NASA Astrophysics Data System (ADS)
Ren, Y. J.; Zhu, J. G.; Yang, X. Y.; Ye, S. H.
2006-10-01
The Virtex-II Pro FPGA is applied to the vision sensor tracking system of IRB2400 robot. The hardware platform, which undertakes the task of improving SNR and compressing data, is constructed by using the high-speed image processing of FPGA. The lower level image-processing algorithm is realized by combining the FPGA frame and the embedded CPU. The velocity of image processing is accelerated due to the introduction of FPGA and CPU. The usage of the embedded CPU makes it easily to realize the logic design of interface. Some key techniques are presented in the text, such as read-write process, template matching, convolution, and some modules are simulated too. In the end, the compare among the modules using this design, using the PC computer and using the DSP, is carried out. Because the high-speed image processing system core is a chip of FPGA, the function of which can renew conveniently, therefore, to a degree, the measure system is intelligent.
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing
Bravo, Ignacio; Baliñas, Javier; Gardel, Alfredo; Lázaro, José L.; Espinosa, Felipe; García, Jorge
2011-01-01
This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely. PMID:22163739
Real-time FPGA architectures for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2000-03-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
A natural-color mapping for single-band night-time image based on FPGA
NASA Astrophysics Data System (ADS)
Wang, Yilun; Qian, Yunsheng
2018-01-01
A natural-color mapping for single-band night-time image method based on FPGA can transmit the color of the reference image to single-band night-time image, which is consistent with human visual habits and can help observers identify the target. This paper introduces the processing of the natural-color mapping algorithm based on FPGA. Firstly, the image can be transformed based on histogram equalization, and the intensity features and standard deviation features of reference image are stored in SRAM. Then, the real-time digital images' intensity features and standard deviation features are calculated by FPGA. At last, FPGA completes the color mapping through matching pixels between images using the features in luminance channel.
A single FPGA-based portable ultrasound imaging system for point-of-care applications.
Kim, Gi-Duck; Yoon, Changhan; Kye, Sang-Bum; Lee, Youngbae; Kang, Jeeun; Yoo, Yangmo; Song, Tai-kyong
2012-07-01
We present a cost-effective portable ultrasound system based on a single field-programmable gate array (FPGA) for point-of-care applications. In the portable ultrasound system developed, all the ultrasound signal and image processing modules, including an effective 32-channel receive beamformer with pseudo-dynamic focusing, are embedded in an FPGA chip. For overall system control, a mobile processor running Linux at 667 MHz is used. The scan-converted ultrasound image data from the FPGA are directly transferred to the system controller via external direct memory access without a video processing unit. The potable ultrasound system developed can provide real-time B-mode imaging with a maximum frame rate of 30, and it has a battery life of approximately 1.5 h. These results indicate that the single FPGA-based portable ultrasound system developed is able to meet the processing requirements in medical ultrasound imaging while providing improved flexibility for adapting to emerging POC applications.
A FPGA-based architecture for real-time image matching
NASA Astrophysics Data System (ADS)
Wang, Jianhui; Zhong, Sheng; Xu, Wenhui; Zhang, Weijun; Cao, Zhiguo
2013-10-01
Image matching is a fundamental task in computer vision. It is used to establish correspondence between two images taken at different viewpoint or different time from the same scene. However, its large computational complexity has been a challenge to most embedded systems. This paper proposes a single FPGA-based image matching system, which consists of SIFT feature detection, BRIEF descriptor extraction and BRIEF matching. It optimizes the FPGA architecture for the SIFT feature detection to reduce the FPGA resources utilization. Moreover, we implement BRIEF description and matching on FPGA also. The proposed system can implement image matching at 30fps (frame per second) for 1280x720 images. Its processing speed can meet the demand of most real-life computer vision applications.
Uranus: a rapid prototyping tool for FPGA embedded computer vision
NASA Astrophysics Data System (ADS)
Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.
2007-01-01
The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming
2011-11-01
Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
FPGA-Based Reconfigurable Processor for Ultrafast Interlaced Ultrasound and Photoacoustic Imaging
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2016-01-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models. PMID:22828830
FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging.
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2012-07-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models.
Implementation of total focusing method for phased array ultrasonic imaging on FPGA
NASA Astrophysics Data System (ADS)
Guo, JianQiang; Li, Xi; Gao, Xiaorong; Wang, Zeyong; Zhao, Quanke
2015-02-01
This paper describes a multi-FPGA imaging system dedicated for the real-time imaging using the Total Focusing Method (TFM) and Full Matrix Capture (FMC). The system was entirely described using Verilog HDL language and implemented on Altera Stratix IV GX FPGA development board. The whole algorithm process is to: establish a coordinate system of image and divide it into grids; calculate the complete acoustic distance of array element between transmitting array element and receiving array element, and transform it into index value; then index the sound pressure values from ROM and superimpose sound pressure values to get pixel value of one focus point; and calculate the pixel values of all focus points to get the final imaging. The imaging result shows that this algorithm has high SNR of defect imaging. And FPGA with parallel processing capability can provide high speed performance, so this system can provide the imaging interface, with complete function and good performance.
NASA Astrophysics Data System (ADS)
Zou, Liang; Fu, Zhuang; Zhao, YanZheng; Yang, JunYan
2010-07-01
This paper proposes a kind of pipelined electric circuit architecture implemented in FPGA, a very large scale integrated circuit (VLSI), which efficiently deals with the real time non-uniformity correction (NUC) algorithm for infrared focal plane arrays (IRFPA). Dual Nios II soft-core processors and a DSP with a 64+ core together constitute this image system. Each processor undertakes own systematic task, coordinating its work with each other's. The system on programmable chip (SOPC) in FPGA works steadily under the global clock frequency of 96Mhz. Adequate time allowance makes FPGA perform NUC image pre-processing algorithm with ease, which has offered favorable guarantee for the work of post image processing in DSP. And at the meantime, this paper presents a hardware (HW) and software (SW) co-design in FPGA. Thus, this systematic architecture yields an image processing system with multiprocessor, and a smart solution to the satisfaction with the performance of the system.
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
A New FPGA Architecture of FAST and BRIEF Algorithm for On-Board Corner Detection and Matching.
Huang, Jingjin; Zhou, Guoqing; Zhou, Xiang; Zhang, Rongting
2018-03-28
Although some researchers have proposed the Field Programmable Gate Array (FPGA) architectures of Feature From Accelerated Segment Test (FAST) and Binary Robust Independent Elementary Features (BRIEF) algorithm, there is no consideration of image data storage in these traditional architectures that will result in no image data that can be reused by the follow-up algorithms. This paper proposes a new FPGA architecture that considers the reuse of sub-image data. In the proposed architecture, a remainder-based method is firstly designed for reading the sub-image, a FAST detector and a BRIEF descriptor are combined for corner detection and matching. Six pairs of satellite images with different textures, which are located in the Mentougou district, Beijing, China, are used to evaluate the performance of the proposed architecture. The Modelsim simulation results found that: (i) the proposed architecture is effective for sub-image reading from DDR3 at a minimum cost; (ii) the FPGA implementation is corrected and efficient for corner detection and matching, such as the average value of matching rate of natural areas and artificial areas are approximately 67% and 83%, respectively, which are close to PC's and the processing speed by FPGA is approximately 31 and 2.5 times faster than those by PC processing and by GPU processing, respectively.
Real-time field programmable gate array architecture for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2001-01-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
A generic FPGA-based detector readout and real-time image processing board
NASA Astrophysics Data System (ADS)
Sarpotdar, Mayuresh; Mathew, Joice; Safonova, Margarita; Murthy, Jayant
2016-07-01
For space-based astronomical observations, it is important to have a mechanism to capture the digital output from the standard detector for further on-board analysis and storage. We have developed a generic (application- wise) field-programmable gate array (FPGA) board to interface with an image sensor, a method to generate the clocks required to read the image data from the sensor, and a real-time image processor system (on-chip) which can be used for various image processing tasks. The FPGA board is applied as the image processor board in the Lunar Ultraviolet Cosmic Imager (LUCI) and a star sensor (StarSense) - instruments developed by our group. In this paper, we discuss the various design considerations for this board and its applications in the future balloon and possible space flights.
DDGIPS: a general image processing system in robot vision
NASA Astrophysics Data System (ADS)
Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang
2000-10-01
Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi-algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6201s each has a local memory space, and they also have a shared system memory space which enables them to intercommunicate and exchange data efficiently. At the same time, they can be directly inter-connected in star-shaped architecture. All of these are under the control of a FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
DDGIPS: a general image processing system in robot vision
NASA Astrophysics Data System (ADS)
Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang
2000-10-01
Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi- algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6210s each has a local memory space, and they also have a shared system memory space which enable them to intercommunicate and exchange data efficiently. At the same time, they can be directly interconnected in star- shaped architecture. All of these are under the control of FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
FPGA Coprocessor for Accelerated Classification of Images
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.
2008-01-01
An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.
NASA Astrophysics Data System (ADS)
Jackson, Christopher Robert
"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
Remote hardware-reconfigurable robotic camera
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.
2001-10-01
In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.
Comparing an FPGA to a Cell for an Image Processing Application
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.
2010-12-01
Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view
NASA Astrophysics Data System (ADS)
Cao, Tam P.; Deng, Guang; Elton, Darrell
2009-02-01
In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
Software-based high-level synthesis design of FPGA beamformers for synthetic aperture imaging.
Amaro, Joao; Yiu, Billy Y S; Falcao, Gabriel; Gomes, Marco A C; Yu, Alfred C H
2015-05-01
Field-programmable gate arrays (FPGAs) can potentially be configured as beamforming platforms for ultrasound imaging, but a long design time and skilled expertise in hardware programming are typically required. In this article, we present a novel approach to the efficient design of FPGA beamformers for synthetic aperture (SA) imaging via the use of software-based high-level synthesis techniques. Software kernels (coded in OpenCL) were first developed to stage-wise handle SA beamforming operations, and their corresponding FPGA logic circuitry was emulated through a high-level synthesis framework. After design space analysis, the fine-tuned OpenCL kernels were compiled into register transfer level descriptions to configure an FPGA as a beamformer module. The processing performance of this beamformer was assessed through a series of offline emulation experiments that sought to derive beamformed images from SA channel-domain raw data (40-MHz sampling rate, 12 bit resolution). With 128 channels, our FPGA-based SA beamformer can achieve 41 frames per second (fps) processing throughput (3.44 × 10(8) pixels per second for frame size of 256 × 256 pixels) at 31.5 W power consumption (1.30 fps/W power efficiency). It utilized 86.9% of the FPGA fabric and operated at a 196.5 MHz clock frequency (after optimization). Based on these findings, we anticipate that FPGA and high-level synthesis can together foster rapid prototyping of real-time ultrasound processor modules at low power consumption budgets.
Embedded real-time image processing hardware for feature extraction and clustering
NASA Astrophysics Data System (ADS)
Chiu, Lihu; Chang, Grant
2003-08-01
Printronix, Inc. uses scanner-based image systems to perform print quality measurements for line-matrix printers. The size of the image samples and image definition required make commercial scanners convenient to use. The image processing is relatively well defined, and we are able to simplify many of the calculations into hardware equations and "c" code. The process of rapidly prototyping the system using DSP based "c" code gets the algorithms well defined early in the development cycle. Once a working system is defined, the rest of the process involves splitting the task up for the FPGA and the DSP implementation. Deciding which of the two to use, the DSP or the FPGA, is a simple matter of trial benchmarking. There are two kinds of benchmarking: One for speed, and the other for memory. The more memory intensive algorithms should run in the DSP, and the simple real time tasks can use the FPGA most effectively. Once the task is split, we can decide which platform the algorithm should be executed. This involves prototyping all the code in the DSP, then timing various blocks of the algorithm. Slow routines can be optimized using the compiler tools, and if further reduction in time is needed, into tasks that the FPGA can perform.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perrine, Kenneth A.; Hopkins, Derek F.; Lamarche, Brian L.
2005-09-01
Biologists and computer engineers at Pacific Northwest National Laboratory have specified, designed, and implemented a hardware/software system for performing real-time, multispectral image processing on a confocal microscope. This solution is intended to extend the capabilities of the microscope, enabling scientists to conduct advanced experiments on cell signaling and other kinds of protein interactions. FRET (fluorescence resonance energy transfer) techniques are used to locate and monitor protein activity. In FRET, it is critical that spectral images be precisely aligned with each other despite disturbances in the physical imaging path caused by imperfections in lenses and cameras, and expansion and contraction ofmore » materials due to temperature changes. The central importance of this work is therefore automatic image registration. This runs in a framework that guarantees real-time performance (processing pairs of 1024x1024, 8-bit images at 15 frames per second) and enables the addition of other types of advanced image processing algorithms such as image feature characterization. The supporting system architecture consists of a Visual Basic front-end containing a series of on-screen interfaces for controlling various aspects of the microscope and a script engine for automation. One of the controls is an ActiveX component written in C++ for handling the control and transfer of images. This component interfaces with a pair of LVDS image capture boards and a PCI board containing a 6-million gate Xilinx Virtex-II FPGA. Several types of image processing are performed on the FPGA in a pipelined fashion, including the image registration. The FPGA offloads work that would otherwise need to be performed by the main CPU and has a guaranteed real-time throughput. Image registration is performed in the FPGA by applying a cubic warp on one image to precisely align it with the other image. Before each experiment, an automated calibration procedure is run in order to set up the cubic warp. During image acquisitions, the cubic warp is evaluated by way of forward differencing. Unwanted pixelation artifacts are minimized by bilinear sampling. The resulting system is state-of-the-art for biological imaging. Precisely registered images enable the reliable use of FRET techniques. In addition, real-time image processing performance allows computed images to be fed back and displayed to scientists immediately, and the pipelined nature of the FPGA allows additional image processing algorithms to be incorporated into the system without slowing throughput.« less
Fang, Simin; Zhou, Sheng; Wang, Xiaochun; Ye, Qingsheng; Tian, Ling; Ji, Jianjun; Wang, Yanqun
2015-01-01
To design and improve signal processing algorithms of ophthalmic ultrasonography based on FPGA. Achieved three signal processing modules: full parallel distributed dynamic filter, digital quadrature demodulation, logarithmic compression, using Verilog HDL hardware language in Quartus II. Compared to the original system, the hardware cost is reduced, the whole image shows clearer and more information of the deep eyeball contained in the image, the depth of detection increases from 5 cm to 6 cm. The new algorithms meet the design requirements and achieve the system's optimization that they can effectively improve the image quality of existing equipment.
Design of area array CCD image acquisition and display system based on FPGA
NASA Astrophysics Data System (ADS)
Li, Lei; Zhang, Ning; Li, Tianting; Pan, Yue; Dai, Yuming
2014-09-01
With the development of science and technology, CCD(Charge-coupled Device) has been widely applied in various fields and plays an important role in the modern sensing system, therefore researching a real-time image acquisition and display plan based on CCD device has great significance. This paper introduces an image data acquisition and display system of area array CCD based on FPGA. Several key technical challenges and problems of the system have also been analyzed and followed solutions put forward .The FPGA works as the core processing unit in the system that controls the integral time sequence .The ICX285AL area array CCD image sensor produced by SONY Corporation has been used in the system. The FPGA works to complete the driver of the area array CCD, then analog front end (AFE) processes the signal of the CCD image, including amplification, filtering, noise elimination, CDS correlation double sampling, etc. AD9945 produced by ADI Corporation to convert analog signal to digital signal. Developed Camera Link high-speed data transmission circuit, and completed the PC-end software design of the image acquisition, and realized the real-time display of images. The result through practical testing indicates that the system in the image acquisition and control is stable and reliable, and the indicators meet the actual project requirements.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.
Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan
2017-07-01
In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
Abid, Abdulbasit
2013-03-01
This paper presents a thorough discussion of the proposed field-programmable gate array (FPGA) implementation for fringe pattern demodulation using the one-dimensional continuous wavelet transform (1D-CWT) algorithm. This algorithm is also known as wavelet transform profilometry. Initially, the 1D-CWT is programmed using the C programming language and compiled into VHDL using the ImpulseC tool. This VHDL code is implemented on the Altera Cyclone IV GX EP4CGX150DF31C7 FPGA. A fringe pattern image with a size of 512×512 pixels is presented to the FPGA, which processes the image using the 1D-CWT algorithm. The FPGA requires approximately 100 ms to process the image and produce a wrapped phase map. For performance comparison purposes, the 1D-CWT algorithm is programmed using the C language. The C code is then compiled using the Intel compiler version 13.0. The compiled code is run on a Dell Precision state-of-the-art workstation. The time required to process the fringe pattern image is approximately 1 s. In order to further reduce the execution time, the 1D-CWT is reprogramed using Intel Integrated Primitive Performance (IPP) Library Version 7.1. The execution time was reduced to approximately 650 ms. This confirms that at least sixfold speedup was gained using FPGA implementation over a state-of-the-art workstation that executes heavily optimized implementation of the 1D-CWT algorithm.
A CMOS high speed imaging system design based on FPGA
NASA Astrophysics Data System (ADS)
Tang, Hong; Wang, Huawei; Cao, Jianzhong; Qiao, Mingrui
2015-10-01
CMOS sensors have more advantages than traditional CCD sensors. The imaging system based on CMOS has become a hot spot in research and development. In order to achieve the real-time data acquisition and high-speed transmission, we design a high-speed CMOS imaging system on account of FPGA. The core control chip of this system is XC6SL75T and we take advantages of CameraLink interface and AM41V4 CMOS image sensors to transmit and acquire image data. AM41V4 is a 4 Megapixel High speed 500 frames per second CMOS image sensor with global shutter and 4/3" optical format. The sensor uses column parallel A/D converters to digitize the images. The CameraLink interface adopts DS90CR287 and it can convert 28 bits of LVCMOS/LVTTL data into four LVDS data stream. The reflected light of objects is photographed by the CMOS detectors. CMOS sensors convert the light to electronic signals and then send them to FPGA. FPGA processes data it received and transmits them to upper computer which has acquisition cards through CameraLink interface configured as full models. Then PC will store, visualize and process images later. The structure and principle of the system are both explained in this paper and this paper introduces the hardware and software design of the system. FPGA introduces the driven clock of CMOS. The data in CMOS is converted to LVDS signals and then transmitted to the data acquisition cards. After simulation, the paper presents a row transfer timing sequence of CMOS. The system realized real-time image acquisition and external controls.
NASA Astrophysics Data System (ADS)
Cobos Arribas, Pedro; Monasterio Huelin Macia, Felix
2003-04-01
A FPGA based hardware implementation of the Santos-Victor optical flow algorithm, useful in robot guidance applications, is described in this paper. The system used to do contains an ALTERA FPGA (20K100), an interface with a digital camera, three VRAM memories to contain the data input and some output memories (a VRAM and a EDO) to contain the results. The system have been used previously to develop and test other vision algorithms, such as image compression, optical flow calculation with differential and correlation methods. The designed system let connect the digital camera, or the FPGA output (results of algorithms) to a PC, throw its Firewire or USB port. The problems take place in this occasion have motivated to adopt another hardware structure for certain vision algorithms with special requirements, that need a very hard code intensive processing.
Design of light-small high-speed image data processing system
NASA Astrophysics Data System (ADS)
Yang, Jinbao; Feng, Xue; Li, Fei
2015-10-01
A light-small high speed image data processing system was designed in order to meet the request of image data processing in aerospace. System was constructed of FPGA, DSP and MCU (Micro-controller), implementing a video compress of 3 million pixels@15frames and real-time return of compressed image to the upper system. Programmable characteristic of FPGA, high performance image compress IC and configurable MCU were made best use to improve integration. Besides, hard-soft board design was introduced and PCB layout was optimized. At last, system achieved miniaturization, light-weight and fast heat dispersion. Experiments show that, system's multifunction was designed correctly and worked stably. In conclusion, system can be widely used in the area of light-small imaging.
NASA Astrophysics Data System (ADS)
Nguyen, An Hung; Guillemette, Thomas; Lambert, Andrew J.; Pickering, Mark R.; Garratt, Matthew A.
2017-09-01
Image registration is a fundamental image processing technique. It is used to spatially align two or more images that have been captured at different times, from different sensors, or from different viewpoints. There have been many algorithms proposed for this task. The most common of these being the well-known Lucas-Kanade (LK) and Horn-Schunck approaches. However, the main limitation of these approaches is the computational complexity required to implement the large number of iterations necessary for successful alignment of the images. Previously, a multi-pass image interpolation algorithm (MP-I2A) was developed to considerably reduce the number of iterations required for successful registration compared with the LK algorithm. This paper develops a kernel-warping algorithm (KWA), a modified version of the MP-I2A, which requires fewer iterations to successfully register two images and less memory space for the field-programmable gate array (FPGA) implementation than the MP-I2A. These reductions increase feasibility of the implementation of the proposed algorithm on FPGAs with very limited memory space and other hardware resources. A two-FPGA system rather than single FPGA system is successfully developed to implement the KWA in order to compensate insufficiency of hardware resources supported by one FPGA, and increase parallel processing ability and scalability of the system.
Implementation of the 2-D Wavelet Transform into FPGA for Image
NASA Astrophysics Data System (ADS)
León, M.; Barba, L.; Vargas, L.; Torres, C. O.
2011-01-01
This paper presents a hardware system implementation of the of discrete wavelet transform algoritm in two dimensions for FPGA, using the Daubechies filter family of order 2 (db2). The decomposition algorithm of this transform is designed and simulated with the Hardware Description Language VHDL and is implemented in a programmable logic device (FPGA) XC3S1200E reference, Spartan IIIE family, by Xilinx, take advantage the parallels properties of these gives us and speeds processing that can reach them. The architecture is evaluated using images input of different sizes. This implementation is done with the aim of developing a future images encryption hardware system using wavelet transform for security information.
Real-time distortion correction for visual inspection systems based on FPGA
NASA Astrophysics Data System (ADS)
Liang, Danhua; Zhang, Zhaoxia; Chen, Xiaodong; Yu, Daoyin
2008-03-01
Visual inspection is a kind of new technology based on the research of computer vision, which focuses on the measurement of the object's geometry and location. It can be widely used in online measurement, and other real-time measurement process. Because of the defects of the traditional visual inspection, a new visual detection mode -all-digital intelligent acquisition and transmission is presented. The image processing, including filtering, image compression, binarization, edge detection and distortion correction, can be completed in the programmable devices -FPGA. As the wide-field angle lens is adopted in the system, the output images have serious distortion. Limited by the calculating speed of computer, software can only correct the distortion of static images but not the distortion of dynamic images. To reach the real-time need, we design a distortion correction system based on FPGA. The method of hardware distortion correction is that the spatial correction data are calculated first under software circumstance, then converted into the address of hardware storage and stored in the hardware look-up table, through which data can be read out to correct gray level. The major benefit using FPGA is that the same circuit can be used for other circularly symmetric wide-angle lenses without being modified.
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
Design of polarization imaging system based on CIS and FPGA
NASA Astrophysics Data System (ADS)
Zeng, Yan-an; Liu, Li-gang; Yang, Kun-tao; Chang, Da-ding
2008-02-01
As polarization is an important characteristic of light, polarization image detecting is a new image detecting technology of combining polarimetric and image processing technology. Contrasting traditional image detecting in ray radiation, polarization image detecting could acquire a lot of very important information which traditional image detecting couldn't. Polarization image detecting will be widely used in civilian field and military field. As polarization image detecting could resolve some problem which couldn't be resolved by traditional image detecting, it has been researched widely around the world. The paper introduces polarization image detecting in physical theory at first, then especially introduces image collecting and polarization image process based on CIS (CMOS image sensor) and FPGA. There are two parts including hardware and software for polarization imaging system. The part of hardware include drive module of CMOS image sensor, VGA display module, SRAM access module and the real-time image data collecting system based on FPGA. The circuit diagram and PCB was designed. Stokes vector and polarization angle computing method are analyzed in the part of software. The float multiply of Stokes vector is optimized into just shift and addition operation. The result of the experiment shows that real time image collecting system could collect and display image data from CMOS image sensor in real-time.
NASA Astrophysics Data System (ADS)
Gunay, Omer; Ozsarac, Ismail; Kamisli, Fatih
2017-05-01
Video recording is an essential property of new generation military imaging systems. Playback of the stored video on the same device is also desirable as it provides several operational benefits to end users. Two very important constraints for many military imaging systems, especially for hand-held devices and thermal weapon sights, are power consumption and size. To meet these constraints, it is essential to perform most of the processing applied to the video signal, such as preprocessing, compression, storing, decoding, playback and other system functions on a single programmable chip, such as FPGA, DSP, GPU or ASIC. In this work, H.264/AVC (Advanced Video Coding) compatible video compression, storage, decoding and playback blocks are efficiently designed and implemented on FPGA platforms using FPGA fabric and Altera NIOS II soft processor. Many subblocks that are used in video encoding are also used during video decoding in order to save FPGA resources and power. Computationally complex blocks are designed using FPGA fabric, while blocks such as SD card write/read, H.264 syntax decoding and CAVLC decoding are done using NIOS processor to benefit from software flexibility. In addition, to keep power consumption low, the system was designed to require limited external memory access. The design was tested using 640x480 25 fps thermal camera on CYCLONE V FPGA, which is the ALTERA's lowest power FPGA family, and consumes lower than 40% of CYCLONE V 5CEFA7 FPGA resources on average.
Onboard FPGA-based SAR processing for future spaceborne systems
NASA Technical Reports Server (NTRS)
Le, Charles; Chan, Samuel; Cheng, Frank; Fang, Winston; Fischman, Mark; Hensley, Scott; Johnson, Robert; Jourdan, Michael; Marina, Miguel; Parham, Bruce;
2004-01-01
We present a real-time high-performance and fault-tolerant FPGA-based hardware architecture for the processing of synthetic aperture radar (SAR) images in future spaceborne system. In particular, we will discuss the integrated design approach, from top-level algorithm specifications and system requirements, design methodology, functional verification and performance validation, down to hardware design and implementation.
Real-time blind image deconvolution based on coordinated framework of FPGA and DSP
NASA Astrophysics Data System (ADS)
Wang, Ze; Li, Hang; Zhou, Hua; Liu, Hongjun
2015-10-01
Image restoration takes a crucial place in several important application domains. With the increasing of computation requirement as the algorithms become much more complexity, there has been a significant rise in the need for accelerating implementation. In this paper, we focus on an efficient real-time image processing system for blind iterative deconvolution method by means of the Richardson-Lucy (R-L) algorithm. We study the characteristics of algorithm, and an image restoration processing system based on the coordinated framework of FPGA and DSP (CoFD) is presented. Single precision floating-point processing units with small-scale cascade and special FFT/IFFT processing modules are adopted to guarantee the accuracy of the processing. Finally, Comparing experiments are done. The system could process a blurred image of 128×128 pixels within 32 milliseconds, and is up to three or four times faster than the traditional multi-DSPs systems.
Real-time implementation of camera positioning algorithm based on FPGA & SOPC
NASA Astrophysics Data System (ADS)
Yang, Mingcao; Qiu, Yuehong
2014-09-01
In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
Design of CMOS imaging system based on FPGA
NASA Astrophysics Data System (ADS)
Hu, Bo; Chen, Xiaolai
2017-10-01
In order to meet the needs of engineering applications for high dynamic range CMOS camera under the rolling shutter mode, a complete imaging system is designed based on the CMOS imaging sensor NSC1105. The paper decides CMOS+ADC+FPGA+Camera Link as processing architecture and introduces the design and implementation of the hardware system. As for camera software system, which consists of CMOS timing drive module, image acquisition module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The ISE 14.6 emulator ISim is used in the simulation of signals. The imaging experimental results show that the system exhibits a 1280*1024 pixel resolution, has a frame frequency of 25 fps and a dynamic range more than 120dB. The imaging quality of the system satisfies the requirement of the index.
Zhao, Ming; Li, Yu; Peng, Leilei
2014-01-01
We report a fast non-iterative lifetime data analysis method for the Fourier multiplexed frequency-sweeping confocal FLIM (Fm-FLIM) system [ Opt. Express22, 10221 ( 2014)24921725]. The new method, named R-method, allows fast multi-channel lifetime image analysis in the system’s FPGA data processing board. Experimental tests proved that the performance of the R-method is equivalent to that of single-exponential iterative fitting, and its sensitivity is well suited for time-lapse FLIM-FRET imaging of live cells, for example cyclic adenosine monophosphate (cAMP) level imaging with GFP-Epac-mCherry sensors. With the R-method and its FPGA implementation, multi-channel lifetime images can now be generated in real time on the multi-channel frequency-sweeping FLIM system, and live readout of FRET sensors can be performed during time-lapse imaging. PMID:25321778
A real-time MTFC algorithm of space remote-sensing camera based on FPGA
NASA Astrophysics Data System (ADS)
Zhao, Liting; Huang, Gang; Lin, Zhe
2018-01-01
A real-time MTFC algorithm of space remote-sensing camera based on FPGA was designed. The algorithm can provide real-time image processing to enhance image clarity when the remote-sensing camera running on-orbit. The image restoration algorithm adopted modular design. The MTF measurement calculation module on-orbit had the function of calculating the edge extension function, line extension function, ESF difference operation, normalization MTF and MTFC parameters. The MTFC image filtering and noise suppression had the function of filtering algorithm and effectively suppressing the noise. The algorithm used System Generator to design the image processing algorithms to simplify the design structure of system and the process redesign. The image gray gradient dot sharpness edge contrast and median-high frequency were enhanced. The image SNR after recovery reduced less than 1 dB compared to the original image. The image restoration system can be widely used in various fields.
HALO: a reconfigurable image enhancement and multisensor fusion system
NASA Astrophysics Data System (ADS)
Wu, F.; Hickman, D. L.; Parker, Steve J.
2014-06-01
Contemporary high definition (HD) cameras and affordable infrared (IR) imagers are set to dramatically improve the effectiveness of security, surveillance and military vision systems. However, the quality of imagery is often compromised by camera shake, or poor scene visibility due to inadequate illumination or bad atmospheric conditions. A versatile vision processing system called HALO™ is presented that can address these issues, by providing flexible image processing functionality on a low size, weight and power (SWaP) platform. Example processing functions include video distortion correction, stabilisation, multi-sensor fusion and image contrast enhancement (ICE). The system is based around an all-programmable system-on-a-chip (SoC), which combines the computational power of a field-programmable gate array (FPGA) with the flexibility of a CPU. The FPGA accelerates computationally intensive real-time processes, whereas the CPU provides management and decision making functions that can automatically reconfigure the platform based on user input and scene content. These capabilities enable a HALO™ equipped reconnaissance or surveillance system to operate in poor visibility, providing potentially critical operational advantages in visually complex and challenging usage scenarios. The choice of an FPGA based SoC is discussed, and the HALO™ architecture and its implementation are described. The capabilities of image distortion correction, stabilisation, fusion and ICE are illustrated using laboratory and trials data.
Rapid and highly integrated FPGA-based Shack-Hartmann wavefront sensor for adaptive optics system
NASA Astrophysics Data System (ADS)
Chen, Yi-Pin; Chang, Chia-Yuan; Chen, Shean-Jen
2018-02-01
In this study, a field programmable gate array (FPGA)-based Shack-Hartmann wavefront sensor (SHWS) programmed on LabVIEW can be highly integrated into customized applications such as adaptive optics system (AOS) for performing real-time wavefront measurement. Further, a Camera Link frame grabber embedded with FPGA is adopted to enhance the sensor speed reacting to variation considering its advantage of the highest data transmission bandwidth. Instead of waiting for a frame image to be captured by the FPGA, the Shack-Hartmann algorithm are implemented in parallel processing blocks design and let the image data transmission synchronize with the wavefront reconstruction. On the other hand, we design a mechanism to control the deformable mirror in the same FPGA and verify the Shack-Hartmann sensor speed by controlling the frequency of the deformable mirror dynamic surface deformation. Currently, this FPGAbead SHWS design can achieve a 266 Hz cyclic speed limited by the camera frame rate as well as leaves 40% logic slices for additionally flexible design.
FPGA Implementation of Stereo Disparity with High Throughput for Mobility Applications
NASA Technical Reports Server (NTRS)
Villalpando, Carlos Y.; Morfopolous, Arin; Matthies, Larry; Goldberg, Steven
2011-01-01
High speed stereo vision can allow unmanned robotic systems to navigate safely in unstructured terrain, but the computational cost can exceed the capacity of typical embedded CPUs. In this paper, we describe an end-to-end stereo computation co-processing system optimized for fast throughput that has been implemented on a single Virtex 4 LX160 FPGA. This system is capable of operating on images from a 1024 x 768 3CCD (true RGB) camera pair at 15 Hz. Data enters the FPGA directly from the cameras via Camera Link and is rectified, pre-filtered and converted into a disparity image all within the FPGA, incurring no CPU load. Once complete, a rectified image and the final disparity image are read out over the PCI bus, for a bandwidth cost of 68 MB/sec. Within the FPGA there are 4 distinct algorithms: Camera Link capture, Bilinear rectification, Bilateral subtraction pre-filtering and the Sum of Absolute Difference (SAD) disparity. Each module will be described in brief along with the data flow and control logic for the system. The system has been successfully fielded upon the Carnegie Mellon University's National Robotics Engineering Center (NREC) Crusher system during extensive field trials in 2007 and 2008 and is being implemented for other surface mobility systems at JPL.
Design of an MR image processing module on an FPGA chip
NASA Astrophysics Data System (ADS)
Li, Limin; Wyrwicz, Alice M.
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments.
Design of an MR image processing module on an FPGA chip
Li, Limin; Wyrwicz, Alice M.
2015-01-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. PMID:25909646
FPGA Based High Speed Data Acquisition System for Electrical Impedance Tomography
Khan, S; Borsic, A; Manwaring, Preston; Hartov, Alexander; Halter, Ryan
2014-01-01
Electrical Impedance Tomography (EIT) systems are used to image tissue bio-impedance. EIT provides a number of features making it attractive for use as a medical imaging device including the ability to image fast physiological processes (>60 Hz), to meet a range of clinical imaging needs through varying electrode geometries and configurations, to impart only non-ionizing radiation to a patient, and to map the significant electrical property contrasts present between numerous benign and pathological tissues. To leverage these potential advantages for medical imaging, we developed a modular 32 channel data acquisition (DAQ) system using National Instruments’ PXI chassis, along with FPGA, ADC, Signal Generator and Timing and Synchronization modules. To achieve high frame rates, signal demodulation and spectral characteristics of higher order harmonics were computed using dedicated FFT-hardware built into the FPGA module. By offloading the computing onto FPGA, we were able to achieve a reduction in throughput required between the FPGA and PC by a factor of 32:1. A custom designed analog front end (AFE) was used to interface electrodes with our system. Our system is wideband, and capable of acquiring data for input signal frequencies ranging from 100 Hz to 12 MHz. The modular design of both the hardware and software will allow this system to be flexibly configured for the particular clinical application. PMID:24729790
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos
2011-01-01
General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
High performance embedded system for real-time pattern matching
NASA Astrophysics Data System (ADS)
Sotiropoulou, C.-L.; Luciano, P.; Gkaitatzis, S.; Citraro, S.; Giannetti, P.; Dell'Orso, M.
2017-02-01
In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton-proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device.
OpenPET: A Flexible Electronics System for Radiotracer Imaging
NASA Astrophysics Data System (ADS)
Moses, W. W.; Buckley, S.; Vu, C.; Peng, Q.; Pavlov, N.; Choong, W.-S.; Wu, J.; Jackson, C.
2010-10-01
We present the design for OpenPET, an electronics readout system designed for prototype radiotracer imaging instruments. The critical requirements are that it has sufficient performance, channel count, channel density, and power consumption to service a complete camera, and yet be simple, flexible, and customizable enough to be used with almost any detector or camera design. An important feature of this system is that each analog input is processed independently. Each input can be configured to accept signals of either polarity as well as either differential or ground referenced signals. Each signal is digitized by a continuously sampled ADC, which is processed by an FPGA to extract pulse height information. A leading edge discriminator creates a timing edge that is “time stamped” by a TDC implemented inside the FPGA. This digital information from each channel is sent to an FPGA that services 16 analog channels, and information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc. As all of this processing is controlled by firmware and software, it can be modified/customized easily. The system is open source, meaning that all technical data (specifications, schematics and board layout files, source code, and instructions) will be publicly available.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choong, W. -S.; Abu-Nimeh, F.; Moses, W. W.
Here, we present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, whichmore » allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is "time stamped" by a time-to-digital converter (TDC) implemented inside the FPGA. In conclusion, this digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.« less
DSP+FPGA-based real-time histogram equalization system of infrared image
NASA Astrophysics Data System (ADS)
Gu, Dongsheng; Yang, Nansheng; Pi, Defu; Hua, Min; Shen, Xiaoyan; Zhang, Ruolan
2001-10-01
Histogram Modification is a simple but effective method to enhance an infrared image. There are several methods to equalize an infrared image's histogram due to the different characteristics of the different infrared images, such as the traditional HE (Histogram Equalization) method, and the improved HP (Histogram Projection) and PE (Plateau Equalization) method and so on. If to realize these methods in a single system, the system must have a mass of memory and extremely fast speed. In our system, we introduce a DSP + FPGA based real-time procession technology to do these things together. FPGA is used to realize the common part of these methods while DSP is to do the different part. The choice of methods and the parameter can be input by a keyboard or a computer. By this means, the function of the system is powerful while it is easy to operate and maintain. In this article, we give out the diagram of the system and the soft flow chart of the methods. And at the end of it, we give out the infrared image and its histogram before and after the process of HE method.
NASA Astrophysics Data System (ADS)
Alqasemi, Umar; Li, Hai; Aguirre, Andres; Zhu, Quing
2011-03-01
Co-registering ultrasound (US) and photoacoustic (PA) imaging is a logical extension to conventional ultrasound because both modalities provide complementary information of tumor morphology, tumor vasculature and hypoxia for cancer detection and characterization. In addition, both modalities are capable of providing real-time images for clinical applications. In this paper, a Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP) module-based real-time US/PA imaging system is presented. The system provides real-time US/PA data acquisition and image display for up to 5 fps* using the currently implemented DSP board. It can be upgraded to 15 fps, which is the maximum pulse repetition rate of the used laser, by implementing an advanced DSP module. Additionally, the photoacoustic RF data for each frame is saved for further off-line processing. The system frontend consists of eight 16-channel modules made of commercial and customized circuits. Each 16-channel module consists of two commercial 8-channel receiving circuitry boards and one FPGA board from Analog Devices. Each receiving board contains an IC† that combines. 8-channel low-noise amplifiers, variable-gain amplifiers, anti-aliasing filters, and ADC's‡ in a single chip with sampling frequency of 40MHz. The FPGA board captures the LVDSξ Double Data Rate (DDR) digital output of the receiving board and performs data conditioning and subbeamforming. A customized 16-channel transmission circuitry is connected to the two receiving boards for US pulseecho (PE) mode data acquisition. A DSP module uses External Memory Interface (EMIF) to interface with the eight 16-channel modules through a customized adaptor board. The DSP transfers either sub-beamformed data (US pulse-echo mode or PAI imaging mode) or raw data from FPGA boards to its DDR-2 memory through the EMIF link, then it performs additional processing, after that, it transfer the data to the PC** for further image processing. The PC code performs image processing including demodulation, beam envelope detection and scan conversion. Additionally, the PC code pre-calculates the delay coefficients used for transmission focusing and receiving dynamic focusing for different types of transducers to speed up the imaging process. To further speed up the imaging process, a multi-threads technique is implemented in order to allow formation of previous image frame data and acquisition of the next one simultaneously. The system is also capable of doing semi-real-time automated SO2 imaging at 10 seconds per frame by changing the wavelength knob of the laser automatically using a stepper motor controlled by the system. Initial in vivo experiments were performed on animal tumors to map out its vasculature and hypoxia level, which were superimposed on co-registered US images. The real-time system allows capturing co-registered US/PA images free of motion artifacts and also provides dynamitic information when contrast agents are used.
A Real-Time System for Lane Detection Based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Xiao, Jing; Li, Shutao; Sun, Bin
2016-12-01
This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
Design of an MR image processing module on an FPGA chip.
Li, Limin; Wyrwicz, Alice M
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128×128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. Copyright © 2015 Elsevier Inc. All rights reserved.
Real-time windowing in imaging radar using FPGA technique
NASA Astrophysics Data System (ADS)
Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique
2005-02-01
The imaging radar uses the high frequency electromagnetic waves reflected from different objects for estimating of its parameters. Pulse compression is a standard signal processing technique used to minimize the peak transmission power and to maximize SNR, and to get a better resolution. Usually the pulse compression can be achieved using a matched filter. The level of the side-lobes in the imaging radar can be reduced using the special weighting function processing. There are very known different weighting functions: Hamming, Hanning, Blackman, Chebyshev, Blackman-Harris, Kaiser-Bessel, etc., widely used in the signal processing applications. Field Programmable Gate Arrays (FPGAs) offers great benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. This reconfiguration makes FPGAs a better solution over custom-made integrated circuits. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal and pulse compression using Matlab, Simulink, and System Generator. Employing FPGA and mentioned software we have proposed the pulse compression design on FPGA using classical and novel windows technique to reduce the side-lobes level. This permits increasing the detection ability of the small or nearly placed targets in imaging radar. The advantage of FPGA that can do parallelism in real time processing permits to realize the proposed algorithms. The paper also presents the experimental results of proposed windowing procedure in the marine radar with such the parameters: signal is linear FM (Chirp); frequency deviation DF is 9.375MHz; the pulse width T is 3.2μs taps number in the matched filter is 800 taps; sampling frequency 253.125*106 MHz. It has been realized the reducing of side-lobes levels in real time permitting better resolution of the small targets.
NASA Astrophysics Data System (ADS)
Rosu-Hamzescu, Mihnea; Polonschii, Cristina; Oprea, Sergiu; Popescu, Dragos; David, Sorin; Bratu, Dumitru; Gheorghiu, Eugen
2018-06-01
Electro-optical measurements, i.e., optical waveguides and plasmonic based electrochemical impedance spectroscopy (P-EIS), are based on the sensitive dependence of refractive index of electro-optical sensors on surface charge density, modulated by an AC electrical field applied to the sensor surface. Recently, P-EIS has emerged as a new analytical tool that can resolve local impedance with high, optical spatial resolution, without using microelectrodes. This study describes a high speed image acquisition and processing system for electro-optical measurements, based on a high speed complementary metal-oxide semiconductor (CMOS) sensor and a field-programmable gate array (FPGA) board. The FPGA is used to configure CMOS parameters, as well as to receive and locally process the acquired images by performing Fourier analysis for each pixel, deriving the real and imaginary parts of the Fourier coefficients for the AC field frequencies. An AC field generator, for single or multi-sine signals, is synchronized with the high speed acquisition system for phase measurements. The system was successfully used for real-time angle-resolved electro-plasmonic measurements from 30 Hz up to 10 kHz, providing results consistent to ones obtained by a conventional electrical impedance approach. The system was able to detect amplitude variations with a relative variation of ±1%, even for rather low sampling rates per period (i.e., 8 samples per period). The PC (personal computer) acquisition and control software allows synchronized acquisition for multiple FPGA boards, making it also suitable for simultaneous angle-resolved P-EIS imaging.
Full image-processing pipeline in field-programmable gate array for a small endoscopic camera
NASA Astrophysics Data System (ADS)
Mostafa, Sheikh Shanawaz; Sousa, L. Natércia; Ferreira, Nuno Fábio; Sousa, Ricardo M.; Santos, Joao; Wäny, Martin; Morgado-Dias, F.
2017-01-01
Endoscopy is an imaging procedure used for diagnosis as well as for some surgical purposes. The camera used for the endoscopy should be small and able to produce a good quality image or video, to reduce discomfort of the patients, and to increase the efficiency of the medical team. To achieve these fundamental goals, a small endoscopy camera with a footprint of 1 mm×1 mm×1.65 mm is used. Due to the physical properties of the sensors and human vision system limitations, different image-processing algorithms, such as noise reduction, demosaicking, and gamma correction, among others, are needed to faithfully reproduce the image or video. A full image-processing pipeline is implemented using a field-programmable gate array (FPGA) to accomplish a high frame rate of 60 fps with minimum processing delay. Along with this, a viewer has also been developed to display and control the image-processing pipeline. The control and data transfer are done by a USB 3.0 end point in the computer. The full developed system achieves real-time processing of the image and fits in a Xilinx Spartan-6LX150 FPGA.
A front-end readout Detector Board for the OpenPET electronics system
NASA Astrophysics Data System (ADS)
Choong, W.-S.; Abu-Nimeh, F.; Moses, W. W.; Peng, Q.; Vu, C. Q.; Wu, J.-Y.
2015-08-01
We present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, which allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is ``time stamped'' by a time-to-digital converter (TDC) implemented inside the FPGA . This digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.
A front-end readout Detector Board for the OpenPET electronics system
Choong, W. -S.; Abu-Nimeh, F.; Moses, W. W.; ...
2015-08-12
Here, we present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, whichmore » allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is "time stamped" by a time-to-digital converter (TDC) implemented inside the FPGA. In conclusion, this digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.« less
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Li, Bingyi; Chen, Liang; Yu, Wenyue; Xie, Yizhuang; Bian, Mingming; Zhang, Qingjun; Pang, Long
2018-01-01
With the development of satellite load technology and very large-scale integrated (VLSI) circuit technology, on-board real-time synthetic aperture radar (SAR) imaging systems have facilitated rapid response to disasters. A key goal of the on-board SAR imaging system design is to achieve high real-time processing performance under severe size, weight, and power consumption constraints. This paper presents a multi-node prototype system for real-time SAR imaging processing. We decompose the commonly used chirp scaling (CS) SAR imaging algorithm into two parts according to the computing features. The linearization and logic-memory optimum allocation methods are adopted to realize the nonlinear part in a reconfigurable structure, and the two-part bandwidth balance method is used to realize the linear part. Thus, float-point SAR imaging processing can be integrated into a single Field Programmable Gate Array (FPGA) chip instead of relying on distributed technologies. A single-processing node requires 10.6 s and consumes 17 W to focus on 25-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384. The design methodology of the multi-FPGA parallel accelerating system under the real-time principle is introduced. As a proof of concept, a prototype with four processing nodes and one master node is implemented using a Xilinx xc6vlx315t FPGA. The weight and volume of one single machine are 10 kg and 32 cm × 24 cm × 20 cm, respectively, and the power consumption is under 100 W. The real-time performance of the proposed design is demonstrated on Chinese Gaofen-3 stripmap continuous imaging. PMID:29495637
Design of video processing and testing system based on DSP and FPGA
NASA Astrophysics Data System (ADS)
Xu, Hong; Lv, Jun; Chen, Xi'ai; Gong, Xuexia; Yang, Chen'na
2007-12-01
Based on high speed Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA), a video capture, processing and display system is presented, which is of miniaturization and low power. In this system, a triple buffering scheme was used for the capture and display, so that the application can always get a new buffer without waiting; The Digital Signal Processor has an image process ability and it can be used to test the boundary of workpiece's image. A video graduation technology is used to aim at the position which is about to be tested, also, it can enhance the system's flexibility. The character superposition technology realized by DSP is used to display the test result on the screen in character format. This system can process image information in real time, ensure test precision, and help to enhance product quality and quality management.
An FPGA-Based Rapid Wheezing Detection System
Lin, Bor-Shing; Yen, Tian-Shiue
2014-01-01
Wheezing is often treated as a crucial indicator in the diagnosis of obstructive pulmonary diseases. A rapid wheezing detection system may help physicians to monitor patients over the long-term. In this study, a portable wheezing detection system based on a field-programmable gate array (FPGA) is proposed. This system accelerates wheezing detection, and can be used as either a single-process system, or as an integrated part of another biomedical signal detection system. The system segments sound signals into 2-second units. A short-time Fourier transform was used to determine the relationship between the time and frequency components of wheezing sound data. A spectrogram was processed using 2D bilateral filtering, edge detection, multithreshold image segmentation, morphological image processing, and image labeling, to extract wheezing features according to computerized respiratory sound analysis (CORSA) standards. These features were then used to train the support vector machine (SVM) and build the classification models. The trained model was used to analyze sound data to detect wheezing. The system runs on a Xilinx Virtex-6 FPGA ML605 platform. The experimental results revealed that the system offered excellent wheezing recognition performance (0.912). The detection process can be used with a clock frequency of 51.97 MHz, and is able to perform rapid wheezing classification. PMID:24481034
Packet based serial link realized in FPGA dedicated for high resolution infrared image transmission
NASA Astrophysics Data System (ADS)
Bieszczad, Grzegorz
2015-05-01
In article the external digital interface specially designed for thermographic camera built in Military University of Technology is described. The aim of article is to illustrate challenges encountered during design process of thermal vision camera especially related to infrared data processing and transmission. Article explains main requirements for interface to transfer Infra-Red or Video digital data and describes the solution which we elaborated based on Low Voltage Differential Signaling (LVDS) physical layer and signaling scheme. Elaborated link for image transmission is built using FPGA integrated circuit with built-in high speed serial transceivers achieving up to 2500Gbps throughput. Image transmission is realized using proprietary packet protocol. Transmission protocol engine was described in VHDL language and tested in FPGA hardware. The link is able to transmit 1280x1024@60Hz 24bit video data using one signal pair. Link was tested to transmit thermal-vision camera picture to remote monitor. Construction of dedicated video link allows to reduce power consumption compared to solutions with ASIC based encoders and decoders realizing video links like DVI or packed based Display Port, with simultaneous reduction of wires needed to establish link to one pair. Article describes functions of modules integrated in FPGA design realizing several functions like: synchronization to video source, video stream packeting, interfacing transceiver module and dynamic clock generation for video standard conversion.
NASA Astrophysics Data System (ADS)
Naldi, G.; Bartolini, M.; Mattana, A.; Pupillo, G.; Hickish, J.; Foster, G.; Bianchi, G.; Lingua, A.; Monari, J.; Montebugnoli, S.; Perini, F.; Rusticelli, S.; Schiaffino, M.; Virone, G.; Zarb Adami, K.
In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
A real-time multi-scale 2D Gaussian filter based on FPGA
NASA Astrophysics Data System (ADS)
Luo, Haibo; Gai, Xingqin; Chang, Zheng; Hui, Bin
2014-11-01
Multi-scale 2-D Gaussian filter has been widely used in feature extraction (e.g. SIFT, edge etc.), image segmentation, image enhancement, image noise removing, multi-scale shape description etc. However, their computational complexity remains an issue for real-time image processing systems. Aimed at this problem, we propose a framework of multi-scale 2-D Gaussian filter based on FPGA in this paper. Firstly, a full-hardware architecture based on parallel pipeline was designed to achieve high throughput rate. Secondly, in order to save some multiplier, the 2-D convolution is separated into two 1-D convolutions. Thirdly, a dedicate first in first out memory named as CAFIFO (Column Addressing FIFO) was designed to avoid the error propagating induced by spark on clock. Finally, a shared memory framework was designed to reduce memory costs. As a demonstration, we realized a 3 scales 2-D Gaussian filter on a single ALTERA Cyclone III FPGA chip. Experimental results show that, the proposed framework can computing a Multi-scales 2-D Gaussian filtering within one pixel clock period, is further suitable for real-time image processing. Moreover, the main principle can be popularized to the other operators based on convolution, such as Gabor filter, Sobel operator and so on.
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
NASA Astrophysics Data System (ADS)
Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery
2016-10-01
This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
Logic design and implementation of FPGA for a high frame rate ultrasound imaging system
NASA Astrophysics Data System (ADS)
Liu, Anjun; Wang, Jing; Lu, Jian-Yu
2002-05-01
Recently, a method has been developed for high frame rate medical imaging [Jian-yu Lu, ``2D and 3D high frame rate imaging with limited diffraction beams,'' IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44(4), 839-856 (1997)]. To realize this method, a complicated system [multiple-channel simultaneous data acquisition, large memory in each channel for storing up to 16 seconds of data at 40 MHz and 12-bit resolution, time-variable-gain (TGC) control, Doppler imaging, harmonic imaging, as well as coded transmissions] is designed. Due to the complexity of the system, field programmable gate array (FPGA) (Xilinx Spartn II) is used. In this presentation, the design and implementation of the FPGA for the system will be reported. This includes the synchronous dynamic random access memory (SDRAM) controller and other system controllers, time sharing for auto-refresh of SDRAMs to reduce peak power, transmission and imaging modality selections, ECG data acquisition and synchronization, 160 MHz delay locked loop (DLL) for accurate timing, and data transfer via either a parallel port or a PCI bus for post image processing. [Work supported in part by Grant 5RO1 HL60301 from NIH.
An improved non-uniformity correction algorithm and its hardware implementation on FPGA
NASA Astrophysics Data System (ADS)
Rong, Shenghui; Zhou, Huixin; Wen, Zhigang; Qin, Hanlin; Qian, Kun; Cheng, Kuanhong
2017-09-01
The Non-uniformity of Infrared Focal Plane Arrays (IRFPA) severely degrades the infrared image quality. An effective non-uniformity correction (NUC) algorithm is necessary for an IRFPA imaging and application system. However traditional scene-based NUC algorithm suffers the image blurring and artificial ghosting. In addition, few effective hardware platforms have been proposed to implement corresponding NUC algorithms. Thus, this paper proposed an improved neural-network based NUC algorithm by the guided image filter and the projection-based motion detection algorithm. First, the guided image filter is utilized to achieve the accurate desired image to decrease the artificial ghosting. Then a projection-based moving detection algorithm is utilized to determine whether the correction coefficients should be updated or not. In this way the problem of image blurring can be overcome. At last, an FPGA-based hardware design is introduced to realize the proposed NUC algorithm. A real and a simulated infrared image sequences are utilized to verify the performance of the proposed algorithm. Experimental results indicated that the proposed NUC algorithm can effectively eliminate the fix pattern noise with less image blurring and artificial ghosting. The proposed hardware design takes less logic elements in FPGA and spends less clock cycles to process one frame of image.
SAD5 Stereo Correlation Line-Striping in an FPGA
NASA Technical Reports Server (NTRS)
Villalpando, Carlos Y.; Morfopoulos, Arin C.
2011-01-01
High precision SAD5 stereo computations can be performed in an FPGA (field-programmable gate array) at much higher speeds than possible in a conventional CPU (central processing unit), but this uses large amounts of FPGA resources that scale with image size. Of the two key resources in an FPGA, Slices and BRAM (block RAM), Slices scale linearly in the new algorithm with image size, and BRAM scales quadratically with image size. An approach was developed to trade latency for BRAM by sub-windowing the image vertically into overlapping strips and stitching the outputs together to create a single continuous disparity output. In stereo, the general rule of thumb is that the disparity search range must be 1/10 the image size. In the new algorithm, BRAM usage scales linearly with disparity search range and scales again linearly with line width. So a doubling of image size, say from 640 to 1,280, would in the previous design be an effective 4 of BRAM usage: 2 for line width, 2 again for disparity search range. The minimum strip size is twice the search range, and will produce an output strip width equal to the disparity search range. So assuming a disparity search range of 1/10 image width, 10 sequential runs of the minimum strip size would produce a full output image. This approach allowed the innovators to fit 1280 960 wide SAD5 stereo disparity in less than 80 BRAM, 52k Slices on a Virtex 5LX330T, 25% and 24% of resources, respectively. Using a 100-MHz clock, this build would perform stereo at 39 Hz. Of particular interest to JPL is that there is a flight qualified version of the Virtex 5: this could produce stereo results even for very large image sizes at 3 orders of magnitude faster than could be computed on the PowerPC 750 flight computer. The work covered in the report allows the stereo algorithm to run on much larger images than before, and using much less BRAM. This opens up choices for a smaller flight FPGA (which saves power and space), or for other algorithms in addition to SAD5 to be run on the same FPGA.
Yang, Fan; Paindavoine, M
2003-01-01
This paper describes a real time vision system that allows us to localize faces in video sequences and verify their identity. These processes are image processing techniques based on the radial basis function (RBF) neural network approach. The robustness of this system has been evaluated quantitatively on eight video sequences. We have adapted our model for an application of face recognition using the Olivetti Research Laboratory (ORL), Cambridge, UK, database so as to compare the performance against other systems. We also describe three hardware implementations of our model on embedded systems based on the field programmable gate array (FPGA), zero instruction set computer (ZISC) chips, and digital signal processor (DSP) TMS320C62, respectively. We analyze the algorithm complexity and present results of hardware implementations in terms of the resources used and processing speed. The success rates of face tracking and identity verification are 92% (FPGA), 85% (ZISC), and 98.2% (DSP), respectively. For the three embedded systems, the processing speeds for images size of 288 /spl times/ 352 are 14 images/s, 25 images/s, and 4.8 images/s, respectively.
Hardware Implementation of a Bilateral Subtraction Filter
NASA Technical Reports Server (NTRS)
Huertas, Andres; Watson, Robert; Villalpando, Carlos; Goldberg, Steven
2009-01-01
A bilateral subtraction filter has been implemented as a hardware module in the form of a field-programmable gate array (FPGA). In general, a bilateral subtraction filter is a key subsystem of a high-quality stereoscopic machine vision system that utilizes images that are large and/or dense. Bilateral subtraction filters have been implemented in software on general-purpose computers, but the processing speeds attainable in this way even on computers containing the fastest processors are insufficient for real-time applications. The present FPGA bilateral subtraction filter is intended to accelerate processing to real-time speed and to be a prototype of a link in a stereoscopic-machine- vision processing chain, now under development, that would process large and/or dense images in real time and would be implemented in an FPGA. In terms that are necessarily oversimplified for the sake of brevity, a bilateral subtraction filter is a smoothing, edge-preserving filter for suppressing low-frequency noise. The filter operation amounts to replacing the value for each pixel with a weighted average of the values of that pixel and the neighboring pixels in a predefined neighborhood or window (e.g., a 9 9 window). The filter weights depend partly on pixel values and partly on the window size. The present FPGA implementation of a bilateral subtraction filter utilizes a 9 9 window. This implementation was designed to take advantage of the ability to do many of the component computations in parallel pipelines to enable processing of image data at the rate at which they are generated. The filter can be considered to be divided into the following parts (see figure): a) An image pixel pipeline with a 9 9- pixel window generator, b) An array of processing elements; c) An adder tree; d) A smoothing-and-delaying unit; and e) A subtraction unit. After each 9 9 window is created, the affected pixel data are fed to the processing elements. Each processing element is fed the pixel value for its position in the window as well as the pixel value for the central pixel of the window. The absolute difference between these two pixel values is calculated and used as an address in a lookup table. Each processing element has a lookup table, unique for its position in the window, containing the weight coefficients for the Gaussian function for that position. The pixel value is multiplied by the weight, and the outputs of the processing element are the weight and pixel-value weight product. The products and weights are fed to the adder tree. The sum of the products and the sum of the weights are fed to the divider, which computes the sum of products the sum of weights. The output of the divider is denoted the bilateral smoothed image. The smoothing function is a simple weighted average computed over a 3 3 subwindow centered in the 9 9 window. After smoothing, the image is delayed by an additional amount of time needed to match the processing time for computing the bilateral smoothed image. The bilateral smoothed image is then subtracted from the 3 3 smoothed image to produce the final output. The prototype filter as implemented in a commercially available FPGA processes one pixel per clock cycle. Operation at a clock speed of 66 MHz has been demonstrated, and results of a static timing analysis have been interpreted as suggesting that the clock speed could be increased to as much as 100 MHz.
An FPGA-based heterogeneous image fusion system design method
NASA Astrophysics Data System (ADS)
Song, Le; Lin, Yu-chi; Chen, Yan-hua; Zhao, Mei-rong
2011-08-01
Taking the advantages of FPGA's low cost and compact structure, an FPGA-based heterogeneous image fusion platform is established in this study. Altera's Cyclone IV series FPGA is adopted as the core processor of the platform, and the visible light CCD camera and infrared thermal imager are used as the image-capturing device in order to obtain dualchannel heterogeneous video images. Tailor-made image fusion algorithms such as gray-scale weighted averaging, maximum selection and minimum selection methods are analyzed and compared. VHDL language and the synchronous design method are utilized to perform a reliable RTL-level description. Altera's Quartus II 9.0 software is applied to simulate and implement the algorithm modules. The contrast experiments of various fusion algorithms show that, preferably image quality of the heterogeneous image fusion can be obtained on top of the proposed system. The applied range of the different fusion algorithms is also discussed.
Real-time FPGA-based radar imaging for smart mobility systems
NASA Astrophysics Data System (ADS)
Saponara, Sergio; Neri, Bruno
2016-04-01
The paper presents an X-band FMCW (Frequency Modulated Continuous Wave) Radar Imaging system, called X-FRI, for surveillance in smart mobility applications. X-FRI allows for detecting the presence of targets (e.g. obstacles in a railway crossing or urban road crossing, or ships in a small harbor), as well as their speed and their position. With respect to alternative solutions based on LIDAR or camera systems, X-FRI operates in real-time also in bad lighting and weather conditions, night and day. The radio-frequency transceiver is realized through COTS (Commercial Off The Shelf) components on a single-board. An FPGA-based baseband platform allows for real-time Radar image processing.
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe
2017-01-01
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l’information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N-th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work. PMID:28718788
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe; Thom, Christian
2017-07-18
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l'information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N -th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work.
An embedded processor for real-time atmoshperic compensation
NASA Astrophysics Data System (ADS)
Bodnar, Michael R.; Curt, Petersen F.; Ortiz, Fernando E.; Carrano, Carmen J.; Kelmelis, Eric J.
2009-05-01
Imaging over long distances is crucial to a number of defense and security applications, such as homeland security and launch tracking. However, the image quality obtained from current long-range optical systems can be severely degraded by the turbulent atmosphere in the path between the region under observation and the imager. While this obscured image information can be recovered using post-processing techniques, the computational complexity of such approaches has prohibited deployment in real-time scenarios. To overcome this limitation, we have coupled a state-of-the-art atmospheric compensation algorithm, the average-bispectrum speckle method, with a powerful FPGA-based embedded processing board. The end result is a light-weight, lower-power image processing system that improves the quality of long-range imagery in real-time, and uses modular video I/O to provide a flexible interface to most common digital and analog video transport methods. By leveraging the custom, reconfigurable nature of the FPGA, a 20x speed increase over a modern desktop PC was achieved in a form-factor that is compact, low-power, and field-deployable.
Hardware-based image processing for high-speed inspection of grains
USDA-ARS?s Scientific Manuscript database
A high-speed, low-cost, image-based sorting device was developed to detect and separate grains with slight color differences and small defects on grains The device directly combines a complementary metal–oxide–semiconductor (CMOS) color image sensor with a field-programmable gate array (FPGA) which...
A Control System and Streaming DAQ Platform with Image-Based Trigger for X-ray Imaging
NASA Astrophysics Data System (ADS)
Stevanovic, Uros; Caselle, Michele; Cecilia, Angelica; Chilingaryan, Suren; Farago, Tomas; Gasilov, Sergey; Herth, Armin; Kopmann, Andreas; Vogelgesang, Matthias; Balzer, Matthias; Baumbach, Tilo; Weber, Marc
2015-06-01
High-speed X-ray imaging applications play a crucial role for non-destructive investigations of the dynamics in material science and biology. On-line data analysis is necessary for quality assurance and data-driven feedback, leading to a more efficient use of a beam time and increased data quality. In this article we present a smart camera platform with embedded Field Programmable Gate Array (FPGA) processing that is able to stream and process data continuously in real-time. The setup consists of a Complementary Metal-Oxide-Semiconductor (CMOS) sensor, an FPGA readout card, and a readout computer. It is seamlessly integrated in a new custom experiment control system called Concert that provides a more efficient way of operating a beamline by integrating device control, experiment process control, and data analysis. The potential of the embedded processing is demonstrated by implementing an image-based trigger. It records the temporal evolution of physical events with increased speed while maintaining the full field of view. The complete data acquisition system, with Concert and the smart camera platform was successfully integrated and used for fast X-ray imaging experiments at KIT's synchrotron radiation facility ANKA.
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Werne, Thomas A.; Bekker, Dmitriy L.; Wilson, Thor O.
2011-01-01
The Xilinx Virtex-5QV is a new Single-event Immune Reconfigurable FPGA (SIRF) device that is targeted as the spaceborne processor for the NASA Decadal Survey Aerosol-Cloud-Ecosystem (ACE) mission's Multiangle SpectroPolarimetric Imager (MSPI) instrument, currently under development at JPL. A key technology needed for MSPI is on-board processing (OBP) to calculate polarimetry data as imaged by each of the 9 cameras forming the instrument. With funding from NASA's ESTO1 AIST2 Program, JPL is demonstrating how signal data at 95 Mbytes/sec over 16 channels for each of the 9 multi-angle cameras can be reduced to 0.45 Mbytes/sec, thereby substantially reducing the image data volume for spacecraft downlink without loss of science information. This is done via a least-squares fitting algorithm implemented on the Virtex-5 FPGA operating in real-time on the raw video data stream.
High-performance camera module for fast quality inspection in industrial printing applications
NASA Astrophysics Data System (ADS)
Fürtler, Johannes; Bodenstorfer, Ernst; Mayer, Konrad J.; Brodersen, Jörg; Heiss, Dorothea; Penz, Harald; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert
2007-02-01
Today, printing products which must meet highest quality standards, e.g., banknotes, stamps, or vouchers, are automatically checked by optical inspection systems. Typically, the examination of fine details of the print or security features demands images taken from various perspectives, with different spectral sensitivity (visible, infrared, ultraviolet), and with high resolution. Consequently, the inspection system is equipped with several cameras and has to cope with an enormous data rate to be processed in real-time. Hence, it is desirable to move image processing tasks into the camera to reduce the amount of data which has to be transferred to the (central) image processing system. The idea is to transfer relevant information only, i.e., features of the image instead of the raw image data from the sensor. These features are then further processed. In this paper a color line-scan camera for line rates up to 100 kHz is presented. The camera is based on a commercial CMOS (complementary metal oxide semiconductor) area image sensor and a field programmable gate array (FPGA). It implements extraction of image features which are well suited to detect print flaws like blotches of ink, color smears, splashes, spots and scratches. The camera design and several image processing methods implemented on the FPGA are described, including flat field correction, compensation of geometric distortions, color transformation, as well as decimation and neighborhood operations.
Compact FPGA-based beamformer using oversampled 1-bit A/D converters.
Tomov, Borislav Gueorguiev; Jensen, Jørgen Arendt
2005-05-01
A compact medical ultrasound beamformer architecture that uses oversampled 1-bit analog-to-digital (A/D) converters is presented. Sparse sample processing is used, as the echo signal for the image lines is reconstructed in 512 equidistant focal points along the line through its in-phase and quadrature components. That information is sufficient for presenting a B-mode image and creating a color flow map. The high sampling rate provides the necessary delay resolution for the focusing. The low channel data width (1-bit) makes it possible to construct a compact beamformer logic. The signal reconstruction is done using finite impulse reponse (FIR) filters, applied on selected bit sequences of the delta-sigma modulator output stream. The approach allows for a multichannel beamformer to fit in a single field programmable gate array (FPGA) device. A 32-channel beamformer is estimated to occupy 50% of the available logic resources in a commercially available mid-range FPGA, and to be able to operate at 129 MHz. Simulation of the architecture at 140 MHz provides images with a dynamic range approaching 60 dB for an excitation frequency of 3 MHz.
Design of low noise imaging system
NASA Astrophysics Data System (ADS)
Hu, Bo; Chen, Xiaolai
2017-10-01
In order to meet the needs of engineering applications for low noise imaging system under the mode of global shutter, a complete imaging system is designed based on the SCMOS (Scientific CMOS) image sensor CIS2521F. The paper introduces hardware circuit and software system design. Based on the analysis of key indexes and technologies about the imaging system, the paper makes chips selection and decides SCMOS + FPGA+ DDRII+ Camera Link as processing architecture. Then it introduces the entire system workflow and power supply and distribution unit design. As for the software system, which consists of the SCMOS control module, image acquisition module, data cache control module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The imaging experimental results show that the imaging system exhibits a 2560*2160 pixel resolution, has a maximum frame frequency of 50 fps. The imaging quality of the system satisfies the requirement of the index.
NASA Astrophysics Data System (ADS)
Curt, Petersen F.; Bodnar, Michael R.; Ortiz, Fernando E.; Carrano, Carmen J.; Kelmelis, Eric J.
2009-02-01
While imaging over long distances is critical to a number of security and defense applications, such as homeland security and launch tracking, current optical systems are limited in resolving power. This is largely a result of the turbulent atmosphere in the path between the region under observation and the imaging system, which can severely degrade captured imagery. There are a variety of post-processing techniques capable of recovering this obscured image information; however, the computational complexity of such approaches has prohibited real-time deployment and hampers the usability of these technologies in many scenarios. To overcome this limitation, we have designed and manufactured an embedded image processing system based on commodity hardware which can compensate for these atmospheric disturbances in real-time. Our system consists of a reformulation of the average bispectrum speckle method coupled with a high-end FPGA processing board, and employs modular I/O capable of interfacing with most common digital and analog video transport methods (composite, component, VGA, DVI, SDI, HD-SDI, etc.). By leveraging the custom, reconfigurable nature of the FPGA, we have achieved performance twenty times faster than a modern desktop PC, in a form-factor that is compact, low-power, and field-deployable.
Circuit design of an EMCCD camera
NASA Astrophysics Data System (ADS)
Li, Binhua; Song, Qian; Jin, Jianhui; He, Chun
2012-07-01
EMCCDs have been used in the astronomical observations in many ways. Recently we develop a camera using an EMCCD TX285. The CCD chip is cooled to -100°C in an LN2 dewar. The camera controller consists of a driving board, a control board and a temperature control board. Power supplies and driving clocks of the CCD are provided by the driving board, the timing generator is located in the control board. The timing generator and an embedded Nios II CPU are implemented in an FPGA. Moreover the ADC and the data transfer circuit are also in the control board, and controlled by the FPGA. The data transfer between the image workstation and the camera is done through a Camera Link frame grabber. The software of image acquisition is built using VC++ and Sapera LT. This paper describes the camera structure, the main components and circuit design for video signal processing channel, clock driver, FPGA and Camera Link interfaces, temperature metering and control system. Some testing results are presented.
NASA Astrophysics Data System (ADS)
Jin, Minglei; Jin, Weiqi; Li, Yiyang; Li, Shuo
2015-08-01
In this paper, we propose a novel scene-based non-uniformity correction algorithm for infrared image processing-temporal high-pass non-uniformity correction algorithm based on grayscale mapping (THP and GM). The main sources of non-uniformity are: (1) detector fabrication inaccuracies; (2) non-linearity and variations in the read-out electronics and (3) optical path effects. The non-uniformity will be reduced by non-uniformity correction (NUC) algorithms. The NUC algorithms are often divided into calibration-based non-uniformity correction (CBNUC) algorithms and scene-based non-uniformity correction (SBNUC) algorithms. As non-uniformity drifts temporally, CBNUC algorithms must be repeated by inserting a uniform radiation source which SBNUC algorithms do not need into the view, so the SBNUC algorithm becomes an essential part of infrared imaging system. The SBNUC algorithms' poor robustness often leads two defects: artifacts and over-correction, meanwhile due to complicated calculation process and large storage consumption, hardware implementation of the SBNUC algorithms is difficult, especially in Field Programmable Gate Array (FPGA) platform. The THP and GM algorithm proposed in this paper can eliminate the non-uniformity without causing defects. The hardware implementation of the algorithm only based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay: less than 20 lines, it can be transplanted to a variety of infrared detectors equipped with FPGA image processing module, it can reduce the stripe non-uniformity and the ripple non-uniformity.
Real-time laser cladding control with variable spot size
NASA Astrophysics Data System (ADS)
Arias, J. L.; Montealegre, M. A.; Vidal, F.; Rodríguez, J.; Mann, S.; Abels, P.; Motmans, F.
2014-03-01
Laser cladding processing has been used in different industries to improve the surface properties or to reconstruct damaged pieces. In order to cover areas considerably larger than the diameter of the laser beam, successive partially overlapping tracks are deposited. With no control over the process variables this conduces to an increase of the temperature, which could decrease mechanical properties of the laser cladded material. Commonly, the process is monitored and controlled by a PC using cameras, but this control suffers from a lack of speed caused by the image processing step. The aim of this work is to design and develop a FPGA-based laser cladding control system. This system is intended to modify the laser beam power according to the melt pool width, which is measured using a CMOS camera. All the control and monitoring tasks are carried out by a FPGA, taking advantage of its abundance of resources and speed of operation. The robustness of the image processing algorithm is assessed, as well as the control system performance. Laser power is decreased as substrate temperature increases, thus maintaining a constant clad width. This FPGA-based control system is integrated in an adaptive laser cladding system, which also includes an adaptive optical system that will control the laser focus distance on the fly. The whole system will constitute an efficient instrument for part repair with complex geometries and coating selective surfaces. This will be a significant step forward into the total industrial implementation of an automated industrial laser cladding process.
NASA Astrophysics Data System (ADS)
Wright, Adam A.; Momin, Orko; Shin, Young Ho; Shakya, Rahul; Nepal, Kumud; Ahlgren, David J.
2010-01-01
This paper presents the application of a distributed systems architecture to an autonomous ground vehicle, Q, that participates in both the autonomous and navigation challenges of the Intelligent Ground Vehicle Competition. In the autonomous challenge the vehicle is required to follow a course, while avoiding obstacles and staying within the course boundaries, which are marked by white lines. For the navigation challenge, the vehicle is required to reach a set of target destinations, known as way points, with given GPS coordinates and avoid obstacles that it encounters in the process. Previously the vehicle utilized a single laptop to execute all processing activities including image processing, sensor interfacing and data processing, path planning and navigation algorithms and motor control. National Instruments' (NI) LabVIEW served as the programming language for software implementation. As an upgrade to last year's design, a NI compact Reconfigurable Input/Output system (cRIO) was incorporated to the system architecture. The cRIO is NI's solution for rapid prototyping that is equipped with a real time processor, an FPGA and modular input/output. Under the current system, the real time processor handles the path planning and navigation algorithms, the FPGA gathers and processes sensor data. This setup leaves the laptop to focus on running the image processing algorithm. Image processing as previously presented by Nepal et. al. is a multi-step line extraction algorithm and constitutes the largest processor load. This distributed approach results in a faster image processing algorithm which was previously Q's bottleneck. Additionally, the path planning and navigation algorithms are executed more reliably on the real time processor due to the deterministic nature of operation. The implementation of this architecture required exploration of various inter-system communication techniques. Data transfer between the laptop and the real time processor using UDP packets was established as the most reliable protocol after testing various options. Improvement can be made to the system by migrating more algorithms to the hardware based FPGA to further speed up the operations of the vehicle.
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
FPGA-Based Pulse Pile-Up Correction With Energy and Timing Recovery.
Haselman, M D; Pasko, J; Hauck, S; Lewellen, T K; Miyaoka, R S
2012-10-01
Modern field programmable gate arrays (FPGAs) are capable of performing complex discrete signal processing algorithms with clock rates well above 100 MHz. This, combined with FPGA's low expense, ease of use, and selected dedicated hardware make them an ideal technology for a data acquisition system for a positron emission tomography (PET) scanner. The University of Washington is producing a high-resolution, small-animal PET scanner that utilizes FPGAs as the core of the front-end electronics. For this scanner, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilized to add significant signal processing power to produce higher quality images. In this paper we report on an all-digital pulse pile-up correction algorithm that has been developed for the FPGA. The pile-up mitigation algorithm will allow the scanner to run at higher count rates without incurring large data losses due to the overlapping of scintillation signals. This correction technique utilizes a reference pulse to extract timing and energy information for most pile-up events. Using pulses acquired from a Zecotech Photonics MAPD-N with an LFS-3 scintillator, we show that good timing and energy information can be achieved in the presence of pile-up utilizing a moderate amount of FPGA resources.
The sequence measurement system of the IR camera
NASA Astrophysics Data System (ADS)
Geng, Ai-hui; Han, Hong-xia; Zhang, Hai-bo
2011-08-01
Currently, the IR cameras are broadly used in the optic-electronic tracking, optic-electronic measuring, fire control and optic-electronic countermeasure field, but the output sequence of the most presently applied IR cameras in the project is complex and the giving sequence documents from the leave factory are not detailed. Aiming at the requirement that the continuous image transmission and image procession system need the detailed sequence of the IR cameras, the sequence measurement system of the IR camera is designed, and the detailed sequence measurement way of the applied IR camera is carried out. The FPGA programming combined with the SignalTap online observation way has been applied in the sequence measurement system, and the precise sequence of the IR camera's output signal has been achieved, the detailed document of the IR camera has been supplied to the continuous image transmission system, image processing system and etc. The sequence measurement system of the IR camera includes CameraLink input interface part, LVDS input interface part, FPGA part, CameraLink output interface part and etc, thereinto the FPGA part is the key composed part in the sequence measurement system. Both the video signal of the CmaeraLink style and the video signal of LVDS style can be accepted by the sequence measurement system, and because the image processing card and image memory card always use the CameraLink interface as its input interface style, the output signal style of the sequence measurement system has been designed into CameraLink interface. The sequence measurement system does the IR camera's sequence measurement work and meanwhile does the interface transmission work to some cameras. Inside the FPGA of the sequence measurement system, the sequence measurement program, the pixel clock modification, the SignalTap file configuration and the SignalTap online observation has been integrated to realize the precise measurement to the IR camera. Te sequence measurement program written by the verilog language combining the SignalTap tool on line observation can count the line numbers in one frame, pixel numbers in one line and meanwhile account the line offset and row offset of the image. Aiming at the complex sequence of the IR camera's output signal, the sequence measurement system of the IR camera accurately measures the sequence of the project applied camera, supplies the detailed sequence document to the continuous system such as image processing system and image transmission system and gives out the concrete parameters of the fval, lval, pixclk, line offset and row offset. The experiment shows that the sequence measurement system of the IR camera can get the precise sequence measurement result and works stably, laying foundation for the continuous system.
Invited Article: Digital beam-forming imaging riometer systems
NASA Astrophysics Data System (ADS)
Honary, Farideh; Marple, Steve R.; Barratt, Keith; Chapman, Peter; Grill, Martin; Nielsen, Erling
2011-03-01
The design and operation of a new generation of digital imaging riometer systems developed by Lancaster University are presented. In the heart of the digital imaging riometer is a field-programmable gate array (FPGA), which is used for the digital signal processing and digital beam forming, completely replacing the analog Butler matrices which have been used in previous designs. The reconfigurable nature of the FPGA has been exploited to produce tools for remote system testing and diagnosis which have proven extremely useful for operation in remote locations such as the Arctic and Antarctic. Different FPGA programs enable different instrument configurations, including a 4 × 4 antenna filled array (producing 4 × 4 beams), an 8 × 8 antenna filled array (producing 7 × 7 beams), and a Mills cross system utilizing 63 antennas producing 556 usable beams. The concept of using a Mills cross antenna array for riometry has been successfully demonstrated for the first time. The digital beam forming has been validated by comparing the received signal power from cosmic radio sources with results predicted from the theoretical beam radiation pattern. The performances of four digital imaging riometer systems are compared against each other and a traditional imaging riometer utilizing analog Butler matrices. The comparison shows that digital imaging riometer systems, with independent receivers for each antenna, can obtain much better measurement precision for filled arrays or much higher spatial resolution for the Mills cross configuration when compared to existing imaging riometer systems.
Design of embedded endoscopic ultrasonic imaging system
NASA Astrophysics Data System (ADS)
Li, Ming; Zhou, Hao; Wen, Shijie; Chen, Xiodong; Yu, Daoyin
2008-12-01
Endoscopic ultrasonic imaging system is an important component in the endoscopic ultrasonography system (EUS). Through the ultrasonic probe, the characteristics of the fault histology features of digestive organs is detected by EUS, and then received by the reception circuit which making up of amplifying, gain compensation, filtering and A/D converter circuit, in the form of ultrasonic echo. Endoscopic ultrasonic imaging system is the back-end processing system of the EUS, with the function of receiving digital ultrasonic echo modulated by the digestive tract wall from the reception circuit, acquiring and showing the fault histology features in the form of image and characteristic data after digital signal processing, such as demodulation, etc. Traditional endoscopic ultrasonic imaging systems are mainly based on image acquisition and processing chips, which connecting to personal computer with USB2.0 circuit, with the faults of expensive, complicated structure, poor portability, and difficult to popularize. To against the shortcomings above, this paper presents the methods of digital signal acquisition and processing specially based on embedded technology with the core hardware structure of ARM and FPGA for substituting the traditional design with USB2.0 and personal computer. With built-in FIFO and dual-buffer, FPGA implement the ping-pong operation of data storage, simultaneously transferring the image data into ARM through the EBI bus by DMA function, which is controlled by ARM to carry out the purpose of high-speed transmission. The ARM system is being chosen to implement the responsibility of image display every time DMA transmission over and actualizing system control with the drivers and applications running on the embedded operating system Windows CE, which could provide a stable, safe and reliable running platform for the embedded device software. Profiting from the excellent graphical user interface (GUI) and good performance of Windows CE, we can not only clearly show 511×511 pixels ultrasonic echo images through application program, but also provide a simple and friendly operating interface with mouse and touch screen which is more convenient than the traditional endoscopic ultrasonic imaging system. Including core and peripheral circuits of FPGA and ARM, power network circuit and LCD display circuit, we designed the whole embedded system, achieving the desired purpose by implementing ultrasonic image display properly after the experimental verification, solving the problem of hugeness and complexity of the traditional endoscopic ultrasonic imaging system.
Implementation of a pulse coupled neural network in FPGA.
Waldemark, J; Millberg, M; Lindblad, T; Waldemark, K; Becanovic, V
2000-06-01
The Pulse Coupled neural network, PCNN, is a biologically inspired neural net and it can be used in various image analysis applications, e.g. time-critical applications in the field of image pre-processing like segmentation, filtering, etc. a VHDL implementation of the PCNN targeting FPGA was undertaken and the results presented here. The implementation contains many interesting features. By pipelining the PCNN structure a very high throughput of 55 million neuron iterations per second could be achieved. By making the coefficients re-configurable during operation, a complete recognition system could be implemented on one, or maybe two, chip(s). Reconsidering the ranges and resolutions of the constants may save a lot of hardware, since the higher resolution requires larger multipliers, adders, memories etc.
A design of camera simulator for photoelectric image acquisition system
NASA Astrophysics Data System (ADS)
Cai, Guanghui; Liu, Wen; Zhang, Xin
2015-02-01
In the process of developing the photoelectric image acquisition equipment, it needs to verify the function and performance. In order to make the photoelectric device recall the image data formerly in the process of debugging and testing, a design scheme of the camera simulator is presented. In this system, with FPGA as the control core, the image data is saved in NAND flash trough USB2.0 bus. Due to the access rate of the NAND, flash is too slow to meet the requirement of the sytsem, to fix the problem, the pipeline technique and the High-Band-Buses technique are applied in the design to improve the storage rate. It reads image data out from flash in the control logic of FPGA and output separately from three different interface of Camera Link, LVDS and PAL, which can provide image data for photoelectric image acquisition equipment's debugging and algorithm validation. However, because the standard of PAL image resolution is 720*576, the resolution is different between PAL image and input image, so the image can be output after the resolution conversion. The experimental results demonstrate that the camera simulator outputs three format image sequence correctly, which can be captured and displayed by frame gather. And the three-format image data can meet test requirements of the most equipment, shorten debugging time and improve the test efficiency.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-01-01
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-12-15
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
Real-time digital signal processing in multiphoton and time-resolved microscopy
NASA Astrophysics Data System (ADS)
Wilson, Jesse W.; Warren, Warren S.; Fischer, Martin C.
2016-03-01
The use of multiphoton interactions in biological tissue for imaging contrast requires highly sensitive optical measurements. These often involve signal processing and filtering steps between the photodetector and the data acquisition device, such as photon counting and lock-in amplification. These steps can be implemented as real-time digital signal processing (DSP) elements on field-programmable gate array (FPGA) devices, an approach that affords much greater flexibility than commercial photon counting or lock-in devices. We will present progress toward developing two new FPGA-based DSP devices for multiphoton and time-resolved microscopy applications. The first is a high-speed multiharmonic lock-in amplifier for transient absorption microscopy, which is being developed for real-time analysis of the intensity-dependence of melanin, with applications in vivo and ex vivo (noninvasive histopathology of melanoma and pigmented lesions). The second device is a kHz lock-in amplifier running on a low cost (50-200) development platform. It is our hope that these FPGA-based DSP devices will enable new, high-speed, low-cost applications in multiphoton and time-resolved microscopy.
An embedded face-classification system for infrared images on an FPGA
NASA Astrophysics Data System (ADS)
Soto, Javier E.; Figueroa, Miguel
2014-10-01
We present a face-classification architecture for long-wave infrared (IR) images implemented on a Field Programmable Gate Array (FPGA). The circuit is fast, compact and low power, can recognize faces in real time and be embedded in a larger image-processing and computer vision system operating locally on an IR camera. The algorithm uses Local Binary Patterns (LBP) to perform feature extraction on each IR image. First, each pixel in the image is represented as an LBP pattern that encodes the similarity between the pixel and its neighbors. Uniform LBP codes are then used to reduce the number of patterns to 59 while preserving more than 90% of the information contained in the original LBP representation. Then, the image is divided into 64 non-overlapping regions, and each region is represented as a 59-bin histogram of patterns. Finally, the algorithm concatenates all 64 regions to create a 3,776-bin spatially enhanced histogram. We reduce the dimensionality of this histogram using Linear Discriminant Analysis (LDA), which improves clustering and enables us to store an entire database of 53 subjects on-chip. During classification, the circuit applies LBP and LDA to each incoming IR image in real time, and compares the resulting feature vector to each pattern stored in the local database using the Manhattan distance. We implemented the circuit on a Xilinx Artix-7 XC7A100T FPGA and tested it with the UCHThermalFace database, which consists of 28 81 x 150-pixel images of 53 subjects in indoor and outdoor conditions. The circuit achieves a 98.6% hit ratio, trained with 16 images and tested with 12 images of each subject in the database. Using a 100 MHz clock, the circuit classifies 8,230 images per second, and consumes only 309mW.
Pursley, Randall H.; Salem, Ghadi; Devasahayam, Nallathamby; Subramanian, Sankaran; Koscielniak, Janusz; Krishna, Murali C.; Pohida, Thomas J.
2006-01-01
The integration of modern data acquisition and digital signal processing (DSP) technologies with Fourier transform electron paramagnetic resonance (FT-EPR) imaging at radiofrequencies (RF) is described. The FT-EPR system operates at a Larmor frequency (Lf) of 300 MHz to facilitate in vivo studies. This relatively low frequency Lf, in conjunction with our ~10 MHz signal bandwidth, enables the use of direct free induction decay time-locked subsampling (TLSS). This particular technique provides advantages by eliminating the traditional analog intermediate frequency downconversion stage along with the corresponding noise sources. TLSS also results in manageable sample rates that facilitate the design of DSP-based data acquisition and image processing platforms. More specifically, we utilize a high-speed field programmable gate array (FPGA) and a DSP processor to perform advanced real-time signal and image processing. The migration to a DSP-based configuration offers the benefits of improved EPR system performance, as well as increased adaptability to various EPR system configurations (i.e., software configurable systems instead of hardware reconfigurations). The required modifications to the FT-EPR system design are described, with focus on the addition of DSP technologies including the application-specific hardware, software, and firmware developed for the FPGA and DSP processor. The first results of using real-time DSP technologies in conjunction with direct detection bandpass sampling to implement EPR imaging at RF frequencies are presented. PMID:16243552
A computational approach to real-time image processing for serial time-encoded amplified microscopy
NASA Astrophysics Data System (ADS)
Oikawa, Minoru; Hiyama, Daisuke; Hirayama, Ryuji; Hasegawa, Satoki; Endo, Yutaka; Sugie, Takahisa; Tsumura, Norimichi; Kuroshima, Mai; Maki, Masanori; Okada, Genki; Lei, Cheng; Ozeki, Yasuyuki; Goda, Keisuke; Shimobaba, Tomoyoshi
2016-03-01
High-speed imaging is an indispensable technique, particularly for identifying or analyzing fast-moving objects. The serial time-encoded amplified microscopy (STEAM) technique was proposed to enable us to capture images with a frame rate 1,000 times faster than using conventional methods such as CCD (charge-coupled device) cameras. The application of this high-speed STEAM imaging technique to a real-time system, such as flow cytometry for a cell-sorting system, requires successively processing a large number of captured images with high throughput in real time. We are now developing a high-speed flow cytometer system including a STEAM camera. In this paper, we describe our approach to processing these large amounts of image data in real time. We use an analog-to-digital converter that has up to 7.0G samples/s and 8-bit resolution for capturing the output voltage signal that involves grayscale images from the STEAM camera. Therefore the direct data output from the STEAM camera generates 7.0G byte/s continuously. We provided a field-programmable gate array (FPGA) device as a digital signal pre-processor for image reconstruction and finding objects in a microfluidic channel with high data rates in real time. We also utilized graphics processing unit (GPU) devices for accelerating the calculation speed of identification of the reconstructed images. We built our prototype system, which including a STEAM camera, a FPGA device and a GPU device, and evaluated its performance in real-time identification of small particles (beads), as virtual biological cells, owing through a microfluidic channel.
Method and apparatus for optical encoding with compressible imaging
NASA Technical Reports Server (NTRS)
Leviton, Douglas B. (Inventor)
2006-01-01
The present invention presents an optical encoder with increased conversion rates. Improvement in the conversion rate is a result of combining changes in the pattern recognition encoder's scale pattern with an image sensor readout technique which takes full advantage of those changes, and lends itself to operation by modern, high-speed, ultra-compact microprocessors and digital signal processors (DSP) or field programmable gate array (FPGA) logic elements which can process encoder scale images at the highest speeds. Through these improvements, all three components of conversion time (reciprocal conversion rate)--namely exposure time, image readout time, and image processing time--are minimized.
Low-SWaP coincidence processing for Geiger-mode LIDAR video
NASA Astrophysics Data System (ADS)
Schultz, Steven E.; Cervino, Noel P.; Kurtz, Zachary D.; Brown, Myron Z.
2015-05-01
Photon-counting Geiger-mode lidar detector arrays provide a promising approach for producing three-dimensional (3D) video at full motion video (FMV) data rates, resolution, and image size from long ranges. However, coincidence processing required to filter raw photon counts is computationally expensive, generally requiring significant size, weight, and power (SWaP) and also time. In this paper, we describe a laboratory test-bed developed to assess the feasibility of low-SWaP, real-time processing for 3D FMV based on Geiger-mode lidar. First, we examine a design based on field programmable gate arrays (FPGA) and demonstrate proof-of-concept results. Then we examine a design based on a first-of-its-kind embedded graphical processing unit (GPU) and compare performance with the FPGA. Results indicate feasibility of real-time Geiger-mode lidar processing for 3D FMV and also suggest utility for real-time onboard processing for mapping lidar systems.
Real-time object tracking based on scale-invariant features employing bio-inspired hardware.
Yasukawa, Shinsuke; Okuno, Hirotsugu; Ishii, Kazuo; Yagi, Tetsuya
2016-09-01
We developed a vision sensor system that performs a scale-invariant feature transform (SIFT) in real time. To apply the SIFT algorithm efficiently, we focus on a two-fold process performed by the visual system: whole-image parallel filtering and frequency-band parallel processing. The vision sensor system comprises an active pixel sensor, a metal-oxide semiconductor (MOS)-based resistive network, a field-programmable gate array (FPGA), and a digital computer. We employed the MOS-based resistive network for instantaneous spatial filtering and a configurable filter size. The FPGA is used to pipeline process the frequency-band signals. The proposed system was evaluated by tracking the feature points detected on an object in a video. Copyright © 2016 Elsevier Ltd. All rights reserved.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2017-03-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2016-01-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
Khan, Shadab; Manwaring, Preston; Borsic, Andrea; Halter, Ryan
2015-04-01
Electrical impedance tomography (EIT) is used to image the electrical property distribution of a tissue under test. An EIT system comprises complex hardware and software modules, which are typically designed for a specific application. Upgrading these modules is a time-consuming process, and requires rigorous testing to ensure proper functioning of new modules with the existing ones. To this end, we developed a modular and reconfigurable data acquisition (DAQ) system using National Instruments' (NI) hardware and software modules, which offer inherent compatibility over generations of hardware and software revisions. The system can be configured to use up to 32-channels. This EIT system can be used to interchangeably apply current or voltage signal, and measure the tissue response in a semi-parallel fashion. A novel signal averaging algorithm, and 512-point fast Fourier transform (FFT) computation block was implemented on the FPGA. FFT output bins were classified as signal or noise. Signal bins constitute a tissue's response to a pure or mixed tone signal. Signal bins' data can be used for traditional applications, as well as synchronous frequency-difference imaging. Noise bins were used to compute noise power on the FPGA. Noise power represents a metric of signal quality, and can be used to ensure proper tissue-electrode contact. Allocation of these computationally expensive tasks to the FPGA reduced the required bandwidth between PC, and the FPGA for high frame rate EIT. In 16-channel configuration, with a signal-averaging factor of 8, the DAQ frame rate at 100 kHz exceeded 110 frames s (-1), and signal-to-noise ratio exceeded 90 dB across the spectrum. Reciprocity error was found to be for frequencies up to 1 MHz. Static imaging experiments were performed on a high-conductivity inclusion placed in a saline filled tank; the inclusion was clearly localized in the reconstructions obtained for both absolute current and voltage mode data.
A real-time tracking system of infrared dim and small target based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Rong, Sheng-hui; Zhou, Hui-xin; Qin, Han-lin; Wang, Bing-jian; Qian, Kun
2014-11-01
A core technology in the infrared warning system is the detection tracking of dim and small targets with complicated background. Consequently, running the detection algorithm on the hardware platform has highly practical value in the military field. In this paper, a real-time detection tracking system of infrared dim and small target which is used FPGA (Field Programmable Gate Array) and DSP (Digital Signal Processor) as the core was designed and the corresponding detection tracking algorithm and the signal flow is elaborated. At the first stage, the FPGA obtain the infrared image sequence from the sensor, then it suppresses background clutter by mathematical morphology method and enhances the target intensity by Laplacian of Gaussian operator. At the second stage, the DSP obtain both the original image and the filtered image form the FPGA via the video port. Then it segments the target from the filtered image by an adaptive threshold segmentation method and gets rid of false target by pipeline filter. Experimental results show that our system can achieve higher detection rate and lower false alarm rate.
Research based on the SoPC platform of feature-based image registration
NASA Astrophysics Data System (ADS)
Shi, Yue-dong; Wang, Zhi-hui
2015-12-01
This paper focuses on the study of implementing feature-based image registration by System on a Programmable Chip (SoPC) hardware platform. We solidify the image registration algorithm on the FPGA chip, in which embedded soft core processor Nios II can speed up the image processing system. In this way, we can make image registration technology get rid of the PC. And, consequently, this kind of technology will be got an extensive use. The experiment result indicates that our system shows stable performance, particularly in terms of matching processing which noise immunity is good. And feature points of images show a reasonable distribution.
Li, Bingyi; Chen, Liang; Wei, Chunpeng; Xie, Yizhuang; Chen, He; Yu, Wenyue
2017-01-01
With the development of satellite load technology and very large scale integrated (VLSI) circuit technology, onboard real-time synthetic aperture radar (SAR) imaging systems have become a solution for allowing rapid response to disasters. A key goal of the onboard SAR imaging system design is to achieve high real-time processing performance with severe size, weight, and power consumption constraints. In this paper, we analyse the computational burden of the commonly used chirp scaling (CS) SAR imaging algorithm. To reduce the system hardware cost, we propose a partial fixed-point processing scheme. The fast Fourier transform (FFT), which is the most computation-sensitive operation in the CS algorithm, is processed with fixed-point, while other operations are processed with single precision floating-point. With the proposed fixed-point processing error propagation model, the fixed-point processing word length is determined. The fidelity and accuracy relative to conventional ground-based software processors is verified by evaluating both the point target imaging quality and the actual scene imaging quality. As a proof of concept, a field- programmable gate array—application-specific integrated circuit (FPGA-ASIC) hybrid heterogeneous parallel accelerating architecture is designed and realized. The customized fixed-point FFT is implemented using the 130 nm complementary metal oxide semiconductor (CMOS) technology as a co-processor of the Xilinx xc6vlx760t FPGA. A single processing board requires 12 s and consumes 21 W to focus a 50-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384. PMID:28672813
Yang, Chen; Li, Bingyi; Chen, Liang; Wei, Chunpeng; Xie, Yizhuang; Chen, He; Yu, Wenyue
2017-06-24
With the development of satellite load technology and very large scale integrated (VLSI) circuit technology, onboard real-time synthetic aperture radar (SAR) imaging systems have become a solution for allowing rapid response to disasters. A key goal of the onboard SAR imaging system design is to achieve high real-time processing performance with severe size, weight, and power consumption constraints. In this paper, we analyse the computational burden of the commonly used chirp scaling (CS) SAR imaging algorithm. To reduce the system hardware cost, we propose a partial fixed-point processing scheme. The fast Fourier transform (FFT), which is the most computation-sensitive operation in the CS algorithm, is processed with fixed-point, while other operations are processed with single precision floating-point. With the proposed fixed-point processing error propagation model, the fixed-point processing word length is determined. The fidelity and accuracy relative to conventional ground-based software processors is verified by evaluating both the point target imaging quality and the actual scene imaging quality. As a proof of concept, a field- programmable gate array-application-specific integrated circuit (FPGA-ASIC) hybrid heterogeneous parallel accelerating architecture is designed and realized. The customized fixed-point FFT is implemented using the 130 nm complementary metal oxide semiconductor (CMOS) technology as a co-processor of the Xilinx xc6vlx760t FPGA. A single processing board requires 12 s and consumes 21 W to focus a 50-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384.
NASA Astrophysics Data System (ADS)
García, Aday; Santos, Lucana; López, Sebastián.; Callicó, Gustavo M.; Lopez, Jose F.; Sarmiento, Roberto
2014-05-01
Efficient onboard satellite hyperspectral image compression represents a necessity and a challenge for current and future space missions. Therefore, it is mandatory to provide hardware implementations for this type of algorithms in order to achieve the constraints required for onboard compression. In this work, we implement the Lossy Compression for Exomars (LCE) algorithm on an FPGA by means of high-level synthesis (HSL) in order to shorten the design cycle. Specifically, we use CatapultC HLS tool to obtain a VHDL description of the LCE algorithm from C-language specifications. Two different approaches are followed for HLS: on one hand, introducing the whole C-language description in CatapultC and on the other hand, splitting the C-language description in functional modules to be implemented independently with CatapultC, connecting and controlling them by an RTL description code without HLS. In both cases the goal is to obtain an FPGA implementation. We explain the several changes applied to the original Clanguage source code in order to optimize the results obtained by CatapultC for both approaches. Experimental results show low area occupancy of less than 15% for a SRAM-based Virtex-5 FPGA and a maximum frequency above 80 MHz. Additionally, the LCE compressor was implemented into an RTAX2000S antifuse-based FPGA, showing an area occupancy of 75% and a frequency around 53 MHz. All these serve to demonstrate that the LCE algorithm can be efficiently executed on an FPGA onboard a satellite. A comparison between both implementation approaches is also provided. The performance of the algorithm is finally compared with implementations on other technologies, specifically a graphics processing unit (GPU) and a single-threaded CPU.
Real-time implementation of a multispectral mine target detection algorithm
NASA Astrophysics Data System (ADS)
Samson, Joseph W.; Witter, Lester J.; Kenton, Arthur C.; Holloway, John H., Jr.
2003-09-01
Spatial-spectral anomaly detection (the "RX Algorithm") has been exploited on the USMC's Coastal Battlefield Reconnaissance and Analysis (COBRA) Advanced Technology Demonstration (ATD) and several associated technology base studies, and has been found to be a useful method for the automated detection of surface-emplaced antitank land mines in airborne multispectral imagery. RX is a complex image processing algorithm that involves the direct spatial convolution of a target/background mask template over each multispectral image, coupled with a spatially variant background spectral covariance matrix estimation and inversion. The RX throughput on the ATD was about 38X real time using a single Sun UltraSparc system. A goal to demonstrate RX in real-time was begun in FY01. We now report the development and demonstration of a Field Programmable Gate Array (FPGA) solution that achieves a real-time implementation of the RX algorithm at video rates using COBRA ATD data. The approach uses an Annapolis Microsystems Firebird PMC card containing a Xilinx XCV2000E FPGA with over 2,500,000 logic gates and 18MBytes of memory. A prototype system was configured using a Tek Microsystems VME board with dual-PowerPC G4 processors and two PMC slots. The RX algorithm was translated from its C programming implementation into the VHDL language and synthesized into gates that were loaded into the FPGA. The VHDL/synthesizer approach allows key RX parameters to be quickly changed and a new implementation automatically generated. Reprogramming the FPGA is done rapidly and in-circuit. Implementation of the RX algorithm in a single FPGA is a major first step toward achieving real-time land mine detection.
NASA Astrophysics Data System (ADS)
Zhang, Zhenhai; Li, Kejie; Wu, Xiaobing; Zhang, Shujiang
2008-03-01
The unwrapped and correcting algorithm based on Coordinate Rotation Digital Computer (CORDIC) and bilinear interpolation algorithm was presented in this paper, with the purpose of processing dynamic panoramic annular image. An original annular panoramic image captured by panoramic annular lens (PAL) can be unwrapped and corrected to conventional rectangular image without distortion, which is much more coincident with people's vision. The algorithm for panoramic image processing is modeled by VHDL and implemented in FPGA. The experimental results show that the proposed panoramic image algorithm for unwrapped and distortion correction has the lower computation complexity and the architecture for dynamic panoramic image processing has lower hardware cost and power consumption. And the proposed algorithm is valid.
NASA Astrophysics Data System (ADS)
Shatravin, V.; Shashev, D. V.
2018-05-01
Currently, robots are increasingly being used in every industry. One of the most high-tech areas is creation of completely autonomous robotic devices including vehicles. The results of various global research prove the efficiency of vision systems in autonomous robotic devices. However, the use of these systems is limited because of the computational and energy resources available in the robot device. The paper describes the results of applying the original approach for image processing on reconfigurable computing environments by the example of morphological operations over grayscale images. This approach is prospective for realizing complex image processing algorithms and real-time image analysis in autonomous robotic devices.
Integrated circuits for volumetric ultrasound imaging with 2-D CMUT arrays.
Bhuyan, Anshuman; Choe, Jung Woo; Lee, Byung Chul; Wygant, Ira O; Nikoozadeh, Amin; Oralkan, Ömer; Khuri-Yakub, Butrus T
2013-12-01
Real-time volumetric ultrasound imaging systems require transmit and receive circuitry to generate ultrasound beams and process received echo signals. The complexity of building such a system is high due to requirement of the front-end electronics needing to be very close to the transducer. A large number of elements also need to be interfaced to the back-end system and image processing of a large dataset could affect the imaging volume rate. In this work, we present a 3-D imaging system using capacitive micromachined ultrasonic transducer (CMUT) technology that addresses many of the challenges in building such a system. We demonstrate two approaches in integrating the transducer and the front-end electronics. The transducer is a 5-MHz CMUT array with an 8 mm × 8 mm aperture size. The aperture consists of 1024 elements (32 × 32) with an element pitch of 250 μm. An integrated circuit (IC) consists of a transmit beamformer and receive circuitry to improve the noise performance of the overall system. The assembly was interfaced with an FPGA and a back-end system (comprising of a data acquisition system and PC). The FPGA provided the digital I/O signals for the IC and the back-end system was used to process the received RF echo data (from the IC) and reconstruct the volume image using a phased array imaging approach. Imaging experiments were performed using wire and spring targets, a ventricle model and a human prostrate. Real-time volumetric images were captured at 5 volumes per second and are presented in this paper.
Fast semivariogram computation using FPGA architectures
NASA Astrophysics Data System (ADS)
Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang
2015-02-01
The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery
Qi, Baogui; Zhuang, Yin; Chen, He; Chen, Liang
2018-01-01
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited. PMID:29693585
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery.
Qi, Baogui; Shi, Hao; Zhuang, Yin; Chen, He; Chen, Liang
2018-04-25
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited.
FPGA-Based Front-End Electronics for Positron Emission Tomography
Haselman, Michael; DeWitt, Don; McDougald, Wendy; Lewellen, Thomas K.; Miyaoka, Robert; Hauck, Scott
2010-01-01
Modern Field Programmable Gate Arrays (FPGAs) are capable of performing complex discrete signal processing algorithms with clock rates above 100MHz. This combined with FPGA’s low expense, ease of use, and selected dedicated hardware make them an ideal technology for a data acquisition system for positron emission tomography (PET) scanners. Our laboratory is producing a high-resolution, small-animal PET scanner that utilizes FPGAs as the core of the front-end electronics. For this next generation scanner, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilizes to add significant signal processing power to produce higher resolution images. In this paper two such processes, sub-clock rate pulse timing and event localization, will be discussed in detail. We show that timing performed in the FPGA can achieve a resolution that is suitable for small-animal scanners, and will outperform the analog version given a low enough sampling period for the ADC. We will also show that the position of events in the scanner can be determined in real time using a statistical positioning based algorithm. PMID:21961085
Research and implementation of SATA protocol link layer based on FPGA
NASA Astrophysics Data System (ADS)
Liu, Wen-long; Liu, Xue-bin; Qiang, Si-miao; Yan, Peng; Wen, Zhi-gang; Kong, Liang; Liu, Yong-zheng
2018-02-01
In order to solve the problem high-performance real-time, high-speed the image data storage generated by the detector. In this thesis, it choose an suitable portable image storage hard disk of SATA interface, it is relative to the existing storage media. It has a large capacity, high transfer rate, inexpensive, power-down data which is not lost, and many other advantages. This paper focuses on the link layer of the protocol, analysis the implementation process of SATA2.0 protocol, and build state machines. Then analyzes the characteristics resources of Kintex-7 FPGA family, builds state machines according to the agreement, write Verilog implement link layer modules, and run the simulation test. Finally, the test is on the Kintex-7 development board platform. It meets the requirements SATA2.0 protocol basically.
NASA Astrophysics Data System (ADS)
Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.
2004-11-01
Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
FPGA-based coprocessor for matrix algorithms implementation
NASA Astrophysics Data System (ADS)
Amira, Abbes; Bensaali, Faycal
2003-03-01
Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.
Investigation of Electromagnetic Signatures of a FPGA Using an APREL EM-ISIGHT System
2015-12-01
unprofessional workmanship in the bonding process. Focused ion beam (FIB) images often consist of some type of etch and/or deposition of material from/to...characteristics of conducted emissions." Electromagnetic Compatibility, 2008. EMC 2008. IEEE International Symposium (2008): 1-4. Montanari, Ivan
Large-N correlator systems for low frequency radio astronomy
NASA Astrophysics Data System (ADS)
Foster, Griffin
Low frequency radio astronomy has entered a second golden age driven by the development of a new class of large-N interferometric arrays. The low frequency array (LOFAR) and a number of redshifted HI Epoch of Reionization (EoR) arrays are currently undergoing commission and regularly observing. Future arrays of unprecedented sensitivity and resolutions at low frequencies, such as the square kilometer array (SKA) and the hydrogen epoch of reionization array (HERA), are in development. The combination of advancements in specialized field programmable gate array (FPGA) hardware for signal processing, computing and graphics processing unit (GPU) resources, and new imaging and calibration algorithms has opened up the oft underused radio band below 300 MHz. These interferometric arrays require efficient implementation of digital signal processing (DSP) hardware to compute the baseline correlations. FPGA technology provides an optimal platform to develop new correlators. The significant growth in data rates from these systems requires automated software to reduce the correlations in real time before storing the data products to disk. Low frequency, widefield observations introduce a number of unique calibration and imaging challenges. The efficient implementation of FX correlators using FPGA hardware is presented. Two correlators have been developed, one for the 32 element BEST-2 array at Medicina Observatory and the other for the 96 element LOFAR station at Chilbolton Observatory. In addition, calibration and imaging software has been developed for each system which makes use of the radio interferometry measurement equation (RIME) to derive calibrations. A process for generating sky maps from widefield LOFAR station observations is presented. Shapelets, a method of modelling extended structures such as resolved sources and beam patterns has been adapted for radio astronomy use to further improve system calibration. Scaling of computing technology allows for the development of larger correlator systems, which in turn allows for improvements in sensitivity and resolution. This requires new calibration techniques which account for a broad range of systematic effects.
A hardware implementation of the discrete Pascal transform for image processing
NASA Astrophysics Data System (ADS)
Goodman, Thomas J.; Aburdene, Maurice F.
2006-02-01
The discrete Pascal transform is a polynomial transform with applications in pattern recognition, digital filtering, and digital image processing. It already has been shown that the Pascal transform matrix can be decomposed into a product of binary matrices. Such a factorization leads to a fast and efficient hardware implementation without the use of multipliers, which consume large amounts of hardware. We recently developed a field-programmable gate array (FPGA) implementation to compute the Pascal transform. Our goal was to demonstrate the computational efficiency of the transform while keeping hardware requirements at a minimum. Images are uploaded into memory from a remote computer prior to processing, and the transform coefficients can be offloaded from the FPGA board for analysis. Design techniques like as-soon-as-possible scheduling and adder sharing allowed us to develop a fast and efficient system. An eight-point, one-dimensional transform completes in 13 clock cycles and requires only four adders. An 8x8 two-dimensional transform completes in 240 cycles and requires only a top-level controller in addition to the one-dimensional transform hardware. Finally, through minor modifications to the controller, the transform operations can be pipelined to achieve 100% utilization of the four adders, allowing one eight-point transform to complete every seven clock cycles.
Preliminary Study of Image Reconstruction Algorithm on a Digital Signal Processor
2014-03-01
5.2 Comparison of CPU-GPU, CPU-FPGA, and CPU-DSP Designs The work for implementing VHDL description of the back-projection algorithm on a physical...FPGA was not complete. Hence, the DSP implementation results are compared with the simulated results for the VHDL design. Simulating VHDL provides an...rather than at the software level. Depending on an application’s characteristics, FPGA implementations can provide a significant performance
Rapid Corner Detection Using FPGAs
NASA Technical Reports Server (NTRS)
Morfopoulos, Arin C.; Metz, Brandon C.
2010-01-01
In order to perform precision landings for space missions, a control system must be accurate to within ten meters. Feature detection applied against images taken during descent and correlated against the provided base image is computationally expensive and requires tens of seconds of processing time to do just one image while the goal is to process multiple images per second. To solve this problem, this algorithm takes that processing load from the central processing unit (CPU) and gives it to a reconfigurable field programmable gate array (FPGA), which is able to compute data in parallel at very high clock speeds. The workload of the processor then becomes simpler; to read an image from a camera, it is transferred into the FPGA, and the results are read back from the FPGA. The Harris Corner Detector uses the determinant and trace to find a corner score, with each step of the computation occurring on independent clock cycles. Essentially, the image is converted into an x and y derivative map. Once three lines of pixel information have been queued up, valid pixel derivatives are clocked into the product and averaging phase of the pipeline. Each x and y derivative is squared against itself, as well as the product of the ix and iy derivative, and each value is stored in a WxN size buffer, where W represents the size of the integration window and N is the width of the image. In this particular case, a window size of 5 was chosen, and the image is 640 480. Over a WxN size window, an equidistance Gaussian is applied (to bring out the stronger corners), and then each value in the entire window is summed and stored. The required components of the equation are in place, and it is just a matter of taking the determinant and trace. It should be noted that the trace is being weighted by a constant k, a value that is found empirically to be within 0.04 to 0.15 (and in this implementation is 0.05). The constant k determines the number of corners available to be compared against a threshold sigma to mark a valid corner. After a fixed delay from when the first pixel is clocked in (to fill the pipeline), a score is achieved after each successive clock. This score corresponds with an (x,y) location within the image. If the score is higher than the predetermined threshold sigma, then a flag is set high and the location is recorded.
NASA Astrophysics Data System (ADS)
Oh, Gyong Jin; Kim, Lyang-June; Sheen, Sue-Ho; Koo, Gyou-Phyo; Jin, Sang-Hun; Yeo, Bo-Yeon; Lee, Jong-Ho
2009-05-01
This paper presents a real time implementation of Non Uniformity Correction (NUC). Two point correction and one point correction with shutter were carried out in an uncooled imaging system which will be applied to a missile application. To design a small, light weight and high speed imaging system for a missile system, SoPC (System On a Programmable Chip) which comprises of FPGA and soft core (Micro-blaze) was used. Real time NUC and generation of control signals are implemented using FPGA. Also, three different NUC tables were made to make the operating time shorter and to reduce the power consumption in a large range of environment temperature. The imaging system consists of optics and four electronics boards which are detector interface board, Analog to Digital converter board, Detector signal generation board and Power supply board. To evaluate the imaging system, NETD was measured. The NETD was less than 160mK in three different environment temperatures.
NASA Technical Reports Server (NTRS)
Ohara, Tetsuo
2012-01-01
A sub-aperture stitching optical interferometer can provide a cost-effective solution for an in situ metrology tool for large optics; however, the currently available technologies are not suitable for high-speed and real-time continuous scan. NanoWave s SPPE (Scanning Probe Position Encoder) has been proven to exhibit excellent stability and sub-nanometer precision with a large dynamic range. This same technology can transform many optical interferometers into real-time subnanometer precision tools with only minor modification. The proposed field-programmable gate array (FPGA) signal processing concept, coupled with a new-generation, high-speed, mega-pixel CMOS (complementary metal-oxide semiconductor) image sensor, enables high speed (>1 m/s) and real-time continuous surface profiling that is insensitive to variation of pixel sensitivity and/or optical transmission/reflection. This is especially useful for large optics surface profiling.
Photoelectric radar servo control system based on ARM+FPGA
NASA Astrophysics Data System (ADS)
Wu, Kaixuan; Zhang, Yue; Li, Yeqiu; Dai, Qin; Yao, Jun
2016-01-01
In order to get smaller, faster, and more responsive requirements of the photoelectric radar servo control system. We propose a set of core ARM + FPGA architecture servo controller. Parallel processing capability of FPGA to be used for the encoder feedback data, PWM carrier modulation, A, B code decoding processing and so on; Utilizing the advantage of imaging design in ARM Embedded systems achieves high-speed implementation of the PID algorithm. After the actual experiment, the closed-loop speed of response of the system cycles up to 2000 times/s, in the case of excellent precision turntable shaft, using a PID algorithm to achieve the servo position control with the accuracy of + -1 encoder input code. Firstly, This article carry on in-depth study of the embedded servo control system hardware to determine the ARM and FPGA chip as the main chip with systems based on a pre-measured target required to achieve performance requirements, this article based on ARM chip used Samsung S3C2440 chip of ARM7 architecture , the FPGA chip is chosen xilinx's XC3S400 . ARM and FPGA communicate by using SPI bus, the advantage of using SPI bus is saving a lot of pins for easy system upgrades required thereafter. The system gets the speed datas through the photoelectric-encoder that transports the datas to the FPGA, Then the system transmits the datas through the FPGA to ARM, transforms speed datas into the corresponding position and velocity data in a timely manner, prepares the corresponding PWM wave to control motor rotation by making comparison between the position data and the velocity data setted in advance . According to the system requirements to draw the schematics of the photoelectric radar servo control system and PCB board to produce specially. Secondly, using PID algorithm to control the servo system, the datas of speed obtained from photoelectric-encoder is calculated position data and speed data via high-speed digital PID algorithm and coordinate models. Finally, a large number of experiments verify the reliability of embedded servo control system's functions, the stability of the program and the stability of the hardware circuit. Meanwhile, the system can also achieve the satisfactory of user experience, to achieve a multi-mode motion, real-time motion status monitoring, online system parameter changes and other convenient features.
Free-running ADC- and FPGA-based signal processing method for brain PET using GAPD arrays
NASA Astrophysics Data System (ADS)
Hu, Wei; Choi, Yong; Hong, Key Jo; Kang, Jihoon; Jung, Jin Ho; Huh, Youn Suk; Lim, Hyun Keong; Kim, Sang Su; Kim, Byung-Tae; Chung, Yonghyun
2012-02-01
Currently, for most photomultiplier tube (PMT)-based PET systems, constant fraction discriminators (CFD) and time to digital converters (TDC) have been employed to detect gamma ray signal arrival time, whereas anger logic circuits and peak detection analog-to-digital converters (ADCs) have been implemented to acquire position and energy information of detected events. As compared to PMT the Geiger-mode avalanche photodiodes (GAPDs) have a variety of advantages, such as compactness, low bias voltage requirement and MRI compatibility. Furthermore, the individual read-out method using a GAPD array coupled 1:1 with an array scintillator can provide better image uniformity than can be achieved using PMT and anger logic circuits. Recently, a brain PET using 72 GAPD arrays (4×4 array, pixel size: 3 mm×3 mm) coupled 1:1 with LYSO scintillators (4×4 array, pixel size: 3 mm×3 mm×20 mm) has been developed for simultaneous PET/MRI imaging in our laboratory. Eighteen 64:1 position decoder circuits (PDCs) were used to reduce GAPD channel number and three off-the-shelf free-running ADC and field programmable gate array (FPGA) combined data acquisition (DAQ) cards were used for data acquisition and processing. In this study, a free-running ADC- and FPGA-based signal processing method was developed for the detection of gamma ray signal arrival time, energy and position information all together for each GAPD channel. For the method developed herein, three DAQ cards continuously acquired 18 channels of pre-amplified analog gamma ray signals and 108-bit digital addresses from 18 PDCs. In the FPGA, the digitized gamma ray pulses and digital addresses were processed to generate data packages containing pulse arrival time, baseline value, energy value and GAPD channel ID. Finally, these data packages were saved to a 128 Mbyte on-board synchronous dynamic random access memory (SDRAM) and then transferred to a host computer for coincidence sorting and image reconstruction. In order to evaluate the functionality of the developed signal processing method, energy and timing resolutions for brain PET were measured via the placement of a 6 μCi 22Na point source at the center of the PET scanner. Furthermore the PET image of the hot rod phantom (rod diameter: from 2.5 mm to 6.5 mm) with activity of 1 mCi was simulated, and then image acquisition experiment was performed using the brain PET. Measured average energy resolution for 1152 GAPD channels and system timing resolution were 19.5% (FWHM%) and 2.7 ns (FWHM), respectively. With regard to the acquisition of the hot rod phantom image, rods could be resolved down to a diameter of 2.5 mm, which was similar to simulated results. The experimental results demonstrated that the signal processing method developed herein was successfully implemented for brain PET. This reduced the complexity, cost and developing duration for PET system relative to normal PET electronics, and it will obviously be useful for the development of high-performance investigational PET systems.
Real-time orthorectification by FPGA-based hardware acceleration
NASA Astrophysics Data System (ADS)
Kuo, David; Gordon, Don
2010-10-01
Orthorectification that corrects the perspective distortion of remote sensing imagery, providing accurate geolocation and ease of correlation to other images is a valuable first-step in image processing for information extraction. However, the large amount of metadata and the floating-point matrix transformations required to operate on each pixel make this a computation and I/O (Input/Output) intensive process. As result much imagery is either left unprocessed or loses timesensitive value in the long processing cycle. However, the computation on each pixel can be reduced substantially by using computational results of the neighboring pixels and accelerated by special pipelined hardware architecture in one to two orders of magnitude. A specialized coprocessor that is implemented inside an FPGA (Field Programmable Gate Array) chip and surrounded by vendorsupported hardware IP (Intellectual Property) shares the computation workload with CPU through PCI-Express interface. The ultimate speed of one pixel per clock (125 MHz) is achieved by the pipelined systolic array architecture. The optimal partition between software and hardware, the timing profile among image I/O and computation, and the highly automated GUI (Graphical User Interface) that fully exploits this speed increase to maximize overall image production throughput will also be discussed. The software that runs on a workstation with the acceleration hardware orthorectifies 16 Megapixels per second, which is 16 times faster than without the hardware. It turns the production time from months to days. A real-life successful story of an imaging satellite company that adopted such workstations for their orthorectified imagery production will be presented. The potential candidacy of the image processing computation that can be accelerated more efficiently by the same approach will also be analyzed.
Research of real-time video processing system based on 6678 multi-core DSP
NASA Astrophysics Data System (ADS)
Li, Xiangzhen; Xie, Xiaodan; Yin, Xiaoqiang
2017-10-01
In the information age, the rapid development in the direction of intelligent video processing, complex algorithm proposed the powerful challenge on the performance of the processor. In this article, through the FPGA + TMS320C6678 frame structure, the image to fog, merge into an organic whole, to stabilize the image enhancement, its good real-time, superior performance, break through the traditional function of video processing system is simple, the product defects such as single, solved the video application in security monitoring, video, etc. Can give full play to the video monitoring effectiveness, improve enterprise economic benefits.
Design for Review - Applying Lessons Learned to Improve the FPGA Review Process
NASA Technical Reports Server (NTRS)
Figueiredo, Marco A.; Li, Kenneth E.
2014-01-01
Flight Field Programmable Gate Array (FPGA) designs are required to be independently reviewed. This paper provides recommendations to Flight FPGA designers to properly prepare their designs for review in order to facilitate the review process, and reduce the impact of the review time in the overall project schedule.
Living in a digital world: features and applications of FPGA in photon detection
NASA Astrophysics Data System (ADS)
Arnesano, Cosimo
Optical spectroscopy and imaging outcomes rely upon many factors; one of the most critical is the photon acquisition and processing method employed. For some types of measurements it may be crucial to acquire every single photon quickly with temporal resolution, but in other cases it is important to acquire as many photons as possible, regardless of the time information about each of them. Fluorescence Lifetime Imaging Microscopy belongs to the first case, where the information of the time of arrival of every single photon in every single pixel is fundamental in obtaining the desired information. Spectral tissue imaging belongs to the second case, where high photon density is needed in order to calculate the optical parameters necessary to build the spectral image. In both cases, the current instrumentation suffers from limitations in terms of acquisition time, duty cycle, cost, and radio-frequency interference and emission. We developed the Digital Frequency-Domain approach for photon acquisition and processing purpose using new digital technology. This approach is based on the use of photon detectors in photon counting mode, and the digital heterodyning method to acquire data which is analyzed in the frequency domain to provide the information of the time of arrival of the photons . In conjunction with the use of pulsed laser sources, this method allows the determination of the time of arrival of the photons using the harmonic content of the frequency domain analysis. The parallel digital FD design is a powerful approach that others the possibility to implement a variety of different applications in fluorescence spectroscopy and microscopy. It can be applied to fluorometry, Fluorescence Lifetime Imaging (FLIM), and Fluorescence Correlation Spectroscopy (FCS), as well as multi frequency and multi wavelength tissue imaging in compact portable medical devices. It dramatically reduces the acquisition time from the several minutes scale to the seconds scale, performs signal processing in a digital fashion avoiding RF emission and it is extremely inexpensive. This development is the result of a systematic study carried on a previous design known as the FLIMBox developed as part of a thesis of another graduate student. The extensive work done in maximizing the performance of the original FLIMBox led us to develop a new hardware solution with exciting and promising results and potential that were not possible in the previous hardware realization, where the signal harmonic content was limited by the FPGA technology. The new design permits acquisition of a much larger harmonic content of the sample response when it is excited with a pulsed light source in one single measurement using the digital mixing principle that was developed in the original design. Furthermore, we used the parallel digital FD principle to perform tissue imaging through Diffuse Optical Spectroscopy (DOS) measurements. We integrated the FLIMBox in a new system that uses a supercontinuum white laser with high brightness as a single light source and photomultipliers with large detection area, both allowing a high penetration depth with extremely low power at the sample. The parallel acquisition, achieved by using the FlimBox, decreases the time required for standard serial systems that scan through all modulation frequencies. Furthermore, the all-digital acquisition avoids analog noise, removes the analog mixer of the conventional frequency domain approach, and it does not generate radio-frequencies, normally present in current analog systems. We are able to obtain a very sensitive acquisition due to the high signal to noise ratio (S/N). The successful results obtained by utilizing digital technology in photon acquisition and processing, prompted us to extend the use of FPGA to other applications, such as phosphorescence detection. Using the FPGA concept we proposed possible solutions to outstanding problems with the current technology. In this thesis I discuss new possible scenarios where new FPGA chips are applied to spectral tissue imaging.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Xiao, Yong; Cheng, Xinyi; Li, Deng; Wang, Liwei
2016-02-01
For the continuous crystal-based positron emission tomography (PET) detector built in our lab, a maximum likelihood algorithm adapted for implementation on a field programmable gate array (FPGA) is proposed to estimate the three-dimensional (3D) coordinate of interaction position with the single-end detected scintillation light response. The row-sum and column-sum readout scheme organizes the 64 channels of photomultiplier (PMT) into eight row signals and eight column signals to be readout for X- and Y-coordinates estimation independently. By the reference events irradiated in a known oblique angle, the probability density function (PDF) for each depth-of-interaction (DOI) segment is generated, by which the reference events in perpendicular irradiation are assigned to DOI segments for generating the PDFs for X and Y estimation in each DOI layer. Evaluated by the experimental data, the algorithm achieves an average X resolution of 1.69 mm along the central X-axis, and DOI resolution of 3.70 mm over the whole thickness (0-10 mm) of crystal. The performance improvements from 2D estimation to the 3D algorithm are also presented. Benefiting from abundant resources of FPGA and a hierarchical storage arrangement, the whole algorithm can be implemented into a middle-scale FPGA. By a parallel structure in pipelines, the 3D position estimator on the FPGA can achieve a processing throughput of 15 M events/s, which is sufficient for the requirement of real-time PET imaging.
Luo, Qiang; Yan, Zhuangzhi; Gu, Dongxing; Cao, Lei
This paper proposed an image interpolation algorithm based on bilinear interpolation and a color correction algorithm based on polynomial regression on FPGA, which focused on the limited number of imaging pixels and color distortion of the ultra-thin electronic endoscope. Simulation experiment results showed that the proposed algorithm realized the real-time display of 1280 x 720@60Hz HD video, and using the X-rite color checker as standard colors, the average color difference was reduced about 30% comparing with that before color correction.
Comparison of laser Doppler and laser speckle contrast imaging using a concurrent processing system
NASA Astrophysics Data System (ADS)
Sun, Shen; Hayes-Gill, Barrie R.; He, Diwei; Zhu, Yiqun; Huynh, Nam T.; Morgan, Stephen P.
2016-08-01
Full field laser Doppler imaging (LDI) and single exposure laser speckle contrast imaging (LSCI) are directly compared using a novel instrument which can concurrently image blood flow using both LDI and LSCI signal processing. Incorporating a commercial CMOS camera chip and a field programmable gate array (FPGA) the flow images of LDI and the contrast maps of LSCI are simultaneously processed by utilizing the same detected optical signals. The comparison was carried out by imaging a rotating diffuser. LDI has a linear response to the velocity. In contrast, LSCI is exposure time dependent and does not provide a linear response in the presence of static speckle. It is also demonstrated that the relationship between LDI and LSCI can be related through a power law which depends on the exposure time of LSCI.
Flight Qualified Micro Sun Sensor
NASA Technical Reports Server (NTRS)
Liebe, Carl Christian; Mobasser, Sohrab; Wrigley, Chris; Schroeder, Jeffrey; Bae, Youngsam; Naegle, James; Katanyoutanant, Sunant; Jerebets, Sergei; Schatzel, Donald; Lee, Choonsup
2007-01-01
A prototype small, lightweight micro Sun sensor (MSS) has been flight qualified as part of the attitude-determination system of a spacecraft or for Mars surface operations. The MSS has previously been reported at a very early stage of development in NASA Tech Briefs, Vol. 28, No. 1 (January 2004). An MSS is essentially a miniature multiple-pinhole electronic camera combined with digital processing electronics that functions analogously to a sundial. A micromachined mask containing a number of microscopic pinholes is mounted in front of an active-pixel sensor (APS). Electronic circuits for controlling the operation of the APS, readout from the pixel photodetectors, and analog-to-digital conversion are all integrated onto the same chip along with the APS. The digital processing includes computation of the centroids of the pinhole Sun images on the APS. The spacecraft computer has the task of converting the Sun centroids into Sun angles utilizing a calibration polynomial. The micromachined mask comprises a 500-micron-thick silicon wafer, onto which is deposited a 57-nm-thick chromium adhesion- promotion layer followed by a 200-nm-thick gold light-absorption layer. The pinholes, 50 microns in diameter, are formed in the gold layer by photolithography. The chromium layer is thin enough to be penetrable by an amount of Sunlight adequate to form measurable pinhole images. A spacer frame between the mask and the APS maintains a gap of .1 mm between the pinhole plane and the photodetector plane of the APS. To minimize data volume, mass, and power consumption, the digital processing of the APS readouts takes place in a single field-programmable gate array (FPGA). The particular FPGA is a radiation- tolerant unit that contains .32,000 gates. No external memory is used so the FPGA calculates the centroids in real time as pixels are read off the APS with minimal internal memory. To enable the MSS to fit into a small package, the APS, the FPGA, and other components are mounted on a single two-sided board following chip-on-board design practices
NASA Astrophysics Data System (ADS)
Xue, Xinwei; Cheryauka, Arvi; Tubbs, David
2006-03-01
CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.
Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph
2013-08-01
Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Architectural design for a low cost FPGA-based traffic signal detection system in vehicles
NASA Astrophysics Data System (ADS)
López, Ignacio; Salvador, Rubén; Alarcón, Jaime; Moreno, Félix
2007-05-01
In this paper we propose an architecture for an embedded traffic signal detection system. Development of Advanced Driver Assistance Systems (ADAS) is one of the major trends of research in automotion nowadays. Examples of past and ongoing projects in the field are CHAMELEON ("Pre-Crash Application all around the vehicle" IST 1999-10108), PREVENT (Preventive and Active Safety Applications, FP6-507075, http://www.prevent-ip.org/) and AVRT in the US (Advanced Vision-Radar Threat Detection (AVRT): A Pre-Crash Detection and Active Safety System). It can be observed a major interest in systems for real-time analysis of complex driving scenarios, evaluating risk and anticipating collisions. The system will use a low cost CCD camera on the dashboard facing the road. The images will be processed by an Altera Cyclone family FPGA. The board does median and Sobel filtering of the incoming frames at PAL rate, and analyzes them for several categories of signals. The result is conveyed to the driver. The scarce resources provided by the hardware require an architecture developed for optimal use. The system will use a combination of neural networks and an adapted blackboard architecture. Several neural networks will be used in sequence for image analysis, by reconfiguring a single, generic hardware neural network in the FPGA. This generic network is optimized for speed, in order to admit several executions within the frame rate. The sequence will follow the execution cycle of the blackboard architecture. The global, blackboard architecture being developed and the hardware architecture for the generic, reconfigurable FPGA perceptron will be explained in this paper. The project is still at an early stage. However, some hardware implementation results are already available and will be offered in the paper.
Economical Implementation of a Filter Engine in an FPGA
NASA Technical Reports Server (NTRS)
Kowalski, James E.
2009-01-01
A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be advantageous to combine compact FPGA digital filter implementations with other application-specific logic implementations on single integrated-circuit chips. An FPGA could readily be tailored to implement a variety of filters because the filter coefficients would be loaded into memory at startup.
FPGA implementation of image dehazing algorithm for real time applications
NASA Astrophysics Data System (ADS)
Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.
2017-09-01
Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Design of a real-time system of moving ship tracking on-board based on FPGA in remote sensing images
NASA Astrophysics Data System (ADS)
Yang, Tie-jun; Zhang, Shen; Zhou, Guo-qing; Jiang, Chuan-xian
2015-12-01
With the broad attention of countries in the areas of sea transportation and trade safety, the requirements of efficiency and accuracy of moving ship tracking are becoming higher. Therefore, a systematic design of moving ship tracking onboard based on FPGA is proposed, which uses the Adaptive Inter Frame Difference (AIFD) method to track a ship with different speed. For the Frame Difference method (FD) is simple but the amount of computation is very large, it is suitable for the use of FPGA to implement in parallel. But Frame Intervals (FIs) of the traditional FD method are fixed, and in remote sensing images, a ship looks very small (depicted by only dozens of pixels) and moves slowly. By applying invariant FIs, the accuracy of FD for moving ship tracking is not satisfactory and the calculation is highly redundant. So we use the adaptation of FD based on adaptive extraction of key frames for moving ship tracking. A FPGA development board of Xilinx Kintex-7 series is used for simulation. The experiments show that compared with the traditional FD method, the proposed one can achieve higher accuracy of moving ship tracking, and can meet the requirement of real-time tracking in high image resolution.
NASA Technical Reports Server (NTRS)
Gregory, Kyle J.; Hill, Joanne E. (Editor); Black, J. Kevin; Baumgartner, Wayne H.; Jahoda, Keith
2016-01-01
A fundamental challenge in a spaceborne application of a gas-based Time Projection Chamber (TPC) for observation of X-ray polarization is handling the large amount of data collected. The TPC polarimeter described uses the APV-25 Application Specific Integrated Circuit (ASIC) to readout a strip detector. Two dimensional photoelectron track images are created with a time projection technique and used to determine the polarization of the incident X-rays. The detector produces a 128x30 pixel image per photon interaction with each pixel registering 12 bits of collected charge. This creates challenging requirements for data storage and downlink bandwidth with only a modest incidence of photons and can have a significant impact on the overall mission cost. An approach is described for locating and isolating the photoelectron track within the detector image, yielding a much smaller data product, typically between 8x8 pixels and 20x20 pixels. This approach is implemented using a Microsemi RT-ProASIC3-3000 Field-Programmable Gate Array (FPGA), clocked at 20 MHz and utilizing 10.7k logic gates (14% of FPGA), 20 Block RAMs (17% of FPGA), and no external RAM. Results will be presented, demonstrating successful photoelectron track cluster detection with minimal impact to detector dead-time.
Okamoto, Takumi; Koide, Tetsushi; Sugi, Koki; Shimizu, Tatsuya; Anh-Tuan Hoang; Tamaki, Toru; Raytchev, Bisser; Kaneda, Kazufumi; Kominami, Yoko; Yoshida, Shigeto; Mieno, Hiroshi; Tanaka, Shinji
2015-08-01
With the increase of colorectal cancer patients in recent years, the needs of quantitative evaluation of colorectal cancer are increased, and the computer-aided diagnosis (CAD) system which supports doctor's diagnosis is essential. In this paper, a hardware design of type identification module in CAD system for colorectal endoscopic images with narrow band imaging (NBI) magnification is proposed for real-time processing of full high definition image (1920 × 1080 pixel). A pyramid style image segmentation with SVMs for multi-size scan windows, which can be implemented on an FPGA with small circuit area and achieve high accuracy, is proposed for actual complex colorectal endoscopic images.
2015-03-26
REAL-TIME RF-DNA FINGERPRINTING OF ZIGBEE DEVICES USING A SOFTWARE-DEFINED RADIO WITH FPGA...not subject to copyright protection in the United States. AFIT-ENG-MS-15-M-054 REAL-TIME RF-DNA FINGERPRINTING OF ZIGBEE DEVICES USING A...REAL-TIME RF-DNA FINGERPRINTING OF ZIGBEE DEVICES USING A SOFTWARE-DEFINED RADIO WITH FPGA PROCESSING William M. Lowder, BSEE, BSCPE
NASA Astrophysics Data System (ADS)
Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza
2013-01-01
Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.
Mobile Ultrasound Plane Wave Beamforming on iPhone or iPad using Metal- based GPU Processing
NASA Astrophysics Data System (ADS)
Hewener, Holger J.; Tretbar, Steffen H.
Mobile and cost effective ultrasound devices are being used in point of care scenarios or the drama room. To reduce the costs of such devices we already presented the possibilities of consumer devices like the Apple iPad for full signal processing of raw data for ultrasound image generation. Using technologies like plane wave imaging to generate a full image with only one excitation/reception event the acquisition times and power consumption of ultrasound imaging can be reduced for low power mobile devices based on consumer electronics realizing the transition from FPGA or ASIC based beamforming into more flexible software beamforming. The massive parallel beamforming processing can be done with the Apple framework "Metal" for advanced graphics and general purpose GPU processing for the iOS platform. We were able to integrate the beamforming reconstruction into our mobile ultrasound processing application with imaging rates up to 70 Hz on iPad Air 2 hardware.
SAD-Based Stereo Vision Machine on a System-on-Programmable-Chip (SoPC)
Zhang, Xiang; Chen, Zhangwei
2013-01-01
This paper, proposes a novel solution for a stereo vision machine based on the System-on-Programmable-Chip (SoPC) architecture. The SOPC technology provides great convenience for accessing many hardware devices such as DDRII, SSRAM, Flash, etc., by IP reuse. The system hardware is implemented in a single FPGA chip involving a 32-bit Nios II microprocessor, which is a configurable soft IP core in charge of managing the image buffer and users' configuration data. The Sum of Absolute Differences (SAD) algorithm is used for dense disparity map computation. The circuits of the algorithmic module are modeled by the Matlab-based DSP Builder. With a set of configuration interfaces, the machine can process many different sizes of stereo pair images. The maximum image size is up to 512 K pixels. This machine is designed to focus on real time stereo vision applications. The stereo vision machine offers good performance and high efficiency in real time. Considering a hardware FPGA clock of 90 MHz, 23 frames of 640 × 480 disparity maps can be obtained in one second with 5 × 5 matching window and maximum 64 disparity pixels. PMID:23459385
High-speed polarization sensitive optical coherence tomography for retinal diagnostics
NASA Astrophysics Data System (ADS)
Yin, Biwei; Wang, Bingqing; Vemishetty, Kalyanramu; Nagle, Jim; Liu, Shuang; Wang, Tianyi; Rylander, Henry G., III; Milner, Thomas E.
2012-01-01
We report design and construction of an FPGA-based high-speed swept-source polarization-sensitive optical coherence tomography (SS-PS-OCT) system for clinical retinal imaging. Clinical application of the SS-PS-OCT system is accurate measurement and display of thickness, phase retardation and birefringence maps of the retinal nerve fiber layer (RNFL) in human subjects for early detection of glaucoma. The FPGA-based SS-PS-OCT system provides three incident polarization states on the eye and uses a bulk-optic polarization sensitive balanced detection module to record two orthogonal interference fringe signals. Interference fringe signals and relative phase retardation between two orthogonal polarization states are used to obtain Stokes vectors of light returning from each RNFL depth. We implement a Levenberg-Marquardt algorithm on a Field Programmable Gate Array (FPGA) to compute accurate phase retardation and birefringence maps. For each retinal scan, a three-state Levenberg-Marquardt nonlinear algorithm is applied to 360 clusters each consisting of 100 A-scans to determine accurate maps of phase retardation and birefringence in less than 1 second after patient measurement allowing real-time clinical imaging-a speedup of more than 300 times over previous implementations. We report application of the FPGA-based SS-PS-OCT system for real-time clinical imaging of patients enrolled in a clinical study at the Eye Institute of Austin and Duke Eye Center.
FPGA-based gating and logic for multichannel single photon counting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pooser, Raphael C; Earl, Dennis Duncan; Evans, Philip G
2012-01-01
We present results characterizing multichannel InGaAs single photon detectors utilizing gated passive quenching circuits (GPQC), self-differencing techniques, and field programmable gate array (FPGA)-based logic for both diode gating and coincidence counting. Utilizing FPGAs for the diode gating frontend and the logic counting backend has the advantage of low cost compared to custom built logic circuits and current off-the-shelf detector technology. Further, FPGA logic counters have been shown to work well in quantum key distribution (QKD) test beds. Our setup combines multiple independent detector channels in a reconfigurable manner via an FPGA backend and post processing in order to perform coincidencemore » measurements between any two or more detector channels simultaneously. Using this method, states from a multi-photon polarization entangled source are detected and characterized via coincidence counting on the FPGA. Photons detection events are also processed by the quantum information toolkit for application testing (QITKAT)« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kirkham, R.; Siddons, D.; Dunn, P.A.
2010-06-23
The Maia detector system is engineered for energy dispersive x-ray fluorescence spectroscopy and elemental imaging at photon rates exceeding 10{sup 7}/s, integrated scanning of samples for pixel transit times as small as 50 {micro}s and high definition images of 10{sup 8} pixels and real-time processing of detected events for spectral deconvolution and online display of pure elemental images. The system developed by CSIRO and BNL combines a planar silicon 384 detector array, application-specific integrated circuits for pulse shaping and peak detection and sampling and optical data transmission to an FPGA-based pipelined, parallel processor. This paper describes the system and themore » underpinning engineering solutions.« less
FPGA design for constrained energy minimization
NASA Astrophysics Data System (ADS)
Wang, Jianwei; Chang, Chein-I.; Cao, Mang
2004-02-01
The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.
Trigger and Readout System for the Ashra-1 Detector
NASA Astrophysics Data System (ADS)
Aita, Y.; Aoki, T.; Asaoka, Y.; Morimoto, Y.; Motz, H. M.; Sasaki, M.; Abiko, C.; Kanokohata, C.; Ogawa, S.; Shibuya, H.; Takada, T.; Kimura, T.; Learned, J. G.; Matsuno, S.; Kuze, S.; Binder, P. M.; Goldman, J.; Sugiyama, N.; Watanabe, Y.
Highly sophisticated trigger and readout system has been developed for All-sky Survey High Resolution Air-shower (Ashra) detector. Ashra-1 detector has 42 degree diameter field of view. Detection of Cherenkov and fluorescence light from large background in the large field of view requires finely segmented and high speed trigger and readout system. The system is composed of optical fiber image transmission system, 64 × 64 channel trigger sensor and FPGA based trigger logic processor. The system typically processes the image within 10 to 30 ns and opens the shutter on the fine CMOS sensor. 64 × 64 coarse split image is transferred via 64 × 64 precisely aligned optical fiber bundle to a photon sensor. Current signals from the photon sensor are discriminated by custom made trigger amplifiers. FPGA based processor processes 64 × 64 hit pattern and correspondent partial area of the fine image is acquired. Commissioning earth skimming tau neutrino observational search was carried out with this trigger system. In addition to the geometrical advantage of the Ashra observational site, the excellent tau shower axis measurement based on the fine imaging and the night sky background rejection based on the fine and fast imaging allow zero background tau shower search. Adoption of the optical fiber bundle and trigger LSI realized 4k channel trigger system cheaply. Detectability of tau shower is also confirmed by simultaneously observed Cherenkov air shower. Reduction of the trigger threshold appears to enhance the effective area especially in PeV tau neutrino energy region. New two dimensional trigger LSI was introduced and the trigger threshold was lowered. New calibration system of the trigger system was recently developed and introduced to the Ashra detector
Neuromorphic VLSI vision system for real-time texture segregation.
Shimonomura, Kazuhiro; Yagi, Tetsuya
2008-10-01
The visual system of the brain can perceive an external scene in real-time with extremely low power dissipation, although the response speed of an individual neuron is considerably lower than that of semiconductor devices. The neurons in the visual pathway generate their receptive fields using a parallel and hierarchical architecture. This architecture of the visual cortex is interesting and important for designing a novel perception system from an engineering perspective. The aim of this study is to develop a vision system hardware, which is designed inspired by a hierarchical visual processing in V1, for real time texture segregation. The system consists of a silicon retina, orientation chip, and field programmable gate array (FPGA) circuit. The silicon retina emulates the neural circuits of the vertebrate retina and exhibits a Laplacian-Gaussian-like receptive field. The orientation chip selectively aggregates multiple pixels of the silicon retina in order to produce Gabor-like receptive fields that are tuned to various orientations by mimicking the feed-forward model proposed by Hubel and Wiesel. The FPGA circuit receives the output of the orientation chip and computes the responses of the complex cells. Using this system, the neural images of simple cells were computed in real-time for various orientations and spatial frequencies. Using the orientation-selective outputs obtained from the multi-chip system, a real-time texture segregation was conducted based on a computational model inspired by psychophysics and neurophysiology. The texture image was filtered by the two orthogonally oriented receptive fields of the multi-chip system and the filtered images were combined to segregate the area of different texture orientation with the aid of FPGA. The present system is also useful for the investigation of the functions of the higher-order cells that can be obtained by combining the simple and complex cells.
Image processing for a tactile/vision substitution system using digital CNN.
Lin, Chien-Nan; Yu, Sung-Nien; Hu, Jin-Cheng
2006-01-01
In view of the parallel processing and easy implementation properties of CNN, we propose to use digital CNN as the image processor of a tactile/vision substitution system (TVSS). The digital CNN processor is used to execute the wavelet down-sampling filtering and the half-toning operations, aiming to extract important features from the images. A template combination method is used to embed the two image processing functions into a single CNN processor. The digital CNN processor is implemented on an intellectual property (IP) and is implemented on a XILINX VIRTEX II 2000 FPGA board. Experiments are designated to test the capability of the CNN processor in the recognition of characters and human subjects in different environments. The experiments demonstrates impressive results, which proves the proposed digital CNN processor a powerful component in the design of efficient tactile/vision substitution systems for the visually impaired people.
Laser Speckle Imaging of Cerebral Blood Flow
NASA Astrophysics Data System (ADS)
Luo, Qingming; Jiang, Chao; Li, Pengcheng; Cheng, Haiying; Wang, Zhen; Wang, Zheng; Tuchin, Valery V.
Monitoring the spatio-temporal characteristics of cerebral blood flow (CBF) is crucial for studying the normal and pathophysiologic conditions of brain metabolism. By illuminating the cortex with laser light and imaging the resulting speckle pattern, relative CBF images with tens of microns spatial and millisecond temporal resolution can be obtained. In this chapter, a laser speckle imaging (LSI) method for monitoring dynamic, high-resolution CBF is introduced. To improve the spatial resolution of current LSI, a modified LSI method is proposed. To accelerate the speed of data processing, three LSI data processing frameworks based on graphics processing unit (GPU), digital signal processor (DSP), and field-programmable gate array (FPGA) are also presented. Applications for detecting the changes in local CBF induced by sensory stimulation and thermal stimulation, the influence of a chemical agent on CBF, and the influence of acute hyperglycemia following cortical spreading depression on CBF are given.
Lifting Scheme DWT Implementation in a Wireless Vision Sensor Network
NASA Astrophysics Data System (ADS)
Ong, Jia Jan; Ang, L.-M.; Seng, K. P.
This paper presents the practical implementation of a Wireless Visual Sensor Network (WVSN) with DWT processing on the visual nodes. WVSN consists of visual nodes that capture video and transmit to the base-station without processing. Limitation of network bandwidth restrains the implementation of real time video streaming from remote visual nodes through wireless communication. Three layers of DWT filters are implemented to process the captured image from the camera. With having all the wavelet coefficients produced, it is possible just to transmit the low frequency band coefficients and obtain an approximate image at the base-station. This will reduce the amount of power required in transmission. When necessary, transmitting all the wavelet coefficients will produce the full detail of image, which is similar to the image captured at the visual nodes. The visual node combines the CMOS camera, Xilinx Spartan-3L FPGA and wireless ZigBee® network that uses the Ember EM250 chip.
Wilson, Jesse W.; Park, Jong Kang; Warren, Warren S.
2015-01-01
The lock-in amplifier is a critical component in many different types of experiments, because of its ability to reduce spurious or environmental noise components by restricting detection to a single frequency and phase. One example application is pump-probe microscopy, a multiphoton technique that leverages excited-state dynamics for imaging contrast. With this application in mind, we present here the design and implementation of a high-speed lock-in amplifier on the field-programmable gate array (FPGA) coprocessor of a data acquisition board. The most important advantage is the inherent ability to filter signals based on more complex modulation patterns. As an example, we use the flexibility of the FPGA approach to enable a novel pump-probe detection scheme based on spread-spectrum communications techniques. PMID:25832238
Wilson, Jesse W; Park, Jong Kang; Warren, Warren S; Fischer, Martin C
2015-03-01
The lock-in amplifier is a critical component in many different types of experiments, because of its ability to reduce spurious or environmental noise components by restricting detection to a single frequency and phase. One example application is pump-probe microscopy, a multiphoton technique that leverages excited-state dynamics for imaging contrast. With this application in mind, we present here the design and implementation of a high-speed lock-in amplifier on the field-programmable gate array (FPGA) coprocessor of a data acquisition board. The most important advantage is the inherent ability to filter signals based on more complex modulation patterns. As an example, we use the flexibility of the FPGA approach to enable a novel pump-probe detection scheme based on spread-spectrum communications techniques.
A FPGA implementation for linearly unmixing a hyperspectral image using OpenCL
NASA Astrophysics Data System (ADS)
Guerra, Raúl; López, Sebastián.; Sarmiento, Roberto
2017-10-01
Hyperspectral imaging systems provide images in which single pixels have information from across the electromagnetic spectrum of the scene under analysis. These systems divide the spectrum into many contiguos channels, which may be even out of the visible part of the spectra. The main advantage of the hyperspectral imaging technology is that certain objects leave unique fingerprints in the electromagnetic spectrum, known as spectral signatures, which allow to distinguish between different materials that may look like the same in a traditional RGB image. Accordingly, the most important hyperspectral imaging applications are related with distinguishing or identifying materials in a particular scene. In hyperspectral imaging applications under real-time constraints, the huge amount of information provided by the hyperspectral sensors has to be rapidly processed and analysed. For such purpose, parallel hardware devices, such as Field Programmable Gate Arrays (FPGAs) are typically used. However, developing hardware applications typically requires expertise in the specific targeted device, as well as in the tools and methodologies which can be used to perform the implementation of the desired algorithms in the specific device. In this scenario, the Open Computing Language (OpenCL) emerges as a very interesting solution in which a single high-level synthesis design language can be used to efficiently develop applications in multiple and different hardware devices. In this work, the Fast Algorithm for Linearly Unmixing Hyperspectral Images (FUN) has been implemented into a Bitware Stratix V Altera FPGA using OpenCL. The obtained results demonstrate the suitability of OpenCL as a viable design methodology for quickly creating efficient FPGAs designs for real-time hyperspectral imaging applications.
The Real Time Correction of Stereoscopic Images: From the Serial to a Parallel Treatment
NASA Astrophysics Data System (ADS)
Irki, Zohir; Devy, Michel; Achour, Karim; Azzaz, Mohamed Salah
2008-06-01
The correction of the stereoscopic images is a task which consists in replacing acquired images by other images having the same properties but which are simpler to use in the other stages of stereovision. The use of the pre-calculated tables, built during an off line calibration step, made it possible to carry out the off line stereoscopic images rectification. An improvement of the built tables made it possible to carry out the real time rectification. In this paper, we describe an improvement of the real time correction approach so it can be exploited for a possible implementation on an FPGA component. This improvement holds in account the real time aspect of the correction and the available resources that can offer the FPGA Type Stratix 1S40F780C5.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms
NASA Astrophysics Data System (ADS)
Du, Hongtao; Qi, Hairong; Szu, Harold H.
2005-03-01
Recent advances in intelligence technology have boosted the development of micro- Unmanned Air Vehicles (UAVs) including Sliver Fox, Shadow, and Scan Eagle for various surveillance and reconnaissance applications. These affordable and reusable devices have to fit a series of size, weight, and power constraints. Cameras used on such micro-UAVs are therefore mounted directly at a fixed angle without any motion-compensated gimbals. This mounting scheme has resulted in the so-called jitter effect in which jitter is defined as sub-pixel or small amplitude vibrations. The jitter blur caused by the jitter effect needs to be corrected before any other processing algorithms can be practically applied. Jitter restoration has been solved by various optimization techniques, including Wiener approximation, maximum a-posteriori probability (MAP), etc. However, these algorithms normally assume a spatial-invariant blur model that is not the case with jitter blur. Szu et al. developed a smart real-time algorithm based on auto-regression (AR) with its natural generalization of unsupervised artificial neural network (ANN) learning to achieve restoration accuracy at the sub-pixel level. This algorithm resembles the capability of the human visual system, in which an agreement between the pair of eyes indicates "signal", otherwise, the jitter noise. Using this non-statistical method, for each single pixel, a deterministic blind sources separation (BSS) process can then be carried out independently based on a deterministic minimum of the Helmholtz free energy with a generalization of Shannon's information theory applied to open dynamic systems. From a hardware implementation point of view, the process of jitter restoration of an image using Szu's algorithm can be optimized by pixel-based parallelization. In our previous work, a parallelly structured independent component analysis (ICA) algorithm has been implemented on both Field Programmable Gate Array (FPGA) and Application-Specific Integrated Circuit (ASIC) using standard-height cells. ICA is an algorithm that can solve BSS problems by carrying out the all-order statistical, decorrelation-based transforms, in which an assumption that neighborhood pixels share the same but unknown mixing matrix A is made. In this paper, we continue our investigation on the design challenges of firmware approaches to smart algorithms. We think two levels of parallelization can be explored, including pixel-based parallelization and the parallelization of the restoration algorithm performed at each pixel. This paper focuses on the latter and we use ICA as an example to explain the design and implementation methods. It is well known that the capacity constraints of single FPGA have limited the implementation of many complex algorithms including ICA. Using the reconfigurability of FPGA, we show, in this paper, how to manipulate the FPGA-based system to provide extra computing power for the parallelized ICA algorithm with limited FPGA resources. The synthesis aiming at the pilchard re-configurable FPGA platform is reported. The pilchard board is embedded with single Xilinx VIRTEX 1000E FPGA and transfers data directly to CPU on the 64-bit memory bus at the maximum frequency of 133MHz. Both the feasibility performance evaluations and experimental results validate the effectiveness and practicality of this synthesis, which can be extended to the spatial-variant jitter restoration for micro-UAV deployment.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA
NASA Astrophysics Data System (ADS)
Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing
Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
Estimating the circuit delay of FPGA with a transfer learning method
NASA Astrophysics Data System (ADS)
Cui, Xiuhai; Liu, Datong; Peng, Yu; Peng, Xiyuan
2017-10-01
With the increase of FPGA (Field Programmable Gate Array, FPGA) functionality, FPGA has become an on-chip system platform. Due to increase the complexity of FPGA, estimating the delay of FPGA is a very challenge work. To solve the problems, we propose a transfer learning estimation delay (TLED) method to simplify the delay estimation of different speed grade FPGA. In fact, the same style different speed grade FPGA comes from the same process and layout. The delay has some correlation among different speed grade FPGA. Therefore, one kind of speed grade FPGA is chosen as a basic training sample in this paper. Other training samples of different speed grade can get from the basic training samples through of transfer learning. At the same time, we also select a few target FPGA samples as training samples. A general predictive model is trained by these samples. Thus one kind of estimation model is used to estimate different speed grade FPGA circuit delay. The framework of TRED includes three phases: 1) Building a basic circuit delay library which includes multipliers, adders, shifters, and so on. These circuits are used to train and build the predictive model. 2) By contrasting experiments among different algorithms, the forest random algorithm is selected to train predictive model. 3) The target circuit delay is predicted by the predictive model. The Artix-7, Kintex-7, and Virtex-7 are selected to do experiments. Each of them includes -1, -2, -2l, and -3 different speed grade. The experiments show the delay estimation accuracy score is more than 92% with the TLED method. This result shows that the TLED method is a feasible delay assessment method, especially in the high-level synthesis stage of FPGA tool, which is an efficient and effective delay assessment method.
Research of aerial imaging spectrometer data acquisition technology based on USB 3.0
NASA Astrophysics Data System (ADS)
Huang, Junze; Wang, Yueming; He, Daogang; Yu, Yanan
2016-11-01
With the emergence of UAV (unmanned aerial vehicle) platform for aerial imaging spectrometer, research of aerial imaging spectrometer DAS(data acquisition system) faces new challenges. Due to the limitation of platform and other factors, the aerial imaging spectrometer DAS requires small-light, low-cost and universal. Traditional aerial imaging spectrometer DAS system is expensive, bulky, non-universal and unsupported plug-and-play based on PCIe. So that has been unable to meet promotion and application of the aerial imaging spectrometer. In order to solve these problems, the new data acquisition scheme bases on USB3.0 interface.USB3.0 can provide guarantee of small-light, low-cost and universal relying on the forward-looking technology advantage. USB3.0 transmission theory is up to 5Gbps.And the GPIF programming interface achieves 3.2Gbps of the effective theoretical data bandwidth.USB3.0 can fully meet the needs of the aerial imaging spectrometer data transmission rate. The scheme uses the slave FIFO asynchronous data transmission mode between FPGA and USB3014 interface chip. Firstly system collects spectral data from TLK2711 of high-speed serial interface chip. Then FPGA receives data in DDR2 cache after ping-pong data processing. Finally USB3014 interface chip transmits data via automatic-dma approach and uploads to PC by USB3.0 cable. During the manufacture of aerial imaging spectrometer, the DAS can achieve image acquisition, transmission, storage and display. All functions can provide the necessary test detection for aerial imaging spectrometer. The test shows that system performs stable and no data lose. Average transmission speed and storage speed of writing SSD can stabilize at 1.28Gbps. Consequently ,this data acquisition system can meet application requirements for aerial imaging spectrometer.
A Subsystem Test Bed for Chinese Spectral Radioheliograph
NASA Astrophysics Data System (ADS)
Zhao, An; Yan, Yihua; Wang, Wei
2014-11-01
The Chinese Spectral Radioheliograph is a solar dedicated radio interferometric array that will produce high spatial resolution, high temporal resolution, and high spectral resolution images of the Sun simultaneously in decimetre and centimetre wave range. Digital processing of intermediate frequency signal is an important part in a radio telescope. This paper describes a flexible and high-speed digital down conversion system for the CSRH by applying complex mixing, parallel filtering, and extracting algorithms to process IF signal at the time of being designed and incorporates canonic-signed digit coding and bit-plane method to improve program efficiency. The DDC system is intended to be a subsystem test bed for simulation and testing for CSRH. Software algorithms for simulation and hardware language algorithms based on FPGA are written which use less hardware resources and at the same time achieve high performances such as processing high-speed data flow (1 GHz) with 10 MHz spectral resolution. An experiment with the test bed is illustrated by using geostationary satellite data observed on March 20, 2014. Due to the easy alterability of the algorithms on FPGA, the data can be recomputed with different digital signal processing algorithms for selecting optimum algorithm.
Low-cost, high-speed back-end processing system for high-frequency ultrasound B-mode imaging.
Chang, Jin Ho; Sun, Lei; Yen, Jesse T; Shung, K Kirk
2009-07-01
For real-time visualization of the mouse heart (6 to 13 beats per second), a back-end processing system involving high-speed signal processing functions to form and display images has been developed. This back-end system was designed with new signal processing algorithms to achieve a frame rate of more than 400 images per second. These algorithms were implemented in a simple and cost-effective manner with a single field-programmable gate array (FPGA) and software programs written in C++. The operating speed of the back-end system was investigated by recording the time required for transferring an image to a personal computer. Experimental results showed that the back-end system is capable of producing 433 images per second. To evaluate the imaging performance of the back-end system, a complete imaging system was built. This imaging system, which consisted of a recently reported high-speed mechanical sector scanner assembled with the back-end system, was tested by imaging a wire phantom, a pig eye (in vitro), and a mouse heart (in vivo). It was shown that this system is capable of providing high spatial resolution images with fast temporal resolution.
Low-Cost, High-Speed Back-End Processing System for High-Frequency Ultrasound B-Mode Imaging
Chang, Jin Ho; Sun, Lei; Yen, Jesse T.; Shung, K. Kirk
2009-01-01
For real-time visualization of the mouse heart (6 to 13 beats per second), a back-end processing system involving high-speed signal processing functions to form and display images has been developed. This back-end system was designed with new signal processing algorithms to achieve a frame rate of more than 400 images per second. These algorithms were implemented in a simple and cost-effective manner with a single field-programmable gate array (FPGA) and software programs written in C++. The operating speed of the back-end system was investigated by recording the time required for transferring an image to a personal computer. Experimental results showed that the back-end system is capable of producing 433 images per second. To evaluate the imaging performance of the back-end system, a complete imaging system was built. This imaging system, which consisted of a recently reported high-speed mechanical sector scanner assembled with the back-end system, was tested by imaging a wire phantom, a pig eye (in vitro), and a mouse heart (in vivo). It was shown that this system is capable of providing high spatial resolution images with fast temporal resolution. PMID:19574160
FPGA Sequencer for Radar Altimeter Applications
NASA Technical Reports Server (NTRS)
Berkun, Andrew C.; Pollard, Brian D.; Chen, Curtis W.
2011-01-01
A sequencer for a radar altimeter provides accurate attitude information for a reliable soft landing of the Mars Science Laboratory (MSL). This is a field-programmable- gate-array (FPGA)-only implementation. A table loaded externally into the FPGA controls timing, processing, and decision structures. Radar is memory-less and does not use previous acquisitions to assist in the current acquisition. All cycles complete in exactly 50 milliseconds, regardless of range or whether a target was found. A RAM (random access memory) within the FPGA holds instructions for up to 15 sets. For each set, timing is run, echoes are processed, and a comparison is made. If a target is seen, more detailed processing is run on that set. If no target is seen, the next set is tried. When all sets have been run, the FPGA terminates and waits for the next 50-millisecond event. This setup simplifies testing and improves reliability. A single vertex chip does the work of an entire assembly. Output products require minor processing to become range and velocity. This technology is the heart of the Terminal Descent Sensor, which is an integral part of the Entry Decent and Landing system for MSL. In addition, it is a strong candidate for manned landings on Mars or the Moon.
NASA Astrophysics Data System (ADS)
Yang, C.; Zheng, W.; Zhang, M.; Yuan, T.; Zhuang, G.; Pan, Y.
2016-06-01
Measurement and control of the plasma in real-time are critical for advanced Tokamak operation. It requires high speed real-time data acquisition and processing. ITER has designed the Fast Plant System Controllers (FPSC) for these purposes. At J-TEXT Tokamak, a real-time data acquisition and processing framework has been designed and implemented using standard ITER FPSC technologies. The main hardware components of this framework are an Industrial Personal Computer (IPC) with a real-time system and FlexRIO devices based on FPGA. With FlexRIO devices, data can be processed by FPGA in real-time before they are passed to the CPU. The software elements are based on a real-time framework which runs under Red Hat Enterprise Linux MRG-R and uses Experimental Physics and Industrial Control System (EPICS) for monitoring and configuring. That makes the framework accord with ITER FPSC standard technology. With this framework, any kind of data acquisition and processing FlexRIO FPGA program can be configured with a FPSC. An application using the framework has been implemented for the polarimeter-interferometer diagnostic system on J-TEXT. The application is able to extract phase-shift information from the intermediate frequency signal produced by the polarimeter-interferometer diagnostic system and calculate plasma density profile in real-time. Different algorithms implementations on the FlexRIO FPGA are compared in the paper.
Reconfigurable Gabor Filter For Fingerprint Recognition Using FPGA Verilog
NASA Astrophysics Data System (ADS)
Rosshidi, H. T.; Hadi, A. R.
2009-06-01
This paper present the implementations of Gabor filter for fingerprint recognition using Verilog HDL. This work demonstrates the application of Gabor Filter technique to enhance the fingerprint image. The incoming signal in form of image pixel will be filter out or convolute by the Gabor filter to define the ridge and valley regions of fingerprint. This is done with the application of a real time convolve based on Field Programmable Gate Array (FPGA) to perform the convolution operation. The main characteristic of the proposed approach are the usage of memory to store the incoming image pixel and the coefficient of the Gabor filter before the convolution matrix take place. The result was the signal convoluted with the Gabor coefficient.
Adapting the Reconfigurable SpaceCube Processing System for Multiple Mission Applications
NASA Technical Reports Server (NTRS)
Petrick, Dave
2014-01-01
This paper will detail the use of SpaceCube in multiple space flight applications including the Hubble Space Telescope Servicing Mission 4 (HST-SM4), an International Space Station (ISS) radiation test bed experiment, and the main avionics subsystem for two separate ISS attached payloads. Each mission has had varying degrees of data processing complexities, performance requirements, and external interfaces. We will show the methodology used to minimize the changes required to the physical hardware, FPGA designs, embedded software interfaces, and testing.This paper will summarize significant results as they apply to each mission application. In the HST-SM4 application we utilized the FPGA resources to accelerate portions of the image processing algorithms more than 25 times faster than a standard space processor in order to meet computational speed requirements. For the ISS radiation on-orbit demonstration, the main goal is to show that we can rely on the commercial FPGAs and processors in a space environment. We describe our FPGA and processor radiation mitigation strategies that have resulted in our eight PowerPCs being available and error free for more than 99.99 of the time over the period of four years. This positive data and proven reliability of the SpaceCube on ISS resulted in the Department of Defense (DoD) selecting SpaceCube, which is replacing an older and slower computer currently used on ISS, as the main avionics for two upcoming ISS experiment campaigns. This paper will show how we quickly reconfigured the SpaceCube system to meet the more stringent reliability requirements
Design of an FPGA-Based Algorithm for Real-Time Solutions of Statistics-Based Positioning
DeWitt, Don; Johnson-Williams, Nathan G.; Miyaoka, Robert S.; Li, Xiaoli; Lockhart, Cate; Lewellen, Tom K.; Hauck, Scott
2010-01-01
We report on the implementation of an algorithm and hardware platform to allow real-time processing of the statistics-based positioning (SBP) method for continuous miniature crystal element (cMiCE) detectors. The SBP method allows an intrinsic spatial resolution of ~1.6 mm FWHM to be achieved using our cMiCE design. Previous SBP solutions have required a postprocessing procedure due to the computation and memory intensive nature of SBP. This new implementation takes advantage of a combination of algebraic simplifications, conversion to fixed-point math, and a hierarchal search technique to greatly accelerate the algorithm. For the presented seven stage, 127 × 127 bin LUT implementation, these algorithm improvements result in a reduction from >7 × 106 floating-point operations per event for an exhaustive search to < 5 × 103 integer operations per event. Simulations show nearly identical FWHM positioning resolution for this accelerated SBP solution, and positioning differences of <0.1 mm from the exhaustive search solution. A pipelined field programmable gate array (FPGA) implementation of this optimized algorithm is able to process events in excess of 250 K events per second, which is greater than the maximum expected coincidence rate for an individual detector. In contrast with all detectors being processed at a centralized host, as in the current system, a separate FPGA is available at each detector, thus dividing the computational load. These methods allow SBP results to be calculated in real-time and to be presented to the image generation components in real-time. A hardware implementation has been developed using a commercially available prototype board. PMID:21197135
NASA Astrophysics Data System (ADS)
Benkrid, K.; Belkacemi, S.; Sukhsawas, S.
2005-06-01
This paper proposes an integrated framework for the high level design of high performance signal processing algorithms' implementations on FPGAs. The framework emerged from a constant need to rapidly implement increasingly complicated algorithms on FPGAs while maintaining the high performance needed in many real time digital signal processing applications. This is particularly important for application developers who often rely on iterative and interactive development methodologies. The central idea behind the proposed framework is to dynamically integrate high performance structural hardware description languages with higher level hardware languages in other to help satisfy the dual requirement of high level design and high performance implementation. The paper illustrates this by integrating two environments: Celoxica's Handel-C language, and HIDE, a structural hardware environment developed at the Queen's University of Belfast. On the one hand, Handel-C has been proven to be very useful in the rapid design and prototyping of FPGA circuits, especially control intensive ones. On the other hand, HIDE, has been used extensively, and successfully, in the generation of highly optimised parameterisable FPGA cores. In this paper, this is illustrated in the construction of a scalable and fully parameterisable core for image algebra's five core neighbourhood operations, where fully floorplanned efficient FPGA configurations, in the form of EDIF netlists, are generated automatically for instances of the core. In the proposed combined framework, highly optimised data paths are invoked dynamically from within Handel-C, and are synthesized using HIDE. Although the idea might seem simple prima facie, it could have serious implications on the design of future generations of hardware description languages.
A visually guided collision warning system with a neuromorphic architecture.
Okuno, Hirotsugu; Yagi, Tetsuya
2008-12-01
We have designed a visually guided collision warning system with a neuromorphic architecture, employing an algorithm inspired by the visual nervous system of locusts. The system was implemented with mixed analog-digital integrated circuits consisting of an analog resistive network and field-programmable gate array (FPGA) circuits. The resistive network processes the interaction between the laterally spreading excitatory and inhibitory signals instantaneously, which is essential for real-time computation of collision avoidance with a low power consumption and a compact hardware. The system responded selectively to approaching objects of simulated movie images at close range. The system was, however, confronted with serious noise problems due to the vibratory ego-motion, when it was installed in a mobile miniature car. To overcome this problem, we developed the algorithm, which is also installable in FPGA circuits, in order for the system to respond robustly during the ego-motion.
In-Storage Embedded Accelerator for Sparse Pattern Processing
2016-08-13
performance of RAM disk. Since this configuration offloads most of processing onto the FPGA, the host software consists of only two threads for...more. Fig. 13. Document Processed vs CPU Threads Note that BlueDBM efficiency comes from our in-store processing paradigm that uses the FPGA...In-Storage Embedded Accelerator for Sparse Pattern Processing Sang-Woo Jun*, Huy T. Nguyen#, Vijay Gadepally#*, and Arvind* #MIT Lincoln Laboratory
NASA Astrophysics Data System (ADS)
Xu, Zhipeng; Wei, Jun; Li, Jianwei; Zhou, Qianting
2010-11-01
An image spectrometer of a spatial remote sensing satellite requires shortwave band range from 2.1μm to 3μm which is one of the most important bands in remote sensing. We designed an infrared sub-system of the image spectrometer using a homemade 640x1 InGaAs shortwave infrared sensor working on FPA system which requires high uniformity and low level of dark current. The working temperature should be -15+/-0.2 Degree Celsius. This paper studies the model of noise for focal plane array (FPA) system, investigated the relationship with temperature and dark current noise, and adopts Incremental PID algorithm to generate PWM wave in order to control the temperature of the sensor. There are four modules compose of the FPGA module design. All of the modules are coded by VHDL and implemented in FPGA device APA300. Experiment shows the intelligent temperature control system succeeds in controlling the temperature of the sensor.
An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed
NASA Astrophysics Data System (ADS)
Kim, H.; Choi, Y.; Yang, Y.
In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
NASA Astrophysics Data System (ADS)
Faerber, Christian
2017-10-01
The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.
GPU real-time processing in NA62 trigger system
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-01-01
A commercial Graphics Processing Unit (GPU) is used to build a fast Level 0 (L0) trigger system tested parasitically with the TDAQ (Trigger and Data Acquisition systems) of the NA62 experiment at CERN. In particular, the parallel computing power of the GPU is exploited to perform real-time fitting in the Ring Imaging CHerenkov (RICH) detector. Direct GPU communication using a FPGA-based board has been used to reduce the data transmission latency. The performance of the system for multi-ring reconstrunction obtained during the NA62 physics run will be presented.
Yang, Qiang; Arathorn, David W.; Tiruveedhula, Pavan; Vogel, Curtis R.; Roorda, Austin
2010-01-01
We demonstrate an integrated FPGA solution to project highly stabilized, aberration-corrected stimuli directly onto the retina by means of real-time retinal image motion signals in combination with high speed modulation of a scanning laser. By reducing the latency between target location prediction and stimulus delivery, the stimulus location accuracy, in a subject with good fixation, is improved to 0.15 arcminutes from 0.26 arcminutes in our earlier solution. We also demonstrate the new FPGA solution is capable of delivering stabilized large stimulus pattern (up to 256x256 pixels) to the retina. PMID:20721171
Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions
Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners
1995-01-01
In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...
Implementation of a Real-Time Stacking Algorithm in a Photogrammetric Digital Camera for Uavs
NASA Astrophysics Data System (ADS)
Audi, A.; Pierrot-Deseilligny, M.; Meynard, C.; Thom, C.
2017-08-01
In the recent years, unmanned aerial vehicles (UAVs) have become an interesting tool in aerial photography and photogrammetry activities. In this context, some applications (like cloudy sky surveys, narrow-spectral imagery and night-vision imagery) need a longexposure time where one of the main problems is the motion blur caused by the erratic camera movements during image acquisition. This paper describes an automatic real-time stacking algorithm which produces a high photogrammetric quality final composite image with an equivalent long-exposure time using several images acquired with short-exposure times. Our method is inspired by feature-based image registration technique. The algorithm is implemented on the light-weight IGN camera, which has an IMU sensor and a SoC/FPGA. To obtain the correct parameters for the resampling of images, the presented method accurately estimates the geometrical relation between the first and the Nth image, taking into account the internal parameters and the distortion of the camera. Features are detected in the first image by the FAST detector, than homologous points on other images are obtained by template matching aided by the IMU sensors. The SoC/FPGA in the camera is used to speed up time-consuming parts of the algorithm such as features detection and images resampling in order to achieve a real-time performance as we want to write only the resulting final image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images, as well as block diagrams of the described architecture. The resulting stacked image obtained on real surveys doesn't seem visually impaired. Timing results demonstrate that our algorithm can be used in real-time since its processing time is less than the writing time of an image in the storage device. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real-time the gyrometers of the IMU.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.
Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir
2013-01-01
DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
Experiences on developing digital down conversion algorithms using Xilinx system generator
NASA Astrophysics Data System (ADS)
Xu, Chengfa; Yuan, Yuan; Zhao, Lizhi
2013-07-01
The Digital Down Conversion (DDC) algorithm is a classical signal processing method which is widely used in radar and communication systems. In this paper, the DDC function is implemented by Xilinx System Generator tool on FPGA. System Generator is an FPGA design tool provided by Xilinx Inc and MathWorks Inc. It is very convenient for programmers to manipulate the design and debug the function, especially for the complex algorithm. Through the developing process of DDC function based on System Generator, the results show that System Generator is a very fast and efficient tool for FPGA design.
Design of video interface conversion system based on FPGA
NASA Astrophysics Data System (ADS)
Zhao, Heng; Wang, Xiang-jun
2014-11-01
This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
A Test Methodology for Determining Space-Readiness of Xilinx SRAM-Based FPGA Designs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather M; Graham, Paul S; Morgan, Keith S
2008-01-01
Using reconfigurable, static random-access memory (SRAM) based field-programmable gate arrays (FPGAs) for space-based computation has been an exciting area of research for the past decade. Since both the circuit and the circuit's state is stored in radiation-tolerant memory, both could be alterd by the harsh space radiation environment. Both the circuit and the circuit's state can be prote cted by triple-moduler redundancy (TMR), but applying TMR to FPGA user designs is often an error-prone process. Faulty application of TMR could cause the FPGA user circuit to output incorrect data. This paper will describe a three-tiered methodology for testing FPGA usermore » designs for space-readiness. We will describe the standard approach to testing FPGA user designs using a particle accelerator, as well as two methods using fault injection and a modeling tool. While accelerator testing is the current 'gold standard' for pre-launch testing, we believe the use of fault injection and modeling tools allows for easy, cheap and uniform access for discovering errors early in the design process.« less
The RTE inversion on FPGA aboard the solar orbiter PHI instrument
NASA Astrophysics Data System (ADS)
Cobos Carrascosa, J. P.; Aparicio del Moral, B.; Ramos Mas, J. L.; Balaguer, M.; López Jiménez, A. C.; del Toro Iniesta, J. C.
2016-07-01
In this work we propose a multiprocessor architecture to reach high performance in floating point operations by using radiation tolerant FPGA devices, and under narrow time and power constraints. This architecture is used in the PHI instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. The proposed architecture, in a SIMD flavor, is aimed to be an accelerator within the Data Processing Unit (it is composed by a main Leon processor and two FPGAs) for carrying out the RTE inversion on board the spacecraft using a relatively slow FPGA device - Xilinx XQR4VSX55-. The proposed architecture squeezes the FPGA resources in order to reach the computational requirements and improves the ground-based system performance based on commercial CPUs regarding time and power consumption. In this work we demonstrate the feasibility of using this FPGA devices embedded in the SO/PHI instrument. With that goal in mind, we perform tests to evaluate the scientific results and to measure the processing time and power consumption for carrying out the RTE inversion.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System
NASA Astrophysics Data System (ADS)
Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.
2016-03-01
We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio interferometers.
Embedded algorithms within an FPGA-based system to process nonlinear time series data
NASA Astrophysics Data System (ADS)
Jones, Jonathan D.; Pei, Jin-Song; Tull, Monte P.
2008-03-01
This paper presents some preliminary results of an ongoing project. A pattern classification algorithm is being developed and embedded into a Field-Programmable Gate Array (FPGA) and microprocessor-based data processing core in this project. The goal is to enable and optimize the functionality of onboard data processing of nonlinear, nonstationary data for smart wireless sensing in structural health monitoring. Compared with traditional microprocessor-based systems, fast growing FPGA technology offers a more powerful, efficient, and flexible hardware platform including on-site (field-programmable) reconfiguration capability of hardware. An existing nonlinear identification algorithm is used as the baseline in this study. The implementation within a hardware-based system is presented in this paper, detailing the design requirements, validation, tradeoffs, optimization, and challenges in embedding this algorithm. An off-the-shelf high-level abstraction tool along with the Matlab/Simulink environment is utilized to program the FPGA, rather than coding the hardware description language (HDL) manually. The implementation is validated by comparing the simulation results with those from Matlab. In particular, the Hilbert Transform is embedded into the FPGA hardware and applied to the baseline algorithm as the centerpiece in processing nonlinear time histories and extracting instantaneous features of nonstationary dynamic data. The selection of proper numerical methods for the hardware execution of the selected identification algorithm and consideration of the fixed-point representation are elaborated. Other challenges include the issues of the timing in the hardware execution cycle of the design, resource consumption, approximation accuracy, and user flexibility of input data types limited by the simplicity of this preliminary design. Future work includes making an FPGA and microprocessor operate together to embed a further developed algorithm that yields better computational and power efficiency.
Development of IR imaging system simulator
NASA Astrophysics Data System (ADS)
Xiang, Xinglang; He, Guojing; Dong, Weike; Dong, Lu
2017-02-01
To overcome the disadvantages of the tradition semi-physical simulation and injection simulation equipment in the performance evaluation of the infrared imaging system (IRIS), a low-cost and reconfigurable IRIS simulator, which can simulate the realistic physical process of infrared imaging, is proposed to test and evaluate the performance of the IRIS. According to the theoretical simulation framework and the theoretical models of the IRIS, the architecture of the IRIS simulator is constructed. The 3D scenes are generated and the infrared atmospheric transmission effects are simulated using OGRE technology in real-time on the computer. The physical effects of the IRIS are classified as the signal response characteristic, modulation transfer characteristic and noise characteristic, and they are simulated on the single-board signal processing platform based on the core processor FPGA in real-time using high-speed parallel computation method.
A novel parallel architecture for local histogram equalization
NASA Astrophysics Data System (ADS)
Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan
2005-07-01
Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Study on nondestructive detection system based on x-ray for wire ropes conveyer belt
NASA Astrophysics Data System (ADS)
Miao, Changyun; Shi, Boya; Wan, Peng; Li, Jie
2008-03-01
A nondestructive detection system based on X-ray for wire ropes conveyer belt is designed by X-ray detection technology. In this paper X-ray detection principle is analyzed, a design scheme of the system is presented; image processing of conveyer belt is researched and image processing algorithms are given; X-ray acquisition receiving board is designed with the use of FPGA and DSP; the software of the system is programmed by C#.NET on WINXP/WIN2000 platform. The experiment indicates the system can implement remote real-time detection of wire ropes conveyer belt images, find faults and give an alarm in time. The system is direct perceived, strong real-time and high accurate. It can be used for fault detection of wire ropes conveyer belts in mines, ports, terminals and other fields.
NASA Astrophysics Data System (ADS)
Flatt, H.; Tarnowsky, A.; Blume, H.; Pirsch, P.
2010-10-01
Dieser Beitrag behandelt die Abbildung eines videobasierten Verfahrens zur echtzeitfähigen Auswertung von Winkelhistogrammen auf eine modulare Coprozessor-Architektur. Die Architektur besteht aus mehreren dedizierten Recheneinheiten zur parallelen Verarbeitung rechenintensiver Bildverarbeitungsverfahren und ist mit einem RISC-Prozessor verbunden. Eine konfigurierbare Architekturerweiterung um eine Recheneinheit zur Auswertung von Winkelhistogrammen von Objekten ermöglicht in Verbindung mit dem RISC eine echtzeitfähige Klassifikation. Je nach Konfiguration sind für die Architekturerweiterung auf einem Xilinx Virtex-5-FPGA zwischen 3300 und 12 000 Lookup-Tables erforderlich. Bei einer Taktfrequenz von 100 MHz können unabhängig von der Bildauflösung pro Einzelbild in einem 25-Hz-Videodatenstrom bis zu 100 Objekte der Größe 256×256 Pixel analysiert werden. This paper presents the mapping of a video-based approach for real-time evaluation of angular histograms on a modular coprocessor architecture. The architecture comprises several dedicated processing elements for parallel processing of computation-intensive image processing tasks and is coupled with a RISC processor. A configurable architecture extension, especially a processing element for evaluating angular histograms of objects in conjunction with a RISC processor, provides a real-time classification. Depending on the configuration of the architecture extension, 3 300 to 12 000 look-up tables are required for a Xilinx Virtex-5 FPGA implementation. Running at a clock frequency of 100 MHz and independently of the image resolution per frame, 100 objects of size 256×256 pixels are analyzed in a 25 Hz video stream by the architecture.
ERIC Educational Resources Information Center
Meyer-Base, U.; Vera, A.; Meyer-Base, A.; Pattichis, M. S.; Perry, R. J.
2010-01-01
In this paper, an innovative educational approach to introducing undergraduates to both digital signal processing (DSP) and field programmable gate array (FPGA)-based design in a one-semester course and laboratory is described. While both DSP and FPGA-based courses are currently present in different curricula, this integrated approach reduces the…
A digital frequency stabilization system of external cavity diode laser based on LabVIEW FPGA
NASA Astrophysics Data System (ADS)
Liu, Zhuohuan; Hu, Zhaohui; Qi, Lu; Wang, Tao
2015-10-01
Frequency stabilization for external cavity diode laser has played an important role in physics research. Many laser frequency locking solutions have been proposed by researchers. Traditionally, the locking process was accomplished by analog system, which has fast feedback control response speed. However, analog system is susceptible to the effects of environment. In order to improve the automation level and reliability of the frequency stabilization system, we take a grating-feedback external cavity diode laser as the laser source and set up a digital frequency stabilization system based on National Instrument's FPGA (NI FPGA). The system consists of a saturated absorption frequency stabilization of beam path, a differential photoelectric detector, a NI FPGA board and a host computer. Many functions, such as piezoelectric transducer (PZT) sweeping, atomic saturation absorption signal acquisition, signal peak identification, error signal obtaining and laser PZT voltage feedback controlling, are totally completed by LabVIEW FPGA program. Compared with the analog system, the system built by the logic gate circuits, performs stable and reliable. User interface programmed by LabVIEW is friendly. Besides, benefited from the characteristics of reconfiguration, the LabVIEW program is good at transplanting in other NI FPGA boards. Most of all, the system periodically checks the error signal. Once the abnormal error signal is detected, FPGA will restart frequency stabilization process without manual control. Through detecting the fluctuation of error signal of the atomic saturation absorption spectrum line in the frequency locking state, we can infer that the laser frequency stability can reach 1MHz.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
FPGA Accelerated Discrete-SURF for Real-Time Homography Estimation
2015-03-26
allows for the sum of a group of pixels to be found with only four memory accesses, and a single calculation...of pixels are retrieved from memory and their Hessian determinant values are compared. If the center pixel of the 3x3 block is greater than the other...process- ing on the FPGA[5][24][31]. Third, previous approaches rely heavily on external memory and other components external to the FPGA, while a logic
A novel pipeline based FPGA implementation of a genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2014-05-01
To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
Real Time Coincidence Processing Algorithm for Geiger Mode LADAR using FPGAs
2017-01-09
Defense for Research and Engineering. Real Time Coincidence Processing Algorithm for Geiger-Mode Ladar using FPGAs Rufo A. Antonio1, Alexandru N...the first ever Geiger-mode ladar processing al- gorithm that is suitable for implementation on an FPGA enabling real time pro- cessing and data...developed embedded FPGA real time processing algorithms that take noisy raw data, streaming at upwards of 1GB/sec, and filters the data to obtain a near- ly
Timing generator of scientific grade CCD camera and its implementation based on FPGA technology
NASA Astrophysics Data System (ADS)
Si, Guoliang; Li, Yunfei; Guo, Yongfei
2010-10-01
The Timing Generator's functions of Scientific Grade CCD Camera is briefly presented: it generates various kinds of impulse sequence for the TDI-CCD, video processor and imaging data output, acting as the synchronous coordinator for time in the CCD imaging unit. The IL-E2TDI-CCD sensor produced by DALSA Co.Ltd. use in the Scientific Grade CCD Camera. Driving schedules of IL-E2 TDI-CCD sensor has been examined in detail, the timing generator has been designed for Scientific Grade CCD Camera. FPGA is chosen as the hardware design platform, schedule generator is described with VHDL. The designed generator has been successfully fulfilled function simulation with EDA software and fitted into XC2VP20-FF1152 (a kind of FPGA products made by XILINX). The experiments indicate that the new method improves the integrated level of the system. The Scientific Grade CCD camera system's high reliability, stability and low power supply are achieved. At the same time, the period of design and experiment is sharply shorted.
FPGA-based architecture for motion recovering in real-time
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar
2002-03-01
A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
FPGA-based real time processing of the Plenoptic Wavefront Sensor
NASA Astrophysics Data System (ADS)
Rodríguez-Ramos, L. F.; Marín, Y.; Díaz, J. J.; Piqueras, J.; García-Jiménez, J.; Rodríguez-Ramos, J. M.
The plenoptic wavefront sensor combines measurements at pupil and image planes in order to obtain simultaneously wavefront information from different points of view, being capable to sample the volume above the telescope to extract the tomographic information of the atmospheric turbulence. The advantages of this sensor are presented elsewhere at this conference (José M. Rodríguez-Ramos et al). This paper will concentrate in the processing required for pupil plane phase recovery, and its computation in real time using FPGAs (Field Programmable Gate Arrays). This technology eases the implementation of massive parallel processing and allows tailoring the system to the requirements, maintaining flexibility, speed and cost figures.
Efficient Sample Delay Calculation for 2-D and 3-D Ultrasound Imaging.
Ibrahim, Aya; Hager, Pascal A; Bartolini, Andrea; Angiolini, Federico; Arditi, Marcel; Thiran, Jean-Philippe; Benini, Luca; De Micheli, Giovanni
2017-08-01
Ultrasound imaging is a reference medical diagnostic technique, thanks to its blend of versatility, effectiveness, and moderate cost. The core computation of all ultrasound imaging methods is based on simple formulae, except for those required to calculate acoustic propagation delays with high precision and throughput. Unfortunately, advanced three-dimensional (3-D) systems require the calculation or storage of billions of such delay values per frame, which is a challenge. In 2-D systems, this requirement can be four orders of magnitude lower, but efficient computation is still crucial in view of low-power implementations that can be battery-operated, enabling usage in numerous additional scenarios. In this paper, we explore two smart designs of the delay generation function. To quantify their hardware cost, we implement them on FPGA and study their footprint and performance. We evaluate how these architectures scale to different ultrasound applications, from a low-power 2-D system to a next-generation 3-D machine. When using numerical approximations, we demonstrate the ability to generate delay values with sufficient throughput to support 10 000-channel 3-D imaging at up to 30 fps while using 63% of a Virtex 7 FPGA, requiring 24 MB of external memory accessed at about 32 GB/s bandwidth. Alternatively, with similar FPGA occupation, we show an exact calculation method that reaches 24 fps on 1225-channel 3-D imaging and does not require external memory at all. Both designs can be scaled to use a negligible amount of resources for 2-D imaging in low-power applications and for ultrafast 2-D imaging at hundreds of frames per second.
Design and Evaluation of a Scalable and Reconfigurable Multi-Platform System for Acoustic Imaging
Izquierdo, Alberto; Villacorta, Juan José; del Val Puente, Lara; Suárez, Luis
2016-01-01
This paper proposes a scalable and multi-platform framework for signal acquisition and processing, which allows for the generation of acoustic images using planar arrays of MEMS (Micro-Electro-Mechanical Systems) microphones with low development and deployment costs. Acoustic characterization of MEMS sensors was performed, and the beam pattern of a module, based on an 8 × 8 planar array and of several clusters of modules, was obtained. A flexible framework, formed by an FPGA, an embedded processor, a computer desktop, and a graphic processing unit, was defined. The processing times of the algorithms used to obtain the acoustic images, including signal processing and wideband beamforming via FFT, were evaluated in each subsystem of the framework. Based on this analysis, three frameworks are proposed, defined by the specific subsystems used and the algorithms shared. Finally, a set of acoustic images obtained from sound reflected from a person are presented as a case study in the field of biometric identification. These results reveal the feasibility of the proposed system. PMID:27727174
Real-Time On-Board Processing Validation of MSPI Ground Camera Images
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Werne, Thomas A.; Bekker, Dmitriy L.
2010-01-01
The Earth Sciences Decadal Survey identifies a multiangle, multispectral, high-accuracy polarization imager as one requirement for the Aerosol-Cloud-Ecosystem (ACE) mission. JPL has been developing a Multiangle SpectroPolarimetric Imager (MSPI) as a candidate to fill this need. A key technology development needed for MSPI is on-board signal processing to calculate polarimetry data as imaged by each of the 9 cameras forming the instrument. With funding from NASA's Advanced Information Systems Technology (AIST) Program, JPL is solving the real-time data processing requirements to demonstrate, for the first time, how signal data at 95 Mbytes/sec over 16-channels for each of the 9 multiangle cameras in the spaceborne instrument can be reduced on-board to 0.45 Mbytes/sec. This will produce the intensity and polarization data needed to characterize aerosol and cloud microphysical properties. Using the Xilinx Virtex-5 FPGA including PowerPC440 processors we have implemented a least squares fitting algorithm that extracts intensity and polarimetric parameters in real-time, thereby substantially reducing the image data volume for spacecraft downlink without loss of science information.
Development of AN Innovative Three-Dimensional Complete Body Screening Device - 3D-CBS
NASA Astrophysics Data System (ADS)
Crosetto, D. B.
2004-07-01
This article describes an innovative technological approach that increases the efficiency with which a large number of particles (photons) can be detected and analyzed. The three-dimensional complete body screening (3D-CBS) combines the functional imaging capability of the Positron Emission Tomography (PET) with those of the anatomical imaging capability of Computed Tomography (CT). The novel techniques provide better images in a shorter time with less radiation to the patient. A primary means of accomplishing this is the use of a larger solid angle, but this requires a new electronic technique capable of handling the increased data rate. This technique, combined with an improved and simplified detector assembly, enables executing complex real-time algorithms and allows more efficiently use of economical crystals. These are the principal features of this invention. A good synergy of advanced techniques in particle detection, together with technological progress in industry (latest FPGA technology) and simple, but cost-effective ideas provide a revolutionary invention. This technology enables over 400 times PET efficiency improvement at once compared to two to three times improvements achieved every five years during the past decades. Details of the electronics are provided, including an IBM PC board with a parallel-processing architecture implemented in FPGA, enabling the execution of a programmable complex real-time algorithm for best detection of photons.
NASA Astrophysics Data System (ADS)
Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia
2017-07-01
The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
NASA Astrophysics Data System (ADS)
Hashimoto, Ryoji; Matsumura, Tomoya; Nozato, Yoshihiro; Watanabe, Kenji; Onoye, Takao
A multi-agent object attention system is proposed, which is based on biologically inspired attractor selection model. Object attention is facilitated by using a video sequence and a depth map obtained through a compound-eye image sensor TOMBO. Robustness of the multi-agent system over environmental changes is enhanced by utilizing the biological model of adaptive response by attractor selection. To implement the proposed system, an efficient VLSI architecture is employed with reducing enormous computational costs and memory accesses required for depth map processing and multi-agent attractor selection process. According to the FPGA implementation result of the proposed object attention system, which is accomplished by using 7,063 slices, 640×512 pixel input images can be processed in real-time with three agents at a rate of 9fps in 48MHz operation.
2009-09-01
suffer the power and complexity requirements of a public key system. 28 In [18], a simulation of the SHA –1 algorithm is performed on a Xilinx FPGA ... 256 bits. Thus, the construction of a hash table would need 2512 independent comparisons. It is known that hash collisions of the SHA –1 algorithm... SHA –1 algorithm for small-core FPGA design. Small-core FPGA design is the process by which a circuit is adapted to use the minimal amount of logic
Radiation Tolerant, FPGA-Based SmallSat Computer System
NASA Technical Reports Server (NTRS)
LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew
2015-01-01
The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
FASEA: A FPGA Acquisition System and Software Event Analysis for liquid scintillation counting
NASA Astrophysics Data System (ADS)
Steele, T.; Mo, L.; Bignell, L.; Smith, M.; Alexiev, D.
2009-10-01
The FASEA (FPGA based Acquisition and Software Event Analysis) system has been developed to replace the MAC3 for coincidence pulse processing. The system uses a National Instruments Virtex 5 FPGA card (PXI-7842R) for data acquisition and a purpose developed data analysis software for data analysis. Initial comparisons to the MAC3 unit are included based on measurements of 89Sr and 3H, confirming that the system is able to accurately emulate the behaviour of the MAC3 unit.
Novel windowing technique realized in FPGA for radar system
NASA Astrophysics Data System (ADS)
Escamilla-Hernandez, E.; Kravchenko, V. F.; Ponomaryov, V. I.; Ikuo, Arai
2006-02-01
To improve the weak target detection ability in radar applications a pulse compression is usually used that in the case linear FM modulation can improve the SNR. One drawback in here is that it can add the range side-lobes in reflectivity measurements. Using weighting window processing in time domain it is possible to decrease significantly the side-lobe level (SLL) and resolve small or low power targets those are masked by powerful ones. There are usually used classical windows such as Hamming, Hanning, etc. in window processing. Additionally to classical ones in this paper we also use a novel class of windows based on atomic functions (AF) theory. For comparison of simulation and experimental results we applied the standard parameters, such as coefficient of amplification, maximum level of side-lobe, width of main lobe, etc. To implement the compression-windowing model on hardware level it has been employed FPGA. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal, pulse compression and windowing employing FPGA's. Classical and novel AF window technique has been investigated to reduce the SLL taking into account the noise influence and increasing the detection ability of the small or weak targets in the imaging radar. Paper presents the experimental hardware results of windowing in pulse compression radar resolving several targets for rectangular, Hamming, Kaiser-Bessel, (see manuscript for formula) functions windows. The windows created by use the atomic functions offer sufficiently better decreasing of the SLL in case of noise presence and when we move away of the main lobe in comparison with classical windows.
FPGA based hardware optimized implementation of signal processing system for LFM pulsed radar
NASA Astrophysics Data System (ADS)
Azim, Noor ul; Jun, Wang
2016-11-01
Signal processing is one of the main parts of any radar system. Different signal processing algorithms are used to extract information about different parameters like range, speed, direction etc, of a target in the field of radar communication. This paper presents LFM (Linear Frequency Modulation) pulsed radar signal processing algorithms which are used to improve target detection, range resolution and to estimate the speed of a target. Firstly, these algorithms are simulated in MATLAB to verify the concept and theory. After the conceptual verification in MATLAB, the simulation is converted into implementation on hardware using Xilinx FPGA. Chosen FPGA is Xilinx Virtex-6 (XC6LVX75T). For hardware implementation pipeline optimization is adopted and also other factors are considered for resources optimization in the process of implementation. Focusing algorithms in this work for improving target detection, range resolution and speed estimation are hardware optimized fast convolution processing based pulse compression and pulse Doppler processing.
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-01-01
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness. PMID:23867746
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-07-17
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
Reconfigurable Processing Module
NASA Technical Reports Server (NTRS)
Somervill, Kevin; Hodson, Robert; Jones, Robert; Williams, John
2005-01-01
To accommodate a wide spectrum of applications and technologies, NASA s Exploration System's Missions Directorate has called for reconfigurable and modular technologies to support future missions to the moon and Mars. In response, Langley Research Center is leading a program entitled Reconfigurable Scaleable Computing (RSC) that is centered on the development of FPGA-based computing resources in a stackable form factor. This paper details the architecture and implementation of the Reconfigurable Processing Module (RPM), which is the key element of the RSC system. The RPM is an FPGA-based, space-qualified printed circuit assembly leveraging terrestrial/commercial design standards into the space applications domain. The form factor is similar to, and backwards compatible with, the PCI-104 standard utilizing only the PCI interface. The size is expanded to accommodate the required functionality while still better than 30% smaller than a 3U CompactPCI(TradeMark)card and without the overhead of the backplane. The architecture is built around two FPGA devices, one hosting PCI and memory interfaces, and another hosting mission application resources; both of which are connected with a high-speed data bus. The PCI interface FPGA provides access via the PCI bus to onboard SDRAM, flash PROM, and the application resources; both configuration management as well as runtime interaction. The reconfigurable FPGA, referred to as the Application FPGA - or simply "the application" - is a radiation-tolerant Xilinx Virtex-4 FX60 hosting custom application specific logic or soft microprocessor IP. The RPM implements various SEE mitigation techniques including TMR, EDAC, and configuration scrubbing of the reconfigurable FPGA. Prototype hardware and formal modeling techniques are used to explore the performability trade space. These models provide a novel way to calculate quality-of-service performance measures while simultaneously considering fault-related behavior due to SEE soft errors.
A design approach for small vision-based autonomous vehicles
NASA Astrophysics Data System (ADS)
Edwards, Barrett B.; Fife, Wade S.; Archibald, James K.; Lee, Dah-Jye; Wilde, Doran K.
2006-10-01
This paper describes the design of a small autonomous vehicle based on the Helios computing platform, a custom FPGA-based board capable of supporting on-board vision. Target applications for the Helios computing platform are those that require lightweight equipment and low power consumption. To demonstrate the capabilities of FPGAs in real-time control of autonomous vehicles, a 16 inch long R/C monster truck was outfitted with a Helios board. The platform provided by such a small vehicle is ideal for testing and development. The proof of concept application for this autonomous vehicle was a timed race through an environment with obstacles. Given the size restrictions of the vehicle and its operating environment, the only feasible on-board sensor is a small CMOS camera. The single video feed is therefore the only source of information from the surrounding environment. The image is then segmented and processed by custom logic in the FPGA that also controls direction and speed of the vehicle based on visual input.
FPGA-based real-time swept-source OCT systems for B-scan live-streaming or volumetric imaging
NASA Astrophysics Data System (ADS)
Bandi, Vinzenz; Goette, Josef; Jacomet, Marcel; von Niederhäusern, Tim; Bachmann, Adrian H.; Duelk, Marcus
2013-03-01
We have developed a Swept-Source Optical Coherence Tomography (Ss-OCT) system with high-speed, real-time signal processing on a commercially available Data-Acquisition (DAQ) board with a Field-Programmable Gate Array (FPGA). The Ss-OCT system simultaneously acquires OCT and k-clock reference signals at 500MS/s. From the k-clock signal of each A-scan we extract a remap vector for the k-space linearization of the OCT signal. The linear but oversampled interpolation is followed by a 2048-point FFT, additional auxiliary computations, and a data transfer to a host computer for real-time, live-streaming of B-scan or volumetric C-scan OCT visualization. We achieve a 100 kHz A-scan rate by parallelization of our hardware algorithms, which run on standard and affordable, commercially available DAQ boards. Our main development tool for signal analysis as well as for hardware synthesis is MATLAB® with add-on toolboxes and 3rd-party tools.
FPGA design of correlation-based pattern recognition
NASA Astrophysics Data System (ADS)
Jridi, Maher; Alfalou, Ayman
2017-05-01
Optical/Digital pattern recognition and tracking based on optical/digital correlation are a well-known techniques to detect, identify and localize a target object in a scene. Despite the limited number of treatments required by the correlation scheme, computational time and resources are relatively high. The most computational intensive treatment required by the correlation is the transformation from spatial to spectral domain and then from spectral to spatial domain. Furthermore, these transformations are used on optical/digital encryption schemes like the double random phase encryption (DRPE). In this paper, we present a VLSI architecture for the correlation scheme based on the fast Fourier transform (FFT). One interesting feature of the proposed scheme is its ability to stream image processing in order to perform correlation for video sequences. A trade-off between the hardware consumption and the robustness of the correlation can be made in order to understand the limitations of the correlation implementation in reconfigurable and portable platforms. Experimental results obtained from HDL simulations and FPGA prototype have demonstrated the advantages of the proposed scheme.
NASA Astrophysics Data System (ADS)
Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
Serial data acquisition for GEM-2D detector
NASA Astrophysics Data System (ADS)
Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech; Zienkiewicz, Pawel; Mazon, Didier; Malard, Philippe; Herrmann, Albrecht; Vezinet, Didier
2014-11-01
This article debates about data fast acquisition and histogramming method for the X-ray GEM detector. The whole process of histogramming is performed by FPGA chips (Spartan-6 series from Xilinx). The results of the histogramming process are stored in an internal FPGA memory and then sent to PC. In PC data is merged and processed by MATLAB. The structure of firmware functionality implemented in the FPGAs is described. Examples of test measurements and results are presented.
Video Guidance Sensor and Time-of-Flight Rangefinder
NASA Technical Reports Server (NTRS)
Bryan, Thomas; Howard, Richard; Bell, Joseph L.; Roe, Fred D.; Book, Michael L.
2007-01-01
A proposed video guidance sensor (VGS) would be based mostly on the hardware and software of a prior Advanced VGS (AVGS), with some additions to enable it to function as a time-of-flight rangefinder (in contradistinction to a triangulation or image-processing rangefinder). It would typically be used at distances of the order of 2 or 3 kilometers, where a typical target would appear in a video image as a single blob, making it possible to extract the direction to the target (but not the orientation of the target or the distance to the target) from a video image of light reflected from the target. As described in several previous NASA Tech Briefs articles, an AVGS system is an optoelectronic system that provides guidance for automated docking of two vehicles. In the original application, the two vehicles are spacecraft, but the basic principles of design and operation of the system are applicable to aircraft, robots, objects maneuvered by cranes, or other objects that may be required to be aligned and brought together automatically or under remote control. In a prior AVGS system of the type upon which the now-proposed VGS is largely based, the tracked vehicle is equipped with one or more passive targets that reflect light from one or more continuous-wave laser diode(s) on the tracking vehicle, a video camera on the tracking vehicle acquires images of the targets in the reflected laser light, the video images are digitized, and the image data are processed to obtain the direction to the target. The design concept of the proposed VGS does not call for any memory or processor hardware beyond that already present in the prior AVGS, but does call for some additional hardware and some additional software. It also calls for assignment of some additional tasks to two subsystems that are parts of the prior VGS: a field-programmable gate array (FPGA) that generates timing and control signals, and a digital signal processor (DSP) that processes the digitized video images. The additional timing and control signals generated by the FPGA would cause the VGS to alternate between an imaging (direction-finding) mode and a time-of-flight (range-finding mode) and would govern operation in the range-finding mode.
Rapid prototyping of update algorithm of discrete Fourier transform for real-time signal processing
NASA Astrophysics Data System (ADS)
Kakad, Yogendra P.; Sherlock, Barry G.; Chatapuram, Krishnan V.; Bishop, Stephen
2001-10-01
An algorithm is developed in the companion paper, to update the existing DFT to represent the new data series that results when a new signal point is received. Updating the DFT in this way uses less computation than directly evaluating the DFT using the FFT algorithm, This reduces the computational order by a factor of log2 N. The algorithm is able to work in the presence of data window function, for use with rectangular window, the split triangular, Hanning, Hamming, and Blackman windows. In this paper, a hardware implementation of this algorithm, using FPGA technology, is outlined. Unlike traditional fully customized VLSI circuits, FPGAs represent a technical break through in the corresponding industry. The FPGA implements thousands of gates of logic in a single IC chip and it can be programmed by users at their site in a few seconds or less depending on the type of device used. The risk is low and the development time is short. The advantages have made FPGAs very popular for rapid prototyping of algorithms in the area of digital communication, digital signal processing, and image processing. Our paper addresses the related issues of implementation using hardware descriptive language in the development of the design and the subsequent downloading on the programmable hardware chip.
Printed Circuit Board Design (PCB) with HDL Designer
NASA Technical Reports Server (NTRS)
Winkert, Thomas K.; LaFourcade, Teresa
2004-01-01
Contents include the following: PCB design with HDL designer, design process and schematic capture - symbols and diagrams: 1. Motivation: time savings, money savings, simplicity. 2. Approach: use single tool PCB for FPGA design, more FPGA designs than PCB designers. 3. Use HDL designer for schematic capture.
Seam tracking with adaptive image capture for fine-tuning of a high power laser welding process
NASA Astrophysics Data System (ADS)
Lahdenoja, Olli; Säntti, Tero; Laiho, Mika; Paasio, Ari; Poikonen, Jonne K.
2015-02-01
This paper presents the development of methods for real-time fine-tuning of a high power laser welding process of thick steel by using a compact smart camera system. When performing welding in butt-joint configuration, the laser beam's location needs to be adjusted exactly according to the seam line in order to allow the injected energy to be absorbed uniformly into both steel sheets. In this paper, on-line extraction of seam parameters is targeted by taking advantage of a combination of dynamic image intensity compression, image segmentation with a focal-plane processor ASIC, and Hough transform on an associated FPGA. Additional filtering of Hough line candidates based on temporal windowing is further applied to reduce unrealistic frame-to-frame tracking variations. The proposed methods are implemented in Matlab by using image data captured with adaptive integration time. The simulations are performed in a hardware oriented way to allow real-time implementation of the algorithms on the smart camera system.
Embedded System Implementation on FPGA System With μCLinux OS
NASA Astrophysics Data System (ADS)
Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna
2011-02-01
Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
High-speed line-scan camera with digital time delay integration
NASA Astrophysics Data System (ADS)
Bodenstorfer, Ernst; Fürtler, Johannes; Brodersen, Jörg; Mayer, Konrad J.; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert
2007-02-01
Dealing with high-speed image acquisition and processing systems, the speed of operation is often limited by the amount of available light, due to short exposure times. Therefore, high-speed applications often use line-scan cameras, based on charge-coupled device (CCD) sensors with time delayed integration (TDI). Synchronous shift and accumulation of photoelectric charges on the CCD chip - according to the objects' movement - result in a longer effective exposure time without introducing additional motion blur. This paper presents a high-speed color line-scan camera based on a commercial complementary metal oxide semiconductor (CMOS) area image sensor with a Bayer filter matrix and a field programmable gate array (FPGA). The camera implements a digital equivalent to the TDI effect exploited with CCD cameras. The proposed design benefits from the high frame rates of CMOS sensors and from the possibility of arbitrarily addressing the rows of the sensor's pixel array. For the digital TDI just a small number of rows are read out from the area sensor which are then shifted and accumulated according to the movement of the inspected objects. This paper gives a detailed description of the digital TDI algorithm implemented on the FPGA. Relevant aspects for the practical application are discussed and key features of the camera are listed.
Powerful conveyer belt real-time online detection system based on x-ray
NASA Astrophysics Data System (ADS)
Rong, Feng; Miao, Chang-yun; Meng, Wei
2009-07-01
The powerful conveyer belt is widely used in the mine, dock, and so on. After used for a long time, internal steel rope of the conveyor belt may fracture, rust, joints moving, and so on .This would bring potential safety problems. A kind of detection system based on x-ray is designed in this paper. Linear array detector (LDA) is used. LDA cost is low, response fast; technology mature .Output charge of LDA is transformed into differential voltage signal by amplifier. This kind of signal have great ability of anti-noise, is suitable for long-distance transmission. The processor is FPGA. A IP core control 4-channel A/D convertor, achieve parallel output data collection. Soft-core processor MicroBlaze which process tcp/ip protocol is embedded in FPGA. Sampling data are transferred to a computer via Ethernet. In order to improve the image quality, algorithm of getting rid of noise from the measurement result and taking gain normalization for pixel value is studied and designed. Experiments show that this system work well, can real-time online detect conveyor belt of width of 2.0m and speed of 5 m/s, does not affect the production. Image is clear, visual and can easily judge the situation of conveyor belt.
A high performance parallel computing architecture for robust image features
NASA Astrophysics Data System (ADS)
Zhou, Renyan; Liu, Leibo; Wei, Shaojun
2014-03-01
A design of parallel architecture for image feature detection and description is proposed in this article. The major component of this architecture is a 2D cellular network composed of simple reprogrammable processors, enabling the Hessian Blob Detector and Haar Response Calculation, which are the most computing-intensive stage of the Speeded Up Robust Features (SURF) algorithm. Combining this 2D cellular network and dedicated hardware for SURF descriptors, this architecture achieves real-time image feature detection with minimal software in the host processor. A prototype FPGA implementation of the proposed architecture achieves 1318.9 GOPS general pixel processing @ 100 MHz clock and achieves up to 118 fps in VGA (640 × 480) image feature detection. The proposed architecture is stand-alone and scalable so it is easy to be migrated into VLSI implementation.
Achieving High Performance with FPGA-Based Computing
Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug
2011-01-01
Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
Compact and portable X-ray imager system using Medipix3RX
NASA Astrophysics Data System (ADS)
Garcia-Nathan, T. B.; Kachatkou, A.; Jiang, C.; Omar, D.; Marchal, J.; Changani, H.; Tartoni, N.; van Silfhout, R. G.
2017-10-01
In this paper the design and implementation of a novel portable X-ray imager system is presented. The design features a direct X-ray detection scheme by making use of a hybrid detector (Medipix3RX). Taking advantages of the capabilities of the Medipix3RX, like a high resolution, zero dead-time, single photon detection and charge-sharing mode, the imager has a better resolution and higher sensitivity compared to using traditional indirect detection schemes. A detailed description of the system is presented, which consists of a vacuum chamber containing the sensor, an electronic board for temperature management, conditioning and readout of the sensor and a data processing unit which also handles network connection and allow communication with clients by acting as a server. A field programmable gate array (FPGA) device is used to implement the readout protocol for the Medipix3RX, apart from the readout the FPGA can perform complex image processing functions such as feature extraction, histogram, profiling and image compression at high speeds. The temperature of the sensor is monitored and controlled through a PID algorithm making use of a Peltier cooler, improving the energy resolution and response stability of the sensor. Without implementing data compression techniques, the system is capable of transferring 680 profiles/s or 240 images/s in a continuous mode. Implementation of equalization procedures and tests on colour mode are presented in this paper. For the experimental measurements the Medipix3RX sensor was used with a Silicon layer. One of the tested applications of the system is as an X-ray beam position monitor (XBPM) device for synchrotron applications. The XBPM allows a non-destructive real time measurement of the beam position, size and intensity. A Kapton foil is placed in the beam path scattering radiation towards a pinhole camera setup that allows the sensor to obtain an image of the beam. By using profiles of the synchrotron X-ray beam, high frequency movement of the beam position can be studied, up to 340 Hz. The system is capable of realizing an independent energy measure of the beam by using the Medipix3RX variable energy threshold feature.
Heterogeneous real-time computing in radio astronomy
NASA Astrophysics Data System (ADS)
Ford, John M.; Demorest, Paul; Ransom, Scott
2010-07-01
Modern computer architectures suited for general purpose computing are often not the best choice for either I/O-bound or compute-bound problems. Sometimes the best choice is not to choose a single architecture, but to take advantage of the best characteristics of different computer architectures to solve your problems. This paper examines the tradeoffs between using computer systems based on the ubiquitous X86 Central Processing Units (CPU's), Field Programmable Gate Array (FPGA) based signal processors, and Graphical Processing Units (GPU's). We will show how a heterogeneous system can be produced that blends the best of each of these technologies into a real-time signal processing system. FPGA's tightly coupled to analog-to-digital converters connect the instrument to the telescope and supply the first level of computing to the system. These FPGA's are coupled to other FPGA's to continue to provide highly efficient processing power. Data is then packaged up and shipped over fast networks to a cluster of general purpose computers equipped with GPU's, which are used for floating-point intensive computation. Finally, the data is handled by the CPU and written to disk, or further processed. Each of the elements in the system has been chosen for its specific characteristics and the role it can play in creating a system that does the most for the least, in terms of power, space, and money.
An improved real time superresolution FPGA system
NASA Astrophysics Data System (ADS)
Lakshmi Narasimha, Pramod; Mudigoudar, Basavaraj; Yue, Zhanfeng; Topiwala, Pankaj
2009-05-01
In numerous computer vision applications, enhancing the quality and resolution of captured video can be critical. Acquired video is often grainy and low quality due to motion, transmission bottlenecks, etc. Postprocessing can enhance it. Superresolution greatly decreases camera jitter to deliver a smooth, stabilized, high quality video. In this paper, we extend previous work on a real-time superresolution application implemented in ASIC/FPGA hardware. A gradient based technique is used to register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved regularization technique in which the image is iteratively modified by applying back-projection to get a sharp and undistorted image. The algorithm was first tested in software and migrated to hardware, to achieve 320x240 -> 1280x960, about 30 fps, a stunning superresolution by 16X in total pixels. Various input parameters, such as size of input image, enlarging factor and the number of nearest neighbors, can be tuned conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in Matlab Simulink as well as in FPGA hardware, which gives us a fine balance between the number of bits and performance. The proposed system is robust and highly efficient. We have shown the performance improvement of the hardware superresolution over the software version (C code).
A Timing Synchronizer System for Beam Test Setups Requiring Galvanic Isolation
NASA Astrophysics Data System (ADS)
Meder, Lukas Dominik; Emschermann, David; Frühauf, Jochen; Müller, Walter F. J.; Becker, Jürgen
2017-07-01
In beam test setups detector elements together with a readout composed of frontend electronics (FEE) and usually a layer of field-programmable gate arrays (FPGAs) are being analyzed. The FEE is in this scenario often directly connected to both the detector and the FPGA layer what in many cases requires sharing the ground potentials of these layers. This setup can become problematic if parts of the detector need to be operated at different high-voltage potentials, since all of the FPGA boards need to receive a common clock and timing reference for getting the readout synchronized. Thus, for the context of the compressed baryonic matter experiment a versatile timing synchronizer (TS) system was designed providing galvanically isolated timing distribution links over twisted-pair cables. As an electrical interface the so-called timing data processing board FPGA mezzanine card was created for being mounted onto FPGA-based advanced mezzanine cards for mTCA.4 crates. The FPGA logic of the TS system connects to this card and can be monitored and controlled through IPBus slow-control links. Evaluations show that the system is capable of stably synchronizing the FPGA boards of a beam test setup being integrated into a hierarchical TS network.
Frontend electronics for high-precision single photo-electron timing using FPGA-TDCs
NASA Astrophysics Data System (ADS)
Cardinali, M.; Dzyhgadlo, R.; Gerhardt, A.; Götzen, K.; Hohler, R.; Kalicy, G.; Kumawat, H.; Lehmann, D.; Lewandowski, B.; Patsyuk, M.; Peters, K.; Schepers, G.; Schmitt, L.; Schwarz, C.; Schwiening, J.; Traxler, M.; Ugur, C.; Zühlsdorf, M.; Dodokhov, V. Kh.; Britting, A.; Eyrich, W.; Lehmann, A.; Uhlig, F.; Düren, M.; Föhl, K.; Hayrapetyan, A.; Kröck, B.; Merle, O.; Rieke, J.; Cowie, E.; Keri, T.; Montgomery, R.; Rosner, G.; Achenbach, P.; Corell, O.; Ferretti Bondy, M. I.; Hoek, M.; Lauth, W.; Rosner, C.; Sfienti, C.; Thiel, M.; Bühler, P.; Gruber, L.; Marton, J.; Suzuki, K.
2014-12-01
The next generation of high-luminosity experiments requires excellent particle identification detectors which calls for Imaging Cherenkov counters with fast electronics to cope with the expected hit rates. A Barrel DIRC will be used in the central region of the Target Spectrometer of the planned PANDA experiment at FAIR. A single photo-electron timing resolution of better than 100 ps is required by the Barrel DIRC to disentangle the complicated patterns created on the image plane. R&D studies have been performed to provide a design based on the TRB3 readout using FPGA-TDCs with a precision better than 20 ps RMS and custom frontend electronics with high-bandwidth pre-amplifiers and fast discriminators. The discriminators also provide time-over-threshold information thus enabling walk corrections to improve the timing resolution. Two types of frontend electronics cards optimised for reading out 64-channel PHOTONIS Planacon MCP-PMTs were tested: one based on the NINO ASIC and the other, called PADIWA, on FPGA discriminators. Promising results were obtained in a full characterisation using a fast laser setup and in a test experiment at MAMI, Mainz, with a small scale DIRC prototype.
High-speed sorting of grains by color and surface texture
USDA-ARS?s Scientific Manuscript database
A high-speed, low-cost, image-based sorting device was developed to detect and separate grains with different colors/textures. The device directly combines a complementary metal–oxide–semiconductor (CMOS) color image sensor with a field-programmable gate array (FPGA) that was programmed to execute ...
Design and fabrication of vertically-integrated CMOS image sensors.
Skorka, Orit; Joseph, Dileepan
2011-01-01
Technologies to fabricate integrated circuits (IC) with 3D structures are an emerging trend in IC design. They are based on vertical stacking of active components to form heterogeneous microsystems. Electronic image sensors will benefit from these technologies because they allow increased pixel-level data processing and device optimization. This paper covers general principles in the design of vertically-integrated (VI) CMOS image sensors that are fabricated by flip-chip bonding. These sensors are composed of a CMOS die and a photodetector die. As a specific example, the paper presents a VI-CMOS image sensor that was designed at the University of Alberta, and fabricated with the help of CMC Microsystems and Micralyne Inc. To realize prototypes, CMOS dies with logarithmic active pixels were prepared in a commercial process, and photodetector dies with metal-semiconductor-metal devices were prepared in a custom process using hydrogenated amorphous silicon. The paper also describes a digital camera that was developed to test the prototype. In this camera, scenes captured by the image sensor are read using an FPGA board, and sent in real time to a PC over USB for data processing and display. Experimental results show that the VI-CMOS prototype has a higher dynamic range and a lower dark limit than conventional electronic image sensors.
Design and Fabrication of Vertically-Integrated CMOS Image Sensors
Skorka, Orit; Joseph, Dileepan
2011-01-01
Technologies to fabricate integrated circuits (IC) with 3D structures are an emerging trend in IC design. They are based on vertical stacking of active components to form heterogeneous microsystems. Electronic image sensors will benefit from these technologies because they allow increased pixel-level data processing and device optimization. This paper covers general principles in the design of vertically-integrated (VI) CMOS image sensors that are fabricated by flip-chip bonding. These sensors are composed of a CMOS die and a photodetector die. As a specific example, the paper presents a VI-CMOS image sensor that was designed at the University of Alberta, and fabricated with the help of CMC Microsystems and Micralyne Inc. To realize prototypes, CMOS dies with logarithmic active pixels were prepared in a commercial process, and photodetector dies with metal-semiconductor-metal devices were prepared in a custom process using hydrogenated amorphous silicon. The paper also describes a digital camera that was developed to test the prototype. In this camera, scenes captured by the image sensor are read using an FPGA board, and sent in real time to a PC over USB for data processing and display. Experimental results show that the VI-CMOS prototype has a higher dynamic range and a lower dark limit than conventional electronic image sensors. PMID:22163860
A low delay transmission method of multi-channel video based on FPGA
NASA Astrophysics Data System (ADS)
Fu, Weijian; Wei, Baozhi; Li, Xiaobin; Wang, Quan; Hu, Xiaofei
2018-03-01
In order to guarantee the fluency of multi-channel video transmission in video monitoring scenarios, we designed a kind of video format conversion method based on FPGA and its DMA scheduling for video data, reduces the overall video transmission delay.In order to sace the time in the conversion process, the parallel ability of FPGA is used to video format conversion. In order to improve the direct memory access (DMA) writing transmission rate of PCIe bus, a DMA scheduling method based on asynchronous command buffer is proposed. The experimental results show that this paper designs a low delay transmission method based on FPGA, which increases the DMA writing transmission rate by 34% compared with the existing method, and then the video overall delay is reduced to 23.6ms.
Design and construction of a high frame rate imaging system
NASA Astrophysics Data System (ADS)
Wang, Jing; Waugaman, John L.; Liu, Anjun; Lu, Jian-Yu
2002-05-01
A new high frame rate imaging method has been developed recently [Jian-yu Lu, ``2D and 3D high frame rate imaging with limited diffraction beams,'' IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44, 839-856 (1997)]. This method may have a clinical application for imaging of fast moving objects such as human hearts, velocity vector imaging, and low-speckle imaging. To implement the method, an imaging system has been designed. The system consists of one main printed circuit board (PCB) and 16 channel boards (each channel board contains 8 channels), in addition to a set-top box for connections to a personal computer (PC), a front panel board for user control and message display, and a power control and distribution board. The main board contains a field programmable gate array (FPGA) and controls all channels (each channel has also an FPGA). We will report the analog and digital circuit design and simulations, multiplayer PCB designs with commercial software (Protel 99), PCB signal integrity testing and system RFI/EMI shielding, and the assembly and construction of the entire system. [Work supported in part by Grant 5RO1 HL60301 from NIH.
The artificial retina processor for track reconstruction at the LHC crossing rate
Abba, A.; Bedeschi, F.; Citterio, M.; ...
2015-03-16
We present results of an R&D study for a specialized processor capable of precisely reconstructing, in pixel detectors, hundreds of charged-particle tracks from high-energy collisions at 40 MHz rate. We apply a highly parallel pattern-recognition algorithm, inspired by studies of the processing of visual images by the brain as it happens in nature, and describe in detail an efficient hardware implementation in high-speed, high-bandwidth FPGA devices. This is the first detailed demonstration of reconstruction of offline-quality tracks at 40 MHz and makes the device suitable for processing Large Hadron Collider events at the full crossing frequency.
High-performance computing in image registration
NASA Astrophysics Data System (ADS)
Zanin, Michele; Remondino, Fabio; Dalla Mura, Mauro
2012-10-01
Thanks to the recent technological advances, a large variety of image data is at our disposal with variable geometric, radiometric and temporal resolution. In many applications the processing of such images needs high performance computing techniques in order to deliver timely responses e.g. for rapid decisions or real-time actions. Thus, parallel or distributed computing methods, Digital Signal Processor (DSP) architectures, Graphical Processing Unit (GPU) programming and Field-Programmable Gate Array (FPGA) devices have become essential tools for the challenging issue of processing large amount of geo-data. The article focuses on the processing and registration of large datasets of terrestrial and aerial images for 3D reconstruction, diagnostic purposes and monitoring of the environment. For the image alignment procedure, sets of corresponding feature points need to be automatically extracted in order to successively compute the geometric transformation that aligns the data. The feature extraction and matching are ones of the most computationally demanding operations in the processing chain thus, a great degree of automation and speed is mandatory. The details of the implemented operations (named LARES) exploiting parallel architectures and GPU are thus presented. The innovative aspects of the implementation are (i) the effectiveness on a large variety of unorganized and complex datasets, (ii) capability to work with high-resolution images and (iii) the speed of the computations. Examples and comparisons with standard CPU processing are also reported and commented.
Simpler Adaptive Optics using a Single Device for Processing and Control
NASA Astrophysics Data System (ADS)
Zovaro, A.; Bennet, F.; Rye, D.; D'Orgeville, C.; Rigaut, F.; Price, I.; Ritchie, I.; Smith, C.
The management of low Earth orbit is becoming more urgent as satellite and debris densities climb, in order to avoid a Kessler syndrome. A key part of this management is to precisely measure the orbit of both active satellites and debris. The Research School of Astronomy and Astrophysics at the Australian National University have been developing an adaptive optics (AO) system to image and range orbiting objects. The AO system provides atmospheric correction for imaging and laser ranging, allowing for the detection of smaller angular targets and drastically increasing the number of detectable objects. AO systems are by nature very complex and high cost systems, often costing millions of dollars and taking years to design. It is not unusual for AO systems to comprise multiple servers, digital signal processors (DSP) and field programmable gate arrays (FPGA), with dedicated tasks such as wavefront sensor data processing or wavefront reconstruction. While this multi-platform approach has been necessary in AO systems to date due to computation and latency requirements, this may no longer be the case for those with less demanding processing needs. In recent years, large strides have been made in FPGA and microcontroller technology, with todays devices having clock speeds in excess of 200 MHz whilst using a < 5 V power supply. AO systems using a single such device for all data processing and control may present a far simpler, cheaper, smaller and more efficient solution than existing systems. A novel AO system design based around a single, low-cost controller is presented. The objective is to determine the performance which can be achieved in terms of bandwidth and correction order, with a focus on optimisation and parallelisation of AO algorithms such as wavefront measurement and reconstruction. The AO system consists of a Shack-Hartmann wavefront sensor and a deformable mirror to correct light from a 1.8 m telescope for the purpose of imaging orbiting satellites. The microcontroller or FPGA interfaces directly with the wavefront sensor detector and deformable mirror. Wavefront slopes are calculated from each detector frame and converted into actuator commands to complete the closed loop AO control system. A particular challenge of this system is to optimise the AO algorithms to achieve a high rate (> 1kHz) with low latency (< 1ms) to achieve a good AO correction. As part of the Space Environment Cooperative Research Centre (SERC) this AO system design will be used as a demonstrator for what is possible with ground based AO corrected satellite imaging and ranging systems. The ability to directly and efficiently interface the wavefront sensor and deformable mirror is an important step in reducing the cost and complexity of an AO system. It is hoped that in the future this design can be modified for use in general AO applications, such as in 1-3 m telescopes for space surveillance, or even for amateur astronomy.
High altitude subsonic parachute field programmable gate array
NASA Technical Reports Server (NTRS)
Kowalski, James E.; Gromov, Konstantin; Konefat, Edward H.
2005-01-01
This paper describes a rapid, top down requirements-driven design of an FPGA used in an Earth qualification test program for a new Mars subsonic parachute. The FPGA is used to process and control storage of telemetry data from multiple sensors throughout; launch, ascent, deployment and descent phases of the subsonic parachute test.
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors
NASA Technical Reports Server (NTRS)
Flatley, Thomas P.
2015-01-01
SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
Fpga based L-band pulse doppler radar design and implementation
NASA Astrophysics Data System (ADS)
Savci, Kubilay
As its name implies RADAR (Radio Detection and Ranging) is an electromagnetic sensor used for detection and locating targets from their return signals. Radar systems propagate electromagnetic energy, from the antenna which is in part intercepted by an object. Objects reradiate a portion of energy which is captured by the radar receiver. The received signal is then processed for information extraction. Radar systems are widely used for surveillance, air security, navigation, weather hazard detection, as well as remote sensing applications. In this work, an FPGA based L-band Pulse Doppler radar prototype, which is used for target detection, localization and velocity calculation has been built and a general-purpose Pulse Doppler radar processor has been developed. This radar is a ground based stationary monopulse radar, which transmits a short pulse with a certain pulse repetition frequency (PRF). Return signals from the target are processed and information about their location and velocity is extracted. Discrete components are used for the transmitter and receiver chain. The hardware solution is based on Xilinx Virtex-6 ML605 FPGA board, responsible for the control of the radar system and the digital signal processing of the received signal, which involves Constant False Alarm Rate (CFAR) detection and Pulse Doppler processing. The algorithm is implemented in MATLAB/SIMULINK using the Xilinx System Generator for DSP tool. The field programmable gate arrays (FPGA) implementation of the radar system provides the flexibility of changing parameters such as the PRF and pulse length therefore it can be used with different radar configurations as well. A VHDL design has been developed for 1Gbit Ethernet connection to transfer digitized return signal and detection results to PC. An A-Scope software has been developed with C# programming language to display time domain radar signals and detection results on PC. Data are processed both in FPGA chip and on PC. FPGA uses fixed point arithmetic operations as it is fast and facilitates source requirement as it consumes less hardware than floating point arithmetic operations. The software uses floating point arithmetic operations, which ensure precision in processing at the expense of speed. The functionality of the radar system has been tested for experimental validation in the field with a moving car and the validation of submodules are tested with synthetic data simulated on MATLAB.
Generation of Custom DSP Transform IP Cores: Case Study Walsh-Hadamard Transform
2002-09-01
mathematics and hardware design What I know: Finite state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing...state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing Adaptive filter theory … A math guy A hardware engineer...Synthesis Technology Libary Bit-width (8) HF factor (1,2,3,6) VF factor (1,2,4, ... 32) Xilinx FPGA Place&Route Xilinx FPGA Place&Route Performance
High-definition video display based on the FPGA and THS8200
NASA Astrophysics Data System (ADS)
Qian, Jia; Sui, Xiubao
2014-11-01
This paper presents a high-definition video display solution based on the FPGA and THS8200. THS8200 is a video decoder chip launched by TI company, this chip has three 10-bit DAC channels which can capture video data in both 4:2:2 and 4:4:4 formats, and its data synchronization can be either through the dedicated synchronization signals HSYNC and VSYNC, or extracted from the embedded video stream synchronization information SAV / EAV code. In this paper, we will utilize the address and control signals generated by FPGA to access to the data-storage array, and then the FPGA generates the corresponding digital video signals YCbCr. These signals combined with the synchronization signals HSYNC and VSYNC that are also generated by the FPGA act as the input signals of THS8200. In order to meet the bandwidth requirements of the high-definition TV, we adopt video input in the 4:2:2 format over 2×10-bit interface. THS8200 is needed to be controlled by FPGA with I2C bus to set the internal registers, and as a result, it can generate the synchronous signal that is satisfied with the standard SMPTE and transfer the digital video signals YCbCr into analog video signals YPbPr. Hence, the composite analog output signals YPbPr are consist of image data signal and synchronous signal which are superimposed together inside the chip THS8200. The experimental research indicates that the method presented in this paper is a viable solution for high-definition video display, which conforms to the input requirements of the new high-definition display devices.
Integrated test system of infrared and laser data based on USB 3.0
NASA Astrophysics Data System (ADS)
Fu, Hui Quan; Tang, Lin Bo; Zhang, Chao; Zhao, Bao Jun; Li, Mao Wen
2017-07-01
Based on USB3.0, this paper presents the design method of an integrated test system for both infrared image data and laser signal data processing module. The core of the design is FPGA logic control, the design uses dual-chip DDR3 SDRAM to achieve high-speed laser data cache, and receive parallel LVDS image data through serial-to-parallel conversion chip, and it achieves high-speed data communication between the system and host computer through the USB3.0 bus. The experimental results show that the developed PC software realizes the real-time display of 14-bit LVDS original image after 14-to-8 bit conversion and JPEG2000 compressed image after decompression in software, and can realize the real-time display of the acquired laser signal data. The correctness of the test system design is verified, indicating that the interface link is normal.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Ormsby, John (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing (DSP) functions. Such capability also makes and FPGA a suitable platform for the digital implementation of closed loop controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance in a compact form-factor. Other researchers have presented the notion that a second order digital filter with proportional-integral-derivative (PID) control functionality can be implemented in an FPGA. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSF) devices. Our goal is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. Meeting our goals requires alternative compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching these goals.
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2017-01-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2016-10-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
HDL Based FPGA Interface Library for Data Acquisition and Multipurpose Real Time Algorithms
NASA Astrophysics Data System (ADS)
Fernandes, Ana M.; Pereira, R. C.; Sousa, J.; Batista, A. J. N.; Combo, A.; Carvalho, B. B.; Correia, C. M. B. A.; Varandas, C. A. F.
2011-08-01
The inherent parallelism of the logic resources, the flexibility in its configuration and the performance at high processing frequencies makes the field programmable gate array (FPGA) the most suitable device to be used both for real time algorithm processing and data transfer in instrumentation modules. Moreover, the reconfigurability of these FPGA based modules enables exploiting different applications on the same module. When using a reconfigurable module for various applications, the availability of a common interface library for easier implementation of the algorithms on the FPGA leads to more efficient development. The FPGA configuration is usually specified in a hardware description language (HDL) or other higher level descriptive language. The critical paths, such as the management of internal hardware clocks that require deep knowledge of the module behavior shall be implemented in HDL to optimize the timing constraints. The common interface library should include these critical paths, freeing the application designer from hardware complexity and able to choose any of the available high-level abstraction languages for the algorithm implementation. With this purpose a modular Verilog code was developed for the Virtex 4 FPGA of the in-house Transient Recorder and Processor (TRP) hardware module, based on the Advanced Telecommunications Computing Architecture (ATCA), with eight channels sampling at up to 400 MSamples/s (MSPS). The TRP was designed to perform real time Pulse Height Analysis (PHA), Pulse Shape Discrimination (PSD) and Pile-Up Rejection (PUR) algorithms at a high count rate (few Mevent/s). A brief description of this modular code is presented and examples of its use as an interface with end user algorithms, including a PHA with PUR, are described.
Hardware Implementation of Lossless Adaptive Compression of Data From a Hyperspectral Imager
NASA Technical Reports Server (NTRS)
Keymeulen, Didlier; Aranki, Nazeeh I.; Klimesh, Matthew A.; Bakhshi, Alireza
2012-01-01
Efficient onboard data compression can reduce the data volume from hyperspectral imagers on NASA and DoD spacecraft in order to return as much imagery as possible through constrained downlink channels. Lossless compression is important for signature extraction, object recognition, and feature classification capabilities. To provide onboard data compression, a hardware implementation of a lossless hyperspectral compression algorithm was developed using a field programmable gate array (FPGA). The underlying algorithm is the Fast Lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral- Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), p. 26 with the modification reported in Lossless, Multi-Spectral Data Comressor for Improved Compression for Pushbroom-Type Instruments (NPO-45473), NASA Tech Briefs, Vol. 32, No. 7 (July 2008) p. 63, which provides improved compression performance for data from pushbroom-type imagers. An FPGA implementation of the unmodified FL algorithm was previously developed and reported in Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System (NPO-46867), NASA Tech Briefs, Vol. 36, No. 5 (May 2012) p. 42. The essence of the FL algorithm is adaptive linear predictive compression using the sign algorithm for filter adaption. The FL compressor achieves a combination of low complexity and compression effectiveness that exceeds that of stateof- the-art techniques currently in use. The modification changes the predictor structure to tolerate differences in sensitivity of different detector elements, as occurs in pushbroom-type imagers, which are suitable for spacecraft use. The FPGA implementation offers a low-cost, flexible solution compared to traditional ASIC (application specific integrated circuit) and can be integrated as an intellectual property (IP) for part of, e.g., a design that manages the instrument interface. The FPGA implementation was benchmarked on the Xilinx Virtex IV LX25 device, and ported to a Xilinx prototype board. The current implementation has a critical path of 29.5 ns, which dictated a clock speed of 33 MHz. The critical path delay is end-to-end measurement between the uncompressed input data and the output compression data stream. The implementation compresses one sample every clock cycle, which results in a speed of 33 Msample/s. The implementation has a rather low device use of the Xilinx Virtex IV LX25, making the total power consumption of the implementation about 1.27 W.
Super-Resolution in Plenoptic Cameras Using FPGAs
Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime
2014-01-01
Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes. PMID:24841246
Super-resolution in plenoptic cameras using FPGAs.
Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime
2014-05-16
Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes.
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Cotta Ramusino, A.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Gianoli, A.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-03-01
This project aims to exploit the parallel computing power of a commercial Graphics Processing Unit (GPU) to implement fast pattern matching in the Ring Imaging Cherenkov (RICH) detector for the level 0 (L0) trigger of the NA62 experiment. In this approach, the ring-fitting algorithm is seedless, being fed with raw RICH data, with no previous information on the ring position from other detectors. Moreover, since the L0 trigger is provided with a more elaborated information than a simple multiplicity number, it results in a higher selection power. Two methods have been studied in order to reduce the data transfer latency from the readout boards of the detector to the GPU, i.e., the use of a dedicated NIC device driver with very low latency and a direct data transfer protocol from a custom FPGA-based NIC to the GPU. The performance of the system, developed through the FPGA approach, for multi-ring Cherenkov online reconstruction obtained during the NA62 physics runs is presented.
Counting neutrons with a commercial S-CMOS camera
NASA Astrophysics Data System (ADS)
Patrick, Van Esch; Paolo, Mutti; Emilio, Ruiz-Martinez; Estefania, Abad Garcia; Marita, Mosconi; Jon, Ortega
2018-01-01
It is possible to detect individual flashes from thermal neutron impacts in a ZnS scintillator using a CMOS camera looking at the scintillator screen, and off line image processing. Some preliminary results indicated that the efficiency of recognition could be improved by optimizing the light collection and the image processing. We will report on this ongoing work which is a result from the collaboration between ESS Bilbao and the ILL. The main progress to be reported is situated on the level of the on-line treatment of the imaging data. If this technology is to work on a genuine scientific instrument, it is necessary that all the processing happens on line, to avoid the accumulation of large amounts of image data to be analyzed off line. An FPGA-based real-time full-deca mode VME-compatible CameraLink board has been developed at the SCI of the ILL, which is able to manage the data flow from the camera and convert it in a reasonable "neutron impact" data flow like from a usual neutron counting detector. The main challenge of the endeavor is the optical light collection from the scintillator. While the light yield of a ZnS scintillator is a priori rather important, the amount of light collected with a photographic objective is small. Different scintillators and different light collection techniques have been experimented with and results will be shown for different setups improving upon the light recuperation on the camera sensor. Improvements on the algorithm side will also be presented. The algorithms have to be at the same time efficient in their recognition of neutron signals, in their rejection of noise signals (internal and external to the camera) but also have to be simple enough to be easily implemented in the FPGA. The path from the idea of detecting individual neutron impacts with a CMOS camera to a practical working instrument detector is challenging, and in this paper we will give an overview of the part of the road that has already been walked.
Separating higher-order nonlinearities in transient absorption microscopy
NASA Astrophysics Data System (ADS)
Wilson, Jesse W.; Anderson, Miguel; Park, Jong Kang; Fischer, Martin C.; Warren, Warren S.
2015-08-01
The transient absorption response of melanin is a promising optically-accessible biomarker for distinguishing malignant melanoma from benign pigmented lesions, as demonstrated by earlier experiments on thin sections from biopsied tissue. The technique has also been demonstrated in vivo, but the higher optical intensity required for detecting these signals from backscattered light introduces higher-order nonlinearities in the transient response of melanin. These components that are higher than linear with respect to the pump or the probe introduce intensity-dependent changes to the overall response that complicate data analysis. However, our data also suggest these nonlinearities might be advantageous to in vivo imaging, in that different types of melanins have different nonlinear responses. Therefore, methods to separate linear from nonlinear components in transient absorption measurements might provide additional information to aid in the diagnosis of melanoma. We will discuss numerical methods for analyzing the various nonlinear contributions to pump-probe signals, with the ultimate objective of real time analysis using digital signal processing techniques. To that end, we have replaced the lock-in amplifier in our pump-probe microscope with a high-speed data acquisition board, and reprogrammed the coprocessor field-programmable gate array (FPGA) to perform lock-in detection. The FPGA lock-in offers better performance than the commercial instrument, in terms of both signal to noise ratio and speed. In addition, the flexibility of the digital signal processing approach enables demodulation of more complicated waveforms, such as spread-spectrum sequences, which has the potential to accelerate microscopy methods that rely on slow relaxation phenomena, such as photo-thermal and phosphorescence lifetime imaging.
An FPGA-based bolometer for the MAST-U Super-X divertor.
Lovell, Jack; Naylor, Graham; Field, Anthony; Drewelow, Peter; Sharples, Ray
2016-11-01
A new resistive bolometer system has been developed for MAST-Upgrade. It will measure radiated power in the new Super-X divertor, with millisecond time resolution, along 16 vertical and 16 horizontal lines of sight. The system uses a Xilinx Zynq-7000 series Field-Programmable Gate Array (FPGA) in the D-TACQ ACQ2106 carrier to perform real time data acquisition and signal processing. The FPGA enables AC-synchronous detection using high performance digital filtering to achieve a high signal-to-noise ratio and will be able to output processed data in real time with millisecond latency. The system has been installed on 8 previously unused channels of the JET vertical bolometer system. Initial results suggest good agreement with data from existing vertical channels but with higher bandwidth and signal-to-noise ratio.
Quick acquisition and recognition method for the beacon in deep space optical communications.
Wang, Qiang; Liu, Yuefei; Ma, Jing; Tan, Liying; Yu, Siyuan; Li, Changjiang
2016-12-01
In deep space optical communications, it is very difficult to acquire the beacon given the long communication distance. Acquisition efficiency is essential for establishing and holding the optical communication link. Here we proposed a quick acquisition and recognition method for the beacon in deep optical communications based on the characteristics of the deep optical link. To identify the beacon from the background light efficiently, we utilized the maximum similarity between the collecting image and the reference image for accurate recognition and acquisition of the beacon in the area of uncertainty. First, the collecting image and the reference image were processed by Fourier-Mellin. Second, image sampling and image matching were applied for the accurate positioning of the beacon. Finally, the field programmable gate array (FPGA)-based system was used to verify and realize this method. The experimental results showed that the acquisition time for the beacon was as fast as 8.1s. Future application of this method in the system design of deep optical communication will be beneficial.
FPGA based charge fast histogramming for GEM detector
NASA Astrophysics Data System (ADS)
Poźniak, Krzysztof T.; Byszuk, A.; Chernyshova, M.; Cieszewski, R.; Czarski, T.; Dominik, W.; Jakubowska, K.; Kasprowicz, G.; Rzadkiewicz, J.; Scholz, M.; Zabolotny, W.
2013-10-01
This article presents a fast charge histogramming method for the position sensitive X-ray GEM detector. The energy resolved measurements are carried out simultaneously for 256 channels of the GEM detector. The whole process of histogramming is performed in 21 FPGA chips (Spartan-6 series from Xilinx) . The results of the histogramming process are stored in an external DDR3 memory. The structure of an electronic measuring equipment and a firmware functionality implemented in the FPGAs is described. Examples of test measurements are presented.
Iris unwrapping using the Bresenham circle algorithm for real-time iris recognition
NASA Astrophysics Data System (ADS)
Carothers, Matthew T.; Ngo, Hau T.; Rakvic, Ryan N.; Broussard, Randy P.
2015-02-01
An efficient parallel architecture design for the iris unwrapping process in a real-time iris recognition system using the Bresenham Circle Algorithm is presented in this paper. Based on the characteristics of the model parameters this algorithm was chosen over the widely used polar conversion technique as the iris unwrapping model. The architecture design is parallelized to increase the throughput of the system and is suitable for processing an inputted image size of 320 × 240 pixels in real-time using Field Programmable Gate Array (FPGA) technology. Quartus software is used to implement, verify, and analyze the design's performance using the VHSIC Hardware Description Language. The system's predicted processing time is faster than the modern iris unwrapping technique used today∗.
Smart Cameras for Remote Science Survey
NASA Technical Reports Server (NTRS)
Thompson, David R.; Abbey, William; Allwood, Abigail; Bekker, Dmitriy; Bornstein, Benjamin; Cabrol, Nathalie A.; Castano, Rebecca; Estlin, Tara; Fuchs, Thomas; Wagstaff, Kiri L.
2012-01-01
Communication with remote exploration spacecraft is often intermittent and bandwidth is highly constrained. Future missions could use onboard science data understanding to prioritize downlink of critical features [1], draft summary maps of visited terrain [2], or identify targets of opportunity for followup measurements [3]. We describe a generic approach to classify geologic surfaces for autonomous science operations, suitable for parallelized implementations in FPGA hardware. We map these surfaces with texture channels - distinctive numerical signatures that differentiate properties such as roughness, pavement coatings, regolith characteristics, sedimentary fabrics and differential outcrop weathering. This work describes our basic image analysis approach and reports an initial performance evaluation using surface images from the Mars Exploration Rovers. Future work will incorporate these methods into camera hardware for real-time processing.
High-performance electronic image stabilisation for shift and rotation correction
NASA Astrophysics Data System (ADS)
Parker, Steve C. J.; Hickman, D. L.; Wu, F.
2014-06-01
A novel low size, weight and power (SWaP) video stabiliser called HALO™ is presented that uses a SoC to combine the high processing bandwidth of an FPGA, with the signal processing flexibility of a CPU. An image based architecture is presented that can adapt the tiling of frames to cope with changing scene dynamics. A real-time implementation is then discussed that can generate several hundred optical flow vectors per video frame, to accurately calculate the unwanted rigid body translation and rotation of camera shake. The performance of the HALO™ stabiliser is comprehensively benchmarked against the respected Deshaker 3.0 off-line stabiliser plugin to VirtualDub. Eight different videos are used for benchmarking, simulating: battlefield, surveillance, security and low-level flight applications in both visible and IR wavebands. The results show that HALO™ rivals the performance of Deshaker within its operating envelope. Furthermore, HALO™ may be easily reconfigured to adapt to changing operating conditions or requirements; and can be used to host other video processing functionality like image distortion correction, fusion and contrast enhancement.
Parallel Fixed Point Implementation of a Radial Basis Function Network in an FPGA
de Souza, Alisson C. D.; Fernandes, Marcelo A. C.
2014-01-01
This paper proposes a parallel fixed point radial basis function (RBF) artificial neural network (ANN), implemented in a field programmable gate array (FPGA) trained online with a least mean square (LMS) algorithm. The processing time and occupied area were analyzed for various fixed point formats. The problems of precision of the ANN response for nonlinear classification using the XOR gate and interpolation using the sine function were also analyzed in a hardware implementation. The entire project was developed using the System Generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA. PMID:25268918
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Morookian, John M.; Monacos, Steve P.; Lam, Raymond K.; Lebaw, C.; Bond, A.
2004-04-01
Eyetracking is one of the latest technologies that has shown potential in several areas including human-computer interaction for people with and without disabilities, and for noninvasive monitoring, detection, and even diagnosis of physiological and neurological problems in individuals. Current non-invasive eyetracking methods achieve a 30 Hz rate with possibly low accuracy in gaze estimation, that is insufficient for many applications. We propose a new non-invasive visual eyetracking system that is capable of operating at speeds as high as 6-12 KHz. A new CCD video camera and hardware architecture is used, and a novel fast image processing algorithm leverages specific features of the input CCD camera to yield a real-time eyetracking system. A field programmable gate array (FPGA) is used to control the CCD camera and execute the image processing operations. Initial results show the excellent performance of our system under severe head motion and low contrast conditions.
Real-time depth processing for embedded platforms
NASA Astrophysics Data System (ADS)
Rahnama, Oscar; Makarov, Aleksej; Torr, Philip
2017-05-01
Obtaining depth information of a scene is an important requirement in many computer-vision and robotics applications. For embedded platforms, passive stereo systems have many advantages over their active counterparts (i.e. LiDAR, Infrared). They are power efficient, cheap, robust to lighting conditions and inherently synchronized to the RGB images of the scene. However, stereo depth estimation is a computationally expensive task that operates over large amounts of data. For embedded applications which are often constrained by power consumption, obtaining accurate results in real-time is a challenge. We demonstrate a computationally and memory efficient implementation of a stereo block-matching algorithm in FPGA. The computational core achieves a throughput of 577 fps at standard VGA resolution whilst consuming less than 3 Watts of power. The data is processed using an in-stream approach that minimizes memory-access bottlenecks and best matches the raster scan readout of modern digital image sensors.
Automated Design of Board and MCM Level Digital Systems.
1997-10-01
Partitioning for Multicomponent Synthesis 159 Appendix K: Resource Constrained RTL Partitioning for Synthesis of Multi- FPGA Designs 169 Appendix L...digital signal processing) ar- chitectures. These target architectures, illustrated in Figure 1, can contain application-specific ASICS, FPGAs ...synthesis tools for ASIC, FPGA and MCM synthesis (Figure 8). Multicomponent Partitioning Engine The par- titioning engine is a hierarchical partitioning
V&V Plan for FPGA-based ESF-CCS Using System Engineering Approach.
NASA Astrophysics Data System (ADS)
Maerani, Restu; Mayaka, Joyce; El Akrat, Mohamed; Cheon, Jung Jae
2018-02-01
Instrumentation and Control (I&C) systems play an important role in maintaining the safety of Nuclear Power Plant (NPP) operation. However, most current I&C safety systems are based on Programmable Logic Controller (PLC) hardware, which is difficult to verify and validate, and is susceptible to software common cause failure. Therefore, a plan for the replacement of the PLC-based safety systems, such as the Engineered Safety Feature - Component Control System (ESF-CCS), with Field Programmable Gate Arrays (FPGA) is needed. By using a systems engineering approach, which ensures traceability in every phase of the life cycle, from system requirements, design implementation to verification and validation, the system development is guaranteed to be in line with the regulatory requirements. The Verification process will ensure that the customer and stakeholder’s needs are satisfied in a high quality, trustworthy, cost efficient and schedule compliant manner throughout a system’s entire life cycle. The benefit of the V&V plan is to ensure that the FPGA based ESF-CCS is correctly built, and to ensure that the measurement of performance indicators has positive feedback that “do we do the right thing” during the re-engineering process of the FPGA based ESF-CCS.
NASA Astrophysics Data System (ADS)
Villar, Xabier; Piso, Daniel; Bruguera, Javier D.
2014-02-01
This paper presents an FPGA implementation of an algorithm, previously published, for the the reconstruction of cosmic rays' trajectories and the determination of the time of arrival and velocity of the particles. The accuracy and precision issues of the algorithm have been analyzed to propose a suitable implementation. Thus, a 32-bit fixed-point format has been used for the representation of the data values. Moreover, the dependencies among the different operations have been taken into account to obtain a highly parallel and efficient hardware implementation. The final hardware architecture requires 18 cycles to process every particle, and has been exhaustively simulated to validate all the design decisions. The architecture has been mapped over different commercial FPGAs, with a frequency of operation ranging from 300 MHz to 1.3 GHz, depending on the FPGA being used. Consequently, the number of particle trajectories processed per second is between 16 million and 72 million. The high number of particle trajectories calculated per second shows that the proposed FPGA implementation might be used also in high rate environments such as those found in particle and nuclear physics experiments.
Rapid Onboard Data Product Generation with Multicore Processors and FPGA
NASA Astrophysics Data System (ADS)
Mandl, D.; Sohlberg, R. A.; Cappelaere, P. G.; Frye, S. W.; Ly, V.; Handy, M.; Ambrosia, V. G.; Sullivan, D. V.; Bland, G.; Pastor, E.; Crago, S.; Flatley, C.; Shah, N.; Bronston, J.; Creech, T.
2012-12-01
The Intelligent Payload Module (IPM) is an experimental testbed with multicore processors and Field Programmable Gate Array (FPGA). This effort is being funded by the NASA Earth Science Technology Office as part of an Advanced Information Systems Technology (AIST) 2011 research grant to investigate the use of high performance onboard processing to create an onboard data processing pipeline that can rapidly process a subset of onboard imaging spectrometer data (1) through radiance to reflectance conversion (2) atmospheric correction (3) geolocation and co-registration and (4) level 2 data product generation. The requirements are driven by the mission concept for the HyspIRI NASA Decadal mission, although other NASA Decadal missions could use the same concept. The system is being set up to make use of the same ground and flight software being used by other satellites at NASA/GSFC. Furthermore, a Web Coverage Processing Service (WCPS) is installed as part of the flight software which enables a user on the ground to specify the desired algorithm to run onboard against the data in realtime. Benchmark demonstrations are being run and will be run through the three year effort on various platforms including a helicopter and various airplane platforms with various instruments to demonstrate various configurations that would be compatible with the HyspIRI mission and other similar missions. This presentation will lay out the demonstrations conducted to date along with any benchmark performance metrics and future demonstration efforts and objectives.Initial IPM Test Box
Spaceborne synthetic aperture radar signal processing using FPGAs
NASA Astrophysics Data System (ADS)
Sugimoto, Yohei; Ozawa, Satoru; Inaba, Noriyasu
2017-10-01
Synthetic Aperture Radar (SAR) imagery requires image reproduction through successive signal processing of received data before browsing images and extracting information. The received signal data records of the ALOS-2/PALSAR-2 are stored in the onboard mission data storage and transmitted to the ground. In order to compensate the storage usage and the capacity of transmission data through the mission date communication networks, the operation duty of the PALSAR-2 is limited. This balance strongly relies on the network availability. The observation operations of the present spaceborne SAR systems are rigorously planned by simulating the mission data balance, given conflicting user demands. This problem should be solved such that we do not have to compromise the operations and the potential of the next-generation spaceborne SAR systems. One of the solutions is to compress the SAR data through onboard image reproduction and information extraction from the reproduced images. This is also beneficial for fast delivery of information products and event-driven observations by constellation. The Emergence Studio (Sōhatsu kōbō in Japanese) with Japan Aerospace Exploration Agency is developing evaluation models of FPGA-based signal processing system for onboard SAR image reproduction. The model, namely, "Fast L1 Processor (FLIP)" developed in 2016 can reproduce a 10m-resolution single look complex image (Level 1.1) from ALOS/PALSAR raw signal data (Level 1.0). The processing speed of the FLIP at 200 MHz results in twice faster than CPU-based computing at 3.7 GHz. The image processed by the FLIP is no way inferior to the image processed with 32-bit computing in MATLAB.
Evaluation of the FIR Example using Xilinx Vivado High-Level Synthesis Compiler
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Finkel, Hal; Yoshii, Kazutomo
Compared to central processing units (CPUs) and graphics processing units (GPUs), field programmable gate arrays (FPGAs) have major advantages in reconfigurability and performance achieved per watt. This development flow has been augmented with high-level synthesis (HLS) flow that can convert programs written in a high-level programming language to Hardware Description Language (HDL). Using high-level programming languages such as C, C++, and OpenCL for FPGA-based development could allow software developers, who have little FPGA knowledge, to take advantage of the FPGA-based application acceleration. This improves developer productivity and makes the FPGA-based acceleration accessible to hardware and software developers. Xilinx Vivado HLSmore » compiler is a high-level synthesis tool that enables C, C++ and System C specification to be directly targeted into Xilinx FPGAs without the need to create RTL manually. The white paper [1] published recently by Xilinx uses a finite impulse response (FIR) example to demonstrate the variable-precision features in the Vivado HLS compiler and the resource and power benefits of converting floating point to fixed point for a design. To get a better understanding of variable-precision features in terms of resource usage and performance, this report presents the experimental results of evaluating the FIR example using Vivado HLS 2017.1 and a Kintex Ultrascale FPGA. In addition, we evaluated the half-precision floating-point data type against the double-precision and single-precision data type and present the detailed results.« less
Event-Driven Random-Access-Windowing CCD Imaging System
NASA Technical Reports Server (NTRS)
Monacos, Steve; Portillo, Angel; Ortiz, Gerardo; Alexander, James; Lam, Raymond; Liu, William
2004-01-01
A charge-coupled-device (CCD) based high-speed imaging system, called a realtime, event-driven (RARE) camera, is undergoing development. This camera is capable of readout from multiple subwindows [also known as regions of interest (ROIs)] within the CCD field of view. Both the sizes and the locations of the ROIs can be controlled in real time and can be changed at the camera frame rate. The predecessor of this camera was described in High-Frame-Rate CCD Camera Having Subwindow Capability (NPO- 30564) NASA Tech Briefs, Vol. 26, No. 12 (December 2002), page 26. The architecture of the prior camera requires tight coupling between camera control logic and an external host computer that provides commands for camera operation and processes pixels from the camera. This tight coupling limits the attainable frame rate and functionality of the camera. The design of the present camera loosens this coupling to increase the achievable frame rate and functionality. From a host computer perspective, the readout operation in the prior camera was defined on a per-line basis; in this camera, it is defined on a per-ROI basis. In addition, the camera includes internal timing circuitry. This combination of features enables real-time, event-driven operation for adaptive control of the camera. Hence, this camera is well suited for applications requiring autonomous control of multiple ROIs to track multiple targets moving throughout the CCD field of view. Additionally, by eliminating the need for control intervention by the host computer during the pixel readout, the present design reduces ROI-readout times to attain higher frame rates. This camera (see figure) includes an imager card consisting of a commercial CCD imager and two signal-processor chips. The imager card converts transistor/ transistor-logic (TTL)-level signals from a field programmable gate array (FPGA) controller card. These signals are transmitted to the imager card via a low-voltage differential signaling (LVDS) cable assembly. The FPGA controller card is connected to the host computer via a standard peripheral component interface (PCI).
Independent component analysis algorithm FPGA design to perform real-time blind source separation
NASA Astrophysics Data System (ADS)
Meyer-Baese, Uwe; Odom, Crispin; Botella, Guillermo; Meyer-Baese, Anke
2015-05-01
The conditions that arise in the Cocktail Party Problem prevail across many fields creating a need for of Blind Source Separation. The need for BSS has become prevalent in several fields of work. These fields include array processing, communications, medical signal processing, and speech processing, wireless communication, audio, acoustics and biomedical engineering. The concept of the cocktail party problem and BSS led to the development of Independent Component Analysis (ICA) algorithms. ICA proves useful for applications needing real time signal processing. The goal of this research was to perform an extensive study on ability and efficiency of Independent Component Analysis algorithms to perform blind source separation on mixed signals in software and implementation in hardware with a Field Programmable Gate Array (FPGA). The Algebraic ICA (A-ICA), Fast ICA, and Equivariant Adaptive Separation via Independence (EASI) ICA were examined and compared. The best algorithm required the least complexity and fewest resources while effectively separating mixed sources. The best algorithm was the EASI algorithm. The EASI ICA was implemented on hardware with Field Programmable Gate Arrays (FPGA) to perform and analyze its performance in real time.
FPGA Online Tracking Algorithm for the PANDA Straw Tube Tracker
NASA Astrophysics Data System (ADS)
Liang, Yutie; Ye, Hua; Galuska, Martin J.; Gessler, Thomas; Kuhn, Wolfgang; Lange, Jens Soren; Wagner, Milan N.; Liu, Zhen'an; Zhao, Jingzhou
2017-06-01
A novel FPGA based online tracking algorithm for helix track reconstruction in a solenoidal field, developed for the PANDA spectrometer, is described. Employing the Straw Tube Tracker detector with 4636 straw tubes, the algorithm includes a complex track finder, and a track fitter. Implemented in VHDL, the algorithm is tested on a Xilinx Virtex-4 FX60 FPGA chip with different types of events, at different event rates. A processing time of 7 $\\mu$s per event for an average of 6 charged tracks is obtained. The momentum resolution is about 3\\% (4\\%) for $p_t$ ($p_z$) at 1 GeV/c. Comparing to the algorithm running on a CPU chip (single core Intel Xeon E5520 at 2.26 GHz), an improvement of 3 orders of magnitude in processing time is obtained. The algorithm can handle severe overlapping of events which are typical for interaction rates above 10 MHz.
A new FPGA architecture suitable for DSP applications
NASA Astrophysics Data System (ADS)
Liyun, Wang; Jinmei, Lai; Jiarong, Tong; Pushan, Tang; Xing, Chen; Xueyan, Duan; Liguang, Chen; Jian, Wang; Yuan, Wang
2011-05-01
A new FPGA architecture suitable for digital signal processing applications is presented. DSP modules can be inserted into FPGA conveniently with the proposed architecture, which is much faster when used in the field of digital signal processing compared with traditional FPGAs. An advanced 2-level MUX (multiplexer) is also proposed. With the added SLEEP MODE PASS to traditional 2-level MUX, static leakage is reduced. Furthermore, buffers are inserted at early returns of long lines. With this kind of buffer, the delay of the long line is improved by 9.8% while the area increases by 4.37%. The layout of this architecture has been taped out in standard 0.13 μm CMOS technology successfully. The die size is 6.3 × 4.5 mm2 with the QFP208 package. Test results show that performances of presented classical DSP cases are improved by 28.6%-302% compared with traditional FPGAs.
A fast one-chip event-preprocessor and sequencer for the Simbol-X Low Energy Detector
NASA Astrophysics Data System (ADS)
Schanz, T.; Tenzer, C.; Maier, D.; Kendziorra, E.; Santangelo, A.
2010-12-01
We present an FPGA-based digital camera electronics consisting of an Event-Preprocessor (EPP) for on-board data preprocessing and a related Sequencer (SEQ) to generate the necessary signals to control the readout of the detector. The device has been originally designed for the Simbol-X low energy detector (LED). The EPP operates on 64×64 pixel images and has a real-time processing capability of more than 8000 frames per second. The already working releases of the EPP and the SEQ are now combined into one Digital-Camera-Controller-Chip (D3C).
HIFU Monitoring and Control with Dual-Mode Ultrasound Arrays
NASA Astrophysics Data System (ADS)
Casper, Andrew Jacob
The biological effects of high-intensity focused ultrasound (HIFU) have been known and studied for decades. HIFU has been shown capable of treating a wide variety of diseases and disorders. However, despite its demonstrated potential, HIFU has been slow to gain clinical acceptance. This is due, in part, to the difficulty associated with robustly monitoring and controlling the delivery of the HIFU energy. The non-invasive nature of the surgery makes the assessment of treatment progression difficult, leading to long treatment times and a significant risk of under treatment. This thesis research develops new techniques and systems for robustly monitoring HIFU therapies for the safe and efficacious delivery of the intended treatment. Systems and algorithms were developed for the two most common modes of HIFU delivery systems: single-element and phased array applicators. Delivering HIFU with a single element transducer is a widely used technique in HIFU therapies. The simplicity of a single element offers many benefits in terms of cost and overall system complexity. Typical monitoring schemes rely on an external device (e.g. diagnostic ultrasound or MRI) to assess the progression of therapy. The research presented in this thesis explores using the same element to both deliver and monitor the HIFU therapy. The use of a dual-mode ultrasound transducer (DMUT) required the development of an FPGA based single-channel arbitrary waveform generator and high-speed data acquisition unit. Data collected from initial uncontrolled ablations led to the development of monitoring and control algorithms which were implemented directly on the FPGA. Close integration between the data acquisition and arbitrary waveform units allowed for fast, low latency control over the ablation process. Results are presented that demonstrate control of HIFU therapies over a broad range of intensities and in multiple in vitro tissues. The second area of investigation expands the DMUT research to an ultrasound phased-array. The phased-array allows for electronic steering of the HIFU focus and imaging of the acoustic medium. Investigating the dual-mode ultrasound array (DMUA) required the design and construction of a novel ultrasound-guided focused ultrasound (USgFUS) platform. The platform consisted of custom hardware designed for the unique requirements of operating a phased-array in both therapeutic and imaging modes. The platform also required the development of FPGA based signal processing and GPU based beamforming algorithms for online monitoring of the therapy process. The results presented in this thesis represent the first demonstration of a real-time USgFUS platform based around a DMUA. Experimental imaging and therapy results from series of animal experiments, including a 12 animal GLP study, are presented. In addition, in vitro control results, which build upon the DMUT work, are presented.
An Insect Eye Inspired Miniaturized Multi-Camera System for Endoscopic Imaging.
Cogal, Omer; Leblebici, Yusuf
2017-02-01
In this work, we present a miniaturized high definition vision system inspired by insect eyes, with a distributed illumination method, which can work in dark environments for proximity imaging applications such as endoscopy. Our approach is based on modeling biological systems with off-the-shelf miniaturized cameras combined with digital circuit design for real time image processing. We built a 5 mm radius hemispherical compound eye, imaging a 180 ° ×180 ° degrees field of view while providing more than 1.1 megapixels (emulated ommatidias) as real-time video with an inter-ommatidial angle ∆ϕ = 0.5 ° at 18 mm radial distance. We made an FPGA implementation of the image processing system which is capable of generating 25 fps video with 1080 × 1080 pixel resolution at a 120 MHz processing clock frequency. When compared to similar size insect eye mimicking systems in literature, the system proposed in this paper features 1000 × resolution increase. To the best of our knowledge, this is the first time that a compound eye with built-in illumination idea is reported. We are offering our miniaturized imaging system for endoscopic applications like colonoscopy or laparoscopic surgery where there is a need for large field of view high definition imagery. For that purpose we tested our system inside a human colon model. We also present the resulting images and videos from the human colon model in this paper.
Wirtz, Sebastian F; Cunha, Adauto P A; Labusch, Marc; Marzun, Galina; Barcikowski, Stephan; Söffker, Dirk
2018-06-01
Today, the demand for continuous monitoring of valuable or safety critical equipment is increasing in many industrial applications due to safety and economical requirements. Therefore, reliable in-situ measurement techniques are required for instance in Structural Health Monitoring (SHM) as well as process monitoring and control. Here, current challenges are related to the processing of sensor data with a high data rate and low latency. In particular, measurement and analyses of Acoustic Emission (AE) are widely used for passive, in-situ inspection. Advantages of AE are related to its sensitivity to different micro-mechanical mechanisms on the material level. However, online processing of AE waveforms is computationally demanding. The related equipment is typically bulky, expensive, and not well suited for permanent installation. The contribution of this paper is the development of a Field Programmable Gate Array (FPGA)-based measurement system using ZedBoard devlopment kit with Zynq-7000 system on chip for embedded implementation of suitable online processing algorithms. This platform comprises a dual-core Advanced Reduced Instruction Set Computer Machine (ARM) architecture running a Linux operating system and FPGA fabric. A FPGA-based hardware implementation of the discrete wavelet transform is realized to accelerate processing the AE measurements. Key features of the system are low cost, small form factor, and low energy consumption, which makes it suitable to serve as field-deployed measurement and control device. For verification of the functionality, a novel automatically realized adjustment of the working distance during pulsed laser ablation in liquids is established as an example. A sample rate of 5 MHz is achieved at 16 bit resolution.
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor
NASA Astrophysics Data System (ADS)
Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram
2007-09-01
Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
Implementation en VHDl/FPGA d'afficheur video numerique (AVN) pour des applications aerospatiales
NASA Astrophysics Data System (ADS)
Pelletier, Sebastien
L'objectif de ce projet est de developper un controleur video en langage VHDL afin de remplacer la composante specialisee presentement utilisee chez CMC Electronique. Une recherche approfondie des tendances et de ce qui se fait actuellement dans le domaine des controleurs video est effectuee afin de definir les specifications du systeme. Les techniques d'entreposage et d'affichage des images sont expliquees afin de mener ce projet a terme. Le nouveau controleur est developpe sur une plateforme electronique possedant un FPGA, un port VGA et de la memoire pour emmagasiner les donnees. Il est programmable et prend peu d'espace dans un FPGA, ce qui lui permet de s'inserer dans n'importe quelle nouvelle technologie de masse a faible cout. Il s'adapte rapidement a toutes les resolutions d'affichage puisqu'il est modulaire et configurable. A court terme, ce projet permettra un controle ameliore des specifications et des normes de qualite liees aux contraintes de l'avionique.
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Ken
2007-01-01
This viewgraph presentation reviews the selection of the optimum Field Programmable Gate Arrays (FPGA) for space missions. Included in this review is a discussion on differentiating amongst various FPGAs, cost analysis of the various options, the investigation of radiation effects, an expansion of the evaluation criteria, and the application of the evaluation criteria to the selection process.
Parametric dense stereovision implementation on a system-on chip (SoC).
Gardel, Alfredo; Montejo, Pablo; García, Jorge; Bravo, Ignacio; Lázaro, José L
2012-01-01
This paper proposes a novel hardware implementation of a dense recovery of stereovision 3D measurements. Traditionally 3D stereo systems have imposed the maximum number of stereo correspondences, introducing a large restriction on artificial vision algorithms. The proposed system-on-chip (SoC) provides great performance and efficiency, with a scalable architecture available for many different situations, addressing real time processing of stereo image flow. Using double buffering techniques properly combined with pipelined processing, the use of reconfigurable hardware achieves a parametrisable SoC which gives the designer the opportunity to decide its right dimension and features. The proposed architecture does not need any external memory because the processing is done as image flow arrives. Our SoC provides 3D data directly without the storage of whole stereo images. Our goal is to obtain high processing speed while maintaining the accuracy of 3D data using minimum resources. Configurable parameters may be controlled by later/parallel stages of the vision algorithm executed on an embedded processor. Considering hardware FPGA clock of 100 MHz, image flows up to 50 frames per second (fps) of dense stereo maps of more than 30,000 depth points could be obtained considering 2 Mpix images, with a minimum initial latency. The implementation of computer vision algorithms on reconfigurable hardware, explicitly low level processing, opens up the prospect of its use in autonomous systems, and they can act as a coprocessor to reconstruct 3D images with high density information in real time.
NASA Astrophysics Data System (ADS)
Tickle, Andrew J.; Harvey, Paul K.; Smith, Jeremy S.
2010-10-01
Morphological Scene Change Detection (MSCD) is a process typically tasked at detecting relevant changes in a guarded environment for security applications. This can be implemented on a Field Programmable Gate Array (FPGA) by a combination of binary differences based around exclusive-OR (XOR) gates, mathematical morphology and a crucial threshold setting. The additional ability to set up the system in virtually any location due to the FPGA makes it ideal for insertion into an autonomous mobile robot for patrol duties. However, security is not the only potential of this robust algorithm. This paper details how such a system can be used for the detection of leaks in piping for use in the process and chemical industries and could be deployed as stated in the above manner. The test substance in this work was water, which was pumped either as a liquid or as low pressure steam through a simple pipe configuration with holes at set points to simulate the leaks. These holes were situated randomly at either the center of a pipe (in order to simulate an impact to it) or at a joint or corner (to simulate a failed weld). Imagery of the resultant leaks, which were visualised as drips or the accumulation of steam, which where analysed using MATLAB to determine their pixel volume in order to calibrate the trigger for the MSCD. The triggering mechanism is adaptive to make it possible in theory for the type of leak to be determined by the number of pixels in the threshold of the image and a numerical output signal to state which of the leak situations is being observed. The system was designed using the DSP Builder package from Altera so that its graphical nature is easily comprehensible to the non-embedded system designer. Furthermore, all the data from the DSP Builder simulation underwent verification against MATLAB comparisons using the image processing toolbox in order to validate the results.
Optimized smith waterman processor design for breast cancer early diagnosis
NASA Astrophysics Data System (ADS)
Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.
2017-09-01
This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
Digital hardware implementation of a stochastic two-dimensional neuron model.
Grassia, F; Kohno, T; Levi, T
2016-11-01
This study explores the feasibility of stochastic neuron simulation in digital systems (FPGA), which realizes an implementation of a two-dimensional neuron model. The stochasticity is added by a source of current noise in the silicon neuron using an Ornstein-Uhlenbeck process. This approach uses digital computation to emulate individual neuron behavior using fixed point arithmetic operation. The neuron model's computations are performed in arithmetic pipelines. It was designed in VHDL language and simulated prior to mapping in the FPGA. The experimental results confirmed the validity of the developed stochastic FPGA implementation, which makes the implementation of the silicon neuron more biologically plausible for future hybrid experiments. Copyright © 2017 Elsevier Ltd. All rights reserved.
FPGA Based Reconfigurable ATM Switch Test Bed
NASA Technical Reports Server (NTRS)
Chu, Pong P.; Jones, Robert E.
1998-01-01
Various issues associated with "FPGA Based Reconfigurable ATM Switch Test Bed" are presented in viewgraph form. Specific topics include: 1) Network performance evaluation; 2) traditional approaches; 3) software simulation; 4) hardware emulation; 5) test bed highlights; 6) design environment; 7) test bed architecture; 8) abstract sheared-memory switch; 9) detailed switch diagram; 10) traffic generator; 11) data collection circuit and user interface; 12) initial results; and 13) the following conclusions: Advances in FPGA make hardware emulation feasible for performance evaluation, hardware emulation can provide several orders of magnitude speed-up over software simulation; due to the complexity of hardware synthesis process, development in emulation is much more difficult than simulation and requires knowledge in both networks and digital design.
Implementation of Multispectral Image Classification on a Remote Adaptive Computer
NASA Technical Reports Server (NTRS)
Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna
1999-01-01
As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).
Radiation effects in reconfigurable FPGAs
NASA Astrophysics Data System (ADS)
Quinn, Heather
2017-04-01
Field-programmable gate arrays (FPGAs) are co-processing hardware used in image and signal processing. FPGA are programmed with custom implementations of an algorithm. These algorithms are highly parallel hardware designs that are faster than software implementations. This flexibility and speed has made FPGAs attractive for many space programs that need in situ, high-speed signal processing for data categorization and data compression. Most commercial FPGAs are affected by the space radiation environment, though. Problems with TID has restricted the use of flash-based FPGAs. Static random access memory based FPGAs must be mitigated to suppress errors from single-event upsets. This paper provides a review of radiation effects issues in reconfigurable FPGAs and discusses methods for mitigating these problems. With careful design it is possible to use these components effectively and resiliently.
Aguiar Santos, Susana; Robens, Anne; Boehm, Anna; Leonhardt, Steffen; Teichmann, Daniel
2016-01-01
A new prototype of a multi-frequency electrical impedance tomography system is presented. The system uses a field-programmable gate array as a main controller and is configured to measure at different frequencies simultaneously through a composite waveform. Both real and imaginary components of the data are computed for each frequency and sent to the personal computer over an ethernet connection, where both time-difference imaging and frequency-difference imaging are reconstructed and visualized. The system has been tested for both time-difference and frequency-difference imaging for diverse sets of frequency pairs in a resistive/capacitive test unit and in self-experiments. To our knowledge, this is the first work that shows preliminary frequency-difference images of in-vivo experiments. Results of time-difference imaging were compared with simulation results and shown that the new prototype performs well at all frequencies in the tested range of 60 kHz–960 kHz. For frequency-difference images, further development of algorithms and an improved normalization process is required to correctly reconstruct and interpreted the resulting images. PMID:27463715
Integrated optical 3D digital imaging based on DSP scheme
NASA Astrophysics Data System (ADS)
Wang, Xiaodong; Peng, Xiang; Gao, Bruce Z.
2008-03-01
We present a scheme of integrated optical 3-D digital imaging (IO3DI) based on digital signal processor (DSP), which can acquire range images independently without PC support. This scheme is based on a parallel hardware structure with aid of DSP and field programmable gate array (FPGA) to realize 3-D imaging. In this integrated scheme of 3-D imaging, the phase measurement profilometry is adopted. To realize the pipeline processing of the fringe projection, image acquisition and fringe pattern analysis, we present a multi-threads application program that is developed under the environment of DSP/BIOS RTOS (real-time operating system). Since RTOS provides a preemptive kernel and powerful configuration tool, with which we are able to achieve a real-time scheduling and synchronization. To accelerate automatic fringe analysis and phase unwrapping, we make use of the technique of software optimization. The proposed scheme can reach a performance of 39.5 f/s (frames per second), so it may well fit into real-time fringe-pattern analysis and can implement fast 3-D imaging. Experiment results are also presented to show the validity of proposed scheme.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Monenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used proportional-integral-derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM-based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a DSP (Digital Signal Processor) or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSP) devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching this goal.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Montenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of Field Programmable Gate Arrays (FPGA's) in the hardware implementation of fast digital signal processing functions. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used Proportional-Integral-Derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a Digital Signal Processor (DSP) device or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using DSP devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, Pulse Width Modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacemap. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive-control algorithm approaches. Radiation tolerant FPGA's are a feasible option for reaching this goal.
NASA Technical Reports Server (NTRS)
Geist, Alessandro; Lin, Michael; Flatley, Tom; Petrick, David
2013-01-01
SpaceCube 1.5 is a high-performance and low-power system in a compact form factor. It is a hybrid processing system consisting of CPU (central processing unit), FPGA (field-programmable gate array), and DSP (digital signal processor) processing elements. The primary processing engine is the Virtex- 5 FX100T FPGA, which has two embedded processors. The SpaceCube 1.5 System was a bridge to the SpaceCube 2.0 and SpaceCube 2.0 Mini processing systems. The SpaceCube 1.5 system was the primary avionics in the successful SMART (Small Rocket/Spacecraft Technology) Sounding Rocket mission that was launched in the summer of 2011. For SMART and similar missions, an avionics processor is required that is reconfigurable, has high processing capability, has multi-gigabit interfaces, is low power, and comes in a rugged/compact form factor. The original SpaceCube 1.0 met a number of the criteria, but did not possess the multi-gigabit interfaces that were required and is a higher-cost system. The SpaceCube 1.5 was designed with those mission requirements in mind. The SpaceCube 1.5 features one Xilinx Virtex-5 FX100T FPGA and has excellent size, weight, and power characteristics [4×4×3 in. (approx. = 10×10×8 cm), 3 lb (approx. = 1.4 kg), and 5 to 15 W depending on the application]. The estimated computing power of the two PowerPC 440s in the Virtex-5 FPGA is 1100 DMIPS each. The SpaceCube 1.5 includes two Gigabit Ethernet (1 Gbps) interfaces as well as two SATA-I/II interfaces (1.5 to 3.0 Gbps) for recording to data drives. The SpaceCube 1.5 also features DDR2 SDRAM (double data rate synchronous dynamic random access memory); 4- Gbit Flash for storing application code for the CPU, FPGA, and DSP processing elements; and a Xilinx Platform Flash XL to store FPGA configuration files or application code. The system also incorporates a 12 bit analog to digital converter with the ability to read 32 discrete analog sensor inputs. The SpaceCube 1.5 design also has a built-in accelerometer. In addition, the system has 12 receive and transmit RS- 422 interfaces for legacy support. The SpaceCube 1.5 processor card represents the first NASA Goddard design in a compact form factor featuring the Xilinx Virtex- 5. The SpaceCube 1.5 incorporates backward compatibility with the Space- Cube 1.0 form factor and stackable architecture. It also makes use of low-cost commercial parts, but is designed for operation in harsh environments.
NASA Astrophysics Data System (ADS)
Burri, Samuel; Homulle, Harald; Bruschini, Claudio; Charbon, Edoardo
2016-04-01
LinoSPAD is a reconfigurable camera sensor with a 256×1 CMOS SPAD (single-photon avalanche diode) pixel array connected to a low cost Xilinx Spartan 6 FPGA. The LinoSPAD sensor's line of pixels has a pitch of 24 μm and 40% fill factor. The FPGA implements an array of 64 TDCs and histogram engines capable of processing up to 8.5 giga-photons per second. The LinoSPAD sensor measures 1.68 mm×6.8 mm and each pixel has a direct digital output to connect to the FPGA. The chip is bonded on a carrier PCB to connect to the FPGA motherboard. 64 carry chain based TDCs sampled at 400 MHz can generate a timestamp every 7.5 ns with a mean time resolution below 25 ps per code. The 64 histogram engines provide time-of-arrival histograms covering up to 50 ns. An alternative mode allows the readout of 28 bit timestamps which have a range of up to 4.5 ms. Since the FPGA TDCs have considerable non-linearity we implemented a correction module capable of increasing histogram linearity at real-time. The TDC array is interfaced to a computer using a super-speed USB3 link to transfer over 150k histograms per second for the 12.5 ns reference period used in our characterization. After characterization and subsequent programming of the post-processing we measure an instrument response histogram shorter than 100 ps FWHM using a strong laser pulse with 50 ps FWHM. A timing resolution that when combined with the high fill factor makes the sensor well suited for a wide variety of applications from fluorescence lifetime microscopy over Raman spectroscopy to 3D time-of-flight.
NASA Astrophysics Data System (ADS)
Berdychowski, Piotr P.; Zabolotny, Wojciech M.
2010-09-01
The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.
Diagnostic layer integration in FPGA-based pipeline measurement systems for HEP experiments
NASA Astrophysics Data System (ADS)
Pozniak, Krzysztof T.
2007-08-01
Integrated triggering and data acquisition systems for high energy physics experiments may be considered as fast, multichannel, synchronous, distributed, pipeline measurement systems. A considerable extension of functional, technological and monitoring demands, which has recently been imposed on them, forced a common usage of large field-programmable gate array (FPGA), digital signal processing-enhanced matrices and fast optical transmission for their realization. This paper discusses modelling, design, realization and testing of pipeline measurement systems. A distribution of synchronous data stream flows is considered in the network. A general functional structure of a single network node is presented. A suggested, novel block structure of the node model facilitates full implementation in the FPGA chip, circuit standardization and parametrization, as well as integration of functional and diagnostic layers. A general method for pipeline system design was derived. This method is based on a unified model of the synchronous data network node. A few examples of practically realized, FPGA-based, pipeline measurement systems were presented. The described systems were applied in ZEUS and CMS.
NASA Astrophysics Data System (ADS)
Deliparaschos, Kyriakos M.; Michail, Konstantinos; Zolotas, Argyrios C.; Tzafestas, Spyros G.
2016-05-01
This work presents a field programmable gate array (FPGA)-based embedded software platform coupled with a software-based plant, forming a hardware-in-the-loop (HIL) that is used to validate a systematic sensor selection framework. The systematic sensor selection framework combines multi-objective optimization, linear-quadratic-Gaussian (LQG)-type control, and the nonlinear model of a maglev suspension. A robustness analysis of the closed-loop is followed (prior to implementation) supporting the appropriateness of the solution under parametric variation. The analysis also shows that quantization is robust under different controller gains. While the LQG controller is implemented on an FPGA, the physical process is realized in a high-level system modeling environment. FPGA technology enables rapid evaluation of the algorithms and test designs under realistic scenarios avoiding heavy time penalty associated with hardware description language (HDL) simulators. The HIL technique facilitates significant speed-up in the required execution time when compared to its software-based counterpart model.
FPGA based data processing in the ALICE High Level Trigger in LHC Run 2
NASA Astrophysics Data System (ADS)
Engel, Heiko; Alt, Torsten; Kebschull, Udo;
2017-10-01
The ALICE High Level Trigger (HLT) is a computing cluster dedicated to the online compression, reconstruction and calibration of experimental data. The HLT receives detector data via serial optical links into FPGA based readout boards that process the data on a per-link level already inside the FPGA and provide it to the host machines connected with a data transport framework. FPGA based data pre-processing is enabled for the biggest detector of ALICE, the Time Projection Chamber (TPC), with a hardware cluster finding algorithm. This algorithm was ported to the Common Read-Out Receiver Card (C-RORC) as used in the HLT for RUN 2. It was improved to handle double the input bandwidth and adjusted to the upgraded TPC Readout Control Unit (RCU2). A flexible firmware implementation in the HLT handles both the old and the new TPC data format and link rates transparently. Extended protocol and data error detection, error handling and the enhanced RCU2 data ordering scheme provide an improved physics performance of the cluster finder. The performance of the cluster finder was verified against large sets of reference data both in terms of throughput and algorithmic correctness. Comparisons with a software reference implementation confirm significant savings on CPU processing power using the hardware implementation. The C-RORC hardware with the cluster finder for RCU1 data is in use in the HLT since the start of RUN 2. The extended hardware cluster finder implementation for the RCU2 with doubled throughput is active since the upgrade of the TPC readout electronics in early 2016.
Roll Angle Estimation Using Thermopiles for a Flight Controlled Mortar
2012-06-01
Using Xilinx’s System generator, the entire design was implemented at a relatively high level within Malab’s Simulink. This allowed VHDL code to...thermopile data with a Recursive Least Squares (RLS) filter implemented on a field programmable gate array (FPGA). These results demonstrate the...accurately estimated by processing the thermopile data with a Recursive Least Squares (RLS) filter implemented on a field programmable gate array (FPGA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Learn, Mark Walter
Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
Volumetric visualization algorithm development for an FPGA-based custom computing machine
NASA Astrophysics Data System (ADS)
Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim
1998-05-01
Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
Mobility and orientation aid for blind persons using artificial vision
NASA Astrophysics Data System (ADS)
Costa, Gustavo; Gusberti, Adrián; Graffigna, Juan Pablo; Guzzo, Martín; Nasisi, Oscar
2007-11-01
Blind or vision-impaired persons are limited in their normal life activities. Mobility and orientation of blind persons is an ever-present research subject because no total solution has yet been reached for these activities that pose certain risks for the affected persons. The current work presents the design and development of a device conceived on capturing environment information through stereoscopic vision. The images captured by a couple of video cameras are transferred and processed by configurable and sequential FPGA and DSP devices that issue action signals to a tactile feedback system. Optimal processing algorithms are implemented to perform this feedback in real time. The components selected permit portability; that is, to readily get used to wearing the device.
High frequency signal acquisition and control system based on DSP+FPGA
NASA Astrophysics Data System (ADS)
Liu, Xiao-qi; Zhang, Da-zhi; Yin, Ya-dong
2017-10-01
This paper introduces a design and implementation of high frequency signal acquisition and control system based on DSP + FPGA. The system supports internal/external clock and internal/external trigger sampling. It has a maximum sampling rate of 400MBPS and has a 1.4GHz input bandwidth for the ADC. Data can be collected continuously or periodically in systems and they are stored in DDR2. At the same time, the system also supports real-time acquisition, the collected data after digital frequency conversion and Cascaded Integrator-Comb (CIC) filtering, which then be sent to the CPCI bus through the high-speed DSP, can be assigned to the fiber board for subsequent processing. The system integrates signal acquisition and pre-processing functions, which uses high-speed A/D, high-speed DSP and FPGA mixed technology and has a wide range of uses in data acquisition and recording. In the signal processing, the system can be seamlessly connected to the dedicated processor board. The system has the advantages of multi-selectivity, good scalability and so on, which satisfies the different requirements of different signals in different projects.
Study on the high-frequency laser measurement of slot surface difference
NASA Astrophysics Data System (ADS)
Bing, Jia; Lv, Qiongying; Cao, Guohua
2017-10-01
In view of the measurement of the slot surface difference in the large-scale mechanical assembly process, Based on high frequency laser scanning technology and laser detection imaging principle, This paragraph designs a double galvanometer pulse laser scanning system. Laser probe scanning system architecture consists of three parts: laser ranging part, mechanical scanning part, data acquisition and processing part. The part of laser range uses high-frequency laser range finder to measure the distance information of the target shape and get a lot of point cloud data. Mechanical scanning part includes high-speed rotary table, high-speed transit and related structure design, in order to realize the whole system should be carried out in accordance with the design of scanning path on the target three-dimensional laser scanning. Data processing part mainly by FPGA hardware with LAbVIEW software to design a core, to process the point cloud data collected by the laser range finder at the high-speed and fitting calculation of point cloud data, to establish a three-dimensional model of the target, so laser scanning imaging is realized.
Windowing technique in FM radar realized by FPGA for better target resolution
NASA Astrophysics Data System (ADS)
Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique; Kravchenko, Victor F.
2006-09-01
Remote sensing systems, such as SAR usually apply FM signals to resolve nearly placed targets (objects) and improve SNR. Main drawbacks in the pulse compression of FM radar signal that it can add the range side-lobes in reflectivity measurements. Using weighting window processing in time domain it is possible to decrease significantly the side-lobe level (SLL) of output radar signal that permits to resolve small or low power targets those are masked by powerful ones. There are usually used classical windows such as Hamming, Hanning, Blackman-Harris, Kaiser-Bessel, Dolph-Chebyshev, Gauss, etc. in window processing. Additionally to classical ones in here we also use a novel class of windows based on atomic functions (AF) theory. For comparison of simulation and experimental results we applied the standard parameters, such as coefficient of amplification, maximum level of side-lobe, width of main lobe, etc. In this paper we also proposed to implement the compression-windowing model on a hardware level employing Field Programmable Gate Array (FPGA) that offers some benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. It has been investigated the pulse compression design on FPGA applying classical and novel window technique to reduce the SLL in absence and presence of noise. The paper presents simulated and experimental examples of detection of small or nearly placed targets in the imaging radar. Paper also presents the experimental hardware results of windowing in FM radar demonstrating resolution of the several targets for classical rectangular, Hamming, Kaiser-Bessel, and some novel ones: Up(x), fup 4(x)•D 3(x), fup 6(x)•G 3(x), etc. It is possible to conclude that windows created on base of the AFs offer better decreasing of the SLL in cases of presence or absence of noise and when we move away of the main lobe in comparison with classical windows.
A real-time n/γ digital pulse shape discriminator based on FPGA.
Li, Shiping; Xu, Xiufeng; Cao, Hongrui; Yuan, Guoliang; Yang, Qingwei; Yin, Zejie
2013-02-01
A FPGA-based real-time digital pulse shape discriminator has been employed to distinguish between neutrons (n) and gammas (γ) in the Neutron Flux Monitor (NFM) for International Thermonuclear Experimental Reactor (ITER). The discriminator takes advantages of the Field Programmable Gate Array (FPGA) parallel and pipeline process capabilities to carry out the real-time sifting of neutrons in n/γ mixed radiation fields, and uses the rise time and amplitude inspection techniques simultaneously as the discrimination algorithm to observe good n/γ separation. Some experimental results have been presented which show that this discriminator can realize the anticipated goals of NFM perfectly with its excellent discrimination quality and zero dead time. Copyright © 2012 Elsevier Ltd. All rights reserved.
Embedded system of image storage based on fiber channel
NASA Astrophysics Data System (ADS)
Chen, Xiaodong; Su, Wanxin; Xing, Zhongbao; Wang, Hualong
2008-03-01
In domains of aerospace, aviation, aiming, and optic measure etc., the embedded system of imaging, processing and recording is absolutely necessary, which has small volume, high processing speed and high resolution. But the embedded storage technology becomes system bottleneck because of developing slowly. It is used to use RAID to promote storage speed, but it is unsuitable for the embedded system because of its big volume. Fiber channel (FC) technology offers a new method to develop the high-speed, portable storage system. In order to make storage subsystem meet the needs of high storage rate, make use of powerful Virtex-4 FPGA and high speed fiber channel, advance a project of embedded system of digital image storage based on Xilinx Fiber Channel Arbitrated Loop LogiCORE. This project utilizes Virtex- 4 RocketIO MGT transceivers to transmit the data serially, and connects many Fiber Channel hard drivers by using of Arbitrated Loop optionally. It can achieve 400MBps storage rate, breaks through the bottleneck of PCI interface, and has excellences of high-speed, real-time, portable and massive capacity.
FPGA applications for single dish activity at Medicina radio telescopes
NASA Astrophysics Data System (ADS)
Bartolini, M.; Naldi, G.; Mattana, A.; Maccaferri, A.; De Biaggi, M.
FPGA technologies are gaining major attention in the recent years in the field of radio astronomy. At Medicina radio telescopes, FPGAs have been used in the last ten years for a number of purposes and in this article we will take into exam the applications developed and installed for the Medicina Single Dish 32m Antenna: these range from high performance digital signal processing to instrument control developed on top of smaller FPGAs.
An Efficient Pipeline Wavefront Phase Recovery for the CAFADIS Camera for Extremely Large Telescopes
Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel
2010-01-01
In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations. PMID:22315523
Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel
2010-01-01
In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations.
System Design of One-chip Wave Particle Interaction Analyzer for SCOPE mission.
NASA Astrophysics Data System (ADS)
Fukuhara, Hajime; Ueda, Yoshikatsu; Kojima, Hiro; Yamakawa, Hiroshi
In past science spacecrafts such like GEOTAIL, we usually capture electric and magnetic field waveforms and observe energetic eletron and ion particles as velocity distributions by each sensor. We analyze plasma wave-particle interactions by these respective data and the discussions are sometimes restricted by the difference of time resolution and by the data loss in desired regions. One-chip Wave Particle Interaction Analyzer (OWPIA) conducts direct quantitative observations of wave-particle interaction by direct 'E dot v' calculation on-board. This new instruments have a capability to use all plasma waveform data and electron particle informations. In the OWPIA system, we have to calibrate the digital observation data and transform the same coordinate system. All necessary calculations are processed in Field Programmable Gate Array(FPGA). In our study, we introduce a basic concept of the OWPIA system and a optimization method for each calculation functions installed in FPGA. And we also discuss the process speed, the FPGA utilization efficiency, the total power consumption.
Goavec-Mérou, G; Chrétien, N; Friedt, J-M; Sandoz, P; Martin, G; Lenczner, M; Ballandras, S
2014-01-01
Vibrating mechanical structure characterization is demonstrated using contactless techniques best suited for mobile and rotating equipments. Fast measurement rates are achieved using Field Programmable Gate Array (FPGA) devices as real-time digital signal processors. Two kinds of algorithms are implemented on FPGA and experimentally validated in the case of the vibrating tuning fork. A first application concerns in-plane displacement detection by vision with sampling rates above 10 kHz, thus reaching frequency ranges above the audio range. A second demonstration concerns pulsed-RADAR cooperative target phase detection and is applied to radiofrequency acoustic transducers used as passive wireless strain gauges. In this case, the 250 ksamples/s refresh rate achieved is only limited by the acoustic sensor design but not by the detection bandwidth. These realizations illustrate the efficiency, interest, and potentialities of FPGA-based real-time digital signal processing for the contactless interrogation of passive embedded probes with high refresh rates.
Fast Image Texture Classification Using Decision Trees
NASA Technical Reports Server (NTRS)
Thompson, David R.
2011-01-01
Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.
Real-time image processing for particle tracking velocimetry
NASA Astrophysics Data System (ADS)
Kreizer, Mark; Ratner, David; Liberzon, Alex
2010-01-01
We present a novel high-speed particle tracking velocimetry (PTV) experimental system. Its novelty is due to the FPGA-based, real-time image processing "on camera". Instead of an image, the camera transfers to the computer using a network card, only the relevant information of the identified flow tracers. Therefore, the system is ideal for the remote particle tracking systems in research and industrial applications, while the camera can be controlled and data can be transferred over any high-bandwidth network. We present the hardware and the open source software aspects of the PTV experiments. The tracking results of the new experimental system has been compared to the flow visualization and particle image velocimetry measurements. The canonical flow in the central cross section of a a cubic cavity (1:1:1 aspect ratio) in our lid-driven cavity apparatus is used for validation purposes. The downstream secondary eddy (DSE) is the sensitive portion of this flow and its size was measured with increasing Reynolds number (via increasing belt velocity). The size of DSE estimated from the flow visualization, PIV and compressed PTV is shown to agree within the experimental uncertainty of the methods applied.
FPGA based charge acquisition algorithm for soft x-ray diagnostics system
NASA Astrophysics Data System (ADS)
Wojenski, A.; Kasprowicz, G.; Pozniak, K. T.; Zabolotny, W.; Byszuk, A.; Juszczyk, B.; Kolasinski, P.; Krawczyk, R. D.; Zienkiewicz, P.; Chernyshova, M.; Czarski, T.
2015-09-01
Soft X-ray (SXR) measurement systems working in tokamaks or with laser generated plasma can expect high photon fluxes. Therefore it is necessary to focus on data processing algorithms to have the best possible efficiency in term of processed photon events per second. This paper refers to recently designed algorithm and data-flow for implementation of charge data acquisition in FPGA. The algorithms are currently on implementation stage for the soft X-ray diagnostics system. In this paper despite of the charge processing algorithm is also described general firmware overview, data storage methods and other key components of the measurement system. The simulation section presents algorithm performance and expected maximum photon rate.
Different source image fusion based on FPGA
NASA Astrophysics Data System (ADS)
Luo, Xiao; Piao, Yan
2016-03-01
The fusion technology of video image is to make the video obtained by different image sensors complementary to each other by some technical means, so as to obtain the video information which is rich in information and suitable for the human eye system. Infrared cameras in harsh environments such as when smoke, fog and low light situations penetrating power, but the ability to obtain the details of the image is poor, does not meet the human visual system. Single visible light imaging can be rich in detail, high resolution images and for the visual system, but the visible image easily affected by the external environment. Infrared image and visible image fusion process involved in the video image fusion algorithm complexity and high calculation capacity, have occupied more memory resources, high clock rate requirements, such as software, c ++, c, etc. to achieve more, but based on Hardware platform less. In this paper, based on the imaging characteristics of infrared images and visible light images, the software and hardware are combined to obtain the registration parameters through software matlab, and the gray level weighted average method is used to implement the hardware platform. Information fusion, and finally the fusion image can achieve the goal of effectively improving the acquisition of information to increase the amount of information in the image.
Random number generators for large-scale parallel Monte Carlo simulations on FPGA
NASA Astrophysics Data System (ADS)
Lin, Y.; Wang, F.; Liu, B.
2018-05-01
Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
NASA Astrophysics Data System (ADS)
Saxena, Shefali; Hawari, Ayman I.
2017-07-01
Digital signal processing techniques have been widely used in radiation spectrometry to provide improved stability and performance with compact physical size over the traditional analog signal processing. In this paper, field-programmable gate array (FPGA)-based adaptive digital pulse shaping techniques are investigated for real-time signal processing. National Instruments (NI) NI 5761 14-bit, 250-MS/s adaptor module is used for digitizing high-purity germanium (HPGe) detector's preamplifier pulses. Digital pulse processing algorithms are implemented on the NI PXIe-7975R reconfigurable FPGA (Kintex-7) using the LabVIEW FPGA module. Based on the time separation between successive input pulses, the adaptive shaping algorithm selects the optimum shaping parameters (rise time and flattop time of trapezoid-shaping filter) for each incoming signal. A digital Sallen-Key low-pass filter is implemented to enhance signal-to-noise ratio and reduce baseline drifting in trapezoid shaping. A recursive trapezoid-shaping filter algorithm is employed for pole-zero compensation of exponentially decayed (with two-decay constants) preamplifier pulses of an HPGe detector. It allows extraction of pulse height information at the beginning of each pulse, thereby reducing the pulse pileup and increasing throughput. The algorithms for RC-CR2 timing filter, baseline restoration, pile-up rejection, and pulse height determination are digitally implemented for radiation spectroscopy. Traditionally, at high-count-rate conditions, a shorter shaping time is preferred to achieve high throughput, which deteriorates energy resolution. In this paper, experimental results are presented for varying count-rate and pulse shaping conditions. Using adaptive shaping, increased throughput is accepted while preserving the energy resolution observed using the longer shaping times.
VLF and X-ray Instruments for Stratospheric Balloons: ABOVE2 and EPEx
NASA Astrophysics Data System (ADS)
Cully, C. M.; Galts, D.; Patrick, M.; Duffin, C.; Jang, A. C.; Pitzel, J.; Trumpour, T.; McCarthy, M.; Milling, D. K.
2017-12-01
The ABOVE2 (2016) and EPEx (2018) stratospheric balloon missions are designed to study energetic electrons precipitating from the radiation belts into the atmosphere. The payloads include instruments that measure Very Low Frequency (VLF) magnetic and electric fields, and bremsstrahlung X-rays. The ABOVE2 VLF instrument is an FPGA-based design with >200 kHz sampling rates, sub-microsecond timing accuracy and onboard spectral processing, designed in a Cubesat-friendly format. The EPEx X-ray instrument is a hard X-ray imaging system, also in a Cubesat-friendly format, incorporating a commercially-available Cadmium-Zinc-Telluride module. The imager is sufficiently lightweight that we can launch it on-demand with low-volume latex balloons. I will discuss the design and performance of both instruments, and present data from the ABOVE2 flights.
NASA Astrophysics Data System (ADS)
Rais, Muhammad H.
2010-06-01
This paper presents Field Programmable Gate Array (FPGA) implementation of standard and truncated multipliers using Very High Speed Integrated Circuit Hardware Description Language (VHDL). Truncated multiplier is a good candidate for digital signal processing (DSP) applications such as finite impulse response (FIR) and discrete cosine transform (DCT). Remarkable reduction in FPGA resources, delay, and power can be achieved using truncated multipliers instead of standard parallel multipliers when the full precision of the standard multiplier is not required. The truncated multipliers show significant improvement as compared to standard multipliers. Results show that the anomaly in Spartan-3 AN average connection and maximum pin delay have been efficiently reduced in Virtex-4 device.
A low power flash-FPGA based brain implant micro-system of PID control.
Lijuan Xia; Fattah, Nabeel; Soltan, Ahmed; Jackson, Andrew; Chester, Graeme; Degenaar, Patrick
2017-07-01
In this paper, we demonstrate that a low power flash FPGA based micro-system can provide a low power programmable interface for closed-loop brain implant inter- faces. The proposed micro-system receives recording local field potential (LFP) signals from an implanted probe, performs closed-loop control using a first order control system, then converts the signal into an optogenetic control stimulus pattern. Stimulus can be implemented through optoelectronic probes. The long term target is for both fundamental neuroscience applications and for clinical use in treating epilepsy. Utilizing our device, closed-loop processing consumes only 14nJ of power per PID cycle compared to 1.52μJ per cycle for a micro-controller implementation. Compared to an application specific digital integrated circuit, flash FPGA's are inherently programmable.
A Fixed Point VHDL Component Library for a High Efficiency Reconfigurable Radio Design Methodology
NASA Technical Reports Server (NTRS)
Hoy, Scott D.; Figueiredo, Marco A.
2006-01-01
Advances in Field Programmable Gate Array (FPGA) technologies enable the implementation of reconfigurable radio systems for both ground and space applications. The development of such systems challenges the current design paradigms and requires more robust design techniques to meet the increased system complexity. Among these techniques is the development of component libraries to reduce design cycle time and to improve design verification, consequently increasing the overall efficiency of the project development process while increasing design success rates and reducing engineering costs. This paper describes the reconfigurable radio component library developed at the Software Defined Radio Applications Research Center (SARC) at Goddard Space Flight Center (GSFC) Microwave and Communications Branch (Code 567). The library is a set of fixed-point VHDL components that link the Digital Signal Processing (DSP) simulation environment with the FPGA design tools. This provides a direct synthesis path based on the latest developments of the VHDL tools as proposed by the BEE VBDL 2004 which allows for the simulation and synthesis of fixed-point math operations while maintaining bit and cycle accuracy. The VHDL Fixed Point Reconfigurable Radio Component library does not require the use of the FPGA vendor specific automatic component generators and provide a generic path from high level DSP simulations implemented in Mathworks Simulink to any FPGA device. The access to the component synthesizable, source code provides full design verification capability:
STRS Compliant FPGA Waveform Development
NASA Technical Reports Server (NTRS)
Nappier, Jennifer; Downey, Joseph; Mortensen, Dale
2008-01-01
The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. The extension of STRS to the SSP hardware will promote easier waveform reconfiguration and reuse. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. A FPGA-based transmit waveform implementation of the proposed standard interfaces on a laboratory breadboard SDR will be discussed.
A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations
Mak, Terrence S. T.; Rachmuth, Guy; Lam, Kai-Pui; Poon, Chi-Sang
2008-01-01
Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field Programmable Gate Array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the AMPA and NMDA synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired. PMID:17190033
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cohen, J; Dossa, D; Gokhale, M
Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe:more » (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows: SuperMicro X7DBE Xeon Dual Socket Blackford Server Motherboard; 2 Intel Xeon Dual-Core 2.66 GHz processors; 1 GB DDR2 PC2-5300 RAM (2 x 512); 80GB Hard Drive (Seagate SATA II Barracuda). The Fusion board is presently capable of 4X in a PCIe slot. The image resampling benchmark was run on a dual Xeon workstation with NVIDIA graphics card (see Chapter 5 for full specification). An XtremeData Opteron+FPGA was used for the language classification application. We observed that these benchmarks are not uniformly I/O intensive. The only benchmark that showed greater that 50% of the time in I/O was the graph algorithm when it accessed data files over NFS. When local disk was used, the graph benchmark spent at most 40% of its time in I/O. The other benchmarks were CPU dominated. The image resampling benchmark and language classification showed order of magnitude speedup over software by using co-processor technology to offload the CPU-intensive kernels. Our experiments to date suggest that emerging hardware technologies offer significant benefit to boosting the performance of data-intensive algorithms. Using GPU and FPGA co-processors, we were able to improve performance by more than an order of magnitude on the benchmark algorithms, eliminating the processor bottleneck of CPU-bound tasks. Experiments with a prototype solid state nonvolative memory available today show 10X better throughput on random reads than disk, with a 2X speedup on a graph processing benchmark when compared to the use of local SATA disk.« less
Data path design and image quality aspects of the next generation multifunctional printer
NASA Astrophysics Data System (ADS)
Brassé, M. H. H.; de Smet, S. P. R. C.
2008-01-01
Multifunctional devices (MFDs) are increasingly used as a document hub. The MFD is used as a copier, scanner, printer, and it facilitates digital document distribution and sharing. This imposes new requirements on the design of the data path and its image processing. Various design aspects need to be taken into account, including system performance, features, image quality, and cost price. A good balance is required in order to develop a competitive MFD. A modular datapath architecture is presented that supports all the envisaged use cases. Besides copying, colour scanning is becoming an important use case of a modern MFD. The copy-path use case is described and it is shown how colour scanning can also be supported with a minimal adaptation to the architecture. The key idea is to convert the scanner data to an opponent colour space representation at the beginning of the image processing pipeline. The sub-sampling of chromatic information allows for the saving of scarce hardware resources without significant perceptual loss of quality. In particular, we have shown that functional FPGA modules from the copy application can also be used for the scan-to-file application. This makes the presented approach very cost-effective while complying with market conform image quality standards.
NASA Tech Briefs, December 2008
NASA Technical Reports Server (NTRS)
2008-01-01
Topics covered include: Crew Activity Analyzer; Distributing Data to Hand-Held Devices in a Wireless Network; Reducing Surface Clutter in Cloud Profiling Radar Data; MODIS Atmospheric Data Handler; Multibeam Altimeter Navigation Update Using Faceted Shape Model; Spaceborne Hybrid-FPGA System for Processing FTIR Data; FPGA Coprocessor for Accelerated Classification of Images; SiC JFET Transistor Circuit Model for Extreme Temperature Range; TDR Using Autocorrelation and Varying-Duration Pulses; Update on Development of SiC Multi-Chip Power Modules; Radio Ranging System for Guidance of Approaching Spacecraft; Electromagnetically Clean Solar Arrays; Improved Short-Circuit Protection for Power Cells in Series; Electromagnetically Clean Solar Arrays; Logic Gates Made of N-Channel JFETs and Epitaxial Resistors; Improved Short-Circuit Protection for Power Cells in Series; Communication Limits Due to Photon-Detector Jitter; System for Removing Pollutants from Incinerator Exhaust; Sealing and External Sterilization of a Sample Container; Converting EOS Data from HDF-EOS to netCDF; HDF-EOS 2 and HDF-EOS 5 Compatibility Library; HDF-EOS Web Server; HDF-EOS 5 Validator; XML DTD and Schemas for HDF-EOS; Converting from XML to HDF-EOS; Simulating Attitudes and Trajectories of Multiple Spacecraft; Specialized Color Function for Display of Signed Data; Delivering Alert Messages to Members of a Work Force; Delivering Images for Mars Rover Science Planning; Oxide Fiber Cathode Materials for Rechargeable Lithium Cells; Electrocatalytic Reduction of Carbon Dioxide to Methane; Heterogeneous Superconducting Low-Noise Sensing Coils; Progress toward Making Epoxy/Carbon-Nanotube Composites; Predicting Properties of Unidirectional-Nanofiber Composites; Deployable Crew Quarters; Nonventing, Regenerable, Lightweight Heat Absorber; Miniature High-Force, Long-Stroke SMA Linear Actuators; "Bootstrap" Configuration for Multistage Pulse-Tube Coolers; Reducing Liquid Loss during Ullage Venting in Microgravity; Ka-Band Transponder for Deep-Space Radio Science; Replication of Space-Shuttle Computers in FPGAs and ASICs; Demisable Reaction-Wheel Assembly; Spatial and Temporal Low-Dimensional Models for Fluid Flow; Advanced Land Imager Assessment System; Range Imaging without Moving Parts.
NASA Astrophysics Data System (ADS)
Abdelazim, S.; Santoro, D.; Arend, M.; Moshary, F.; Ahmed, S.
2011-11-01
A field deployable all-fiber eye-safe Coherent Doppler LIDAR is being developed at the Optical Remote Sensing Lab at the City College of New York (CCNY) and is designed to monitor wind fields autonomously and continuously in urban settings. Data acquisition is accomplished by sampling lidar return signals at 400 MHz and performing onboard processing using field programmable gate arrays (FPGAs). The FPGA is programmed to accumulate signal information that is used to calculate the power spectrum of the atmospherically back scattered signal. The advantage of using FPGA is that signal processing will be performed at the hardware level, reducing the load on the host computer and allowing for 100% return signal processing. An experimental setup measured wind speeds at ranges of up to 3 km.
Region-Oriented Placement Algorithm for Coarse-Grained Power-Gating FPGA Architecture
NASA Astrophysics Data System (ADS)
Li, Ce; Dong, Yiping; Watanabe, Takahiro
An FPGA plays an essential role in industrial products due to its fast, stable and flexible features. But the power consumption of FPGAs used in portable devices is one of critical issues. Top-down hierarchical design method is commonly used in both ASIC and FPGA design. But, in the case where plural modules are integrated in an FPGA and some of them might be in sleep-mode, current FPGA architecture cannot be fully effective. In this paper, coarse-grained power gating FPGA architecture is proposed where a whole area of an FPGA is partitioned into several regions and power supply is controlled for each region, so that modules in sleep mode can be effectively power-off. We also propose a region oriented FPGA placement algorithm fitted to this user's hierarchical design based on VPR[1]. Simulation results show that this proposed method could reduce power consumption of FPGA by 38% on average by setting unused modules or regions in sleep mode.
In-camera video-stream processing for bandwidth reduction in web inspection
NASA Astrophysics Data System (ADS)
Jullien, Graham A.; Li, QiuPing; Hajimowlana, S. Hossain; Morvay, J.; Conflitti, D.; Roberts, James W.; Doody, Brian C.
1996-02-01
Automated machine vision systems are now widely used for industrial inspection tasks where video-stream data information is taken in by the camera and then sent out to the inspection system for future processing. In this paper we describe a prototype system for on-line programming of arbitrary real-time video data stream bandwidth reduction algorithms; the output of the camera only contains information that has to be further processed by a host computer. The processing system is built into a DALSA CCD camera and uses a microcontroller interface to download bit-stream data to a XILINXTM FPGA. The FPGA is directly connected to the video data-stream and outputs data to a low bandwidth output bus. The camera communicates to a host computer via an RS-232 link to the microcontroller. Static memory is used to both generate a FIFO interface for buffering defect burst data, and for off-line examination of defect detection data. In addition to providing arbitrary FPGA architectures, the internal program of the microcontroller can also be changed via the host computer and a ROM monitor. This paper describes a prototype system board, mounted inside a DALSA camera, and discusses some of the algorithms currently being implemented for web inspection applications.
NASA Astrophysics Data System (ADS)
Lazarev, Grigory; Bonifer, Stefanie; Engel, Philip; Höhne, Daniel; Notni, Gunther
2017-06-01
We report about the implementation of the liquid crystal on silicon (LCOS) microdisplay with 1920 by 1080 resolution and 720 Hz frame rate. The driving solution is FPGA-based. The input signal is converted from the ultrahigh-resolution HDMI 2.0 signal into HD frames, which follow with the specified 720 Hz frame rate. Alternatively the signal is generated directly on the FPGA with built-in pattern generator. The display is showing switching times below 1.5 ms for the selected working temperature. The bit depth of the addressed image achieves 8 bit within each frame. The microdisplay is used in the fringe projection-based 3D sensing system, implemented by Fraunhofer IOF.
Flexible electronic control system based on FPGA for liquid-crystal microlens
NASA Astrophysics Data System (ADS)
Zhang, Bo; Xin, Zhaowei; Li, Dapeng; Wei, Dong; Zhang, Xinyu; Wang, Haiwei; Xie, Changsheng
2018-02-01
Traditional imaging based on common optical lens can only be used to collect intensity information of incident beams, but actually lightwave also carries other mode information about targets and environment, including: spectrum, wavefront, and depth of target, and so on. It is very important to acquire those information mentioned for efficiently detecting and identifying targets in complex background. There is a urgent need to develop new high-performance optical imaging components. The liquid-crystal microlens (LCMs) only by applying spatial electrical field to change optical performance, have demonstrated remarkable advantages comparing conventional lenses, and therefore show a widely application prospect. Because the physical properties of the spatial electric fields between electrode plates in LCMs are directly related to the light-field performances of LCMs, the quality of voltage signal applied to LCMs needs high requirements. In this paper, we design and achieve a new type of digital voltage equipment with a wide adjustable voltage range and high precise voltage to effectively drive and adjust LCMs. More importantly, the device primarily based on field-programmable gate array(FPGA) can generate flexible and stable voltage signals to cooperate with the various functions of LCMs. Our experiments show that through the electronic control system, the LCMs already realize several significant functions including: electrically swing focus, wavefront imaging, electrically tunable spectral imaging and light-field imaging.
NASA Technical Reports Server (NTRS)
Arnold, Jeffrey M.; Buell, Duncan A.; Kleinfelder, Walter J.
1993-01-01
Splash 2 is an attached processor system for Sun SPARC 2 workstations that uses Xilinx 4010 Field Programmable Gate Arrays (FPGA's) as its processing elements. The purpose of this paper is to describe Splash 2. The predecessor system, Splash 1, was designed to be used as a systolic processing system. Although it was very successful in that mode, there were many other applications that were not systolic, but which were successful, nonetheless, on Splash 1, or that were not implemented successfully due to one or more architectural limitations, most notably I/O bandwidth and interprocessor communication. Although other uses to increase computational performance have been found for the Xilinx FPGA's that are Splash's processing elements. Splash is unique in its goal to be programmable in a general sense.
NASA Astrophysics Data System (ADS)
Babayan, Pavel; Smirnov, Sergey; Strotov, Valery
2017-10-01
This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Parallel algorithm for computation of second-order sequential best rotations
NASA Astrophysics Data System (ADS)
Redif, Soydan; Kasap, Server
2013-12-01
Algorithms for computing an approximate polynomial matrix eigenvalue decomposition of para-Hermitian systems have emerged as a powerful, generic signal processing tool. A technique that has shown much success in this regard is the sequential best rotation (SBR2) algorithm. Proposed is a scheme for parallelising SBR2 with a view to exploiting the modern architectural features and inherent parallelism of field-programmable gate array (FPGA) technology. Experiments show that the proposed scheme can achieve low execution times while requiring minimal FPGA resources.
Implementing Legacy-C Algorithms in FPGA Co-Processors for Performance Accelerated Smart Payloads
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.; Hartzell, Christine
2008-01-01
Accurate, on-board classification of instrument data is used to increase science return by autonomously identifying regions of interest for priority transmission or generating summary products to conserve transmission bandwidth. Due to on-board processing constraints, such classification has been limited to using the simplest functions on a small subset of the full instrument data. FPGA co-processor designs for SVM1 classifiers will lead to significant improvement in on-board classification capability and accuracy.
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
Systems-on-chip approach for real-time simulation of wheel-rail contact laws
NASA Astrophysics Data System (ADS)
Mei, T. X.; Zhou, Y. J.
2013-04-01
This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
FPGA-based firmware model for extended measurement systems with data quality monitoring
NASA Astrophysics Data System (ADS)
Wojenski, A.; Pozniak, K. T.; Mazon, D.; Chernyshova, M.
2017-08-01
Modern physics experiments requires construction of advanced, modular measurement systems for data processing and registration purposes. Components are often designed in one of the common mechanical and electrical standards, e.g. VME or uTCA. The paper is focused on measurement systems using FPGAs as data processing blocks, especially for plasma diagnostics using GEM detectors with data quality monitoring aspects. In the article is proposed standardized model of HDL FPGA firmware implementation, for use in a wide range of different measurement system. The effort was made in term of flexible implementation of data quality monitoring along with source data dynamic selection. In the paper is discussed standard measurement system model followed by detailed model of FPGA firmware for modular measurement systems. Considered are both: functional blocks and data buses. In the summary, necessary blocks and signal lines are described. Implementation of firmware following the presented rules should provide modular design, with ease of change different parts of it. The key benefit is construction of universal, modular HDL design, that can be applied in different measurement system with simple adjustments.
Berger, Andrew J; Page, Michael R; Jacob, Jan; Young, Justin R; Lewis, Jim; Wenzel, Lothar; Bhallamudi, Vidya P; Johnston-Halperin, Ezekiel; Pelekhov, Denis V; Hammel, P Chris
2014-12-01
Understanding the complex properties of electronic and spintronic devices at the micro- and nano-scale is a topic of intense current interest as it becomes increasingly important for scientific progress and technological applications. In operando characterization of such devices by scanning probe techniques is particularly well-suited for the microscopic study of these properties. We have developed a scanning probe microscope (SPM) which is capable of both standard force imaging (atomic, magnetic, electrostatic) and simultaneous electrical transport measurements. We utilize flexible and inexpensive FPGA (field-programmable gate array) hardware and a custom software framework developed in National Instrument's LabVIEW environment to perform the various aspects of microscope operation and device measurement. The FPGA-based approach enables sensitive, real-time cantilever frequency-shift detection. Using this system, we demonstrate electrostatic force microscopy of an electrically biased graphene field-effect transistor device. The combination of SPM and electrical transport also enables imaging of the transport response to a localized perturbation provided by the scanned cantilever tip. Facilitated by the broad presence of LabVIEW in the experimental sciences and the openness of our software solution, our system permits a wide variety of combined scanning and transport measurements by providing standardized interfaces and flexible access to all aspects of a measurement (input and output signals, and processed data). Our system also enables precise control of timing (synchronization of scanning and transport operations) and implementation of sophisticated feedback protocols, and thus should be broadly interesting and useful to practitioners in the field.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berger, Andrew J., E-mail: berger.156@osu.edu; Page, Michael R.; Young, Justin R.
Understanding the complex properties of electronic and spintronic devices at the micro- and nano-scale is a topic of intense current interest as it becomes increasingly important for scientific progress and technological applications. In operando characterization of such devices by scanning probe techniques is particularly well-suited for the microscopic study of these properties. We have developed a scanning probe microscope (SPM) which is capable of both standard force imaging (atomic, magnetic, electrostatic) and simultaneous electrical transport measurements. We utilize flexible and inexpensive FPGA (field-programmable gate array) hardware and a custom software framework developed in National Instrument's LabVIEW environment to perform themore » various aspects of microscope operation and device measurement. The FPGA-based approach enables sensitive, real-time cantilever frequency-shift detection. Using this system, we demonstrate electrostatic force microscopy of an electrically biased graphene field-effect transistor device. The combination of SPM and electrical transport also enables imaging of the transport response to a localized perturbation provided by the scanned cantilever tip. Facilitated by the broad presence of LabVIEW in the experimental sciences and the openness of our software solution, our system permits a wide variety of combined scanning and transport measurements by providing standardized interfaces and flexible access to all aspects of a measurement (input and output signals, and processed data). Our system also enables precise control of timing (synchronization of scanning and transport operations) and implementation of sophisticated feedback protocols, and thus should be broadly interesting and useful to practitioners in the field.« less
JPEG XS, a new standard for visually lossless low-latency lightweight image compression
NASA Astrophysics Data System (ADS)
Descampe, Antonin; Keinert, Joachim; Richter, Thomas; Fößel, Siegfried; Rouvroy, Gaël.
2017-09-01
JPEG XS is an upcoming standard from the JPEG Committee (formally known as ISO/IEC SC29 WG1). It aims to provide an interoperable visually lossless low-latency lightweight codec for a wide range of applications including mezzanine compression in broadcast and Pro-AV markets. This requires optimal support of a wide range of implementation technologies such as FPGAs, CPUs and GPUs. Targeted use cases are professional video links, IP transport, Ethernet transport, real-time video storage, video memory buffers, and omnidirectional video capture and rendering. In addition to the evaluation of the visual transparency of the selected technologies, a detailed analysis of the hardware and software complexity as well as the latency has been done to make sure that the new codec meets the requirements of the above-mentioned use cases. In particular, the end-to-end latency has been constrained to a maximum of 32 lines. Concerning the hardware complexity, neither encoder nor decoder should require more than 50% of an FPGA similar to Xilinx Artix 7 or 25% of an FPGA similar to Altera Cyclon 5. This process resulted in a coding scheme made of an optional color transform, a wavelet transform, the entropy coding of the highest magnitude level of groups of coefficients, and the raw inclusion of the truncated wavelet coefficients. This paper presents the details and status of the standardization process, a technical description of the future standard, and the latest performance evaluation results.
NASA Astrophysics Data System (ADS)
Mandal, Swagata; Saini, Jogender; Zabołotny, Wojciech M.; Sau, Suman; Chakrabarti, Amlan; Chattopadhyay, Subhasis
2017-03-01
Due to the dramatic increase of data volume in modern high energy physics (HEP) experiments, a robust high-speed data acquisition (DAQ) system is very much needed to gather the data generated during different nuclear interactions. As the DAQ works under harsh radiation environment, there is a fair chance of data corruption due to various energetic particles like alpha, beta, or neutron. Hence, a major challenge in the development of DAQ in the HEP experiment is to establish an error resilient communication system between front-end sensors or detectors and back-end data processing computing nodes. Here, we have implemented the DAQ using field-programmable gate array (FPGA) due to some of its inherent advantages over the application-specific integrated circuit. A novel orthogonal concatenated code and cyclic redundancy check (CRC) have been used to mitigate the effects of data corruption in the user data. Scrubbing with a 32-b CRC has been used against error in the configuration memory of FPGA. Data from front-end sensors will reach to the back-end processing nodes through multiple stages that may add an uncertain amount of delay to the different data packets. We have also proposed a novel memory management algorithm that helps to process the data at the back-end computing nodes removing the added path delays. To the best of our knowledge, the proposed FPGA-based DAQ utilizing optical link with channel coding and efficient memory management modules can be considered as first of its kind. Performance estimation of the implemented DAQ system is done based on resource utilization, bit error rate, efficiency, and robustness to radiation.
Fiber laser-microscope system for femtosecond photodisruption of biological samples
Yavaş, Seydi; Erdogan, Mutlu; Gürel, Kutan; Ilday, F. Ömer; Eldeniz, Y. Burak; Tazebay, Uygar H.
2012-01-01
We report on the development of a ultrafast fiber laser-microscope system for femtosecond photodisruption of biological targets. A mode-locked Yb-fiber laser oscillator generates few-nJ pulses at 32.7 MHz repetition rate, amplified up to ∼125 nJ at 1030 nm. Following dechirping in a grating compressor, ∼240 fs-long pulses are delivered to the sample through a diffraction-limited microscope, which allows real-time imaging and control. The laser can generate arbitrary pulse patterns, formed by two acousto-optic modulators (AOM) controlled by a custom-developed field-programmable gate array (FPGA) controller. This capability opens the route to fine optimization of the ablation processes and management of thermal effects. Sample position, exposure time and imaging are all computerized. The capability of the system to perform femtosecond photodisruption is demonstrated through experiments on tissue and individual cells. PMID:22435105
Design of dual energy x-ray detector for conveyor belt with steel wire ropes
NASA Astrophysics Data System (ADS)
Dai, Yue; Miao, Changyun; Rong, Feng
2009-07-01
A dual energy X-ray detector for conveyor belt with steel wire ropes is researched in the paper. Conveyor belt with steel wire ropes is one of primary transfer equipments in modern production. The traditional test methods like electromagnetic induction principle could not display inner image of steel wire ropes directly. So X-ray detection technology has used to detect the conveyor belt. However the image was not so clear by the interference of the rubber belt. Therefore, the dualenergy X-ray detection technology with subtraction method is developed to numerically remove the rubber belt from radiograph, thus improving the definition of the ropes image. The purpose of this research is to design a dual energy Xray detector that could make the operator easier to found the faulty of the belt. This detection system is composed of Xray source, detector controlled by FPGA chip, PC for running image processing system and so on. With the result of the simulating, this design really improved the capability of the staff to test the conveyor belt.
Wide-field ultraviolet imager for astronomical transient studies
NASA Astrophysics Data System (ADS)
Mathew, Joice; Ambily, S.; Prakash, Ajin; Sarpotdar, Mayuresh; Nirmal, K.; G. Sreejith, A.; Safonova, Margarita; Murthy, Jayant; Brosch, Noah
2018-04-01
Though the ultraviolet (UV) domain plays a vital role in the studies of astronomical transient events, the UV time-domain sky remains largely unexplored. We have designed a wide-field UV imager that can be flown on a range of available platforms, such as high-altitude balloons, CubeSats, and larger space missions. The major scientific goals are the variability of astronomical sources, detection of transients such as supernovae, novae, tidal disruption events, and characterizing active galactic nuclei variability. The instrument has a 80 mm aperture with a circular field of view of 10.8 degrees, an angular resolution of ˜22 arcsec, and a 240 - 390 nm spectral observation window. The detector for the instrument is a Microchannel Plate (MCP)-based image intensifier with both photon counting and integration capabilities. An FPGA-based detector readout mechanism and real time data processing have been implemented. The imager is designed in such a way that its lightweight and compact nature are well fitted for the CubeSat dimensions. Here we present various design and developmental aspects of this UV wide-field transient explorer.
PCIE interface design for high-speed image storage system based on SSD
NASA Astrophysics Data System (ADS)
Wang, Shiming
2015-02-01
This paper proposes and implements a standard interface of miniaturized high-speed image storage system, which combines PowerPC with FPGA and utilizes PCIE bus as the high speed switching channel. Attached to the PowerPC, mSATA interface SSD(Solid State Drive) realizes RAID3 array storage. At the same time, a high-speed real-time image compression patent IP core also can be embedded in FPGA, which is in the leading domestic level with compression rate and image quality, making that the system can record higher image data rate or achieve longer recording time. The notebook memory card buckle type design is used in the mSATA interface SSD, which make it possible to complete the replacement in 5 seconds just using single hand, thus the total length of repeated recordings is increased. MSI (Message Signaled Interrupts) interruption guarantees the stability and reliability of continuous DMA transmission. Furthermore, only through the gigabit network, the remote display, control and upload to backup function can be realized. According to an optional 25 frame/s or 30 frame/s, upload speeds can be up to more than 84 MB/s. Compared with the existing FLASH array high-speed memory systems, it has higher degree of modularity, better stability and higher efficiency on development, maintenance and upgrading. Its data access rate is up to 300MB/s, realizing the high speed image storage system miniaturization, standardization and modularization, thus it is fit for image acquisition, storage and real-time transmission to server on mobile equipment.
High speed true random number generator with a new structure of coarse-tuning PDL in FPGA
NASA Astrophysics Data System (ADS)
Fang, Hongzhen; Wang, Pengjun; Cheng, Xu; Zhou, Keji
2018-03-01
A metastability-based TRNG (true random number generator) is presented in this paper, and implemented in FPGA. The metastable state of a D flip-flop is tunable through a two-stage PDL (programmable delay line). With the proposed coarse-tuning PDL structure, the TRNG core does not require extra placement and routing to ensure its entropy. Furthermore, the core needs fewer stages of coarse-tuning PDL at higher operating frequency, and thus saves more resources in FPGA. The designed TRNG achieves 25 Mbps @ 100 MHz throughput after proper post-processing, which is several times higher than other previous TRNGs based on FPGA. Moreover, the robustness of the system is enhanced with the adoption of a feedback system. The quality of the designed TRNG is verified by NIST (National Institute of Standards and Technology) and also accepted by class P1 of the AIS-20/31 test suite. Project supported by the S&T Plan of Zhejiang Provincial Science and Technology Department (No. 2016C31078), the National Natural Science Foundation of China (Nos. 61574041, 61474068, 61234002), and the K.C. Wong Magna Fund in Ningbo University, China.
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations
NASA Astrophysics Data System (ADS)
Hamada, Tsuyoshi; Fukushige, Toshiyuki; Kawai, Atsushi; Makino, Junichiro
2000-10-01
We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and ``traditional'' GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter relies on a hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions, but also other forms of interactions, such as the van der Waals force, hydro\\-dynamical interactions in the SPHr calculation, and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interactions similar to that of GRAPE-3. One pipeline is fitted into a single FPGA chip, operated at 16 MHz clock. Thus, for gravitational interactions, PROGRAPE-1 provided a speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for a wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA
Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang
2009-01-01
Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
Removal of anti-Stokes emission background in STED microscopy by FPGA-based synchronous detection
NASA Astrophysics Data System (ADS)
Castello, M.; Tortarolo, G.; Coto Hernández, I.; Deguchi, T.; Diaspro, A.; Vicidomini, G.
2017-05-01
In stimulated emission depletion (STED) microscopy, the role of the STED beam is to de-excite, via stimulated emission, the fluorophores that have been previously excited by the excitation beam. This condition, together with specific beam intensity distributions, allows obtaining true sub-diffraction spatial resolution images. However, if the STED beam has a non-negligible probability to excite the fluorophores, a strong fluorescent background signal (anti-Stokes emission) reduces the effective resolution. For STED scanning microscopy, different synchronous detection methods have been proposed to remove this anti-Stokes emission background and recover the resolution. However, every method works only for a specific STED microscopy implementation. Here we present a user-friendly synchronous detection method compatible with any STED scanning microscope. It exploits a data acquisition (DAQ) card based on a field-programmable gate array (FPGA), which is progressively used in STED microscopy. In essence, the FPGA-based DAQ card synchronizes the fluorescent signal registration, the beam deflection, and the excitation beam interruption, providing a fully automatic pixel-by-pixel synchronous detection method. We validate the proposed method in both continuous wave and pulsed STED microscope systems.
A phase-based stereo vision system-on-a-chip.
Díaz, Javier; Ros, Eduardo; Sabatini, Silvio P; Solari, Fabio; Mota, Sonia
2007-02-01
A simple and fast technique for depth estimation based on phase measurement has been adopted for the implementation of a real-time stereo system with sub-pixel resolution on an FPGA device. The technique avoids the attendant problem of phase warping. The designed system takes full advantage of the inherent processing parallelism and segmentation capabilities of FPGA devices to achieve a computation speed of 65megapixels/s, which can be arranged with a customized frame-grabber module to process 211frames/s at a size of 640x480 pixels. The processing speed achieved is higher than conventional camera frame rates, thus allowing the system to extract multiple estimations and be used as a platform to evaluate integration schemes of a population of neurons without increasing hardware resource demands.
Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research
Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M
2008-01-01
Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation. PMID:18412963
Improving wavelet denoising based on an in-depth analysis of the camera color processing
NASA Astrophysics Data System (ADS)
Seybold, Tamara; Plichta, Mathias; Stechele, Walter
2015-02-01
While Denoising is an extensively studied task in signal processing research, most denoising methods are designed and evaluated using readily processed image data, e.g. the well-known Kodak data set. The noise model is usually additive white Gaussian noise (AWGN). This kind of test data does not correspond to nowadays real-world image data taken with a digital camera. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or suboptimal choices in research on real-time camera denoising algorithms. In this paper we derive a precise analysis of the noise characteristics for the different steps in the color processing. Based on real camera noise measurements and simulation of the processing steps, we obtain a good approximation for the noise characteristics. We further show how this approximation can be used in standard wavelet denoising methods. We improve the wavelet hard thresholding and bivariate thresholding based on our noise analysis results. Both the visual quality and objective quality metrics show the advantage of the proposed method. As the method is implemented using look-up-tables that are calculated before the denoising step, our method can be implemented with very low computational complexity and can process HD video sequences real-time in an FPGA.
High-speed image processing system and its micro-optics application
NASA Astrophysics Data System (ADS)
Ohba, Kohtaro; Ortega, Jesus C. P.; Tanikawa, Tamio; Tanie, Kazuo; Tajima, Kenji; Nagai, Hiroshi; Tsuji, Masataka; Yamada, Shigeru
2003-07-01
In this paper, a new application system with high speed photography, i.e. an observational system for the tele-micro-operation, has been proposed with a dynamic focusing system and a high-speed image processing system using the "Depth From Focus (DFF)" criteria. In micro operation, such as for the microsurgery, DNA operation and etc., the small depth of a focus on the microscope makes bad observation. For example, if the focus is on the object, the actuator cannot be seen with the microscope. On the other hand, if the focus is on the actuator, the object cannot be observed. In this sense, the "all-in-focus image," which holds the in-focused texture all over the image, is useful to observe the microenvironments on the microscope. It is also important to obtain the "depth map" which could show the 3D micro virtual environments in real-time to actuate the micro objects, intuitively. To realize the real-time micro operation with DFF criteria, which has to integrate several images to obtain "all-in-focus image" and "depth map," at least, the 240 frames par second based image capture and processing system should be required. At first, this paper briefly reviews the criteria of "depth from focus" to achieve the all-in-focus image and the 3D microenvironments' reconstruction, simultaneously. After discussing the problem in our past system, a new frame-rate system is constructed with the high-speed video camera and FPGA hardware with 240 frames par second. To apply this system in the real microscope, a new criterion "ghost filtering" technique to reconstruct the all-in-focus image is proposed. Finally, the micro observation shows the validity of this system.
Development of ROACH firmware for microwave multiplexed X-ray TES microcalorimeters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madden, T. J.; Cecil, T. W.; Gades, L. M.
We are developing room temperature electronics based upon the ROACH platform for reading out microwave multiplexed X-ray TES. ROACH is an open-source hardware and software platform featuring a large Xilinx Field Programmable Gate Array (FPGA), Power PC processor, several 10GB Ethernet SFP+ interfaces, and a collection of daughter boards for analog signal generation and acquisition. The combination of a ROACH board, ADC/DAC conversion daughter boards, and hardware for RF mixing allows for the generation and capture of multiple RF tones for reading out microwave multiplexed x-ray TES microcalorimeters. The FPGA is used to generate multiple tones in base band, frommore » 10MHz to 250MHz, which are subsequently mixed to RF in the multiple GHz range and sent through the microwave multiplexer. The tones are generated in the FPGA by storing a large lookup table in Quad Data Rate (QDR) SRAM modules and playing out the waveform to a DAC board. Once the signal has been modulated to RF, passed through the microwave multiplexer, and has been modulated back to base band, the signal is digitized by an ADC board. The tones are modulated to 0Hz by using a FPGA circuit consisting of a polyphase filter bank, several Xilinx FFT blocks, Xilinx CORDIC blocks (for converting to magnitude and phase), and special phase accumulator circuit for mixing to exactly 0Hz. Upwards of 256 channels can be simultaneously captured and written into a bank of 256 First-In-First-Out (FIFO) memories, with each FIFO corresponding to a channel. Individual channel data can be further processed in the FPGA before being streamed through a 10GB Ethernet fiber-optic interface to a Linux system. The Linux system runs software written in Python and QT C++ for controlling the ROACH system, capturing data, and processing data.« less
NASA Astrophysics Data System (ADS)
Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.
2008-04-01
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
NASA Astrophysics Data System (ADS)
Jorge, L. S.; Bonifacio, D. A. B.; DeWitt, Don; Miyaoka, R. S.
2016-12-01
Continuous scintillator-based detectors have been considered as a competitive and cheaper approach than highly pixelated discrete crystal positron emission tomography (PET) detectors, despite the need for algorithms to estimate 3D gamma interaction position. In this work, we report on the implementation of a positioning algorithm to estimate the 3D interaction position in a continuous crystal PET detector using a Field Programmable Gate Array (FPGA). The evaluated method is the Statistics-Based Processing (SBP) technique that requires light response function and event position characterization. An algorithm has been implemented using the Verilog language and evaluated using a data acquisition board that contains an Altera Stratix III FPGA. The 3D SBP algorithm was previously successfully implemented on a Stratix II FPGA using simulated data and a different module design. In this work, improvements were made to the FPGA coding of the 3D positioning algorithm, reducing the total memory usage to around 34%. Further the algorithm was evaluated using experimental data from a continuous miniature crystal element (cMiCE) detector module. Using our new implementation, average FWHM (Full Width at Half Maximum) for the whole block is 1.71±0.01 mm, 1.70±0.01 mm and 1.632±0.005 mm for x, y and z directions, respectively. Using a pipelined architecture, the FPGA is able to process 245,000 events per second for interactions inside of the central area of the detector that represents 64% of the total block area. The weighted average of the event rate by regional area (corner, border and central regions) is about 198,000 events per second. This event rate is greater than the maximum expected coincidence rate for any given detector module in future PET systems using the cMiCE detector design.
CBM First-level Event Selector Input Interface Demonstrator
NASA Astrophysics Data System (ADS)
Hutter, Dirk; de Cuveland, Jan; Lindenstruth, Volker
2017-10-01
CBM is a heavy-ion experiment at the future FAIR facility in Darmstadt, Germany. Featuring self-triggered front-end electronics and free-streaming read-out, event selection will exclusively be done by the First Level Event Selector (FLES). Designed as an HPC cluster with several hundred nodes its task is an online analysis and selection of the physics data at a total input data rate exceeding 1 TByte/s. To allow efficient event selection, the FLES performs timeslice building, which combines the data from all given input links to self-contained, potentially overlapping processing intervals and distributes them to compute nodes. Partitioning the input data streams into specialized containers allows performing this task very efficiently. The FLES Input Interface defines the linkage between the FEE and the FLES data transport framework. A custom FPGA PCIe board, the FLES Interface Board (FLIB), is used to receive data via optical links and transfer them via DMA to the host’s memory. The current prototype of the FLIB features a Kintex-7 FPGA and provides up to eight 10 GBit/s optical links. A custom FPGA design has been developed for this board. DMA transfers and data structures are optimized for subsequent timeslice building. Index tables generated by the FPGA enable fast random access to the written data containers. In addition the DMA target buffers can directly serve as InfiniBand RDMA source buffers without copying the data. The usage of POSIX shared memory for these buffers allows data access from multiple processes. An accompanying HDL module has been developed to integrate the FLES link into the front-end FPGA designs. It implements the front-end logic interface as well as the link protocol. Prototypes of all Input Interface components have been implemented and integrated into the FLES test framework. This allows the implementation and evaluation of the foreseen CBM read-out chain.
The Digital Data Acquisition System for the Russian VLBI Network of New Generation
NASA Technical Reports Server (NTRS)
Fedotov, Leonid; Nosov, Eugeny; Grenkov, Sergey; Marshalov, Dmitry
2010-01-01
The system consists of several identical channels of 1024 MHz bandwidth each. In each channel, the RF band is frequency-translated to the intermediate frequency range 1 - 2 GHz. Each channel consists of two parts: the digitizer and Mark 5C recorder. The digitizer is placed on the antenna close to the corresponding Low-Noise Amplifier output and consists of the analog frequency converter, ADC, and a device for digital processing of the signals using FPGA. In the digitizer the subdigitization on frequency of 2048 MHz is used. For producing narrow-band channels and to interface with existing data acquisition systems, the polyphase filtering with FPGA can be used. Digital signals are re-quantized to 2-bits in the FPGA and are transferred to an input of Mark 5C through a fiber line. The breadboard model of the digitizer is being tested, and the data acquisition system is being designed.
Beam Instrument Development System
DOE Office of Scientific and Technical Information (OSTI.GOV)
DOOLITTLE, LAWRENCE; HUANG, GANG; DU, QIANG
Beam Instrumentation Development System (BIDS) is a collection of common support libraries and modules developed during a series of Low-Level Radio Frequency (LLRF) control and timing/synchronization projects. BIDS includes a collection of Hardware Description Language (HDL) libraries and software libraries. The BIDS can be used for the development of any FPGA-based system, such as LLRF controllers. HDL code in this library is generic and supports common Digital Signal Processing (DSP) functions, FPGA-specific drivers (high-speed serial link wrappers, clock generation, etc.), ADC/DAC drivers, Ethernet MAC implementation, etc.
Development of embedded real-time and high-speed vision platform
NASA Astrophysics Data System (ADS)
Ouyang, Zhenxing; Dong, Yimin; Yang, Hua
2015-12-01
Currently, high-speed vision platforms are widely used in many applications, such as robotics and automation industry. However, a personal computer (PC) whose over-large size is not suitable and applicable in compact systems is an indispensable component for human-computer interaction in traditional high-speed vision platforms. Therefore, this paper develops an embedded real-time and high-speed vision platform, ER-HVP Vision which is able to work completely out of PC. In this new platform, an embedded CPU-based board is designed as substitution for PC and a DSP and FPGA board is developed for implementing image parallel algorithms in FPGA and image sequential algorithms in DSP. Hence, the capability of ER-HVP Vision with size of 320mm x 250mm x 87mm can be presented in more compact condition. Experimental results are also given to indicate that the real-time detection and counting of the moving target at a frame rate of 200 fps at 512 x 512 pixels under the operation of this newly developed vision platform are feasible.
SpaceCube 2.0: An Advanced Hybrid Onboard Data Processor
NASA Technical Reports Server (NTRS)
Lin, Michael; Flatley, Thomas; Godfrey, John; Geist, Alessandro; Espinosa, Daniel; Petrick, David
2011-01-01
The SpaceCube 2.0 is a compact, high performance, low-power onboard processing system that takes advantage of cutting-edge hybrid (CPU/FPGA/DSP) processing elements. The SpaceCube 2.0 design concept includes two commercial Virtex-5 field-programmable gate array (FPGA) parts protected by gradiation hardened by software" technology, and possesses exceptional size, weight, and power characteristics [5x5x7 in., 3.5 lb (approximately equal to 12.7 x 12.7 x 17.8 cm, 1.6 kg) 5-25 W, depending on the application fs required clock rate]. The two Virtex-5 FPGA parts are implemented in a unique back-toback configuration to maximize data transfer and computing performance. Draft computing power specifications for the SpaceCube 2.0 unit include four PowerPC 440s (1100 DMIPS each), 500+ DSP48Es (2x580 GMACS), 100+ LVDS high-speed serial I/Os (1.25 Gbps each), and 2x190 GFLOPS single-precision (65 GFLOPS double-precision) floating point performance. The SpaceCube 2.0 includes PROM memory for CPU boot, health and safety, and basic command and telemetry functionality; RAM memory for program execution; and FLASH/EEPROM memory to store algorithms and application code for the CPU, FPGA, and DSP processing elements. Program execution can be reconfigured in real time and algorithms can be updated, modified, and/or replaced at any point during the mission. Gigabit Ethernet, Spacewire, SATA and highspeed LVDS serial/parallel I/O channels are available for instrument/sensor data ingest, and mission-unique instrument interfaces can be accommodated using a compact PCI (cPCI) expansion card interface. The SpaceCube 2.0 can be utilized in NASA Earth Science, Helio/Astrophysics and Exploration missions, and Department of Defense satellites for onboard data processing. It can also be used in commercial communication and mapping satellites.
Carpenter, Thomas M.; Rashid, M. Wasequr; Ghovanloo, Maysam; Cowell, David M. J.; Freear, Steven; Degertekin, F. Levent
2016-01-01
In real-time catheter based 3D ultrasound imaging applications, gathering data from the transducer arrays is difficult as there is a restriction on cable count due to the diameter of the catheter. Although area and power hungry multiplexing circuits integrated at the catheter tip are used in some applications, these are unsuitable for use in small sized catheters for applications like intracardiac imaging. Furthermore, the length requirement for catheters and limited power available to on-chip cable drivers leads to limited signal strength at the receiver end. In this paper an alternative approach using Analog Time Division Multiplexing (TDM) is presented which addresses the cable restrictions of ultrasound catheters. A novel digital demultiplexing technique is also described which allows for a reduction in the number of analog signal processing stages required. The TDM and digital demultiplexing schemes are demonstrated for an intracardiac imaging system that would operate in the 4 MHz to 11 MHz range. A TDM integrated circuit (IC) with 8:1 multiplexer is interfaced with a fast ADC through a micro-coaxial catheter cable bundle, and processed with an FPGA RTL simulation. Input signals to the TDM IC are recovered with −40 dB crosstalk between channels on the same micro-coax, showing the feasibility of this system for ultrasound imaging applications. PMID:27116738
Digital Fingerprinting of Field Programmable Gate Arrays
2008-03-01
48 vii Page Appendix B . Tranistional Sampling Outputs . . . . . . . . . . . . . . 49 Appendix C. VHDL Entities...cumulative sampling outputs by pin . . . . . . . . . . . 48 B .1. FPGA outputs for Sample 0, Clk 18 . . . . . . . . . . . . . . . 49 B .2. FPGA outputs for...Sample 0, Clk 19 . . . . . . . . . . . . . . . 49 B .3. FPGA outputs for Sample 0, Clk 21 . . . . . . . . . . . . . . . 50 B .4. FPGA outputs for Sample
All-IP-Ethernet architecture for real-time sensor-fusion processing
NASA Astrophysics Data System (ADS)
Hiraki, Kei; Inaba, Mary; Tezuka, Hiroshi; Tomari, Hisanobu; Koizumi, Kenichi; Kondo, Shuya
2016-03-01
Serendipter is a device that distinguishes and selects very rare particles and cells from huge amount of population. We are currently designing and constructing information processing system for a Serendipter. The information processing system for Serendipter is a kind of sensor-fusion system but with much more difficulties: To fulfill these requirements, we adopt All IP based architecture: All IP-Ethernet based data processing system consists of (1) sensor/detector directly output data as IP-Ethernet packet stream, (2) single Ethernet/TCP/IP streams by a L2 100Gbps Ethernet switch, (3) An FPGA board with 100Gbps Ethernet I/F connected to the switch and a Xeon based server. Circuits in the FPGA include 100Gbps Ethernet MAC, buffers and preprocessing, and real-time Deep learning circuits using multi-layer neural networks. Proposed All-IP architecture solves existing problem to construct large-scale sensor-fusion systems.
Handheld hyperspectral imager system for chemical/biological and environmental applications
NASA Astrophysics Data System (ADS)
Hinnrichs, Michele; Piatek, Bob
2004-08-01
A small, hand held, battery operated imaging infrared spectrometer, Sherlock, has been developed by Pacific Advanced Technology and was field tested in early 2003. The Sherlock spectral imaging camera has been designed for remote gas leak detection, however, the architecture of the camera is versatile enough that it can be applied to numerous other applications such as homeland security, chemical/biological agent detection, medical and pharmaceutical applications as well as standard research and development. This paper describes the Sherlock camera, theory of operations, shows current applications and touches on potential future applications for the camera. The Sherlock has an embedded Power PC and performs real-time-image processing function in an embedded FPGA. The camera has a built in LCD display as well as output to a standard monitor, or NTSC display. It has several I/O ports, ethernet, firewire, RS232 and thus can be easily controlled from a remote location. In addition, software upgrades can be performed over the ethernet eliminating the need to send the camera back to the factory for a retrofit. Using the USB port a mouse and key board can be connected and the camera can be used in a laboratory environment as a stand alone imaging spectrometer.
Hand-held hyperspectral imager for chemical/biological and environmental applications
NASA Astrophysics Data System (ADS)
Hinnrichs, Michele; Piatek, Bob
2004-03-01
A small, hand held, battery operated imaging infrared spectrometer, Sherlock, has been developed by Pacific Advanced Technology and was field tested in early 2003. The Sherlock spectral imaging camera has been designed for remote gas leak detection, however, the architecture of the camera is versatile enough that it can be applied to numerous other applications such as homeland security, chemical/biological agent detection, medical and pharmaceutical applications as well as standard research and development. This paper describes the Sherlock camera, theory of operations, shows current applications and touches on potential future applications for the camera. The Sherlock has an embedded Power PC and performs real-time-image processing function in an embedded FPGA. The camera has a built in LCD display as well as output to a standard monitor, or NTSC display. It has several I/O ports, ethernet, firewire, RS232 and thus can be easily controlled from a remote location. In addition, software upgrades can be performed over the ethernet eliminating the need to send the camera back to the factory for a retrofit. Using the USB port a mouse and key board can be connected and the camera can be used in a laboratory environment as a stand alone imaging spectrometer.
Sensor Fusion to Estimate the Depth and Width of the Weld Bead in Real Time in GMAW Processes
Sampaio, Renato Coral; Vargas, José A. R.
2018-01-01
The arc welding process is widely used in industry but its automatic control is limited by the difficulty in measuring the weld bead geometry and closing the control loop on the arc, which has adverse environmental conditions. To address this problem, this work proposes a system to capture the welding variables and send stimuli to the Gas Metal Arc Welding (GMAW) conventional process with a constant voltage power source, which allows weld bead geometry estimation with an open-loop control. Dynamic models of depth and width estimators of the weld bead are implemented based on the fusion of thermographic data, welding current and welding voltage in a multilayer perceptron neural network. The estimators were trained and validated off-line with data from a novel algorithm developed to extract the features of the infrared image, a laser profilometer was implemented to measure the bead dimensions and an image processing algorithm that measures depth by making a longitudinal cut in the weld bead. These estimators are optimized for embedded devices and real-time processing and were implemented on a Field-Programmable Gate Array (FPGA) device. Experiments to collect data, train and validate the estimators are presented and discussed. The results show that the proposed method is useful in industrial and research environments. PMID:29570698
Sensor Fusion to Estimate the Depth and Width of the Weld Bead in Real Time in GMAW Processes.
Bestard, Guillermo Alvarez; Sampaio, Renato Coral; Vargas, José A R; Alfaro, Sadek C Absi
2018-03-23
The arc welding process is widely used in industry but its automatic control is limited by the difficulty in measuring the weld bead geometry and closing the control loop on the arc, which has adverse environmental conditions. To address this problem, this work proposes a system to capture the welding variables and send stimuli to the Gas Metal Arc Welding (GMAW) conventional process with a constant voltage power source, which allows weld bead geometry estimation with an open-loop control. Dynamic models of depth and width estimators of the weld bead are implemented based on the fusion of thermographic data, welding current and welding voltage in a multilayer perceptron neural network. The estimators were trained and validated off-line with data from a novel algorithm developed to extract the features of the infrared image, a laser profilometer was implemented to measure the bead dimensions and an image processing algorithm that measures depth by making a longitudinal cut in the weld bead. These estimators are optimized for embedded devices and real-time processing and were implemented on a Field-Programmable Gate Array (FPGA) device. Experiments to collect data, train and validate the estimators are presented and discussed. The results show that the proposed method is useful in industrial and research environments.
Board Saver for Use with Developmental FPGAs
NASA Technical Reports Server (NTRS)
Berkun, Andrew
2009-01-01
A device denoted a board saver has been developed as a means of reducing wear and tear of a printed-circuit board onto which an antifuse field programmable gate array (FPGA) is to be eventually soldered permanently after a number of design iterations. The need for the board saver or a similar device arises because (1) antifuse-FPGA design iterations are common and (2) repeated soldering and unsoldering of FPGAs on the printed-circuit board to accommodate design iterations can wear out the printed-circuit board. The board saver is basically a solderable/unsolderable FPGA receptacle that is installed temporarily on the printed-circuit board. The board saver is, more specifically, a smaller, square-ring-shaped, printed-circuit board (see figure) that contains half via holes one for each contact pad along its periphery. As initially fabricated, the board saver is a wider ring containing full via holes, but then it is milled along its outer edges, cutting the via holes in half and laterally exposing their interiors. The board saver is positioned in registration with the designated FPGA footprint and each via hole is soldered to the outer portion of the corresponding FPGA contact pad on the first-mentioned printed-circuit board. The via-hole/contact joints can be inspected visually and can be easily unsoldered later. The square hole in the middle of the board saver is sized to accommodate the FPGA, and the thickness of the board saver is the same as that of the FPGA. Hence, when a non-final FPGA is placed in the square hole, the combination of the non-final FPGA and the board saver occupy no more area and thickness than would a final FPGA soldered directly into its designated position on the first-mentioned circuit board. The contact leads of a non-final FPGA are not bent and are soldered, at the top of the board saver, to the corresponding via holes. A non-final FPGA can readily be unsoldered from the board saver and replaced by another one. Once the final FPGA design has been determined, the board saver can be unsoldered from the contact pads on the first-mentioned printed-circuit board and replaced by the final FPGA.
Video sensor architecture for surveillance applications.
Sánchez, Jordi; Benet, Ginés; Simó, José E
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%.
Video Sensor Architecture for Surveillance Applications
Sánchez, Jordi; Benet, Ginés; Simó, José E.
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%. PMID:22438723
Atoche, Alejandro Castillo; Castillo, Javier Vázquez
2012-01-01
A high-speed dual super-systolic core for reconstructive signal processing (SP) operations consists of a double parallel systolic array (SA) machine in which each processing element of the array is also conceptualized as another SA in a bit-level fashion. In this study, we addressed the design of a high-speed dual super-systolic array (SSA) core for the enhancement/reconstruction of remote sensing (RS) imaging of radar/synthetic aperture radar (SAR) sensor systems. The selected reconstructive SP algorithms are efficiently transformed in their parallel representation and then, they are mapped into an efficient high performance embedded computing (HPEC) architecture in reconfigurable Xilinx field programmable gate array (FPGA) platforms. As an implementation test case, the proposed approach was aggregated in a HW/SW co-design scheme in order to solve the nonlinear ill-posed inverse problem of nonparametric estimation of the power spatial spectrum pattern (SSP) from a remotely sensed scene. We show how such dual SSA core, drastically reduces the computational load of complex RS regularization techniques achieving the required real-time operational mode. PMID:22736964
NASA Astrophysics Data System (ADS)
Jridi, Maher; Alfalou, Ayman
2017-05-01
By this paper, the major goal is to investigate the Multi-CPU/FPGA SoC (System on Chip) design flow and to transfer a know-how and skills to rapidly design embedded real-time vision system. Our aim is to show how the use of these devices can be benefit for system level integration since they make possible simultaneous hardware and software development. We take the facial detection and pretreatments as case study since they have a great potential to be used in several applications such as video surveillance, building access control and criminal identification. The designed system use the Xilinx Zedboard platform. The last is the central element of the developed vision system. The video acquisition is performed using either standard webcam connected to the Zedboard via USB interface or several camera IP devices. The visualization of video content and intermediate results are possible with HDMI interface connected to HD display. The treatments embedded in the system are as follow: (i) pre-processing such as edge detection implemented in the ARM and in the reconfigurable logic, (ii) software implementation of motion detection and face detection using either ViolaJones or LBP (Local Binary Pattern), and (iii) application layer to select processing application and to display results in a web page. One uniquely interesting feature of the proposed system is that two functions have been developed to transmit data from and to the VDMA port. With the proposed optimization, the hardware implementation of the Sobel filter takes 27 ms and 76 ms for 640x480, and 720p resolutions, respectively. Hence, with the FPGA implementation, an acceleration of 5 times is obtained which allow the processing of 37 fps and 13 fps for 640x480, and 720p resolutions, respectively.
Onboard Radar Processing Development for Rapid Response Applications
NASA Technical Reports Server (NTRS)
Lou, Yunling; Chien, Steve; Clark, Duane; Doubleday, Josh; Muellerschoen, Ron; Wang, Charles C.
2011-01-01
We are developing onboard processor (OBP) technology to streamline data acquisition on-demand and explore the potential of the L-band SAR instrument onboard the proposed DESDynI mission and UAVSAR for rapid response applications. The technology would enable the observation and use of surface change data over rapidly evolving natural hazards, both as an aid to scientific understanding and to provide timely data to agencies responsible for the management and mitigation of natural disasters. We are adapting complex science algorithms for surface water extent to detect flooding, snow/water/ice classification to assist in transportation/ shipping forecasts, and repeat-pass change detection to detect disturbances. We are near completion of the development of a custom FPGA board to meet the specific memory and processing needs of L-band SAR processor algorithms and high speed interfaces to reformat and route raw radar data to/from the FPGA processor board. We have also developed a high fidelity Matlab model of the SAR processor that is modularized and parameterized for ease to prototype various SAR processor algorithms targeted for the FPGA. We will be testing the OBP and rapid response algorithms with UAVSAR data to determine the fidelity of the products.
Design of extensible meteorological data acquisition system based on FPGA
NASA Astrophysics Data System (ADS)
Zhang, Wen; Liu, Yin-hua; Zhang, Hui-jun; Li, Xiao-hui
2015-02-01
In order to compensate the tropospheric refraction error generated in the process of satellite navigation and positioning. Temperature, humidity and air pressure had to be used in concerned models to calculate the value of this error. While FPGA XC6SLX16 was used as the core processor, the integrated silicon pressure sensor MPX4115A and digital temperature-humidity sensor SHT75 are used as the basic meteorological parameter detection devices. The core processer was used to control the real-time sampling of ADC AD7608 and to acquire the serial output data of SHT75. The data was stored in the BRAM of XC6SLX16 and used to generate standard meteorological parameters in NEMA format. The whole design was based on Altium hardware platform and ISE software platform. The system was described in the VHDL language and schematic diagram to realize the correct detection of temperature, humidity, air pressure. The 8-channel synchronous sampling characteristics of AD7608 and programmable external resources of FPGA laid the foundation for the increasing of analog or digital meteorological element signal. The designed meteorological data acquisition system featured low cost, high performance, multiple expansions.
Design of signal reception and processing system of embedded ultrasonic endoscope
NASA Astrophysics Data System (ADS)
Li, Ming; Yu, Feng; Zhang, Ruiqiang; Li, Yan; Chen, Xiaodong; Yu, Daoyin
2009-11-01
Embedded Ultrasonic Endoscope, based on embedded microprocessor and embedded real-time operating system, sends a micro ultrasonic probe into coelom through the biopsy channel of the Electronic Endoscope to get the fault histology features of digestive organs by rotary scanning, and acquires the pictures of the alimentary canal mucosal surface. At the same time, ultrasonic signals are processed by signal reception and processing system, forming images of the full histology of the digestive organs. Signal Reception and Processing System is an important component of Embedded Ultrasonic Endoscope. However, the traditional design, using multi-level amplifiers and special digital processing circuits to implement signal reception and processing, is no longer satisfying the standards of high-performance, miniaturization and low power requirements that embedded system requires, and as a result of the high noise that multi-level amplifier brought, the extraction of small signal becomes hard. Therefore, this paper presents a method of signal reception and processing based on double variable gain amplifier and FPGA, increasing the flexibility and dynamic range of the Signal Reception and Processing System, improving system noise level, and reducing power consumption. Finally, we set up the embedded experiment system, using a transducer with the center frequency of 8MHz to scan membrane samples, and display the image of ultrasonic echo reflected by each layer of membrane, with a frame rate of 5Hz, verifying the correctness of the system.
Multi-DSP and FPGA based Multi-channel Direct IF/RF Digital receiver for atmospheric radar
NASA Astrophysics Data System (ADS)
Yasodha, Polisetti; Jayaraman, Achuthan; Kamaraj, Pandian; Durga rao, Meka; Thriveni, A.
2016-07-01
Modern phased array radars depend highly on digital signal processing (DSP) to extract the echo signal information and to accomplish reliability along with programmability and flexibility. The advent of ASIC technology has made various digital signal processing steps to be realized in one DSP chip, which can be programmed as per the application and can handle high data rates, to be used in the radar receiver to process the received signal. Further, recent days field programmable gate array (FPGA) chips, which can be re-programmed, also present an opportunity to utilize them to process the radar signal. A multi-channel direct IF/RF digital receiver (MCDRx) is developed at NARL, taking the advantage of high speed ADCs and high performance DSP chips/FPGAs, to be used for atmospheric radars working in HF/VHF bands. Multiple channels facilitate the radar t be operated in multi-receiver modes and also to obtain the wind vector with improved time resolution, without switching the antenna beam. MCDRx has six channels, implemented on a custom built digital board, which is realized using six numbers of ADCs for simultaneous processing of the six input signals, Xilinx vertex5 FPGA and Spartan6 FPGA, and two ADSPTS201 DSP chips, each of which performs one phase of processing. MCDRx unit interfaces with the data storage/display computer via two gigabit ethernet (GbE) links. One of the six channels is used for Doppler beam swinging (DBS) mode and the other five channels are used for multi-receiver mode operations, dedicatedly. Each channel has (i) ADC block, to digitize RF/IF signal, (ii) DDC block for digital down conversion of the digitized signal, (iii) decoding block to decode the phase coded signal, and (iv) coherent integration block for integrating the data preserving phase intact. ADC block consists of Analog devices make AD9467 16-bit ADCs, to digitize the input signal at 80 MSPS. The output of ADC is centered around (80 MHz - input frequency). The digitized data is fed to DDC block, which down converts the data to base-band. The DDC block has NCO, mixer and two chains of Bessel filters (fifth order cascaded integration comb filter, two FIR filters, two half band filters and programmable FIR filters) for in-phase (I) and Quadrature phase (Q) channels. The NCO has 32 bits and is set to match the output frequency of ADC. Further, DDC down samples (decimation) the data and reduces the data rate to 16 MSPS. This data is further decimated and the data rate is reduced down to 4/2/1/0.5/0.25/0.125/0.0625 MSPS for baud lengths 0.25/0.5/1/2/4/8/16 μs respectively. The down sampled data is then fed to decoding block, which performs cross correlation to achieve pulse compression of the binary-phase coded data to obtain better range resolution with maximum possible height coverage. This step improves the signal power by a factor equal to the length of the code. Coherent integration block integrates the decoded data coherently for successive pulses, which improves the signal to noise ratio and reduces the data volume. DDC, decoding and coherent integration blocks are implemented in Xilinx vertex5 FPGA. Till this point, function of all six channels is same for DBS mode and multi-receiver modes. Data from vertex5 FPGA is transferred to PC via GbE-1 interface for multi-modes or to two Analog devices make ADSP-TS201 DSP chips (A and B), via link port for DBS mode. ADSP-TS201 chips perform the normalization, DC removal, windowing, FFT computation and spectral averaging on the data, which is transferred to storage/display PC via GbE-2 interface for real-time data display and data storing. Physical layer of GbE interface is implemented in an external chip (Marvel 88E1111) and MAC layer is implemented internal to vertex5 FPGA. The MCDRx has total 4 GB of DDR2 memory for data storage. Spartan6 FPGA is used for generating timing signals, required for basic operation of the radar and testing of the MCDRx.
Alali, Sanaz; Gribble, Adam; Vitkin, I Alex
2016-03-01
A new polarimetry method is demonstrated to image the entire Mueller matrix of a turbid sample using four photoelastic modulators (PEMs) and a charge coupled device (CCD) camera, with no moving parts. Accurate wide-field imaging is enabled with a field-programmable gate array (FPGA) optical gating technique and an evolutionary algorithm (EA) that optimizes imaging times. This technique accurately and rapidly measured the Mueller matrices of air, polarization elements, and turbid phantoms. The system should prove advantageous for Mueller matrix analysis of turbid samples (e.g., biological tissues) over large fields of view, in less than a second.
NASA Tech Briefs, September 2010
NASA Technical Reports Server (NTRS)
2010-01-01
Topics covered include: Instrument for Measuring Thermal Conductivity of Materials at Low Temperatures; Multi-Axis Accelerometer Calibration System; Pupil Alignment Measuring Technique and Alignment Reference for Instruments or Optical Systems; Autonomous System for Monitoring the Integrity of Composite Fan Housings; A Safe, Self-Calibrating, Wireless System for Measuring Volume of Any Fuel at Non-Horizontal Orientation; Adaptation of the Camera Link Interface for Flight-Instrument Applications; High-Performance CCSDS Encapsulation Service Implementation in FPGA; High-Performance CCSDS AOS Protocol Implementation in FPGA; Advanced Flip Chips in Extreme Temperature Environments; Diffuse-Illumination Systems for Growing Plants; Microwave Plasma Hydrogen Recovery System; Producing Hydrogen by Plasma Pyrolysis of Methane; Self-Deployable Membrane Structures; Reactivation of a Tin-Oxide-Containing Catalys; Functionalization of Single-Wall Carbon Nanotubes by Photo-Oxidation; Miniature Piezoelectric Macro-Mass Balance; Acoustic Liner for Turbomachinery Applications; Metering Gas Strut for Separating Rocket Stages; Large-Flow-Area Flow-Selective Liquid/Gas Separator; Counterflowing Jet Subsystem Design; Water Tank with Capillary Air/Liquid Separation; True Shear Parallel Plate Viscometer; Focusing Diffraction Grating Element with Aberration Control; Universal Millimeter-Wave Radar Front End; Mode Selection for a Single-Frequency Fiber Laser; Qualification and Selection of Flight Diode Lasers for Space Applications; Plenoptic Imager for Automated Surface Navigation; Maglev Facility for Simulating Variable Gravity; Hybrid AlGaN-SiC Avalanche Photodiode for Deep-UV Photon Detection; High-Speed Operation of Interband Cascade Lasers; 3D GeoWall Analysis System for Shuttle External Tank Foreign Object Debris Events; Charge-Spot Model for Electrostatic Forces in Simulation of Fine Particulates; Hidden Statistics Approach to Quantum Simulations; Reconstituted Three-Dimensional Interactive Imaging; Determining Atmospheric-Density Profile of Titan; Digital Microfluidics Sample Analyzer; Radiation Protection Using Carbon Nanotube Derivatives; Process to Selectively Distinguish Viable from Non-Viable Bacterial Cells; and TEAMS Model Analyzer.
A modular and programmable development platform for capsule endoscopy system.
Khan, Tareq Hasan; Shrestha, Ravi; Wahid, Khan A
2014-06-01
The state-of-the-art capsule endoscopy (CE) technology offers painless examination for the patients and the ability to examine the interior of the gastrointestinal tract by a noninvasive procedure for the gastroenterologists. In this work, a modular and flexible CE development system platform consisting of a miniature field programmable gate array (FPGA) based electronic capsule, a microcontroller based portable data recorder unit and computer software is designed and developed. Due to the flexible and reprogrammable nature of the system, various image processing and compression algorithms can be tested in the design without requiring any hardware change. The designed capsule prototype supports various imaging modes including white light imaging (WLI) and narrow band imaging (NBI), and communicates with the data recorder in full duplex fashion, which enables configuring the image size and imaging mode in real time during examination. A low complexity image compressor based on a novel color-space is implemented inside the capsule to reduce the amount of RF transmission data. The data recorder contains graphical LCD for real time image viewing and SD cards for storing image data. Data can be uploaded to a computer or Smartphone by SD card, USB interface or by wireless Bluetooth link. Computer software is developed that decompresses and reconstructs images. The fabricated capsule PCBs have a diameter of 16 mm. An ex-vivo animal testing has also been conducted to validate the results.
Design and implementation of a programming circuit in radiation-hardened FPGA
NASA Astrophysics Data System (ADS)
Lihua, Wu; Xiaowei, Han; Yan, Zhao; Zhongli, Liu; Fang, Yu; Chen, Stanley L.
2011-08-01
We present a novel programming circuit used in our radiation-hardened field programmable gate array (FPGA) chip. This circuit provides the ability to write user-defined configuration data into an FPGA and then read it back. The proposed circuit adopts the direct-access programming point scheme instead of the typical long token shift register chain. It not only saves area but also provides more flexible configuration operations. By configuring the proposed partial configuration control register, our smallest configuration section can be conveniently configured as a single data and a flexible partial configuration can be easily implemented. The hierarchical simulation scheme, optimization of the critical path and the elaborate layout plan make this circuit work well. Also, the radiation hardened by design programming point is introduced. This circuit has been implemented in a static random access memory (SRAM)-based FPGA fabricated by a 0.5 μm partial-depletion silicon-on-insulator CMOS process. The function test results of the fabricated chip indicate that this programming circuit successfully realizes the desired functions in the configuration and read-back. Moreover, the radiation test results indicate that the programming circuit has total dose tolerance of 1 × 105 rad(Si), dose rate survivability of 1.5 × 1011 rad(Si)/s and neutron fluence immunity of 1 × 1014 n/cm2.
Chaos and Cryptography: A new dimension in secure communications
NASA Astrophysics Data System (ADS)
Banerjee, Santo; Kurths, J.
2014-06-01
This issue is a collection of contributions on recent developments and achievements of cryptography and communications using chaos. The various contributions report important and promising results such as synchronization of networks and data transmissions; image cipher; optical and TDMA communications, quantum keys etc. Various experiments and applications such as FPGA, smartphone cipher, semiconductor lasers etc, are also included.
An embedded laser marking controller based on ARM and FPGA processors.
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking.
Field programmable gate array-assigned complex-valued computation and its limits
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernard-Schwarz, Maria, E-mail: maria.bernardschwarz@ni.com; Institute of Applied Physics, TU Wien, Wiedner Hauptstrasse 8, 1040 Wien; Zwick, Wolfgang
We discuss how leveraging Field Programmable Gate Array (FPGA) technology as part of a high performance computing platform reduces latency to meet the demanding real time constraints of a quantum optics simulation. Implementations of complex-valued operations using fixed point numeric on a Virtex-5 FPGA compare favorably to more conventional solutions on a central processing unit. Our investigation explores the performance of multiple fixed point options along with a traditional 64 bits floating point version. With this information, the lowest execution times can be estimated. Relative error is examined to ensure simulation accuracy is maintained.
A control system based on field programmable gate array for papermaking sewage treatment
NASA Astrophysics Data System (ADS)
Zhang, Zi Sheng; Xie, Chang; Qing Xiong, Yan; Liu, Zhi Qiang; Li, Qing
2013-03-01
A sewage treatment control system is designed to improve the efficiency of papermaking wastewater treatment system. The automation control system is based on Field Programmable Gate Array (FPGA), coded with Very-High-Speed Integrate Circuit Hardware Description Language (VHDL), compiled and simulated with Quartus. In order to ensure the stability of the data used in FPGA, the data is collected through temperature sensors, water level sensor and online PH measurement system. The automatic control system is more sensitive, and both the treatment efficiency and processing power are increased. This work provides a new method for sewage treatment control.
Computing Models for FPGA-Based Accelerators
Herbordt, Martin C.; Gu, Yongfeng; VanCourt, Tom; Model, Josh; Sukhwani, Bharat; Chiu, Matt
2011-01-01
Field-programmable gate arrays are widely considered as accelerators for compute-intensive applications. A critical phase of FPGA application development is finding and mapping to the appropriate computing model. FPGA computing enables models with highly flexible fine-grained parallelism and associative operations such as broadcast and collective response. Several case studies demonstrate the effectiveness of using these computing models in developing FPGA applications for molecular modeling. PMID:21603152
NEW EPICS/RTEMS IOC BASED ON ALTERA SOC AT JEFFERSON LAB
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Jianxun; Seaton, Chad; Allison, Trent L.
A new EPICS/RTEMS IOC based on the Altera System-on-Chip (SoC) FPGA is being designed at Jefferson Lab. The Altera SoC FPGA integrates a dual ARM Cortex-A9 Hard Processor System (HPS) consisting of processor, peripherals and memory interfaces tied seamlessly with the FPGA fabric using a high-bandwidth interconnect backbone. The embedded Altera SoC IOC has features of remote network boot via U-Boot from SD card or QSPI Flash, 1Gig Ethernet, 1GB DDR3 SDRAM on HPS, UART serial ports, and ISA bus interface. RTEMS for the ARM processor BSP were built with CEXP shell, which will dynamically load the EPICS applications atmore » runtime. U-Boot is the primary bootloader to remotely load the kernel image into local memory from a DHCP/TFTP server over Ethernet, and automatically run RTEMS and EPICS. The first design of the SoC IOC will be compatible with Jefferson Lab’s current PC104 IOCs, which have been running in CEBAF 10 years. The next design would be mounting in a chassis and connected to a daughter card via standard HSMC connectors. This standard SoC IOC will become the next generation of low-level IOC for the accelerator controls at Jefferson Lab.« less
Effect of data truncation in an implementation of pixel clustering on a custom computing machine
NASA Astrophysics Data System (ADS)
Leeser, Miriam E.; Theiler, James P.; Estlick, Michael; Kitaryeva, Natalya V.; Szymanski, John J.
2000-10-01
We investigate the effect of truncating the precision of hyperspectral image data for the purpose of more efficiently segmenting the image using a variant of k-means clustering. We describe the implementation of the algorithm on field-programmable gate array (FPGA) hardware. Truncating the data to only a few bits per pixel in each spectral channel permits a more compact hardware design, enabling greater parallelism, and ultimately a more rapid execution. It also enables the storage of larger images in the onboard memory. In exchange for faster clustering, however, one trades off the quality of the produced segmentation. We find, however, that the clustering algorithm can tolerate considerable data truncation with little degradation in cluster quality. This robustness to truncated data can be extended by computing the cluster centers to a few more bits of precision than the data. Since there are so many more pixels than centers, the more aggressive data truncation leads to significant gains in the number of pixels that can be stored in memory and processed in hardware concurrently.
MPCM: a hardware coder for super slow motion video sequences
NASA Astrophysics Data System (ADS)
Alcocer, Estefanía; López-Granado, Otoniel; Gutierrez, Roberto; Malumbres, Manuel P.
2013-12-01
In the last decade, the improvements in VLSI levels and image sensor technologies have led to a frenetic rush to provide image sensors with higher resolutions and faster frame rates. As a result, video devices were designed to capture real-time video at high-resolution formats with frame rates reaching 1,000 fps and beyond. These ultrahigh-speed video cameras are widely used in scientific and industrial applications, such as car crash tests, combustion research, materials research and testing, fluid dynamics, and flow visualization that demand real-time video capturing at extremely high frame rates with high-definition formats. Therefore, data storage capability, communication bandwidth, processing time, and power consumption are critical parameters that should be carefully considered in their design. In this paper, we propose a fast FPGA implementation of a simple codec called modulo-pulse code modulation (MPCM) which is able to reduce the bandwidth requirements up to 1.7 times at the same image quality when compared with PCM coding. This allows current high-speed cameras to capture in a continuous manner through a 40-Gbit Ethernet point-to-point access.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
NASA Astrophysics Data System (ADS)
Daudin, L.; Barberet, Ph.; Serani, L.; Moretto, Ph.
2013-07-01
High resolution ion microbeams, usually used to perform elemental mapping, low dose targeted irradiation or ion beam lithography needs a very flexible beam control system. For this purpose, we have developed a dedicated system (called “CRionScan”), on the AIFIRA facility (Applications Interdisciplinaires des Faisceaux d'Ions en Région Aquitaine). It consists of a stand-alone real-time scanning and imaging instrument based on a Compact Reconfigurable Input/Output (Compact RIO) device from National Instruments™. It is based on a real-time controller, a Field Programmable Gate Array (FPGA), input/output modules and Ethernet connectivity. We have implemented a fast and deterministic beam scanning system interfaced with our commercial data acquisition system without any hardware development. CRionScan is built under LabVIEW™ and has been used on AIFIRA's nanobeam line since 2009 (Barberet et al., 2009, 2011) [1,2]. A Graphical User Interface (GUI) embedded in the Compact RIO as a web page is used to control the scanning parameters. In addition, a fast electrostatic beam blanking trigger has been included in the FPGA and high speed counters (15 MHz) have been implemented to perform dose controlled irradiation and on-line images on the GUI. Analog to Digital converters are used for the beam current measurement and in the near future for secondary electrons imaging. Other functionalities have been integrated in this controller like LED lighting using Pulse Width Modulation and a “NIM Wilkinson ADC” data acquisition.
STRS Compliant FPGA Waveform Development
NASA Technical Reports Server (NTRS)
Nappier, Jennifer; Downey, Joseph
2008-01-01
The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. Current standards were researched and new standard interfaces were proposed. The implementation of the proposed standard interfaces on a laboratory breadboard SDR will be presented.
24/7 security system: 60-FPS color EMCCD camera with integral human recognition
NASA Astrophysics Data System (ADS)
Vogelsong, T. L.; Boult, T. E.; Gardner, D. W.; Woodworth, R.; Johnson, R. C.; Heflin, B.
2007-04-01
An advanced surveillance/security system is being developed for unattended 24/7 image acquisition and automated detection, discrimination, and tracking of humans and vehicles. The low-light video camera incorporates an electron multiplying CCD sensor with a programmable on-chip gain of up to 1000:1, providing effective noise levels of less than 1 electron. The EMCCD camera operates in full color mode under sunlit and moonlit conditions, and monochrome under quarter-moonlight to overcast starlight illumination. Sixty frame per second operation and progressive scanning minimizes motion artifacts. The acquired image sequences are processed with FPGA-compatible real-time algorithms, to detect/localize/track targets and reject non-targets due to clutter under a broad range of illumination conditions and viewing angles. The object detectors that are used are trained from actual image data. Detectors have been developed and demonstrated for faces, upright humans, crawling humans, large animals, cars and trucks. Detection and tracking of targets too small for template-based detection is achieved. For face and vehicle targets the results of the detection are passed to secondary processing to extract recognition templates, which are then compared with a database for identification. When combined with pan-tilt-zoom (PTZ) optics, the resulting system provides a reliable wide-area 24/7 surveillance system that avoids the high life-cycle cost of infrared cameras and image intensifiers.
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David
2006-05-01
The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lovell, Jack, E-mail: jack.lovell@durham.ac.uk; Culham Centre for Fusion Energy, Culham Science Centre, Abingdon, Oxon OX14 3DB; Naylor, Graham
A new resistive bolometer system has been developed for MAST-Upgrade. It will measure radiated power in the new Super-X divertor, with millisecond time resolution, along 16 vertical and 16 horizontal lines of sight. The system uses a Xilinx Zynq-7000 series Field-Programmable Gate Array (FPGA) in the D-TACQ ACQ2106 carrier to perform real time data acquisition and signal processing. The FPGA enables AC-synchronous detection using high performance digital filtering to achieve a high signal-to-noise ratio and will be able to output processed data in real time with millisecond latency. The system has been installed on 8 previously unused channels of themore » JET vertical bolometer system. Initial results suggest good agreement with data from existing vertical channels but with higher bandwidth and signal-to-noise ratio.« less
New Developments in FPGA: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; Label, Kenneth A.; Pellish, Jonathan
2016-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
New Developments in FPGA: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; LaBel, Kenneth; Pellish, Jonathan
2015-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
Tethered Forth system for FPGA applications
NASA Astrophysics Data System (ADS)
Goździkowski, Paweł; Zabołotny, Wojciech M.
2013-10-01
This paper presents the tethered Forth system dedicated for testing and debugging of FPGA based electronic systems. Use of the Forth language allows to interactively develop and run complex testing or debugging routines. The solution is based on a small, 16-bit soft core CPU, used to implement the Forth Virtual Machine. Thanks to the use of the tethered Forth model it is possible to minimize usage of the internal RAM memory in the FPGA. The function of the intelligent terminal, which is an essential part of the tethered Forth system, may be fulfilled by the standard PC computer or by the smartphone. System is implemented in Python (the software for intelligent terminal), and in VHDL (the IP core for FPGA), so it can be easily ported to different hardware platforms. The connection between the terminal and FPGA may be established and disconnected many times without disturbing the state of the FPGA based system. The presented system has been verified in the hardware, and may be used as a tool for debugging, testing and even implementing of control algorithms for FPGA based systems.
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction
NASA Astrophysics Data System (ADS)
Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh
2014-05-01
Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
A custom hardware classifier for bruised apple detection in hyperspectral images
NASA Astrophysics Data System (ADS)
Cárdenas, Javier; Figueroa, Miguel; Pezoa, Jorge E.
2015-09-01
We present a custom digital architecture for bruised apple classification using hyperspectral images in the near infrared (NIR) spectrum. The algorithm classifies each pixel in an image into one of three classes: bruised, non-bruised, and background. We extract two 5-element feature vectors for each pixel using only 10 out of the 236 spectral bands provided by the hyperspectral camera, thereby greatly reducing both the requirements of the imager and the computational complexity of the algorithm. We then use two linear-kernel support vector machine (SVM) to classify each pixel. Each SVM was trained with 504 windows of size 17×17-pixel taken from 14 hyperspectral images of 320×320 pixels each, for each class. The architecture then computes the percentage of bruised pixels in each apple in order to adequately classify the fruit. We implemented the architecture on a Xilinx Zynq Z-7010 field-programmable gate array (FPGA) and tested it on images from a NIR N17E push-broom camera with a frame rate of 25 fps, a band-pixel rate of 1.888 MHz, and 236 spectral bands between 900 and 1700 nanometers in laboratory conditions. Using 28-bit fixed-point arithmetic, the circuit accurately discriminates 95.2% of the pixels corresponding to an apple, 81% of the pixels corresponding to a bruised apple, and 96.4% of the background. With the default threshold settings, the highest false positive (FP) for a bruised apple is 18.7%. The circuit operates at the native frame rate of the camera, consumes 67 mW of dynamic power, and uses less than 10% of the logic resources on the FPGA.
Design of a system based on DSP and FPGA for video recording and replaying
NASA Astrophysics Data System (ADS)
Kang, Yan; Wang, Heng
2013-08-01
This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA for video recording and replaying has a considerable perspective in analysis after the event, simulated exercitation and so forth.
High-Speed Digital Scan Converter for High-Frequency Ultrasound Sector Scanners
Chang, Jin Ho; Yen, Jesse T.; Shung, K. Kirk
2008-01-01
This paper presents a high-speed digital scan converter (DSC) capable of providing more than 400 images per second, which is necessary to examine the activities of the mouse heart whose rate is 5–10 beats per second. To achieve the desired high-speed performance in cost-effective manner, the DSC developed adopts a linear interpolation algorithm in which two nearest samples to each object pixel of a monitor are selected and only angular interpolation is performed. Through computer simulation with the Field II program, its accuracy was investigated by comparing it to that of bilinear interpolation known as the best algorithm in terms of accuracy and processing speed. The simulation results show that the linear interpolation algorithm is capable of providing an acceptable image quality, which means that the difference of the root mean square error (RMSE) values of the linear and bilinear interpolation algorithms is below 1 %, if the sample rate of the envelope samples is at least four times higher than the Nyquist rate for the baseband component of echo signals. The designed DSC was implemented with a single FPGA (Stratix EP1S60F1020C6, Altera Corporation, San Jose, CA) on a DSC board that is a part of a high-speed ultrasound imaging system developed. The temporal and spatial resolutions of the implemented DSC were evaluated by examining its maximum processing time with a time stamp indicating when an image is completely formed and wire phantom testing, respectively. The experimental results show that the implemented DSC is capable of providing images at the rate of 400 images per second with negligible processing error. PMID:18430449
A programmable controller based on CAN field bus embedded microprocessor and FPGA
NASA Astrophysics Data System (ADS)
Cai, Qizhong; Guo, Yifeng; Chen, Wenhei; Wang, Mingtao
2008-10-01
One kind of new programmable controller(PLC) is introduced in this paper. The advanced embedded microprocessor and Field-Programmable Gate Array (FPGA) device are applied in the PLC system. The PLC system structure was presented in this paper. It includes 32 bits Advanced RISC Machines (ARM) embedded microprocessor as control core, FPGA as control arithmetic coprocessor and CAN bus as data communication criteria protocol connected the host controller and its various extension modules. It is detailed given that the circuits and working principle, IiO interface circuit between ARM and FPGA and interface circuit between ARM and FPGA coprocessor. Furthermore the interface circuit diagrams between various modules are written. In addition, it is introduced that ladder chart program how to control the transfer info of control arithmetic part in FPGA coprocessor. The PLC, through nearly two months of operation to meet the design of the basic requirements.
Generating clock signals for a cycle accurate, cycle reproducible FPGA based hardware accelerator
Asaad, Sameth W.; Kapur, Mohit
2016-01-05
A method, system and computer program product are disclosed for generating clock signals for a cycle accurate FPGA based hardware accelerator used to simulate operations of a device-under-test (DUT). In one embodiment, the DUT includes multiple device clocks generating multiple device clock signals at multiple frequencies and at a defined frequency ratio; and the FPG hardware accelerator includes multiple accelerator clocks generating multiple accelerator clock signals to operate the FPGA hardware accelerator to simulate the operations of the DUT. In one embodiment, operations of the DUT are mapped to the FPGA hardware accelerator, and the accelerator clock signals are generated at multiple frequencies and at the defined frequency ratio of the frequencies of the multiple device clocks, to maintain cycle accuracy between the DUT and the FPGA hardware accelerator. In an embodiment, the FPGA hardware accelerator may be used to control the frequencies of the multiple device clocks.
Real-time Astrometry Using Phase Congruency
NASA Astrophysics Data System (ADS)
Lambert, A.; Polo, M.; Tang, Y.
Phase congruency is a computer vision technique that proves to perform well for determining the tracks of optical objects (Flewelling, AMOS 2014). We report on a real-time implementation of this using an FPGA and CMOS Image Sensor, with on-sky data. The lightweight instrument can provide tracking update signals to the mount of the telescope, as well as determine abnormal objects in the scene.
Ripple FPN reduced algorithm based on temporal high-pass filter and hardware implementation
NASA Astrophysics Data System (ADS)
Li, Yiyang; Li, Shuo; Zhang, Zhipeng; Jin, Weiqi; Wu, Lei; Jin, Minglei
2016-11-01
Cooled infrared detector arrays always suffer from undesired Ripple Fixed-Pattern Noise (FPN) when observe the scene of sky. The Ripple Fixed-Pattern Noise seriously affect the imaging quality of thermal imager, especially for small target detection and tracking. It is hard to eliminate the FPN by the Calibration based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified space low-pass and temporal high-pass nonuniformity correction algorithm using adaptive time domain threshold (THP&GM). The threshold is designed to significantly reduce ghosting artifacts. We test the algorithm on real infrared in comparison to several previously published methods. This algorithm not only can effectively correct common FPN such as Stripe, but also has obviously advantage compared with the current methods in terms of detail protection and convergence speed, especially for Ripple FPN correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA). The hardware implementation of the algorithm based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay (less than 20 lines). The hardware has been successfully applied in actual system.
Stego on FPGA: An IWT Approach
Ramalingam, Balakrishnan
2014-01-01
A reconfigurable hardware architecture for the implementation of integer wavelet transform (IWT) based adaptive random image steganography algorithm is proposed. The Haar-IWT was used to separate the subbands namely, LL, LH, HL, and HH, from 8 × 8 pixel blocks and the encrypted secret data is hidden in the LH, HL, and HH blocks using Moore and Hilbert space filling curve (SFC) scan patterns. Either Moore or Hilbert SFC was chosen for hiding the encrypted data in LH, HL, and HH coefficients, whichever produces the lowest mean square error (MSE) and the highest peak signal-to-noise ratio (PSNR). The fixated random walk's verdict of all blocks is registered which is nothing but the furtive key. Our system took 1.6 µs for embedding the data in coefficient blocks and consumed 34% of the logic elements, 22% of the dedicated logic register, and 2% of the embedded multiplier on Cyclone II field programmable gate array (FPGA). PMID:24723794
Colt: an experiment in wormhole run-time reconfiguration
NASA Astrophysics Data System (ADS)
Bittner, Ray; Athanas, Peter M.; Musgrove, Mark
1996-10-01
Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
A high performance hardware implementation image encryption with AES algorithm
NASA Astrophysics Data System (ADS)
Farmani, Ali; Jafari, Mohamad; Miremadi, Seyed Sohrab
2011-06-01
This paper describes implementation of a high-speed encryption algorithm with high throughput for encrypting the image. Therefore, we select a highly secured symmetric key encryption algorithm AES(Advanced Encryption Standard), in order to increase the speed and throughput using pipeline technique in four stages, control unit based on logic gates, optimal design of multiplier blocks in mixcolumn phase and simultaneous production keys and rounds. Such procedure makes AES suitable for fast image encryption. Implementation of a 128-bit AES on FPGA of Altra company has been done and the results are as follow: throughput, 6 Gbps in 471MHz. The time of encrypting in tested image with 32*32 size is 1.15ms.
Implementation of image transmission server system using embedded Linux
NASA Astrophysics Data System (ADS)
Park, Jong-Hyun; Jung, Yeon Sung; Nam, Boo Hee
2005-12-01
In this paper, we performed the implementation of image transmission server system using embedded system that is for the specified object and easy to install and move. Since the embedded system has lower capability than the PC, we have to reduce the quantity of calculation of the baseline JPEG image compression and transmission. We used the Redhat Linux 9.0 OS at the host PC and the target board based on embedded Linux. The image sequences are obtained from the camera attached to the FPGA (Field Programmable Gate Array) board with ALTERA cooperation chip. For effectiveness and avoiding some constraints from the vendor's own, we made the device driver using kernel module.
New Developments in FPGA Devices: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Kenneth; Pellish, Jonathan
2016-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
Digitization of Analog Signals using a Field Programmable Gate Array (FPGA)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilera, Daniel; Rusu, Vadim
The idea of this research is consolidating the electrical components used for capturing data in the Mu2e Tracker. Ideally, an FPGA will serve as the Time-Division Converters (TDC) and Analog-to-Digital Converters (ADC). The TDC is already being carried out by the FPGA, but we are still using off the shelf ADCs. This poster proposes using Low Voltage Differential Signaling as the basis for analog-to-digital conversion using and FPGA.
Performance of PHOTONIS' low light level CMOS imaging sensor for long range observation
NASA Astrophysics Data System (ADS)
Bourree, Loig E.
2014-05-01
Identification of potential threats in low-light conditions through imaging is commonly achieved through closed-circuit television (CCTV) and surveillance cameras by combining the extended near infrared (NIR) response (800-10000nm wavelengths) of the imaging sensor with NIR LED or laser illuminators. Consequently, camera systems typically used for purposes of long-range observation often require high-power lasers in order to generate sufficient photons on targets to acquire detailed images at night. While these systems may adequately identify targets at long-range, the NIR illumination needed to achieve such functionality can easily be detected and therefore may not be suitable for covert applications. In order to reduce dependency on supplemental illumination in low-light conditions, the frame rate of the imaging sensors may be reduced to increase the photon integration time and thus improve the signal to noise ratio of the image. However, this may hinder the camera's ability to image moving objects with high fidelity. In order to address these particular drawbacks, PHOTONIS has developed a CMOS imaging sensor (CIS) with a pixel architecture and geometry designed specifically to overcome these issues in low-light level imaging. By combining this CIS with field programmable gate array (FPGA)-based image processing electronics, PHOTONIS has achieved low-read noise imaging with enhanced signal-to-noise ratio at quarter moon illumination, all at standard video frame rates. The performance of this CIS is discussed herein and compared to other commercially available CMOS and CCD for long-range observation applications.
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems
NASA Technical Reports Server (NTRS)
Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.
2012-01-01
Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.
An Embedded Laser Marking Controller Based on ARM and FPGA Processors
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking. PMID:24772028
FPGA-based fused smart-sensor for tool-wear area quantitative estimation in CNC machine inserts.
Trejo-Hernandez, Miguel; Osornio-Rios, Roque Alfredo; de Jesus Romero-Troncoso, Rene; Rodriguez-Donate, Carlos; Dominguez-Gonzalez, Aurelio; Herrera-Ruiz, Gilberto
2010-01-01
Manufacturing processes are of great relevance nowadays, when there is a constant claim for better productivity with high quality at low cost. The contribution of this work is the development of a fused smart-sensor, based on FPGA to improve the online quantitative estimation of flank-wear area in CNC machine inserts from the information provided by two primary sensors: the monitoring current output of a servoamplifier, and a 3-axis accelerometer. Results from experimentation show that the fusion of both parameters makes it possible to obtain three times better accuracy when compared with the accuracy obtained from current and vibration signals, individually used.
Hardware Timestamping for an Image Acquisition System Based on FlexRIO and IEEE 1588 v2 Standard
NASA Astrophysics Data System (ADS)
Esquembri, S.; Sanz, D.; Barrera, E.; Ruiz, M.; Bustos, A.; Vega, J.; Castro, R.
2016-02-01
Current fusion devices usually implement distributed acquisition systems for the multiple diagnostics of their experiments. However, each diagnostic is composed by hundreds or even thousands of signals, including images from the vessel interior. These signals and images must be correctly timestamped, because all the information will be analyzed to identify plasma behavior using temporal correlations. For acquisition devices without synchronization mechanisms the timestamp is given by another device with timing capabilities when signaled by the first device. Later, each data should be related with its timestamp, usually via software. This critical action is unfeasible for software applications when sampling rates are high. In order to solve this problem this paper presents the implementation of an image acquisition system with real-time hardware timestamping mechanism. This is synchronized with a master clock using the IEEE 1588 v2 Precision Time Protocol (PTP). Synchronization, image acquisition and processing, and timestamping mechanisms are implemented using Field Programmable Gate Array (FPGA) and a timing card -PTP v2 synchronized. The system has been validated using a camera simulator streaming videos from fusion databases. The developed architecture is fully compatible with ITER Fast Controllers and has been integrated with EPICS to control and monitor the whole system.
SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU FPGA DSP Architectures
NASA Technical Reports Server (NTRS)
Schmidt, Andrew G.; Weisz, Gabriel; French, Matthew; Flatley, Thomas; Villalpando, Carlos Y.
2017-01-01
The SpaceCubeX project is motivated by the need for high performance, modular, and scalable on-board processing to help scientists answer critical 21st century questions about global climate change, air quality, ocean health, and ecosystem dynamics, while adding new capabilities such as low-latency data products for extreme event warnings. These goals translate into on-board processing throughput requirements that are on the order of 100-1,000 more than those of previous Earth Science missions for standard processing, compression, storage, and downlink operations. To study possible future architectures to achieve these performance requirements, the SpaceCubeX project provides an evolvable testbed and framework that enables a focused design space exploration of candidate hybrid CPU/FPGA/DSP processing architectures. The framework includes ArchGen, an architecture generator tool populated with candidate architecture components, performance models, and IP cores, that allows an end user to specify the type, number, and connectivity of a hybrid architecture. The framework requires minimal extensions to integrate new processors, such as the anticipated High Performance Spaceflight Computer (HPSC), reducing time to initiate benchmarking by months. To evaluate the framework, we leverage a wide suite of high performance embedded computing benchmarks and Earth science scenarios to ensure robust architecture characterization. We report on our projects Year 1 efforts and demonstrate the capabilities across four simulation testbed models, a baseline SpaceCube 2.0 system, a dual ARM A9 processor system, a hybrid quad ARM A53 and FPGA system, and a hybrid quad ARM A53 and DSP system.
Fast modular data acquisition system for GEM-2D detector
NASA Astrophysics Data System (ADS)
Kasprowicz, G.; Byszuk, Adrian; Wojeński, A.; Zienkiewicz, P.; Czarski, T.; Chernyshova, M.; Poźniak, K.; Rzadkiewicz, J.; Zabolotny, W.; Juszczyk, B.
2014-11-01
A novel approach to two dimensional Gas Electron Multiplier (GEM) detector readout is presented. Unlike commonly used methods, based on discriminators and analogue FIFOs, the method developed uses simulta- neously sampling high speed ADCs with fast hybrid integrator and advanced FPGA-based processing logic to estimate the energy of every single photon. Such a method is applied to every GEM strip / pixel signal. It is especially useful in case of crystal-based spectrometers for soft X-rays, 2D imaging for plasma tomography and all these applications where energy resolution of every single photon is required. For the purpose of the detector readout, a novel, highly modular and extendable conception of the measurement platform was developed. It is evolution of already deployed measurement system for JET Spectrometer.
A Star Image Extractor for Small Satellites
NASA Astrophysics Data System (ADS)
Yamada, Yoshiyuki; Yamauchi, Masahiro; Gouda, Naoteru; Kobayashi, Yukiyasu; Tsujimoto, Takuji; Yano, Taihei; Suganuma, Masahiro; Nakasuka, Shinichi; Sako, Nobutada; Inamori, Takaya
We have developed a Star Image Extractor (SIE) which works as an on-board real-time image processor. It is a logic circuit written on an FPGA(Field Programmable Gate Array) device. It detects and extracts only an object data from raw image data. SIE will be required with the Nano-JASMINE 1) satellite. Nano-JASMINE is the small astrometry satellite that observes objects in our galaxy. It will be launched in 2010 and needs two years mission period. Nano-JASMINE observes an object with the TDI (Time Delayed Integration) observation mode. TDI is one of operation modes of CCD detector. Data is obtained, by rotating the imaging system including CCD at a rated synchronized with a vertical charge transfer of CCD. Obtained image data is sent through SIE to the Mission-controller.
FPGA-based Upgrade to RITS-6 Control System, Designed with EMP Considerations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harold D. Anderson, John T. Williams
2009-07-01
The existing control system for the RITS-6, a 20-MA 3-MV pulsed-power accelerator located at Sandia National Laboratories, was built as a system of analog switches because the operators needed to be close enough to the machine to hear pulsed-power breakdowns, yet the electromagnetic pulse (EMP) emitted would disable any processor-based solutions. The resulting system requires operators to activate and deactivate a series of 110-V relays manually in a complex order. The machine is sensitive to both the order of operation and the time taken between steps. A mistake in either case would cause a misfire and possible machine damage. Basedmore » on these constraints, a field-programmable gate array (FPGA) was chosen as the core of a proposed upgrade to the control system. An FPGA is a series of logic elements connected during programming. Based on their connections, the elements can mimic primitive logic elements, a process called synthesis. The circuit is static; all paths exist simultaneously and do not depend on a processor. This should make it less sensitive to EMP. By shielding it and using good electromagnetic interference-reduction practices, it should continue to operate well in the electrically noisy environment. The FPGA has two advantages over the existing system. In manual operation mode, the synthesized logic gates keep the operators in sequence. In addition, a clock signal and synthesized countdown circuit provides an automated sequence, with adjustable delays, for quickly executing the time-critical portions of charging and firing. The FPGA is modeled as a set of states, each state being a unique set of values for the output signals. The state is determined by the input signals, and in the automated segment by the value of the synthesized countdown timer, with the default mode placing the system in a safe configuration. Unlike a processor-based system, any system stimulus that results in an abort situation immediately executes a shutdown, with only a tens-of-nanoseconds delay to propagate across the FPGA. This paper discusses the design, installation, and testing of the proposed system upgrade, including failure statistics and modifications to the original design.« less
Development of a real time magnetic island identification system for HL-2A tokamak.
Chen, Chao; Sun, Shan; Ji, Xiaoquan; Yin, Zejie
2017-08-01
A novel real time magnetic island identification system for HL-2A is introduced. The identification method is based on the measurement of Mirnov probes and the equilibrium flux constructed by the equilibrium fit (EFIT) code. The system consists of an analog front board and a digital processing board connected by a shield cable. Four octal-channel analog-to-digital convertors are utilized for 100 KHz simultaneous sampling of all the probes, and the applications of PCI extensions for Instrumentation platform and reflective memory allow the system to receive EFIT results simultaneously. A high performance field programmable gate array (FPGA) is used to realize the real time identification algorithm. Based on the parallel and pipeline processing of the FPGA, the magnetic island structure can be identified with a cycle time of 3 ms during experiments.
Development of a real time magnetic island identification system for HL-2A tokamak
NASA Astrophysics Data System (ADS)
Chen, Chao; Sun, Shan; Ji, Xiaoquan; Yin, Zejie
2017-08-01
A novel real time magnetic island identification system for HL-2A is introduced. The identification method is based on the measurement of Mirnov probes and the equilibrium flux constructed by the equilibrium fit (EFIT) code. The system consists of an analog front board and a digital processing board connected by a shield cable. Four octal-channel analog-to-digital convertors are utilized for 100 KHz simultaneous sampling of all the probes, and the applications of PCI extensions for Instrumentation platform and reflective memory allow the system to receive EFIT results simultaneously. A high performance field programmable gate array (FPGA) is used to realize the real time identification algorithm. Based on the parallel and pipeline processing of the FPGA, the magnetic island structure can be identified with a cycle time of 3 ms during experiments.
Integration of the Reconfigurable Self-Healing eDNA Architecture in an Embedded System
NASA Technical Reports Server (NTRS)
Boesen, Michael Reibel; Keymeulen, Didier; Madsen, Jan; Lu, Thomas; Chao, Tien-Hsin
2011-01-01
In this work we describe the first real world case study for the self-healing eDNA (electronic DNA) architecture by implementing the control and data processing of a Fourier Transform Spectrometer (FTS) on an eDNA prototype. For this purpose the eDNA prototype has been ported from a Xilinx Virtex 5 FPGA to an embedded system consisting of a PowerPC and a Xilinx Virtex 5 FPGA. The FTS instrument features a novel liquid crystal waveguide, which consequently eliminates all moving parts from the instrument. The addition of the eDNA architecture to do the control and data processing has resulted in a highly fault-tolerant FTS instrument. The case study has shown that the early stage prototype of the autonomous self-healing eDNA architecture is expensive in terms of execution time.
NASA Technical Reports Server (NTRS)
Berg, M. D.; Kim, H. S.; Friendlich, M. A.; Perez, C. E.; Seidlick, C. M.; LaBel, K. A.
2011-01-01
We present SEU test and analysis of the Microsemi ProASIC3 FPGA. SEU Probability models are incorporated for device evaluation. Included is a comparison to the RTAXS FPGA illustrating the effectiveness of the overall testing methodology.
Event management for large scale event-driven digital hardware spiking neural networks.
Caron, Louis-Charles; D'Haene, Michiel; Mailhot, Frédéric; Schrauwen, Benjamin; Rouat, Jean
2013-09-01
The interest in brain-like computation has led to the design of a plethora of innovative neuromorphic systems. Individually, spiking neural networks (SNNs), event-driven simulation and digital hardware neuromorphic systems get a lot of attention. Despite the popularity of event-driven SNNs in software, very few digital hardware architectures are found. This is because existing hardware solutions for event management scale badly with the number of events. This paper introduces the structured heap queue, a pipelined digital hardware data structure, and demonstrates its suitability for event management. The structured heap queue scales gracefully with the number of events, allowing the efficient implementation of large scale digital hardware event-driven SNNs. The scaling is linear for memory, logarithmic for logic resources and constant for processing time. The use of the structured heap queue is demonstrated on a field-programmable gate array (FPGA) with an image segmentation experiment and a SNN of 65,536 neurons and 513,184 synapses. Events can be processed at the rate of 1 every 7 clock cycles and a 406×158 pixel image is segmented in 200 ms. Copyright © 2013 Elsevier Ltd. All rights reserved.
Development of signal processing system of avalanche photo diode for space observations by Astro-H
NASA Astrophysics Data System (ADS)
Ohno, M.; Goto, K.; Hanabata, Y.; Takahashi, H.; Fukazawa, Y.; Yoshino, M.; Saito, T.; Nakamori, T.; Kataoka, J.; Sasano, M.; Torii, S.; Uchiyama, H.; Nakazawa, K.; Watanabe, S.; Kokubun, M.; Ohta, M.; Sato, T.; Takahashi, T.; Tajima, H.
2013-01-01
Astro-H is the sixth Japanese X-ray space observatory which will be launched in 2014. Two of onboard instruments of Astro-H, Hard X-ray Imager and Soft Gamma-ray Detector are surrounded by many number of large Bismuth Germanate (Bi4Ge3O12; BGO) scintillators. Optimum readout system of scintillation lights from these BGOs are essential to reduce the background signals and achieve high performance for main detectors because most of gamma-rays from out of field-of-view of main detectors or radio-isotopes produced inside them due to activation can be eliminated by anti-coincidence technique using BGO signals. We apply Avalanche Photo Diode (APD) for light sensor of these BGO detectors since their compactness and high quantum efficiency make it easy to design such large number of BGO detector system. For signal processing from APDs, digital filter and other trigger logics on the Field-Programmable Gate Array (FPGA) is used instead of discrete analog circuits due to limitation of circuit implementation area on spacecraft. For efficient observations, we have to achieve as low threshold of anti-coincidence signal as possible by utilizing the digital filtering. In addition, such anti-coincident signals should be sent to the main detector within 5 μs to make it in time to veto the A-D conversion. Considering this requirement and constraint from logic size of FPGA, we adopt two types of filter, 8 delay taps filter with only 2 bit precision coefficient and 16 delay taps filter with 8 bit precision coefficient. The data after former simple filter provides anti-coincidence signal quickly in orbit, and the latter filter is used for detail analysis after the data is down-linked.
Sivakamasundari, J; Kavitha, G; Sujatha, C M; Ramakrishnan, S
2014-01-01
Diabetic Retinopathy (DR) is a disorder that affects the structure of retinal blood vessels due to long-standing diabetes mellitus. Real-Time mass screening system for DR is vital for timely diagnosis and periodic screening to prevent the patient from severe visual loss. Human retinal fundus images are widely used for an automated segmentation of blood vessel and diagnosis of various blood vessel disorders. In this work, an attempt has been made to perform hardware synthesis of Kirsch template based edge detection for segmentation of blood vessels. This method is implemented using LabVIEW software and is synthesized in field programmable gate array board to yield results in real-time application. The segmentation of blood vessels using Kirsch based edge detection is compared with other edge detection methods such as Sobel, Prewitt and Canny. The texture features such as energy, entropy, contrast, mean, homogeneity and structural feature namely ratio of vessel to vessel free area are obtained from the segmented images. The performance of segmentation is analysed in terms of sensitivity, specificity and accuracy. It is observed from the results that the Kirsch based edge detection technique segmented the edges of blood vessels better than other edge detection techniques. The ratio of vessel to vessel free area classified the normal and DR affected retinal images more significantly than other texture based features. FPGA based hardware synthesis of Kirsch edge detection method is able to differentiate normal and diseased images with high specificity (93%). This automated segmentation of retinal blood vessels system could be used in computer-assisted diagnosis for diabetic retinopathy screening in real-time application.
Research on moving object detection based on frog's eyes
NASA Astrophysics Data System (ADS)
Fu, Hongwei; Li, Dongguang; Zhang, Xinyuan
2008-12-01
On the basis of object's information processing mechanism with frog's eyes, this paper discussed a bionic detection technology which suitable for object's information processing based on frog's vision. First, the bionics detection theory by imitating frog vision is established, it is an parallel processing mechanism which including pick-up and pretreatment of object's information, parallel separating of digital image, parallel processing, and information synthesis. The computer vision detection system is described to detect moving objects which has special color, special shape, the experiment indicates that it can scheme out the detecting result in the certain interfered background can be detected. A moving objects detection electro-model by imitating biologic vision based on frog's eyes is established, the video simulative signal is digital firstly in this system, then the digital signal is parallel separated by FPGA. IN the parallel processing, the video information can be caught, processed and displayed in the same time, the information fusion is taken by DSP HPI ports, in order to transmit the data which processed by DSP. This system can watch the bigger visual field and get higher image resolution than ordinary monitor systems. In summary, simulative experiments for edge detection of moving object with canny algorithm based on this system indicate that this system can detect the edge of moving objects in real time, the feasibility of bionic model was fully demonstrated in the engineering system, and it laid a solid foundation for the future study of detection technology by imitating biologic vision.
ASIC/FPGA Trust Assessment Framework
NASA Technical Reports Server (NTRS)
Berg, Melanie
2018-01-01
NASA Electronic Parts and Packaging (NEPP) is developing a process to be employed in critical applications. The framework assesses levels of Trust and assurance in microelectronic systems. The process is being created with participation from a variety of organizations. We present a synopsis of the framework that includes contributions from The Aerospace Corporation.
FPGA-Based Filterbank Implementation for Parallel Digital Signal Processing
NASA Technical Reports Server (NTRS)
Berner, Stephan; DeLeon, Phillip
1999-01-01
One approach to parallel digital signal processing decomposes a high bandwidth signal into multiple lower bandwidth (rate) signals by an analysis bank. After processing, the subband signals are recombined into a fullband output signal by a synthesis bank. This paper describes an implementation of the analysis and synthesis banks using (Field Programmable Gate Arrays) FPGAs.
Note: Design of FPGA based system identification module with application to atomic force microscopy
NASA Astrophysics Data System (ADS)
Ghosal, Sayan; Pradhan, Sourav; Salapaka, Murti
2018-05-01
The science of system identification is widely utilized in modeling input-output relationships of diverse systems. In this article, we report field programmable gate array (FPGA) based implementation of a real-time system identification algorithm which employs forgetting factors and bias compensation techniques. The FPGA module is employed to estimate the mechanical properties of surfaces of materials at the nano-scale with an atomic force microscope (AFM). The FPGA module is user friendly which can be interfaced with commercially available AFMs. Extensive simulation and experimental results validate the design.
Radiation Hardened 10BASE-T Ethernet Physical Layer (PHY)
NASA Technical Reports Server (NTRS)
Lin, Michael R. (Inventor); Petrick, David J. (Inventor); Ballou, Kevin M. (Inventor); Espinosa, Daniel C. (Inventor); James, Edward F. (Inventor); Kliesner, Matthew A. (Inventor)
2017-01-01
Embodiments may provide a radiation hardened 10BASE-T Ethernet interface circuit suitable for space flight and in compliance with the IEEE 802.3 standard for Ethernet. The various embodiments may provide a 10BASE-T Ethernet interface circuit, comprising a field programmable gate array (FPGA), a transmitter circuit connected to the FPGA, a receiver circuit connected to the FPGA, and a transformer connected to the transmitter circuit and the receiver circuit. In the various embodiments, the FPGA, transmitter circuit, receiver circuit, and transformer may be radiation hardened.
Implementation of a watershed algorithm on FPGAs
NASA Astrophysics Data System (ADS)
Zahirazami, Shahram; Akil, Mohamed
1998-10-01
In this article we present an implementation of a watershed algorithm on a multi-FPGA architecture. This implementation is based on an hierarchical FIFO. A separate FIFO for each gray level. The gray scale value of a pixel is taken for the altitude of the point. In this way we look at the image as a relief. We proceed by a flooding step. It's like as we immerse the relief in a lake. The water begins to come up and when the water of two different catchment basins reach each other, we will construct a separator or a `Watershed'. This approach is data dependent, hence the process time is different for different images. The H-FIFO is used to guarantee the nature of immersion, it means that we need two types of priority. All the points of an altitude `n' are processed before any point of altitude `n + 1'. And inside an altitude water propagates with a constant velocity in all directions from the source. This operator needs two images as input. An original image or it's gradient and the marker image. A classic way to construct the marker image is to build an image of minimal regions. Each minimal region has it's unique label. This label is the color of the water and will be used to see whether two different water touch each other. The algorithm at first fill the hierarchy FIFO with neighbors of all the regions who are not colored. Next it fetches the first pixel from the first non-empty FIFO and treats this pixel. This pixel will take the color of its neighbor, and all the neighbors who are not already in the H-FIFO are put in their correspondent FIFO. The process is over when the H-FIFO is empty. The result is a segmented and labeled image.
FPGA implementation for real-time background subtraction based on Horprasert model.
Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J; Diaz, Javier; Ros, Eduardo
2012-01-01
Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W.
Implementation of a high precision multi-measurement time-to-digital convertor on a Kintex-7 FPGA
NASA Astrophysics Data System (ADS)
Kuang, Jie; Wang, Yonggang; Cao, Qiang; Liu, Chong
2018-05-01
Time-to-digital convertors (TDCs) based on field programmable gate array (FPGA) are becoming more and more popular. Multi-measurement is an effective method to improve TDC precision beyond the cell delay limitation. However, the implementation of TDC with multi-measurement on FPGAs manufactured with 28 nm and more advanced process is facing new challenges. Benefiting from the ones-counter encoding scheme, which was developed in our previous work, we implement a ring oscillator multi-measurement TDC on a Xilinx Kintex-7 FPGA. Using the two TDC channels to measure time-intervals in the range (0 ns-30 ns), the average RMS precision can be improved to 5.76 ps, meanwhile the logic resource usage remains the same with the one-measurement TDC, and the TDC dead time is only 22 ns. The investigation demonstrates that the multi-measurement methods are still available for current main-stream FPGAs. Furthermore, the new implementation in this paper could make the trade-off among the time precision, resource usage and TDC dead time better than ever before.
An FPGA-Based People Detection System
NASA Astrophysics Data System (ADS)
Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.
2005-12-01
This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.
FPGA Implementation for Real-Time Background Subtraction Based on Horprasert Model
Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J.; Diaz, Javier; Ros, Eduardo
2012-01-01
Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W. PMID:22368487
Design of FPGA ICA for hyperspectral imaging processing
NASA Astrophysics Data System (ADS)
Nordin, Anis; Hsu, Charles C.; Szu, Harold H.
2001-03-01
The remote sensing problem which uses hyperspectral imaging can be transformed into a blind source separation problem. Using this model, hyperspectral imagery can be de-mixed into sub-pixel spectra which indicate the different material present in the pixel. This can be further used to deduce areas which contain forest, water or biomass, without even knowing the sources which constitute the image. This form of remote sensing allows previously blurred images to show the specific terrain involved in that region. The blind source separation problem can be implemented using an Independent Component Analysis algorithm. The ICA Algorithm has previously been successfully implemented using software packages such as MATLAB, which has a downloadable version of FastICA. The challenge now lies in implementing it in a form of hardware, or firmware in order to improve its computational speed. Hardware implementation also solves insufficient memory problem encountered by software packages like MATLAB when employing ICA for high resolution images and a large number of channels. Here, a pipelined solution of the firmware, realized using FPGAs are drawn out and simulated using C. Since C code can be translated into HDLs or be used directly on the FPGAs, it can be used to simulate its actual implementation in hardware. The simulated results of the program is presented here, where seven channels are used to model the 200 different channels involved in hyperspectral imaging.
Image sensor system with bio-inspired efficient coding and adaptation.
Okuno, Hirotsugu; Yagi, Tetsuya
2012-08-01
We designed and implemented an image sensor system equipped with three bio-inspired coding and adaptation strategies: logarithmic transform, local average subtraction, and feedback gain control. The system comprises a field-programmable gate array (FPGA), a resistive network, and active pixel sensors (APS), whose light intensity-voltage characteristics are controllable. The system employs multiple time-varying reset voltage signals for APS in order to realize multiple logarithmic intensity-voltage characteristics, which are controlled so that the entropy of the output image is maximized. The system also employs local average subtraction and gain control in order to obtain images with an appropriate contrast. The local average is calculated by the resistive network instantaneously. The designed system was successfully used to obtain appropriate images of objects that were subjected to large changes in illumination.
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA).
FPGA-Based Fused Smart-Sensor for Tool-Wear Area Quantitative Estimation in CNC Machine Inserts
Trejo-Hernandez, Miguel; Osornio-Rios, Roque Alfredo; de Jesus Romero-Troncoso, Rene; Rodriguez-Donate, Carlos; Dominguez-Gonzalez, Aurelio; Herrera-Ruiz, Gilberto
2010-01-01
Manufacturing processes are of great relevance nowadays, when there is a constant claim for better productivity with high quality at low cost. The contribution of this work is the development of a fused smart-sensor, based on FPGA to improve the online quantitative estimation of flank-wear area in CNC machine inserts from the information provided by two primary sensors: the monitoring current output of a servoamplifier, and a 3-axis accelerometer. Results from experimentation show that the fusion of both parameters makes it possible to obtain three times better accuracy when compared with the accuracy obtained from current and vibration signals, individually used. PMID:22319304
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA). PMID:22319345
An FPGA-based demodulation system for fiber Bragg grating sensing
NASA Astrophysics Data System (ADS)
Li, Yongqian; He, Haitao; Yao, Guozhen
2010-11-01
This paper introduces the principle of fiber Bragg grating (FBG) sensor, designs and realizes a compact wavelength demodulation system for FBG sensing using a Fabry-Perot (F-P) filter. FPGA is adopted as a main controller to control a D/A converter to produce a sawtooth wave for driving the F-P filter, and to design the data acquisition circuit for collecting the output signals of photoelectric detector. The collected data is processed after transmitting to PC through the data transmission circuit, and then the demodulation of FBG wavelength is completed finally. This compact FBG wavelength demodulation system is expected to have wide applications in on-line monitoring of electric power equipment and large structures.
Multi-channel imaging cytometry with a single detector
NASA Astrophysics Data System (ADS)
Locknar, Sarah; Barton, John; Entwistle, Mark; Carver, Gary; Johnson, Robert
2018-02-01
Multi-channel microscopy and multi-channel flow cytometry generate high bit data streams. Multiple channels (both spectral and spatial) are important in diagnosing diseased tissue and identifying individual cells. Omega Optical has developed techniques for mapping multiple channels into the time domain for detection by a single high gain, high bandwidth detector. This approach is based on pulsed laser excitation and a serial array of optical fibers coated with spectral reflectors such that up to 15 wavelength bins are sequentially detected by a single-element detector within 2.5 μs. Our multichannel microscopy system uses firmware running on dedicated DSP and FPGA chips to synchronize the laser, scanning mirrors, and sampling clock. The signals are digitized by an NI board into 14 bits at 60MHz - allowing for 232 by 174 pixel fields in up to 15 channels with 10x over sampling. Our multi-channel imaging cytometry design adds channels for forward scattering and back scattering to the fluorescence spectral channels. All channels are detected within the 2.5 μs - which is compatible with fast cytometry. Going forward, we plan to digitize at 16 bits with an A-toD chip attached to a custom board. Processing these digital signals in custom firmware would allow an on-board graphics processing unit to display imaging flow cytometry data over configurable scanning line lengths. The scatter channels can be used to trigger data buffering when a cell is present in the beam. This approach enables a low cost mechanically robust imaging cytometer.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Liu, Chong
2016-10-01
Field programmable gate arrays (FPGAs) manufactured with more advanced processing technology have faster carry chains and smaller delay elements, which are favorable for the design of tapped delay line (TDL)-style time-to-digital converters (TDCs) in FPGA. However, new challenges are posed in using them to implement TDCs with a high time precision. In this paper, we propose a bin realignment method and a dual-sampling method for TDC implementation in a Xilinx UltraScale FPGA. The former realigns the disordered time delay taps so that the TDC precision can approach the limit of its delay granularity, while the latter doubles the number of taps in the delay line so that the TDC precision beyond the cell delay limitation can be expected. Two TDC channels were implemented in a Kintex UltraScale FPGA, and the effectiveness of the new methods was evaluated. For fixed time intervals in the range from 0 to 440 ns, the average RMS precision measured by the two TDC channels reaches 5.8 ps using the bin realignment, and it further improves to 3.9 ps by using the dual-sampling method. The time precision has a 5.6% variation in the measured temperature range. Every part of the TDC, including dual-sampling, encoding, and on-line calibration, could run at a 500 MHz clock frequency. The system measurement dead time is only 4 ns.
NASA Astrophysics Data System (ADS)
Castano, R.; Abbey, W. J.; Bekker, D. L.; Cabrol, N. A.; Francis, R.; Manatt, K.; Ortega, K.; Thompson, D. R.; Wagstaff, K.
2013-12-01
TextureCam is an intelligent camera that uses integrated image analysis to classify sediment and rock surfaces into basic visual categories. This onboard image understanding can improve the autonomy of exploration spacecraft during the long periods when they are out of contact with operators. This could increase the number of science activities performed in each command cycle by, for example, autonomously targeting science features of opportunity with narrow field of view remote sensing, identifying clean surfaces for autonomous placement of arm-mounted instruments, or by detecting high value images for prioritized downlink. TextureCam incorporates image understanding directly into embedded hardware with a Field Programmable Gate Array (FPGA). This allows the instrument to perform the classification in real time without taxing the primary spacecraft computing resources. We use a machine learning approach in which operators train a statistical model of surface appearance using examples from previously acquired images. A random forest model extrapolates from these training cases, using the statistics of small image patches to characterize the texture of each pixel independently. Applying this model to each pixel in a new image yields a map of surface units. We deployed a prototype instrument in the Cima Volcanic Fields during a series of experiments in May 2013. We imaged each environment with a tripod-mounted RGB camera connected directly to the FPGA board for real time processing. Our first scenario assessed ground surface cover on open terrain atop a weathered volcanic flow. We performed a transect consisting of 16 forward-facing images collected at 1m intervals. We trained the system to categorize terrain into four classes: sediment, basalt cobbles, basalt pebbles, and basalt with iron oxide weathering. Accuracy rates with regards to the fraction of the actual feature that was labeled correctly by the automated system were calculated. Lower accuracy rates were observed for pebble and iron oxide resulting from the intrinsic ambiguity between these categories and the basalt cobble class. The second scenario classified strata in the exposed layers of a younger lava flow incised by a channel. The instrument classified the section into five layers: the channel bed, sorted volcanic gravel, gravel in a clay matrix, oxidized clay, and massive blocks. Performance was poor (<30% true positives) for the massive block class, since this material was often covered by clay very similar to the matrix below. We disregarded this top layer. The performance on the remaining layers of the column was better than 95%, a level that would significantly improve autonomous targeting with respect to random sampling. Future development will continue to refine the classification algorithms as well as the speed of the data processing hardware. Acknowledgements: The TextureCam project is supported by the NASA Astrobiology Science and Technology Instrument Development program (NNH10ZDA001N-ASTID) and National Park Service permit MOJA-2013-SCI-0011. This work was carried out at the Jet Propulsion Laboratory, California Institute of Technology under a contract with the National Aeronautics and Space Administration. Copyright 2013, California Institute of Technology.
Autonomous Lawnmower using FPGA implementation.
NASA Astrophysics Data System (ADS)
Ahmad, Nabihah; Lokman, Nabill bin; Helmy Abd Wahab, Mohd
2016-11-01
Nowadays, there are various types of robot have been invented for multiple purposes. The robots have the special characteristic that surpass the human ability and could operate in extreme environment which human cannot endure. In this paper, an autonomous robot is built to imitate the characteristic of a human cutting grass. A Field Programmable Gate Array (FPGA) is used to control the movements where all data and information would be processed. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) is used to describe the hardware using Quartus II software. This robot has the ability of avoiding obstacle using ultrasonic sensor. This robot used two DC motors for its movement. It could include moving forward, backward, and turning left and right. The movement or the path of the automatic lawn mower is based on a path planning technique. Four Global Positioning System (GPS) plot are set to create a boundary. This to ensure that the lawn mower operates within the area given by user. Every action of the lawn mower is controlled by the FPGA DE' Board Cyclone II with the help of the sensor. Furthermore, Sketch Up software was used to design the structure of the lawn mower. The autonomous lawn mower was able to operate efficiently and smoothly return to coordinated paths after passing the obstacle. It uses 25% of total pins available on the board and 31% of total Digital Signal Processing (DSP) blocks.
FIR filters for hardware-based real-time multi-band image blending
NASA Astrophysics Data System (ADS)
Popovic, Vladan; Leblebici, Yusuf
2015-02-01
Creating panoramic images has become a popular feature in modern smart phones, tablets, and digital cameras. A user can create a 360 degree field-of-view photograph from only several images. Quality of the resulting image is related to the number of source images, their brightness, and the used algorithm for their stitching and blending. One of the algorithms that provides excellent results in terms of background color uniformity and reduction of ghosting artifacts is the multi-band blending. The algorithm relies on decomposition of image into multiple frequency bands using dyadic filter bank. Hence, the results are also highly dependant on the used filter bank. In this paper we analyze performance of the FIR filters used for multi-band blending. We present a set of five filters that showed the best results in both literature and our experiments. The set includes Gaussian filter, biorthogonal wavelets, and custom-designed maximally flat and equiripple FIR filters. The presented results of filter comparison are based on several no-reference metrics for image quality. We conclude that 5/3 biorthogonal wavelet produces the best result in average, especially when its short length is considered. Furthermore, we propose a real-time FPGA implementation of the blending algorithm, using 2D non-separable systolic filtering scheme. Its pipeline architecture does not require hardware multipliers and it is able to achieve very high operating frequencies. The implemented system is able to process 91 fps for 1080p (1920×1080) image resolution.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader
2004-01-01
One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing for control and for robotics missions using vision sensors. It presents a top-level description of technologies required for the design and construction of SVIP and EASI and to advance the spatial-spectral imaging and large-scale space interferometry science and engineering.
A low cost X-ray imaging device based on BPW-34 Si-PIN photodiode
NASA Astrophysics Data System (ADS)
Emirhan, E.; Bayrak, A.; Yücel, E. Barlas; Yücel, M.; Ozben, C. S.
2016-05-01
A low cost X-ray imaging device based on BPW-34 silicon PIN photodiode was designed and produced. X-rays were produced from a CEI OX/70-P dental tube using a custom made ±30 kV power supply. A charge sensitive preamplifier and a shaping amplifier were built for the amplification of small signals produced by photons in the depletion layer of Si-PIN photodiode. A two dimensional position control unit was used for moving the detector in small steps to measure the intensity of X-rays absorbed in the object to be imaged. An Aessent AES220B FPGA module was used for transferring the image data to a computer via USB. Images of various samples were obtained with acceptable image quality despite of the low cost of the device.
Custom FPGA processing for real-time fetal ECG extraction and identification.
Torti, E; Koliopoulos, D; Matraxia, M; Danese, G; Leporati, F
2017-01-01
Monitoring the fetal cardiac activity during pregnancy is of crucial importance for evaluating fetus health. However, there is a lack of automatic and reliable methods for Fetal ECG (FECG) monitoring that can perform this elaboration in real-time. In this paper, we present a hardware architecture, implemented on the Altera Stratix V FPGA, capable of separating the FECG from the maternal ECG and to correctly identify it. We evaluated our system using both synthetic and real tracks acquired from patients beyond the 20th pregnancy week. This work is part of a project aiming at developing a portable system for FECG continuous real-time monitoring. Its characteristics of reduced power consumption, real-time processing capability and reduced size make it suitable to be embedded in the overall system, that is the first proposed exploiting Blind Source Separation with this technology, to the best of our knowledge. Copyright © 2016 Elsevier Ltd. All rights reserved.
FPGA acceleration of rigid-molecule docking codes
Sukhwani, B.; Herbordt, M.C.
2011-01-01
Modelling the interactions of biological molecules, or docking, is critical both to understanding basic life processes and to designing new drugs. The field programmable gate array (FPGA) based acceleration of a recently developed, complex, production docking code is described. The authors found that it is necessary to extend their previous three-dimensional (3D) correlation structure in several ways, most significantly to support simultaneous computation of several correlation functions. The result for small-molecule docking is a 100-fold speed-up of a section of the code that represents over 95% of the original run-time. An additional 2% is accelerated through a previously described method, yielding a total acceleration of 36× over a single core and 10× over a quad-core. This approach is found to be an ideal complement to graphics processing unit (GPU) based docking, which excels in the protein–protein domain. PMID:21857870
FPGA-based protein sequence alignment : A review
NASA Astrophysics Data System (ADS)
Isa, Mohd. Nazrin Md.; Muhsen, Ku Noor Dhaniah Ku; Saiful Nurdin, Dayana; Ahmad, Muhammad Imran; Anuar Zainol Murad, Sohiful; Nizam Mohyar, Shaiful; Harun, Azizi; Hussin, Razaidi
2017-11-01
Sequence alignment have been optimized using several techniques in order to accelerate the computation time to obtain the optimal score by implementing DP-based algorithm into hardware such as FPGA-based platform. During hardware implementation, there will be performance challenges such as the frequent memory access and highly data dependent in computation process. Therefore, investigation in processing element (PE) configuration where involves more on memory access in load or access the data (substitution matrix, query sequence character) and the PE configuration time will be the main focus in this paper. There are various approaches to enhance the PE configuration performance that have been done in previous works such as by using serial configuration chain and parallel configuration chain i.e. the configuration data will be loaded into each PEs sequentially and simultaneously respectively. Some researchers have proven that the performance using parallel configuration chain has optimized both the configuration time and area.
NASA Astrophysics Data System (ADS)
Saponara, Sergio; Donati, Massimiliano; Fanucci, Luca; Odendahl, Maximilian; Leupers, Reiner; Errico, Walter
2013-02-01
The on-board data processing is a vital task for any satellite and spacecraft due to the importance of elaborate the sensing data before sending them to the Earth, in order to exploit effectively the bandwidth to the ground station. In the last years the amount of sensing data collected by scientific and commercial space missions has increased significantly, while the available downlink bandwidth is comparatively stable. The increasing demand of on-board real-time processing capabilities represents one of the critical issues in forthcoming European missions. Faster and faster signal and image processing algorithms are required to accomplish planetary observation, surveillance, Synthetic Aperture Radar imaging and telecommunications. The only available space-qualified Digital Signal Processor (DSP) free of International Traffic in Arms Regulations (ITAR) restrictions faces inadequate performance, thus the development of a next generation European DSP is well known to the space community. The DSPACE space-qualified DSP architecture fills the gap between the computational requirements and the available devices. It leverages a pipelined and massively parallel core based on the Very Long Instruction Word (VLIW) paradigm, with 64 registers and 8 operational units, along with cache memories, memory controllers and SpaceWire interfaces. Both the synthesizable VHDL and the software development tools are generated from the LISA high-level model. A Xilinx-XC7K325T FPGA is chosen to realize a compact PCI demonstrator board. Finally first synthesis results on CMOS standard cell technology (ASIC 180 nm) show an area of around 380 kgates and a peak performance of 1000 MIPS and 750 MFLOPS at 125MHz.
Broad-Bandwidth FPGA-Based Digital Polyphase Spectrometer
NASA Technical Reports Server (NTRS)
Jamot, Robert F.; Monroe, Ryan M.
2012-01-01
With present concern for ecological sustainability ever increasing, it is desirable to model the composition of Earth s upper atmosphere accurately with regards to certain helpful and harmful chemicals, such as greenhouse gases and ozone. The microwave limb sounder (MLS) is an instrument designed to map the global day-to-day concentrations of key atmospheric constituents continuously. One important component in MLS is the spectrometer, which processes the raw data provided by the receivers into frequency-domain information that cannot only be transmitted more efficiently, but also processed directly once received. The present-generation spectrometer is fully analog. The goal is to include a fully digital spectrometer in the next-generation sensor. In a digital spectrometer, incoming analog data must be converted into a digital format, processed through a Fourier transform, and finally accumulated to reduce the impact of input noise. While the final design will be placed on an application specific integrated circuit (ASIC), the building of these chips is prohibitively expensive. To that end, this design was constructed on a field-programmable gate array (FPGA). A family of state-of-the-art digital Fourier transform spectrometers has been developed, with a combination of high bandwidth and fine resolution. Analog signals consisting of radiation emitted by constituents in planetary atmospheres or galactic sources are downconverted and subsequently digitized by a pair of interleaved analog-to-digital converters (ADCs). This 6-Gsps (gigasample per second) digital representation of the analog signal is then processed through an FPGA-based streaming fast Fourier transform (FFT). Digital spectrometers have many advantages over previously used analog spectrometers, especially in terms of accuracy and resolution, both of which are particularly important for the type of scientific questions to be addressed with next-generation radiometers.
Fast Fourier Transform Co-Processor (FFTC)- Towards Embedded GFLOPs
NASA Astrophysics Data System (ADS)
Kuehl, Christopher; Liebstueckel, Uwe; Tejerina, Isaac; Uemminghaus, Michael; Wite, Felix; Kolb, Michael; Suess, Martin; Weigand, Roland
2012-08-01
Many signal processing applications and algorithms perform their operations on the data in the transform domain to gain efficiency. The Fourier Transform Co- Processor has been developed with the aim to offload General Purpose Processors from performing these transformations and therefore to boast the overall performance of a processing module. The IP of the commercial PowerFFT processor has been selected and adapted to meet the constraints of the space environment.In frame of the ESA activity “Fast Fourier Transform DSP Co-processor (FFTC)” (ESTEC/Contract No. 15314/07/NL/LvH/ma) the objectives were the following:Production of prototypes of a space qualified version of the commercial PowerFFT chip called FFTC based on the PowerFFT IP.The development of a stand-alone FFTC Accelerator Board (FTAB) based on the FFTC including the Controller FPGA and SpaceWire Interfaces to verify the FFTC function and performance.The FFTC chip performs its calculations with floating point precision. Stand alone it is capable computing FFTs of up to 1K complex samples in length in only 10μsec. This corresponds to an equivalent processing performance of 4.7 GFlops. In this mode the maximum sustained data throughput reaches 6.4Gbit/s. When connected to up to 4 EDAC protected SDRAM memory banks the FFTC can perform long FFTs with up to 1M complex samples in length or multidimensional FFT- based processing tasks.A Controller FPGA on the FTAB takes care of the SDRAM addressing. The instructions commanded via the Controller FPGA are used to set up the data flow and generate the memory addresses.The presentation will give and overview on the project, including the results of the validation of the FFTC ASIC prototypes.
Fast Fourier Transform Co-processor (FFTC), towards embedded GFLOPs
NASA Astrophysics Data System (ADS)
Kuehl, Christopher; Liebstueckel, Uwe; Tejerina, Isaac; Uemminghaus, Michael; Witte, Felix; Kolb, Michael; Suess, Martin; Weigand, Roland; Kopp, Nicholas
2012-10-01
Many signal processing applications and algorithms perform their operations on the data in the transform domain to gain efficiency. The Fourier Transform Co-Processor has been developed with the aim to offload General Purpose Processors from performing these transformations and therefore to boast the overall performance of a processing module. The IP of the commercial PowerFFT processor has been selected and adapted to meet the constraints of the space environment. In frame of the ESA activity "Fast Fourier Transform DSP Co-processor (FFTC)" (ESTEC/Contract No. 15314/07/NL/LvH/ma) the objectives were the following: • Production of prototypes of a space qualified version of the commercial PowerFFT chip called FFTC based on the PowerFFT IP. • The development of a stand-alone FFTC Accelerator Board (FTAB) based on the FFTC including the Controller FPGA and SpaceWire Interfaces to verify the FFTC function and performance. The FFTC chip performs its calculations with floating point precision. Stand alone it is capable computing FFTs of up to 1K complex samples in length in only 10μsec. This corresponds to an equivalent processing performance of 4.7 GFlops. In this mode the maximum sustained data throughput reaches 6.4Gbit/s. When connected to up to 4 EDAC protected SDRAM memory banks the FFTC can perform long FFTs with up to 1M complex samples in length or multidimensional FFT-based processing tasks. A Controller FPGA on the FTAB takes care of the SDRAM addressing. The instructions commanded via the Controller FPGA are used to set up the data flow and generate the memory addresses. The paper will give an overview on the project, including the results of the validation of the FFTC ASIC prototypes.
NASA Astrophysics Data System (ADS)
De Lorenzo, Danilo; De Momi, Elena; Beretta, Elisa; Cerveri, Pietro; Perona, Franco; Ferrigno, Giancarlo
2009-02-01
Computer Assisted Orthopaedic Surgery (CAOS) systems improve the results and the standardization of surgical interventions. Anatomical landmarks and bone surface detection is straightforward to either register the surgical space with the pre-operative imaging space and to compute biomechanical parameters for prosthesis alignment. Surface points acquisition increases the intervention invasiveness and can be influenced by the soft tissue layer interposition (7-15mm localization errors). This study is aimed at evaluating the accuracy of a custom-made A-mode ultrasound (US) system for non invasive detection of anatomical landmarks and surfaces. A-mode solutions eliminate the necessity of US images segmentation, offers real-time signal processing and requires less invasive equipment. The system consists in a single transducer US probe optically tracked, a pulser/receiver and an FPGA-based board, which is responsible for logic control command generation and for real-time signal processing and three custom-made board (signal acquisition, blanking and synchronization). We propose a new calibration method of the US system. The experimental validation was then performed measuring the length of known-shape polymethylmethacrylate boxes filled with pure water and acquiring bone surface points on a bovine bone phantom covered with soft-tissue mimicking materials. Measurement errors were computed through MR and CT images acquisitions of the phantom. Points acquisition on bone surface with the US system demonstrated lower errors (1.2mm) than standard pointer acquisition (4.2mm).
Infrared small target tracking based on SOPC
NASA Astrophysics Data System (ADS)
Hu, Taotao; Fan, Xiang; Zhang, Yu-Jin; Cheng, Zheng-dong; Zhu, Bin
2011-01-01
The paper presents a low cost FPGA based solution for a real-time infrared small target tracking system. A specialized architecture is presented based on a soft RISC processor capable of running kernel based mean shift tracking algorithm. Mean shift tracking algorithm is realized in NIOS II soft-core with SOPC (System on a Programmable Chip) technology. Though mean shift algorithm is widely used for target tracking, the original mean shift algorithm can not be directly used for infrared small target tracking. As infrared small target only has intensity information, so an improved mean shift algorithm is presented in this paper. How to describe target will determine whether target can be tracked by mean shift algorithm. Because color target can be tracked well by mean shift algorithm, imitating color image expression, spatial component and temporal component are advanced to describe target, which forms pseudo-color image. In order to improve the processing speed parallel technology and pipeline technology are taken. Two RAM are taken to stored images separately by ping-pong technology. A FLASH is used to store mass temp data. The experimental results show that infrared small target is tracked stably in complicated background.
A hybrid scanning mode for fast scanning ion conductance microscopy (SICM) imaging
Zhukov, Alex; Richards, Owen; Ostanin, Victor; Korchev, Yuri; Klenerman, David
2012-01-01
We have developed a new method of controlling the pipette for scanning ion conductance microscopy to obtain high-resolution images faster. The method keeps the pipette close to the surface during a single line scan but does not follow the exact surface topography, which is calculated by using the ion current. Using an FPGA platform we demonstrate this new method on model test samples and then on live cells. This method will be particularly useful to follow changes occurring on relatively flat regions of the cell surface at high spatial and temporal resolutions. PMID:22902298
FPGA based demodulation of laser induced fluorescence in plasmas
NASA Astrophysics Data System (ADS)
Mattingly, Sean W.; Skiff, Fred
2018-04-01
We present a field programmable gate array (FPGA)-based system that counts photons from laser-induced fluorescence (LIF) on a laboratory plasma. This is accomplished with FPGA-based up/down counters that demodulate the data, giving a background-subtracted LIF signal stream that is updated with a new point as each laser amplitude modulation cycle completes. We demonstrate using the FPGA to modulate a laser at 1 MHz and demodulate the resulting LIF data stream. This data stream is used to calculate an LIF-based measurement sampled at 1 MHz of a plasma ion fluctuation spectrum.
NASA Astrophysics Data System (ADS)
Zhang, Xian; Zhou, Binquan; Li, Hong; Zhao, Xinghua; Mu, Weiwei; Wu, Wenfeng
2017-10-01
Navigation technology is crucial to the national defense and military, which can realize the measurement of orientation, positioning, attitude and speed for moving object. Inertial navigation is not only autonomous, real-time, continuous, hidden, undisturbed but also no time-limited and environment-limited. The gyroscope is the core component of the inertial navigation system, whose precision and size are the bottleneck of the performance. However, nuclear magnetic resonance gyroscope is characteristic of the advantage of high precision and small size. Nuclear magnetic resonance gyroscope can meet the urgent needs of high-tech weapons and equipment development of new generation. This paper mainly designs a set of photoelectric signal processing system for nuclear magnetic resonance gyroscope based on FPGA, which process and control the information of detecting laser .The photoelectric signal with high frequency carrier is demodulated by in-phase and quadrature demodulation method. Finally, the processing system of photoelectric signal can compensate the residual magnetism of the shielding barrel and provide the information of nuclear magnetic resonance gyroscope angular velocity.
FPGA in-the-loop simulations of cardiac excitation model under voltage clamp conditions
NASA Astrophysics Data System (ADS)
Othman, Norliza; Adon, Nur Atiqah; Mahmud, Farhanahani
2017-01-01
Voltage clamp technique allows the detection of single channel currents in biological membranes in identifying variety of electrophysiological problems in the cellular level. In this paper, a simulation study of the voltage clamp technique has been presented to analyse current-voltage (I-V) characteristics of ion currents based on Luo-Rudy Phase-I (LR-I) cardiac model by using a Field Programmable Gate Array (FPGA). Nowadays, cardiac models are becoming increasingly complex which can cause a vast amount of time to run the simulation. Thus, a real-time hardware implementation using FPGA could be one of the best solutions for high-performance real-time systems as it provides high configurability and performance, and able to executes in parallel mode operation. For shorter time development while retaining high confidence results, FPGA-based rapid prototyping through HDL Coder from MATLAB software has been used to construct the algorithm for the simulation system. Basically, the HDL Coder is capable to convert the designed MATLAB Simulink blocks into hardware description language (HDL) for the FPGA implementation. As a result, the voltage-clamp fixed-point design of LR-I model has been successfully conducted in MATLAB Simulink and the simulation of the I-V characteristics of the ionic currents has been verified on Xilinx FPGA Virtex-6 XC6VLX240T development board through an FPGA-in-the-loop (FIL) simulation.
NASA Technical Reports Server (NTRS)
Lin, Michael; Petrick, David; Geist, Alessandro; Flatley, Thomas
2012-01-01
This version of the SpaceCube will be a full-fledged, onboard space processing system capable of 2500+ MIPS, and featuring a number of plug-andplay gigabit and standard interfaces, all in a condensed 3x3x3 form factor [less than 10 watts and less than 3 lb (approximately equal to 1.4 kg)]. The main processing engine is the Xilinx SIRF radiation- hardened-by-design Virtex-5 FX-130T field-programmable gate array (FPGA). Even as the SpaceCube 2.0 version (currently under test) is being targeted as the platform of choice for a number of the upcoming Earth Science Decadal Survey missions, GSFC has been contacted by customers who wish to see a system that incorporates key features of the version 2.0 architecture in an even smaller form factor. In order to fulfill that need, the SpaceCube Mini is being designed, and will be a very compact and low-power system. A similar flight system with this combination of small size, low power, low cost, adaptability, and extremely high processing power does not otherwise exist, and the SpaceCube Mini will be of tremendous benefit to GSFC and its partners. The SpaceCube Mini will utilize space-grade components. The primary processing engine of the Mini is the Xilinx Virtex-5 SIRF FX-130T radiation-hardened-by-design FPGA for critical flight applications in high-radiation environments. The Mini can also be equipped with a commercial Xilinx Virtex-5 FPGA with integrated PowerPCs for a low-cost, high-power computing platform for use in the relatively radiation- benign LEOs (low-Earth orbits). In either case, this version of the Space-Cube will weigh less than 3 pounds (.1.4 kg), conform to the CubeSat form-factor (10x10x10 cm), and will be low power (less than 10 watts for typical applications). The SpaceCube Mini will have a radiation-hardened Aeroflex FPGA for configuring and scrubbing the Xilinx FPGA by utilizing the onboard FLASH memory to store the configuration files. The FLASH memory will also be used for storing algorithm and application code for the PowerPCs and the Xilinx FPGA. In addition, it will feature highspeed DDR SDRAM (double data rate synchronous dynamic random-access memory) to store the instructions and data of active applications. This version will also feature SATA-II and Gigabit Ethernet interfaces. Furthermore, there will also be general-purpose, multi-gigabit interfaces. In addition, the system will have dozens of transceivers that can support LVDS (low-voltage differential signaling), RS-422, or SpaceWire. The SpaceCube Mini includes an I/O card that can be customized to meet the needs of each mission. This version of the SpaceCube will be designed so that multiple Minis can be networked together using SpaceWire, Ethernet, or even a custom protocol. Scalability can be provided by networking multiple SpaceCube Minis together. Rigid-Flex technology is being targeted for the construction of the SpaceCube Mini, which will make the extremely compact and low-weight design feasible. The SpaceCube Mini is designed to fit in the compact CubeSat form factor, thus allowing deployment in a new class of missions that the previous SpaceCube versions were not suited for. At the time of this reporting, engineering units should be available in the summer 2012.
C-RED One : the infrared camera using the Saphira e-APD detector
NASA Astrophysics Data System (ADS)
Greffe, Timothée.; Feautrier, Philippe; Gach, Jean-Luc; Stadler, Eric; Clop, Fabien; Lemarchand, Stephane; Boutolleau, David; Baker, Ian
2016-08-01
Name for Person Card: Observatoire de la Côte d'Azur First Light Imaging' C-RED One infrared camera is capable of capturing up to 3500 full frames per second with a sub-electron readout noise and very low background. This breakthrough has been made possible thanks to the use of an e- APD infrared focal plane array which is a real disruptive technology in imagery. C-RED One is an autonomous system with an integrated cooling system and a vacuum regeneration system. It operates its sensor with a wide variety of read out techniques and processes video on-board thanks to an FPGA. We will show its performances and expose its main features. The project leading to this application has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement N° 673944.
Moving Object Detection Using Scanning Camera on a High-Precision Intelligent Holder.
Chen, Shuoyang; Xu, Tingfa; Li, Daqun; Zhang, Jizhou; Jiang, Shenwang
2016-10-21
During the process of moving object detection in an intelligent visual surveillance system, a scenario with complex background is sure to appear. The traditional methods, such as "frame difference" and "optical flow", may not able to deal with the problem very well. In such scenarios, we use a modified algorithm to do the background modeling work. In this paper, we use edge detection to get an edge difference image just to enhance the ability of resistance illumination variation. Then we use a "multi-block temporal-analyzing LBP (Local Binary Pattern)" algorithm to do the segmentation. In the end, a connected component is used to locate the object. We also produce a hardware platform, the core of which consists of the DSP (Digital Signal Processor) and FPGA (Field Programmable Gate Array) platforms and the high-precision intelligent holder.
Parallel ICA and its hardware implementation in hyperspectral image analysis
NASA Astrophysics Data System (ADS)
Du, Hongtao; Qi, Hairong; Peterson, Gregory D.
2004-04-01
Advances in hyperspectral images have dramatically boosted remote sensing applications by providing abundant information using hundreds of contiguous spectral bands. However, the high volume of information also results in excessive computation burden. Since most materials have specific characteristics only at certain bands, a lot of these information is redundant. This property of hyperspectral images has motivated many researchers to study various dimensionality reduction algorithms, including Projection Pursuit (PP), Principal Component Analysis (PCA), wavelet transform, and Independent Component Analysis (ICA), where ICA is one of the most popular techniques. It searches for a linear or nonlinear transformation which minimizes the statistical dependence between spectral bands. Through this process, ICA can eliminate superfluous but retain practical information given only the observations of hyperspectral images. One hurdle of applying ICA in hyperspectral image (HSI) analysis, however, is its long computation time, especially for high volume hyperspectral data sets. Even the most efficient method, FastICA, is a very time-consuming process. In this paper, we present a parallel ICA (pICA) algorithm derived from FastICA. During the unmixing process, pICA divides the estimation of weight matrix into sub-processes which can be conducted in parallel on multiple processors. The decorrelation process is decomposed into the internal decorrelation and the external decorrelation, which perform weight vector decorrelations within individual processors and between cooperative processors, respectively. In order to further improve the performance of pICA, we seek hardware solutions in the implementation of pICA. Until now, there are very few hardware designs for ICA-related processes due to the complicated and iterant computation. This paper discusses capacity limitation of FPGA implementations for pICA in HSI analysis. A synthesis of Application-Specific Integrated Circuit (ASIC) is designed for pICA-based dimensionality reduction in HSI analysis. The pICA design is implemented using standard-height cells and aimed at TSMC 0.18 micron process. During the synthesis procedure, three ICA-related reconfigurable components are developed for the reuse and retargeting purpose. Preliminary results show that the standard-height cell based ASIC synthesis provide an effective solution for pICA and ICA-related processes in HSI analysis.
Real-time track-less Cherenkov ring fitting trigger system based on Graphics Processing Units
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Chiozzi, S.; Cretaro, P.; Cotta Ramusino, A.; Di Lorenzo, S.; Fantechi, R.; Fiorini, M.; Frezza, O.; Gianoli, A.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Neri, I.; Paolucci, P. S.; Pastorelli, E.; Piandani, R.; Piccini, M.; Pontisso, L.; Rossetti, D.; Simula, F.; Sozzi, M.; Vicini, P.
2017-12-01
The parallel computing power of commercial Graphics Processing Units (GPUs) is exploited to perform real-time ring fitting at the lowest trigger level using information coming from the Ring Imaging Cherenkov (RICH) detector of the NA62 experiment at CERN. To this purpose, direct GPU communication with a custom FPGA-based board has been used to reduce the data transmission latency. The GPU-based trigger system is currently integrated in the experimental setup of the RICH detector of the NA62 experiment, in order to reconstruct ring-shaped hit patterns. The ring-fitting algorithm running on GPU is fed with raw RICH data only, with no information coming from other detectors, and is able to provide more complex trigger primitives with respect to the simple photodetector hit multiplicity, resulting in a higher selection efficiency. The performance of the system for multi-ring Cherenkov online reconstruction obtained during the NA62 physics run is presented.
2015-04-01
DDC ) results in more complicated digital (FPGA) processing, yet simplifies the analog design significantly while improving the quality of the...Interleaved CP Cyclic Prefix DAC Digital to Analog Converter DDC Digital Down Converter DDR Double Data Rate DUC Digital Up Converter ENOB Effective
Driver face tracking using semantics-based feature of eyes on single FPGA
NASA Astrophysics Data System (ADS)
Yu, Ying-Hao; Chen, Ji-An; Ting, Yi-Siang; Kwok, Ngaiming
2017-06-01
Tracking driver's face is one of the essentialities for driving safety control. This kind of system is usually designed with complicated algorithms to recognize driver's face by means of powerful computers. The design problem is not only about detecting rate but also from parts damages under rigorous environments by vibration, heat, and humidity. A feasible strategy to counteract these damages is to integrate entire system into a single chip in order to achieve minimum installation dimension, weight, power consumption, and exposure to air. Meanwhile, an extraordinary methodology is also indispensable to overcome the dilemma of low-computing capability and real-time performance on a low-end chip. In this paper, a novel driver face tracking system is proposed by employing semantics-based vague image representation (SVIR) for minimum hardware resource usages on a FPGA, and the real-time performance is also guaranteed at the same time. Our experimental results have indicated that the proposed face tracking system is viable and promising for the smart car design in the future.
A dynamically reconfigurable multi-functional PLL for SRAM-based FPGA in 65nm CMOS technology
NASA Astrophysics Data System (ADS)
Yang, Mingqian; Chen, Lei; Li, Xuewu; Zhang, Yanlong
2018-04-01
Phase-locked loops (PLL) have been widely utilized in FPGA as an important module for clock management. PLL with dynamic reconfiguration capability is always welcomed in FPGA design as it is able to decrease power consumption and simultaneously improve flexibility. In this paper, a multi-functional PLL with dynamic reconfiguration capability for 65nm SRAM-based FPGA is proposed. Firstly, configurable charge pump and loop filter are utilized to optimize the loop bandwidth. Secondly, the PLL incorporates a VCO with dual control voltages to accelerate the adjustment of oscillation frequency. Thirdly, three configurable dividers are presented for flexible frequency synthesis. Lastly, a configuration block with dynamic reconfiguration function is proposed. Simulation results demonstrate that the proposed multi-functional PLL can output clocks with configurable division ratio, phase shift and duty cycle. The PLL can also be dynamically reconfigured without affecting other parts' running or halting the FPGA device.
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)
Li, Isaac TS; Shum, Warren; Truong, Kevin
2007-01-01
Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).
Li, Isaac T S; Shum, Warren; Truong, Kevin
2007-06-07
To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
A high data rate universal lattice decoder on FPGA
NASA Astrophysics Data System (ADS)
Ma, Jing; Huang, Xinming; Kura, Swapna
2005-06-01
This paper presents the architecture design of a high data rate universal lattice decoder for MIMO channels on FPGA platform. A phost strategy based lattice decoding algorithm is modified in this paper to reduce the complexity of the closest lattice point search. The data dependency of the improved algorithm is examined and a parallel and pipeline architecture is developed with the iterative decoding function on FPGA and the division intensive channel matrix preprocessing on DSP. Simulation results demonstrate that the improved lattice decoding algorithm provides better bit error rate and less iteration number compared with the original algorithm. The system prototype of the decoder shows that it supports data rate up to 7Mbit/s on a Virtex2-1000 FPGA, which is about 8 times faster than the original algorithm on FPGA platform and two-orders of magnitude better than its implementation on a DSP platform.
Real-time Implementation of a Dual-Mode Ultrasound Array System: In Vivo Results
Casper, Andrew J.; Liu, Dalong; Ballard, John R.; Ebbini, Emad S.
2013-01-01
A real-time dual-mode ultrasound array (DMUA) system for imaging and therapy is described. The system utilizes a concave (40-mm radius of curvature) 3.5 MHz, 32 element array and modular multi-channel transmitter/receiver. It is capable of operating in a variety of imaging and therapy modes (on transmit) and continuous receive on all array elements even during high-power operation. A signal chain consisting of field-programmable gate arrays (FPGA) and graphical processing units (GPU) is used to enable real-time, software-defined beamforming and image formation. Imaging data, from quality assurance phantoms as well as in vivo small and large animal models, are presented and discussed. Corresponding images obtained using a temporally-synchronized and spatially-aligned diagnostic probe confirm the DMUA’s ability to form anatomically-correct images with sufficient contrast in an extended field of view (FOV) around its geometric center. In addition, high frame rate DMUA data also demonstrate the feasibility of detection and localization of echo changes indicative of cavitation and/or tissue boiling during HIFU exposures with 45 – 50 dB dynamic range. The results also show that the axial and lateral resolution of the DMUA are consistent with its fnumber and bandwidth with well behaved speckle cell characteristics. These results point the way to a theranostic DMUA system capable of quantitative imaging of tissue property changes with high specificity to lesion formation using focused ultrasound. PMID:23708766
Systems and methods for detecting a failure event in a field programmable gate array
NASA Technical Reports Server (NTRS)
Ng, Tak-Kwong (Inventor); Herath, Jeffrey A. (Inventor)
2009-01-01
An embodiment generally relates to a method of self-detecting an error in a field programmable gate array (FPGA). The method includes writing a signature value into a signature memory in the FPGA and determining a conclusion of a configuration refresh operation in the FPGA. The method also includes reading an outcome value from the signature memory.
NASA Technical Reports Server (NTRS)
Wang, Jih-Jong; Cronquist, Brian E.; McGowan, John E.; Katz, Richard B.
1997-01-01
The goals for a radiation hardened (RAD-HARD) and high reliability (HI-REL) field programmable gate array (FPGA) are described. The first qualified manufacturer list (QML) radiation hardened RH1280 and RH1020 were developed. The total radiation dose and single event effects observed on the antifuse FPGA RH1280 are reported on. Tradeoffs and the limitations in the single event upset hardening are discussed.