Roll Angle Estimation Using Thermopiles for a Flight Controlled Mortar
2012-06-01
Using Xilinx’s System generator, the entire design was implemented at a relatively high level within Malab’s Simulink. This allowed VHDL code to...thermopile data with a Recursive Least Squares (RLS) filter implemented on a field programmable gate array (FPGA). These results demonstrate the...accurately estimated by processing the thermopile data with a Recursive Least Squares (RLS) filter implemented on a field programmable gate array (FPGA
Random number generators for large-scale parallel Monte Carlo simulations on FPGA
NASA Astrophysics Data System (ADS)
Lin, Y.; Wang, F.; Liu, B.
2018-05-01
Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Implementation of total focusing method for phased array ultrasonic imaging on FPGA
NASA Astrophysics Data System (ADS)
Guo, JianQiang; Li, Xi; Gao, Xiaorong; Wang, Zeyong; Zhao, Quanke
2015-02-01
This paper describes a multi-FPGA imaging system dedicated for the real-time imaging using the Total Focusing Method (TFM) and Full Matrix Capture (FMC). The system was entirely described using Verilog HDL language and implemented on Altera Stratix IV GX FPGA development board. The whole algorithm process is to: establish a coordinate system of image and divide it into grids; calculate the complete acoustic distance of array element between transmitting array element and receiving array element, and transform it into index value; then index the sound pressure values from ROM and superimpose sound pressure values to get pixel value of one focus point; and calculate the pixel values of all focus points to get the final imaging. The imaging result shows that this algorithm has high SNR of defect imaging. And FPGA with parallel processing capability can provide high speed performance, so this system can provide the imaging interface, with complete function and good performance.
20-GFLOPS QR processor on a Xilinx Virtex-E FPGA
NASA Astrophysics Data System (ADS)
Walke, Richard L.; Smith, Robert W. M.; Lightbody, Gaye
2000-11-01
Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.
Real-time field programmable gate array architecture for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2001-01-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
NASA Astrophysics Data System (ADS)
Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
Wang, Jinlong; Lu, Mai; Hu, Yanwen; Chen, Xiaoqiang; Pan, Qiangqiang
2015-12-01
Neuron is the basic unit of the biological neural system. The Hodgkin-Huxley (HH) model is one of the most realistic neuron models on the electrophysiological characteristic description of neuron. Hardware implementation of neuron could provide new research ideas to clinical treatment of spinal cord injury, bionics and artificial intelligence. Based on the HH model neuron and the DSP Builder technology, in the present study, a single HH model neuron hardware implementation was completed in Field Programmable Gate Array (FPGA). The neuron implemented in FPGA was stimulated by different types of current, the action potential response characteristics were analyzed, and the correlation coefficient between numerical simulation result and hardware implementation result were calculated. The results showed that neuronal action potential response of FPGA was highly consistent with numerical simulation result. This work lays the foundation for hardware implementation of neural network.
Optimized smith waterman processor design for breast cancer early diagnosis
NASA Astrophysics Data System (ADS)
Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.
2017-09-01
This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA
NASA Astrophysics Data System (ADS)
Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing
Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
VIRTEX-5 Fpga Implementation of Advanced Encryption Standard Algorithm
NASA Astrophysics Data System (ADS)
Rais, Muhammad H.; Qasim, Syed M.
2010-06-01
In this paper, we present an implementation of Advanced Encryption Standard (AES) cryptographic algorithm using state-of-the-art Virtex-5 Field Programmable Gate Array (FPGA). The design is coded in Very High Speed Integrated Circuit Hardware Description Language (VHDL). Timing simulation is performed to verify the functionality of the designed circuit. Performance evaluation is also done in terms of throughput and area. The design implemented on Virtex-5 (XC5VLX50FFG676-3) FPGA achieves a maximum throughput of 4.34 Gbps utilizing a total of 399 slices.
FPGA design for constrained energy minimization
NASA Astrophysics Data System (ADS)
Wang, Jianwei; Chang, Chein-I.; Cao, Mang
2004-02-01
The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; LaBel, Kenneth; Kim, Hak
2014-01-01
An informative session regarding SRAM FPGA basics. Presenting a framework for fault injection techniques applied to Xilinx Field Programmable Gate Arrays (FPGAs). Introduce an overlooked time component that illustrates fault injection is impractical for most real designs as a stand-alone characterization tool. Demonstrate procedures that benefit from fault injection error analysis.
A Brain-Machine-Brain Interface for Rewiring of Cortical Circuitry after Traumatic Brain Injury
2013-09-01
implemented to significantly decrease the IIR system response time, especially when artifacts were highly reproducible in consecutive stimulation...cycles. The proposed system architecture was hardware- implemented on a field- programmable gate array (FPGA) and tested using two sets of prerecorded...its FPGA implementation and testing with prerecorded neural datasets are reported in a manuscript currently in press with the IEEE Transactions on
Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions
Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners
1995-01-01
In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...
NASA Astrophysics Data System (ADS)
Yussup, N.; Ibrahim, M. M.; Lombigit, L.; Rahman, N. A. A.; Zin, M. R. M.
2014-02-01
Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of data acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yussup, N.; Ibrahim, M. M.; Lombigit, L.
Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of datamore » acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.« less
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System
NASA Astrophysics Data System (ADS)
Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.
2016-03-01
We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio interferometers.
JTRS/SCA and Custom/SDR Waveform Comparison
NASA Technical Reports Server (NTRS)
Oldham, Daniel R.; Scardelletti, Maximilian C.
2007-01-01
This paper compares two waveform implementations generating the same RF signal using the same SDR development system. Both waveforms implement a satellite modem using QPSK modulation at 1M BPS data rates with one half rate convolutional encoding. Both waveforms are partitioned the same across the general purpose processor (GPP) and the field programmable gate array (FPGA). Both waveforms implement the same equivalent set of radio functions on the GPP and FPGA. The GPP implements the majority of the radio functions and the FPGA implements the final digital RF modulator stage. One waveform is implemented directly on the SDR development system and the second waveform is implemented using the JTRS/SCA model. This paper contrasts the amount of resources to implement both waveforms and demonstrates the importance of waveform partitioning across the SDR development system.
Abid, Abdulbasit
2013-03-01
This paper presents a thorough discussion of the proposed field-programmable gate array (FPGA) implementation for fringe pattern demodulation using the one-dimensional continuous wavelet transform (1D-CWT) algorithm. This algorithm is also known as wavelet transform profilometry. Initially, the 1D-CWT is programmed using the C programming language and compiled into VHDL using the ImpulseC tool. This VHDL code is implemented on the Altera Cyclone IV GX EP4CGX150DF31C7 FPGA. A fringe pattern image with a size of 512×512 pixels is presented to the FPGA, which processes the image using the 1D-CWT algorithm. The FPGA requires approximately 100 ms to process the image and produce a wrapped phase map. For performance comparison purposes, the 1D-CWT algorithm is programmed using the C language. The C code is then compiled using the Intel compiler version 13.0. The compiled code is run on a Dell Precision state-of-the-art workstation. The time required to process the fringe pattern image is approximately 1 s. In order to further reduce the execution time, the 1D-CWT is reprogramed using Intel Integrated Primitive Performance (IPP) Library Version 7.1. The execution time was reduced to approximately 650 ms. This confirms that at least sixfold speedup was gained using FPGA implementation over a state-of-the-art workstation that executes heavily optimized implementation of the 1D-CWT algorithm.
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Parallel Fixed Point Implementation of a Radial Basis Function Network in an FPGA
de Souza, Alisson C. D.; Fernandes, Marcelo A. C.
2014-01-01
This paper proposes a parallel fixed point radial basis function (RBF) artificial neural network (ANN), implemented in a field programmable gate array (FPGA) trained online with a least mean square (LMS) algorithm. The processing time and occupied area were analyzed for various fixed point formats. The problems of precision of the ANN response for nonlinear classification using the XOR gate and interpolation using the sine function were also analyzed in a hardware implementation. The entire project was developed using the System Generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA. PMID:25268918
FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling.
Kim, Chang-Min; Park, Hyung-Min; Kim, Taesu; Choi, Yoon-Kyung; Lee, Soo-Young
2003-01-01
An field programmable gate array (FPGA) implementation of independent component analysis (ICA) algorithm is reported for blind signal separation (BSS) and adaptive noise canceling (ANC) in real time. In order to provide enormous computing power for ICA-based algorithms with multipath reverberation, a special digital processor is designed and implemented in FPGA. The chip design fully utilizes modular concept and several chips may be put together for complex applications with a large number of noise sources. Experimental results with a fabricated test board are reported for ANC only, BSS only, and simultaneous ANC/BSS, which demonstrates successful speech enhancement in real environments in real time.
NASA Astrophysics Data System (ADS)
Naldi, G.; Bartolini, M.; Mattana, A.; Pupillo, G.; Hickish, J.; Foster, G.; Bianchi, G.; Lingua, A.; Monari, J.; Montebugnoli, S.; Perini, F.; Rusticelli, S.; Schiaffino, M.; Virone, G.; Zarb Adami, K.
In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yonggang, E-mail: wangyg@ustc.edu.cn; Hui, Cong; Liu, Chong
The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving,more » so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.« less
Wang, Yonggang; Hui, Cong; Liu, Chong; Xu, Chao
2016-04-01
The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving, so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA
Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang
2009-01-01
Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm
Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay
2012-01-01
A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Ormsby, John (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing (DSP) functions. Such capability also makes and FPGA a suitable platform for the digital implementation of closed loop controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance in a compact form-factor. Other researchers have presented the notion that a second order digital filter with proportional-integral-derivative (PID) control functionality can be implemented in an FPGA. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSF) devices. Our goal is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. Meeting our goals requires alternative compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching these goals.
Note: Design of FPGA based system identification module with application to atomic force microscopy
NASA Astrophysics Data System (ADS)
Ghosal, Sayan; Pradhan, Sourav; Salapaka, Murti
2018-05-01
The science of system identification is widely utilized in modeling input-output relationships of diverse systems. In this article, we report field programmable gate array (FPGA) based implementation of a real-time system identification algorithm which employs forgetting factors and bias compensation techniques. The FPGA module is employed to estimate the mechanical properties of surfaces of materials at the nano-scale with an atomic force microscope (AFM). The FPGA module is user friendly which can be interfaced with commercially available AFMs. Extensive simulation and experimental results validate the design.
An Implementation of Physical Layer Authentication Using Software Radio
2009-07-01
USRP consists of an FPGA responsible for up/down conversions, ADCs and DACs, and various plug-in daughterboards. . . . . . . . . . . . . . . . . 7 5...seen in figure 4, the USRP consists of a USB interface, a 6 field-programmable gate array ( FPGA ), ADCs and DACs, and daughterboards. The...configuration. In the following, we detail the signal receive path to highlight the design of the hardware. FPGA Receive Daughterboar d A/D A/D Tr ansmit
NASA Astrophysics Data System (ADS)
Nguyen, An Hung; Guillemette, Thomas; Lambert, Andrew J.; Pickering, Mark R.; Garratt, Matthew A.
2017-09-01
Image registration is a fundamental image processing technique. It is used to spatially align two or more images that have been captured at different times, from different sensors, or from different viewpoints. There have been many algorithms proposed for this task. The most common of these being the well-known Lucas-Kanade (LK) and Horn-Schunck approaches. However, the main limitation of these approaches is the computational complexity required to implement the large number of iterations necessary for successful alignment of the images. Previously, a multi-pass image interpolation algorithm (MP-I2A) was developed to considerably reduce the number of iterations required for successful registration compared with the LK algorithm. This paper develops a kernel-warping algorithm (KWA), a modified version of the MP-I2A, which requires fewer iterations to successfully register two images and less memory space for the field-programmable gate array (FPGA) implementation than the MP-I2A. These reductions increase feasibility of the implementation of the proposed algorithm on FPGAs with very limited memory space and other hardware resources. A two-FPGA system rather than single FPGA system is successfully developed to implement the KWA in order to compensate insufficiency of hardware resources supported by one FPGA, and increase parallel processing ability and scalability of the system.
Design techniques for a stable operation of cryogenic field-programmable gate arrays.
Homulle, Harald; Visser, Stefan; Patra, Bishnu; Charbon, Edoardo
2018-01-01
In this paper, we show how a deep-submicron field-programmable gate array (FPGA) can be operated more stably at extremely low temperatures through special firmware design techniques. Stability at low temperatures is limited through long power supply wires and reduced performance of various printed circuit board components commonly employed at room temperature. Extensive characterization of these components shows that the majority of decoupling capacitor types and voltage regulators are not well behaved at cryogenic temperatures, asking for an ad hoc solution to stabilize the FPGA supply voltage, especially for sensitive applications. Therefore, we have designed a firmware that enforces a constant power consumption, so as to stabilize the supply voltage in the interior of the FPGA. The FPGA is powered with a supply at several meters distance, causing significant resistive voltage drop and thus fluctuations on the local supply voltage. To achieve the stabilization, the variation in digital logic speed, which directly corresponds to changes in supply voltage, is constantly measured and corrected for through a tunable oscillator farm, implemented on the FPGA. The impact of the stabilization technique is demonstrated together with a reconfigurable analog-to-digital converter (ADC), completely implemented in the FPGA fabric and operating at 15 K. The ADC performance can be improved by at most 1.5 bits (effective number of bits) thanks to the more stable supply voltage. The method is versatile and robust, enabling seamless porting to other FPGA families and configurations.
Design techniques for a stable operation of cryogenic field-programmable gate arrays
NASA Astrophysics Data System (ADS)
Homulle, Harald; Visser, Stefan; Patra, Bishnu; Charbon, Edoardo
2018-01-01
In this paper, we show how a deep-submicron field-programmable gate array (FPGA) can be operated more stably at extremely low temperatures through special firmware design techniques. Stability at low temperatures is limited through long power supply wires and reduced performance of various printed circuit board components commonly employed at room temperature. Extensive characterization of these components shows that the majority of decoupling capacitor types and voltage regulators are not well behaved at cryogenic temperatures, asking for an ad hoc solution to stabilize the FPGA supply voltage, especially for sensitive applications. Therefore, we have designed a firmware that enforces a constant power consumption, so as to stabilize the supply voltage in the interior of the FPGA. The FPGA is powered with a supply at several meters distance, causing significant resistive voltage drop and thus fluctuations on the local supply voltage. To achieve the stabilization, the variation in digital logic speed, which directly corresponds to changes in supply voltage, is constantly measured and corrected for through a tunable oscillator farm, implemented on the FPGA. The impact of the stabilization technique is demonstrated together with a reconfigurable analog-to-digital converter (ADC), completely implemented in the FPGA fabric and operating at 15 K. The ADC performance can be improved by at most 1.5 bits (effective number of bits) thanks to the more stable supply voltage. The method is versatile and robust, enabling seamless porting to other FPGA families and configurations.
Independent component analysis algorithm FPGA design to perform real-time blind source separation
NASA Astrophysics Data System (ADS)
Meyer-Baese, Uwe; Odom, Crispin; Botella, Guillermo; Meyer-Baese, Anke
2015-05-01
The conditions that arise in the Cocktail Party Problem prevail across many fields creating a need for of Blind Source Separation. The need for BSS has become prevalent in several fields of work. These fields include array processing, communications, medical signal processing, and speech processing, wireless communication, audio, acoustics and biomedical engineering. The concept of the cocktail party problem and BSS led to the development of Independent Component Analysis (ICA) algorithms. ICA proves useful for applications needing real time signal processing. The goal of this research was to perform an extensive study on ability and efficiency of Independent Component Analysis algorithms to perform blind source separation on mixed signals in software and implementation in hardware with a Field Programmable Gate Array (FPGA). The Algebraic ICA (A-ICA), Fast ICA, and Equivariant Adaptive Separation via Independence (EASI) ICA were examined and compared. The best algorithm required the least complexity and fewest resources while effectively separating mixed sources. The best algorithm was the EASI algorithm. The EASI ICA was implemented on hardware with Field Programmable Gate Arrays (FPGA) to perform and analyze its performance in real time.
NASA Astrophysics Data System (ADS)
Bigdeli, Abbas; Biglari-Abhari, Morteza; Salcic, Zoran; Tin Lai, Yat
2006-12-01
A new pipelined systolic array-based (PSA) architecture for matrix inversion is proposed. The pipelined systolic array (PSA) architecture is suitable for FPGA implementations as it efficiently uses available resources of an FPGA. It is scalable for different matrix size and as such allows employing parameterisation that makes it suitable for customisation for application-specific needs. This new architecture has an advantage of[InlineEquation not available: see fulltext.] processing element complexity, compared to the[InlineEquation not available: see fulltext.] in other systolic array structures, where the size of the input matrix is given by[InlineEquation not available: see fulltext.]. The use of the PSA architecture for Kalman filter as an implementation example, which requires different structures for different number of states, is illustrated. The resulting precision error is analysed and shown to be negligible.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
FPGA-based trigger system for the Fermilab SeaQuest experimentz
NASA Astrophysics Data System (ADS)
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; Chang, Ting-Hua; Chang, Wen-Chen; Chen, Yen-Chu; Gilman, Ron; Nakano, Kenichi; Peng, Jen-Chieh; Wang, Su-Yin
2015-12-01
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ+ and μ- produced in 120 GeV/c proton-nucleon interactions in a high rate environment. The trigger system consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is then examined against pre-determined trigger matrices to identify candidate muon tracks. Information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; ...
2015-09-10
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)
Li, Isaac TS; Shum, Warren; Truong, Kevin
2007-01-01
Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).
Li, Isaac T S; Shum, Warren; Truong, Kevin
2007-06-07
To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Monenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used proportional-integral-derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM-based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a DSP (Digital Signal Processor) or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSP) devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching this goal.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Montenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of Field Programmable Gate Arrays (FPGA's) in the hardware implementation of fast digital signal processing functions. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used Proportional-Integral-Derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a Digital Signal Processor (DSP) device or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using DSP devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, Pulse Width Modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacemap. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive-control algorithm approaches. Radiation tolerant FPGA's are a feasible option for reaching this goal.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.
Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan
2017-07-01
In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
FPGA implementation of self organizing map with digital phase locked loops.
Hikawa, Hiroomi
2005-01-01
The self-organizing map (SOM) has found applicability in a wide range of application areas. Recently new SOM hardware with phase modulated pulse signal and digital phase-locked loops (DPLLs) has been proposed (Hikawa, 2005). The system uses the DPLL as a computing element since the operation of the DPLL is very similar to that of SOM's computation. The system also uses square waveform phase to hold the value of the each input vector element. This paper discuss the hardware implementation of the DPLL SOM architecture. For effective hardware implementation, some components are redesigned to reduce the circuit size. The proposed SOM architecture is described in VHDL and implemented on field programmable gate array (FPGA). Its feasibility is verified by experiments. Results show that the proposed SOM implemented on the FPGA has a good quantization capability, and its circuit size very small.
FPGA Implementation of Heart Rate Monitoring System.
Panigrahy, D; Rakshit, M; Sahu, P K
2016-03-01
This paper describes a field programmable gate array (FPGA) implementation of a system that calculates the heart rate from Electrocardiogram (ECG) signal. After heart rate calculation, tachycardia, bradycardia or normal heart rate can easily be detected. ECG is a diagnosis tool routinely used to access the electrical activities and muscular function of the heart. Heart rate is calculated by detecting the R peaks from the ECG signal. To provide a portable and the continuous heart rate monitoring system for patients using ECG, needs a dedicated hardware. FPGA provides easy testability, allows faster implementation and verification option for implementing a new design. We have proposed a five-stage based methodology by using basic VHDL blocks like addition, multiplication and data conversion (real to the fixed point and vice-versa). Our proposed heart rate calculation (R-peak detection) method has been validated, using 48 first channel ECG records of the MIT-BIH arrhythmia database. It shows an accuracy of 99.84%, the sensitivity of 99.94% and the positive predictive value of 99.89%. Our proposed method outperforms other well-known methods in case of pathological ECG signals and successfully implemented in FPGA.
Kalinin, Stanislav; Kühnemuth, Ralf; Vardanyan, Hayk; Seidel, Claus A M
2012-09-01
We present a fast hardware photon correlator implemented in a field-programmable gate array (FPGA) combined with a compact confocal fluorescence setup. The correlator has two independent units with a time resolution of 4 ns while utilizing less than 15% of a low-end FPGA. The device directly accepts transistor-transistor logic (TTL) signals from two photon counting detectors and calculates two auto- or cross-correlation curves in real time. Test measurements demonstrate that the performance of our correlator is comparable with the current generation of commercial devices. The sensitivity of the optical setup is identical or even superior to current commercial devices. The FPGA design and the optical setup both allow for a straightforward extension to multi-color applications. This inexpensive and compact solution with a very good performance can serve as a versatile platform for uses in education, applied sciences, and basic research.
NASA Astrophysics Data System (ADS)
Kalinin, Stanislav; Kühnemuth, Ralf; Vardanyan, Hayk; Seidel, Claus A. M.
2012-09-01
We present a fast hardware photon correlator implemented in a field-programmable gate array (FPGA) combined with a compact confocal fluorescence setup. The correlator has two independent units with a time resolution of 4 ns while utilizing less than 15% of a low-end FPGA. The device directly accepts transistor-transistor logic (TTL) signals from two photon counting detectors and calculates two auto- or cross-correlation curves in real time. Test measurements demonstrate that the performance of our correlator is comparable with the current generation of commercial devices. The sensitivity of the optical setup is identical or even superior to current commercial devices. The FPGA design and the optical setup both allow for a straightforward extension to multi-color applications. This inexpensive and compact solution with a very good performance can serve as a versatile platform for uses in education, applied sciences, and basic research.
A Real-Time System for Lane Detection Based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Xiao, Jing; Li, Shutao; Sun, Bin
2016-12-01
This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.
Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir
2013-01-01
DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
Benrekia, Fayçal; Attari, Mokhtar; Bouhedda, Mounir
2013-01-01
This paper develops a primitive gas recognition system for discriminating between industrial gas species. The system under investigation consists of an array of eight micro-hotplate-based SnO2 thin film gas sensors with different selectivity patterns. The output signals are processed through a signal conditioning and analyzing system. These signals feed a decision-making classifier, which is obtained via a Field Programmable Gate Array (FPGA) with Very High-Speed Integrated Circuit Hardware Description Language. The classifier relies on a multilayer neural network based on a back propagation algorithm with one hidden layer of four neurons and eight neurons at the input and five neurons at the output. The neural network designed after implementation consists of twenty thousand gates. The achieved experimental results seem to show the effectiveness of the proposed classifier, which can discriminate between five industrial gases. PMID:23529119
A low power, area efficient fpga based beamforming technique for 1-D CMUT arrays.
Joseph, Bastin; Joseph, Jose; Vanjari, Siva Rama Krishna
2015-08-01
A low power area efficient digital beamformer targeting low frequency (2MHz) 1-D linear Capacitive Micromachined Ultrasonic Transducer (CMUT) array is developed. While designing the beamforming logic, the symmetry of the CMUT array is well exploited to reduce the area and power consumption. The proposed method is verified in Matlab by clocking an Arbitrary Waveform Generator(AWG). The architecture is successfully implemented in Xilinx Spartan 3E FPGA kit to check its functionality. The beamforming logic is implemented for 8, 16, 32, and 64 element CMUTs targeting Application Specific Integrated Circuit (ASIC) platform at Vdd 1.62V for UMC 90nm technology. It is observed that the proposed architecture consumes significantly lesser power and area (1.2895 mW power and 47134.4 μm(2) area for a 64 element digital beamforming circuit) compared to the conventional square root based algorithm.
Energy efficiency analysis and implementation of AES on an FPGA
NASA Astrophysics Data System (ADS)
Kenney, David
The Advanced Encryption Standard (AES) was developed by Joan Daemen and Vincent Rjimen and endorsed by the National Institute of Standards and Technology in 2001. It was designed to replace the aging Data Encryption Standard (DES) and be useful for a wide range of applications with varying throughput, area, power dissipation and energy consumption requirements. Field Programmable Gate Arrays (FPGAs) are flexible and reconfigurable integrated circuits that are useful for many different applications including the implementation of AES. Though they are highly flexible, FPGAs are often less efficient than Application Specific Integrated Circuits (ASICs); they tend to operate slower, take up more space and dissipate more power. There have been many FPGA AES implementations that focus on obtaining high throughput or low area usage, but very little research done in the area of low power or energy efficient FPGA based AES; in fact, it is rare for estimates on power dissipation to be made at all. This thesis presents a methodology to evaluate the energy efficiency of FPGA based AES designs and proposes a novel FPGA AES implementation which is highly flexible and energy efficient. The proposed methodology is implemented as part of a novel scripting tool, the AES Energy Analyzer, which is able to fully characterize the power dissipation and energy efficiency of FPGA based AES designs. Additionally, this thesis introduces a new FPGA power reduction technique called Opportunistic Combinational Operand Gating (OCOG) which is used in the proposed energy efficient implementation. The AES Energy Analyzer was able to estimate the power dissipation and energy efficiency of the proposed AES design during its most commonly performed operations. It was found that the proposed implementation consumes less energy per operation than any previous FPGA based AES implementations that included power estimations. Finally, the use of Opportunistic Combinational Operand Gating on an AES cipher was found to reduce its dynamic power consumption by up to 17% when compared to an identical design that did not employ the technique.
Choi, Kang-Il
2016-01-01
This paper proposes a pipelined non-deterministic finite automaton (NFA)-based string matching scheme using field programmable gate array (FPGA) implementation. The characteristics of the NFA such as shared common prefixes and no failure transitions are considered in the proposed scheme. In the implementation of the automaton-based string matching using an FPGA, each state transition is implemented with a look-up table (LUT) for the combinational logic circuit between registers. In addition, multiple state transitions between stages can be performed in a pipelined fashion. In this paper, it is proposed that multiple one-to-one state transitions, called merged state transitions, can be performed with an LUT. By cutting down the number of used LUTs for implementing state transitions, the hardware overhead of combinational logic circuits is greatly reduced in the proposed pipelined NFA-based string matching scheme. PMID:27695114
Kim, HyunJin; Choi, Kang-Il
2016-01-01
This paper proposes a pipelined non-deterministic finite automaton (NFA)-based string matching scheme using field programmable gate array (FPGA) implementation. The characteristics of the NFA such as shared common prefixes and no failure transitions are considered in the proposed scheme. In the implementation of the automaton-based string matching using an FPGA, each state transition is implemented with a look-up table (LUT) for the combinational logic circuit between registers. In addition, multiple state transitions between stages can be performed in a pipelined fashion. In this paper, it is proposed that multiple one-to-one state transitions, called merged state transitions, can be performed with an LUT. By cutting down the number of used LUTs for implementing state transitions, the hardware overhead of combinational logic circuits is greatly reduced in the proposed pipelined NFA-based string matching scheme.
High-Speed Current dq PI Controller for Vector Controlled PMSM Drive
Reaz, Mamun Bin Ibne; Rahman, Labonnah Farzana; Chang, Tae Gyu
2014-01-01
High-speed current controller for vector controlled permanent magnet synchronous motor (PMSM) is presented. The controller is developed based on modular design for faster calculation and uses fixed-point proportional-integral (PI) method for improved accuracy. Current dq controller is usually implemented in digital signal processor (DSP) based computer. However, DSP based solutions are reaching their physical limits, which are few microseconds. Besides, digital solutions suffer from high implementation cost. In this research, the overall controller is realizing in field programmable gate array (FPGA). FPGA implementation of the overall controlling algorithm will certainly trim down the execution time significantly to guarantee the steadiness of the motor. Agilent 16821A Logic Analyzer is employed to validate the result of the implemented design in FPGA. Experimental results indicate that the proposed current dq PI controller needs only 50 ns of execution time in 40 MHz clock, which is the lowest computational cycle for the era. PMID:24574913
Field programmable gate array-assigned complex-valued computation and its limits
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernard-Schwarz, Maria, E-mail: maria.bernardschwarz@ni.com; Institute of Applied Physics, TU Wien, Wiedner Hauptstrasse 8, 1040 Wien; Zwick, Wolfgang
We discuss how leveraging Field Programmable Gate Array (FPGA) technology as part of a high performance computing platform reduces latency to meet the demanding real time constraints of a quantum optics simulation. Implementations of complex-valued operations using fixed point numeric on a Virtex-5 FPGA compare favorably to more conventional solutions on a central processing unit. Our investigation explores the performance of multiple fixed point options along with a traditional 64 bits floating point version. With this information, the lowest execution times can be estimated. Relative error is examined to ensure simulation accuracy is maintained.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model.
Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid
2014-01-01
A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well.
NASA Astrophysics Data System (ADS)
Zou, Liang; Fu, Zhuang; Zhao, YanZheng; Yang, JunYan
2010-07-01
This paper proposes a kind of pipelined electric circuit architecture implemented in FPGA, a very large scale integrated circuit (VLSI), which efficiently deals with the real time non-uniformity correction (NUC) algorithm for infrared focal plane arrays (IRFPA). Dual Nios II soft-core processors and a DSP with a 64+ core together constitute this image system. Each processor undertakes own systematic task, coordinating its work with each other's. The system on programmable chip (SOPC) in FPGA works steadily under the global clock frequency of 96Mhz. Adequate time allowance makes FPGA perform NUC image pre-processing algorithm with ease, which has offered favorable guarantee for the work of post image processing in DSP. And at the meantime, this paper presents a hardware (HW) and software (SW) co-design in FPGA. Thus, this systematic architecture yields an image processing system with multiprocessor, and a smart solution to the satisfaction with the performance of the system.
A novel biomimetic sonarhead using beamforming technology to mimic bat echolocation.
Steckel, Jan; Peremans, Herbert
2012-07-01
A novel biomimetic sonarhead has been developed to allow researchers of bat echolocation behavior and biomimetic sonar to perform experiments with a system similar to the bat¿s sensory system. The bat's echolocation-related transfer function (ERTF) is implemented using an array of receivers to implement the head-related transfer function (HRTF), and an array of emitters mounted on a cylindrical manifold to implement the emission pattern of the bat. The complete system is controlled by a field-programmable gate array (FPGA) based embedded system connected through a USB interface.
NASA Astrophysics Data System (ADS)
Burri, Samuel; Homulle, Harald; Bruschini, Claudio; Charbon, Edoardo
2016-04-01
LinoSPAD is a reconfigurable camera sensor with a 256×1 CMOS SPAD (single-photon avalanche diode) pixel array connected to a low cost Xilinx Spartan 6 FPGA. The LinoSPAD sensor's line of pixels has a pitch of 24 μm and 40% fill factor. The FPGA implements an array of 64 TDCs and histogram engines capable of processing up to 8.5 giga-photons per second. The LinoSPAD sensor measures 1.68 mm×6.8 mm and each pixel has a direct digital output to connect to the FPGA. The chip is bonded on a carrier PCB to connect to the FPGA motherboard. 64 carry chain based TDCs sampled at 400 MHz can generate a timestamp every 7.5 ns with a mean time resolution below 25 ps per code. The 64 histogram engines provide time-of-arrival histograms covering up to 50 ns. An alternative mode allows the readout of 28 bit timestamps which have a range of up to 4.5 ms. Since the FPGA TDCs have considerable non-linearity we implemented a correction module capable of increasing histogram linearity at real-time. The TDC array is interfaced to a computer using a super-speed USB3 link to transfer over 150k histograms per second for the 12.5 ns reference period used in our characterization. After characterization and subsequent programming of the post-processing we measure an instrument response histogram shorter than 100 ps FWHM using a strong laser pulse with 50 ps FWHM. A timing resolution that when combined with the high fill factor makes the sensor well suited for a wide variety of applications from fluorescence lifetime microscopy over Raman spectroscopy to 3D time-of-flight.
Dynamically programmable cache
NASA Astrophysics Data System (ADS)
Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas
1998-10-01
Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
NASA Astrophysics Data System (ADS)
Kelly, Jamie S.; Bowman, Hiroshi C.; Rao, Vittal S.; Pottinger, Hardy J.
1997-06-01
Implementation issues represent an unfamiliar challenge to most control engineers, and many techniques for controller design ignore these issues outright. Consequently, the design of controllers for smart structural systems usually proceeds without regard for their eventual implementation, thus resulting either in serious performance degradation or in hardware requirements that squander power, complicate integration, and drive up cost. The level of integration assumed by the Smart Patch further exacerbates these difficulties, and any design inefficiency may render the realization of a single-package sensor-controller-actuator system infeasible. The goal of this research is to automate the controller implementation process and to relieve the design engineer of implementation concerns like quantization, computational efficiency, and device selection. We specifically target Field Programmable Gate Arrays (FPGAs) as our hardware platform because these devices are highly flexible, power efficient, and reprogrammable. The current study develops an automated implementation sequence that minimizes hardware requirements while maintaining controller performance. Beginning with a state space representation of the controller, the sequence automatically generates a configuration bitstream for a suitable FPGA implementation. MATLAB functions optimize and simulate the control algorithm before translating it into the VHSIC hardware description language. These functions improve power efficiency and simplify integration in the final implementation by performing a linear transformation that renders the controller computationally friendly. The transformation favors sparse matrices in order to reduce multiply operations and the hardware necessary to support them; simultaneously, the remaining matrix elements take on values that minimize limit cycles and parameter sensitivity. The proposed controller design methodology is implemented on a simple cantilever beam test structure using FPGA hardware. The experimental closed loop response is compared with that of an automated FPGA controller implementation. Finally, we explore the integration of FPGA based controllers into a multi-chip module, which we believe represents the next step towards the realization of the Smart Patch.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-01-01
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-12-15
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
NASA Astrophysics Data System (ADS)
Rais, Muhammad H.
2010-06-01
This paper presents Field Programmable Gate Array (FPGA) implementation of standard and truncated multipliers using Very High Speed Integrated Circuit Hardware Description Language (VHDL). Truncated multiplier is a good candidate for digital signal processing (DSP) applications such as finite impulse response (FIR) and discrete cosine transform (DCT). Remarkable reduction in FPGA resources, delay, and power can be achieved using truncated multipliers instead of standard parallel multipliers when the full precision of the standard multiplier is not required. The truncated multipliers show significant improvement as compared to standard multipliers. Results show that the anomaly in Spartan-3 AN average connection and maximum pin delay have been efficiently reduced in Virtex-4 device.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model
Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid
2014-01-01
A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well. PMID:25484854
FPGA Coprocessor for Accelerated Classification of Images
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.
2008-01-01
An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.
Implementing a Digital Phasemeter in an FPGA
NASA Technical Reports Server (NTRS)
Rao, Shanti R.
2008-01-01
Firmware for implementing a digital phasemeter within a field-programmable gate array (FPGA) has been devised. In the original application of this firmware, the phase that one seeks to measure is the difference between the phases of two nominally-equal-frequency heterodyne signals generated by two interferometers. In that application, zero-crossing detectors convert the heterodyne signals to trains of rectangular pulses, the two pulse trains are fed to a fringe counter (the major part of the phasemeter) controlled by a clock signal having a frequency greater than the heterodyne frequency, and the fringe counter computes a time-averaged estimate of the difference between the phases of the two pulse trains. The firmware also does the following: Causes the FPGA to compute the frequencies of the input signals; Causes the FPGA to implement an Ethernet (or equivalent) transmitter for readout of phase and frequency values; and Provides data for use in diagnosis of communication failures. The readout rate can be set, by programming, to a value between 250 Hz and 1 kHz. Network addresses can be programmed by the user.
Real-time implementation of camera positioning algorithm based on FPGA & SOPC
NASA Astrophysics Data System (ADS)
Yang, Mingcao; Qiu, Yuehong
2014-09-01
In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
An FPGA Implementation to Detect Selective Cationic Antibacterial Peptides
Polanco González, Carlos; Nuño Maganda, Marco Aurelio; Arias-Estrada, Miguel; del Rio, Gabriel
2011-01-01
Exhaustive prediction of physicochemical properties of peptide sequences is used in different areas of biological research. One example is the identification of selective cationic antibacterial peptides (SCAPs), which may be used in the treatment of different diseases. Due to the discrete nature of peptide sequences, the physicochemical properties calculation is considered a high-performance computing problem. A competitive solution for this class of problems is to embed algorithms into dedicated hardware. In the present work we present the adaptation, design and implementation of an algorithm for SCAPs prediction into a Field Programmable Gate Array (FPGA) platform. Four physicochemical properties codes useful in the identification of peptide sequences with potential selective antibacterial activity were implemented into an FPGA board. The speed-up gained in a single-copy implementation was up to 108 times compared with a single Intel processor cycle for cycle. The inherent scalability of our design allows for replication of this code into multiple FPGA cards and consequently improvements in speed are possible. Our results show the first embedded SCAPs prediction solution described and constitutes the grounds to efficiently perform the exhaustive analysis of the sequence-physicochemical properties relationship of peptides. PMID:21738652
An Efficient Pipeline Wavefront Phase Recovery for the CAFADIS Camera for Extremely Large Telescopes
Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel
2010-01-01
In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations. PMID:22315523
Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel
2010-01-01
In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations.
NASA Astrophysics Data System (ADS)
Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.
2008-04-01
The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
NASA Astrophysics Data System (ADS)
Chen, Yuan-Ho
2017-05-01
In this work, we propose a counting-weighted calibration method for field-programmable-gate-array (FPGA)-based time-to-digital converter (TDC) to provide non-linearity calibration for use in positron emission tomography (PET) scanners. To deal with the non-linearity in FPGA, we developed a counting-weighted delay line (CWD) to count the delay time of the delay cells in the TDC in order to reduce the differential non-linearity (DNL) values based on code density counts. The performance of the proposed CWD-TDC with regard to linearity far exceeds that of TDC with a traditional tapped delay line (TDL) architecture, without the need for nonlinearity calibration. When implemented in a Xilinx Vertix-5 FPGA device, the proposed CWD-TDC achieved time resolution of 60 ps with integral non-linearity (INL) and DNL of [-0.54, 0.24] and [-0.66, 0.65] least-significant-bit (LSB), respectively. This is a clear indication of the suitability of the proposed FPGA-based CWD-TDC for use in PET scanners.
NASA Astrophysics Data System (ADS)
Deliparaschos, Kyriakos M.; Michail, Konstantinos; Zolotas, Argyrios C.; Tzafestas, Spyros G.
2016-05-01
This work presents a field programmable gate array (FPGA)-based embedded software platform coupled with a software-based plant, forming a hardware-in-the-loop (HIL) that is used to validate a systematic sensor selection framework. The systematic sensor selection framework combines multi-objective optimization, linear-quadratic-Gaussian (LQG)-type control, and the nonlinear model of a maglev suspension. A robustness analysis of the closed-loop is followed (prior to implementation) supporting the appropriateness of the solution under parametric variation. The analysis also shows that quantization is robust under different controller gains. While the LQG controller is implemented on an FPGA, the physical process is realized in a high-level system modeling environment. FPGA technology enables rapid evaluation of the algorithms and test designs under realistic scenarios avoiding heavy time penalty associated with hardware description language (HDL) simulators. The HIL technique facilitates significant speed-up in the required execution time when compared to its software-based counterpart model.
Large-N correlator systems for low frequency radio astronomy
NASA Astrophysics Data System (ADS)
Foster, Griffin
Low frequency radio astronomy has entered a second golden age driven by the development of a new class of large-N interferometric arrays. The low frequency array (LOFAR) and a number of redshifted HI Epoch of Reionization (EoR) arrays are currently undergoing commission and regularly observing. Future arrays of unprecedented sensitivity and resolutions at low frequencies, such as the square kilometer array (SKA) and the hydrogen epoch of reionization array (HERA), are in development. The combination of advancements in specialized field programmable gate array (FPGA) hardware for signal processing, computing and graphics processing unit (GPU) resources, and new imaging and calibration algorithms has opened up the oft underused radio band below 300 MHz. These interferometric arrays require efficient implementation of digital signal processing (DSP) hardware to compute the baseline correlations. FPGA technology provides an optimal platform to develop new correlators. The significant growth in data rates from these systems requires automated software to reduce the correlations in real time before storing the data products to disk. Low frequency, widefield observations introduce a number of unique calibration and imaging challenges. The efficient implementation of FX correlators using FPGA hardware is presented. Two correlators have been developed, one for the 32 element BEST-2 array at Medicina Observatory and the other for the 96 element LOFAR station at Chilbolton Observatory. In addition, calibration and imaging software has been developed for each system which makes use of the radio interferometry measurement equation (RIME) to derive calibrations. A process for generating sky maps from widefield LOFAR station observations is presented. Shapelets, a method of modelling extended structures such as resolved sources and beam patterns has been adapted for radio astronomy use to further improve system calibration. Scaling of computing technology allows for the development of larger correlator systems, which in turn allows for improvements in sensitivity and resolution. This requires new calibration techniques which account for a broad range of systematic effects.
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix
Zhang, Min; Wang, Hai; Liu, Yan
2017-01-01
In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time. PMID:28420121
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix.
Zhang, Min; Wang, Hai; Liu, Yan
2017-04-14
In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time.
FPGA in-the-loop simulations of cardiac excitation model under voltage clamp conditions
NASA Astrophysics Data System (ADS)
Othman, Norliza; Adon, Nur Atiqah; Mahmud, Farhanahani
2017-01-01
Voltage clamp technique allows the detection of single channel currents in biological membranes in identifying variety of electrophysiological problems in the cellular level. In this paper, a simulation study of the voltage clamp technique has been presented to analyse current-voltage (I-V) characteristics of ion currents based on Luo-Rudy Phase-I (LR-I) cardiac model by using a Field Programmable Gate Array (FPGA). Nowadays, cardiac models are becoming increasingly complex which can cause a vast amount of time to run the simulation. Thus, a real-time hardware implementation using FPGA could be one of the best solutions for high-performance real-time systems as it provides high configurability and performance, and able to executes in parallel mode operation. For shorter time development while retaining high confidence results, FPGA-based rapid prototyping through HDL Coder from MATLAB software has been used to construct the algorithm for the simulation system. Basically, the HDL Coder is capable to convert the designed MATLAB Simulink blocks into hardware description language (HDL) for the FPGA implementation. As a result, the voltage-clamp fixed-point design of LR-I model has been successfully conducted in MATLAB Simulink and the simulation of the I-V characteristics of the ionic currents has been verified on Xilinx FPGA Virtex-6 XC6VLX240T development board through an FPGA-in-the-loop (FIL) simulation.
Implementing a Microcontroller Watchdog with a Field-Programmable Gate Array (FPGA)
NASA Technical Reports Server (NTRS)
Straka, Bartholomew
2013-01-01
Reliability is crucial to safety. Redundancy of important system components greatly enhances reliability and hence safety. Field-Programmable Gate Arrays (FPGAs) are useful for monitoring systems and handling the logic necessary to keep them running with minimal interruption when individual components fail. A complete microcontroller watchdog with logic for failure handling can be implemented in a hardware description language (HDL.). HDL-based designs are vendor-independent and can be used on many FPGAs with low overhead.
Implementation of LSCMA adaptive array terminal for mobile satellite communications
NASA Astrophysics Data System (ADS)
Zhou, Shun; Wang, Huali; Xu, Zhijun
2007-11-01
This paper considers the application of adaptive array antenna based on the least squares constant modulus algorithm (LSCMA) for interference rejection in mobile SATCOM terminals. A two-element adaptive array scheme is implemented with a combination of ADI TS201S DSP chips and Altera Stratix II FPGA device, which makes a cooperating computation for adaptive beamforming. Its interference suppressing performance is verified via Matlab simulations. Digital hardware system is implemented to execute the operations of LSCMA beamforming algorithm that is represented by an algorithm flowchart. The result of simulations and test indicate that this scheme can improve the anti-jamming performance of terminals.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Kuang, Jie; Liu, Chong; Cao, Qiang; Li, Deng
2017-03-01
A high performance multi-channel time-to-digital converter (TDC) is implemented in a Xilinx Zynq-7000 field programmable gate array (FPGA). It can be flexibly configured as either 32 TDC channels with 9.9 ps time-interval RMS precision, 16 TDC channels with 6.9 ps RMS precision, or 8 TDC channels with 5.8 ps RMS precision. All TDCs have a 380 M Samples/second measurement throughput and a 2.63 ns measurement dead time. The performance consistency and temperature dependence of TDC channels are also evaluated. Because Zynq-7000 FPGA family integrates a feature-rich dual-core ARM based processing system and 28 nm Xilinx programmable logic in a single device, the realization of high performance TDCs on it will make the platform more widely used in time-measuring related applications.
FPGA implementation of adaptive beamforming in hearing aids.
Samtani, Kartik; Thomas, Jobin; Varma, G Abhinav; Sumam, David S; Deepu, S P
2017-07-01
Beamforming is a spatial filtering technique used in hearing aids to improve target sound reception by reducing interference from other directions. In this paper we propose improvements in an existing architecture present for two omnidirectional microphone array based adaptive beamforming for hearing aid applications and implement the same on Xilinx Artix 7 FPGA using VHDL coding and Xilinx Vivado ® 2015.2. The nulls are introduced in particular directions by combination of two fixed polar patterns. This combination can be adaptively controlled to steer the null in the direction of noise. The beamform patterns and improvements in SNR values obtained from experiments in a conference room environment are analyzed.
Hardware Prototyping of Neural Network based Fetal Electrocardiogram Extraction
NASA Astrophysics Data System (ADS)
Hasan, M. A.; Reaz, M. B. I.
2012-01-01
The aim of this paper is to model the algorithm for Fetal ECG (FECG) extraction from composite abdominal ECG (AECG) using VHDL (Very High Speed Integrated Circuit Hardware Description Language) for FPGA (Field Programmable Gate Array) implementation. Artificial Neural Network that provides efficient and effective ways of separating FECG signal from composite AECG signal has been designed. The proposed method gives an accuracy of 93.7% for R-peak detection in FHR monitoring. The designed VHDL model is synthesized and fitted into Altera's Stratix II EP2S15F484C3 using the Quartus II version 8.0 Web Edition for FPGA implementation.
Applying a Genetic Algorithm to Reconfigurable Hardware
NASA Technical Reports Server (NTRS)
Wells, B. Earl; Weir, John; Trevino, Luis; Patrick, Clint; Steincamp, Jim
2004-01-01
This paper investigates the feasibility of applying genetic algorithms to solve optimization problems that are implemented entirely in reconfgurable hardware. The paper highlights the pe$ormance/design space trade-offs that must be understood to effectively implement a standard genetic algorithm within a modem Field Programmable Gate Array, FPGA, reconfgurable hardware environment and presents a case-study where this stochastic search technique is applied to standard test-case problems taken from the technical literature. In this research, the targeted FPGA-based platform and high-level design environment was the Starbridge Hypercomputing platform, which incorporates multiple Xilinx Virtex II FPGAs, and the Viva TM graphical hardware description language.
Hardware Implementation of Lossless Adaptive and Scalable Hyperspectral Data Compression for Space
NASA Technical Reports Server (NTRS)
Aranki, Nazeeh; Keymeulen, Didier; Bakhshi, Alireza; Klimesh, Matthew
2009-01-01
On-board lossless hyperspectral data compression reduces data volume in order to meet NASA and DoD limited downlink capabilities. The technique also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware. A modified form of the algorithm that is better suited for data from pushbroom instruments is generally appropriate for flight implementation. A scalable field programmable gate array (FPGA) hardware implementation was developed. The FPGA implementation achieves a throughput performance of 58 Msamples/sec, which can be increased to over 100 Msamples/sec in a parallel implementation that uses twice the hardware resources This paper describes the hardware implementation of the 'Modified Fast Lossless' compression algorithm on an FPGA. The FPGA implementation targets the current state-of-the-art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for space applications.
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform
NASA Technical Reports Server (NTRS)
White, Michael J.
2004-01-01
This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
Design of an FPGA-based electronic flow regulator (EFR) for spacecraft propulsion system
NASA Astrophysics Data System (ADS)
Manikandan, J.; Jayaraman, M.; Jayachandran, M.
2011-02-01
This paper describes a scheme for electronically regulating the flow of propellant to the thruster from a high-pressure storage tank used in spacecraft application. Precise flow delivery of propellant to thrusters ensures propulsion system operation at best efficiency by maximizing the propellant and power utilization for the mission. The proposed field programmable gate array (FPGA) based electronic flow regulator (EFR) is used to ensure precise flow of propellant to the thrusters from a high-pressure storage tank used in spacecraft application. This paper presents hardware and software design of electronic flow regulator and implementation of the regulation logic onto an FPGA.Motivation for proposed FPGA-based electronic flow regulation is on the disadvantages of conventional approach of using analog circuits. Digital flow regulation overcomes the analog equivalent as digital circuits are highly flexible, are not much affected due to noise, accurate performance is repeatable, interface is easier to computers, storing facilities are possible and finally failure rate of digital circuits is less. FPGA has certain advantages over ASIC and microprocessor/micro-controller that motivated us to opt for FPGA-based electronic flow regulator. Also the control algorithm being software, it is well modifiable without changing the hardware. This scheme is simple enough to adopt for a wide range of applications, where the flow is to be regulated for efficient operation.The proposed scheme is based on a space-qualified re-configurable field programmable gate arrays (FPGA) and hybrid micro circuit (HMC). A graphical user interface (GUI) based application software is also developed for debugging, monitoring and controlling the electronic flow regulator from PC COM port.
Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research
Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M
2008-01-01
Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation. PMID:18412963
FPGA implementation of motifs-based neuronal network and synchronization analysis
NASA Astrophysics Data System (ADS)
Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao
2016-06-01
Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.
Implementation of a high precision multi-measurement time-to-digital convertor on a Kintex-7 FPGA
NASA Astrophysics Data System (ADS)
Kuang, Jie; Wang, Yonggang; Cao, Qiang; Liu, Chong
2018-05-01
Time-to-digital convertors (TDCs) based on field programmable gate array (FPGA) are becoming more and more popular. Multi-measurement is an effective method to improve TDC precision beyond the cell delay limitation. However, the implementation of TDC with multi-measurement on FPGAs manufactured with 28 nm and more advanced process is facing new challenges. Benefiting from the ones-counter encoding scheme, which was developed in our previous work, we implement a ring oscillator multi-measurement TDC on a Xilinx Kintex-7 FPGA. Using the two TDC channels to measure time-intervals in the range (0 ns-30 ns), the average RMS precision can be improved to 5.76 ps, meanwhile the logic resource usage remains the same with the one-measurement TDC, and the TDC dead time is only 22 ns. The investigation demonstrates that the multi-measurement methods are still available for current main-stream FPGAs. Furthermore, the new implementation in this paper could make the trade-off among the time precision, resource usage and TDC dead time better than ever before.
The dynamical analysis of modified two-compartment neuron model and FPGA implementation
NASA Astrophysics Data System (ADS)
Lin, Qianjin; Wang, Jiang; Yang, Shuangming; Yi, Guosheng; Deng, Bin; Wei, Xile; Yu, Haitao
2017-10-01
The complexity of neural models is increasing with the investigation of larger biological neural network, more various ionic channels and more detailed morphologies, and the implementation of biological neural network is a task with huge computational complexity and power consumption. This paper presents an efficient digital design using piecewise linearization on field programmable gate array (FPGA), to succinctly implement the reduced two-compartment model which retains essential features of more complicated models. The design proposes an approximate neuron model which is composed of a set of piecewise linear equations, and it can reproduce different dynamical behaviors to depict the mechanisms of a single neuron model. The consistency of hardware implementation is verified in terms of dynamical behaviors and bifurcation analysis, and the simulation results including varied ion channel characteristics coincide with the biological neuron model with a high accuracy. Hardware synthesis on FPGA demonstrates that the proposed model has reliable performance and lower hardware resource compared with the original two-compartment model. These investigations are conducive to scalability of biological neural network in reconfigurable large-scale neuromorphic system.
Goavec-Mérou, G; Chrétien, N; Friedt, J-M; Sandoz, P; Martin, G; Lenczner, M; Ballandras, S
2014-01-01
Vibrating mechanical structure characterization is demonstrated using contactless techniques best suited for mobile and rotating equipments. Fast measurement rates are achieved using Field Programmable Gate Array (FPGA) devices as real-time digital signal processors. Two kinds of algorithms are implemented on FPGA and experimentally validated in the case of the vibrating tuning fork. A first application concerns in-plane displacement detection by vision with sampling rates above 10 kHz, thus reaching frequency ranges above the audio range. A second demonstration concerns pulsed-RADAR cooperative target phase detection and is applied to radiofrequency acoustic transducers used as passive wireless strain gauges. In this case, the 250 ksamples/s refresh rate achieved is only limited by the acoustic sensor design but not by the detection bandwidth. These realizations illustrate the efficiency, interest, and potentialities of FPGA-based real-time digital signal processing for the contactless interrogation of passive embedded probes with high refresh rates.
A mixed-signal implementation of a polychronous spiking neural network with delay adaptation
Wang, Runchun M.; Hamilton, Tara J.; Tapson, Jonathan C.; van Schaik, André
2014-01-01
We present a mixed-signal implementation of a re-configurable polychronous spiking neural network capable of storing and recalling spatio-temporal patterns. The proposed neural network contains one neuron array and one axon array. Spike Timing Dependent Delay Plasticity is used to fine-tune delays and add dynamics to the network. In our mixed-signal implementation, the neurons and axons have been implemented as both analog and digital circuits. The system thus consists of one FPGA, containing the digital neuron array and the digital axon array, and one analog IC containing the analog neuron array and the analog axon array. The system can be easily configured to use different combinations of each. We present and discuss the experimental results of all combinations of the analog and digital axon arrays and the analog and digital neuron arrays. The test results show that the proposed neural network is capable of successfully recalling more than 85% of stored patterns using both analog and digital circuits. PMID:24672422
A mixed-signal implementation of a polychronous spiking neural network with delay adaptation.
Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan C; van Schaik, André
2014-01-01
We present a mixed-signal implementation of a re-configurable polychronous spiking neural network capable of storing and recalling spatio-temporal patterns. The proposed neural network contains one neuron array and one axon array. Spike Timing Dependent Delay Plasticity is used to fine-tune delays and add dynamics to the network. In our mixed-signal implementation, the neurons and axons have been implemented as both analog and digital circuits. The system thus consists of one FPGA, containing the digital neuron array and the digital axon array, and one analog IC containing the analog neuron array and the analog axon array. The system can be easily configured to use different combinations of each. We present and discuss the experimental results of all combinations of the analog and digital axon arrays and the analog and digital neuron arrays. The test results show that the proposed neural network is capable of successfully recalling more than 85% of stored patterns using both analog and digital circuits.
High-performance reconfigurable coincidence counting unit based on a field programmable gate array.
Park, Byung Kwon; Kim, Yong-Su; Kwon, Osung; Han, Sang-Wook; Moon, Sung
2015-05-20
We present a high-performance reconfigurable coincidence counting unit (CCU) using a low-end field programmable gate array (FPGA) and peripheral circuits. Because of the flexibility guaranteed by the FPGA program, we can easily change system parameters, such as internal input delays, coincidence configurations, and the coincidence time window. In spite of a low-cost implementation, the proposed CCU architecture outperforms previous ones in many aspects: it has 8 logic inputs and 4 coincidence outputs that can measure up to eight-fold coincidences. The minimum coincidence time window and the maximum input frequency are 0.47 ns and 163 MHz, respectively. The CCU will be useful in various experimental research areas, including the field of quantum optics and quantum information.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
FPGA-Based Filterbank Implementation for Parallel Digital Signal Processing
NASA Technical Reports Server (NTRS)
Berner, Stephan; DeLeon, Phillip
1999-01-01
One approach to parallel digital signal processing decomposes a high bandwidth signal into multiple lower bandwidth (rate) signals by an analysis bank. After processing, the subband signals are recombined into a fullband output signal by a synthesis bank. This paper describes an implementation of the analysis and synthesis banks using (Field Programmable Gate Arrays) FPGAs.
NASA Technical Reports Server (NTRS)
Berg, M.; Kim, H.; Phan, A.; Seidleck, C.; LaBel, K.; Pellish, J.; Campola, M.
2015-01-01
Space applications are complex systems that require intricate trade analyses for optimum implementations. We focus on a subset of the trade process, using classical reliability theory and SEU data, to illustrate appropriate TMR scheme selection.
Rapid and highly integrated FPGA-based Shack-Hartmann wavefront sensor for adaptive optics system
NASA Astrophysics Data System (ADS)
Chen, Yi-Pin; Chang, Chia-Yuan; Chen, Shean-Jen
2018-02-01
In this study, a field programmable gate array (FPGA)-based Shack-Hartmann wavefront sensor (SHWS) programmed on LabVIEW can be highly integrated into customized applications such as adaptive optics system (AOS) for performing real-time wavefront measurement. Further, a Camera Link frame grabber embedded with FPGA is adopted to enhance the sensor speed reacting to variation considering its advantage of the highest data transmission bandwidth. Instead of waiting for a frame image to be captured by the FPGA, the Shack-Hartmann algorithm are implemented in parallel processing blocks design and let the image data transmission synchronize with the wavefront reconstruction. On the other hand, we design a mechanism to control the deformable mirror in the same FPGA and verify the Shack-Hartmann sensor speed by controlling the frequency of the deformable mirror dynamic surface deformation. Currently, this FPGAbead SHWS design can achieve a 266 Hz cyclic speed limited by the camera frame rate as well as leaves 40% logic slices for additionally flexible design.
Design of a system based on DSP and FPGA for video recording and replaying
NASA Astrophysics Data System (ADS)
Kang, Yan; Wang, Heng
2013-08-01
This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA for video recording and replaying has a considerable perspective in analysis after the event, simulated exercitation and so forth.
Testing Microshutter Arrays Using Commercial FPGA Hardware
NASA Technical Reports Server (NTRS)
Rapchun, David
2008-01-01
NASA is developing micro-shutter arrays for the Near Infrared Spectrometer (NIRSpec) instrument on the James Webb Space Telescope (JWST). These micro-shutter arrays allow NIRspec to do Multi Object Spectroscopy, a key part of the mission. Each array consists of 62414 individual 100 x 200 micron shutters. These shutters are magnetically opened and held electrostatically. Individual shutters are then programmatically closed using a simple row/column addressing technique. A common approach to provide these data/clock patterns is to use a Field Programmable Gate Array (FPGA). Such devices require complex VHSIC Hardware Description Language (VHDL) programming and custom electronic hardware. Due to JWST's rapid schedule on the development of the micro-shutters, rapid changes were required to the FPGA code to facilitate new approaches being discovered to optimize the array performance. Such rapid changes simply could not be made using conventional VHDL programming. Subsequently, National Instruments introduced an FPGA product that could be programmed through a Labview interface. Because Labview programming is considerably easier than VHDL programming, this method was adopted and brought success. The software/hardware allowed the rapid change the FPGA code and timely results of new micro-shutter array performance data. As a result, numerous labor hours and money to the project were conserved.
An ultra-low cost NMR device with arbitrary pulse programming
NASA Astrophysics Data System (ADS)
Chen, Hsueh-Ying; Kim, Yaewon; Nath, Pulak; Hilty, Christian
2015-06-01
Ultra-low cost, general purpose electronics boards featuring microprocessors or field programmable gate arrays (FPGA) are reaching capabilities sufficient for direct implementation of NMR spectrometers. We demonstrate a spectrometer based on such a board, implemented with a minimal need for the addition of custom electronics and external components. This feature allows such a spectrometer to be readily implemented using typical knowledge present in an NMR laboratory. With FPGA technology, digital tasks are performed with precise timing, without the limitation of predetermined hardware function. In this case, the FPGA is used for programming of arbitrarily timed pulse sequence events, and to digitally generate required frequencies. Data acquired from a 0.53 T permanent magnet serves as a demonstration of the flexibility of pulse programming for diverse experiments. Pulse sequences applied include a spin-lattice relaxation measurement using a pulse train with small-flip angle pulses, and a Carr-Purcell-Meiboom-Gill experiment with phase cycle. Mixing of NMR signals with a digitally generated, 4-step phase-cycled reference frequency is further implemented to achieve sequential quadrature detection. The flexibility in hardware implementation permits tailoring this type of spectrometer for applications such as relaxometry, polarimetry, diffusometry or NMR based magnetometry.
FPGA implementation of neuro-fuzzy system with improved PSO learning.
Karakuzu, Cihan; Karakaya, Fuat; Çavuşlu, Mehmet Ali
2016-07-01
This paper presents the first hardware implementation of neuro-fuzzy system (NFS) with its metaheuristic learning ability on field programmable gate array (FPGA). Metaheuristic learning of NFS for all of its parameters is accomplished by using the improved particle swarm optimization (iPSO). As a second novelty, a new functional approach, which does not require any memory and multiplier usage, is proposed for the Gaussian membership functions of NFS. NFS and its learning using iPSO are implemented on Xilinx Virtex5 xc5vlx110-3ff1153 and efficiency of the proposed implementation tested on two dynamic system identification problems and licence plate detection problem as a practical application. Results indicate that proposed NFS implementation and membership function approximation is as effective as the other approaches available in the literature but requires less hardware resources. Copyright © 2016 Elsevier Ltd. All rights reserved.
Programmable logic controller performance enhancement by field programmable gate array based design.
Patel, Dhruv; Bhatt, Jignesh; Trivedi, Sanjay
2015-01-01
PLC, the core element of modern automation systems, due to serial execution, exhibits limitations like slow speed and poor scan time. Improved PLC design using FPGA has been proposed based on parallel execution mechanism for enhancement of performance and flexibility. Modelsim as simulation platform and VHDL used to translate, integrate and implement the logic circuit in FPGA. Xilinx's Spartan kit for implementation-testing and VB has been used for GUI development. Salient merits of the design include cost-effectiveness, miniaturization, user-friendliness, simplicity, along with lower power consumption, smaller scan time and higher speed. Various functionalities and applications like typical PLC and industrial alarm annunciator have been developed and successfully tested. Results of simulation, design and implementation have been reported. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Adaptive Proactive Inhibitory Control for Embedded Real-Time Applications
Yang, Shufan; McGinnity, T. Martin; Wong-Lin, KongFatt
2012-01-01
Psychologists have studied the inhibitory control of voluntary movement for many years. In particular, the countermanding of an impending action has been extensively studied. In this work, we propose a neural mechanism for adaptive inhibitory control in a firing-rate type model based on current findings in animal electrophysiological and human psychophysical experiments. We then implement this model on a field-programmable gate array (FPGA) prototyping system, using dedicated real-time hardware circuitry. Our results show that the FPGA-based implementation can run in real-time while achieving behavioral performance qualitatively suggestive of the animal experiments. Implementing such biological inhibitory control in an embedded device can lead to the development of control systems that may be used in more realistic cognitive robotics or in neural prosthetic systems aiding human movement control. PMID:22701420
Design of time interval generator based on hybrid counting method
NASA Astrophysics Data System (ADS)
Yao, Yuan; Wang, Zhaoqi; Lu, Houbing; Chen, Lian; Jin, Ge
2016-10-01
Time Interval Generators (TIGs) are frequently used for the characterizations or timing operations of instruments in particle physics experiments. Though some "off-the-shelf" TIGs can be employed, the necessity of a custom test system or control system makes the TIGs, being implemented in a programmable device desirable. Nowadays, the feasibility of using Field Programmable Gate Arrays (FPGAs) to implement particle physics instrumentation has been validated in the design of Time-to-Digital Converters (TDCs) for precise time measurement. The FPGA-TDC technique is based on the architectures of Tapped Delay Line (TDL), whose delay cells are down to few tens of picosecond. In this case, FPGA-based TIGs with high delay step are preferable allowing the implementation of customized particle physics instrumentations and other utilities on the same FPGA device. A hybrid counting method for designing TIGs with both high resolution and wide range is presented in this paper. The combination of two different counting methods realizing an integratable TIG is described in detail. A specially designed multiplexer for tap selection is emphatically introduced. The special structure of the multiplexer is devised for minimizing the different additional delays caused by the unpredictable routings from different taps to the output. A Kintex-7 FPGA is used for the hybrid counting-based implementation of a TIG, providing a resolution up to 11 ps and an interval range up to 8 s.
Design and implementation of a programming circuit in radiation-hardened FPGA
NASA Astrophysics Data System (ADS)
Lihua, Wu; Xiaowei, Han; Yan, Zhao; Zhongli, Liu; Fang, Yu; Chen, Stanley L.
2011-08-01
We present a novel programming circuit used in our radiation-hardened field programmable gate array (FPGA) chip. This circuit provides the ability to write user-defined configuration data into an FPGA and then read it back. The proposed circuit adopts the direct-access programming point scheme instead of the typical long token shift register chain. It not only saves area but also provides more flexible configuration operations. By configuring the proposed partial configuration control register, our smallest configuration section can be conveniently configured as a single data and a flexible partial configuration can be easily implemented. The hierarchical simulation scheme, optimization of the critical path and the elaborate layout plan make this circuit work well. Also, the radiation hardened by design programming point is introduced. This circuit has been implemented in a static random access memory (SRAM)-based FPGA fabricated by a 0.5 μm partial-depletion silicon-on-insulator CMOS process. The function test results of the fabricated chip indicate that this programming circuit successfully realizes the desired functions in the configuration and read-back. Moreover, the radiation test results indicate that the programming circuit has total dose tolerance of 1 × 105 rad(Si), dose rate survivability of 1.5 × 1011 rad(Si)/s and neutron fluence immunity of 1 × 1014 n/cm2.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Liu, Chong
2016-10-01
The common solution for a field programmable gate array (FPGA)-based time-to-digital converter (TDC) is constructing a tapped delay line (TDL) for time interpolation to yield a sub-clock time resolution. The granularity and uniformity of the delay elements of TDL determine the TDC time resolution. In this paper, we propose a dual-sampling TDL architecture and a bin decimation method that could make the delay elements as small and uniform as possible, so that the implemented TDCs can achieve a high time resolution beyond the intrinsic cell delay. Two identical full hardware-based TDCs were implemented in a Xilinx UltraScale FPGA for performance evaluation. For fixed time intervals in the range from 0 to 440 ns, the average time-interval RMS resolution is measured by the two TDCs with 4.2 ps, thus the timestamp resolution of single TDC is derived as 2.97 ps. The maximum hit rate of the TDC is as high as half the system clock rate of FPGA, namely 250 MHz in our demo prototype. Because the conventional online bin-by-bin calibration is not needed, the implementation of the proposed TDC is straightforward and relatively resource-saving.
Real-time implementation of a multispectral mine target detection algorithm
NASA Astrophysics Data System (ADS)
Samson, Joseph W.; Witter, Lester J.; Kenton, Arthur C.; Holloway, John H., Jr.
2003-09-01
Spatial-spectral anomaly detection (the "RX Algorithm") has been exploited on the USMC's Coastal Battlefield Reconnaissance and Analysis (COBRA) Advanced Technology Demonstration (ATD) and several associated technology base studies, and has been found to be a useful method for the automated detection of surface-emplaced antitank land mines in airborne multispectral imagery. RX is a complex image processing algorithm that involves the direct spatial convolution of a target/background mask template over each multispectral image, coupled with a spatially variant background spectral covariance matrix estimation and inversion. The RX throughput on the ATD was about 38X real time using a single Sun UltraSparc system. A goal to demonstrate RX in real-time was begun in FY01. We now report the development and demonstration of a Field Programmable Gate Array (FPGA) solution that achieves a real-time implementation of the RX algorithm at video rates using COBRA ATD data. The approach uses an Annapolis Microsystems Firebird PMC card containing a Xilinx XCV2000E FPGA with over 2,500,000 logic gates and 18MBytes of memory. A prototype system was configured using a Tek Microsystems VME board with dual-PowerPC G4 processors and two PMC slots. The RX algorithm was translated from its C programming implementation into the VHDL language and synthesized into gates that were loaded into the FPGA. The VHDL/synthesizer approach allows key RX parameters to be quickly changed and a new implementation automatically generated. Reprogramming the FPGA is done rapidly and in-circuit. Implementation of the RX algorithm in a single FPGA is a major first step toward achieving real-time land mine detection.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less
NASA Astrophysics Data System (ADS)
Sklavos, N.; Selimis, G.; Koufopavlou, O.
2005-01-01
The explosive growth of internet and consumer demand for mobility has fuelled the exponential growth of wireless communications and networks. Mobile users want access to services and information, from both internet and personal devices, from a range of locations without the use of a cable medium. IEEE 802.11 is one of the most widely used wireless standards of our days. The amount of access and mobility into wireless networks requires a security infrastructure that protects communication within that network. The security of this protocol is based on the wired equivalent privacy (WEP) scheme. Currently, all the IEEE 802.11 market products support WEP. But recently, the 802.11i working group introduced the advanced encryption standard (AES), as the security scheme for the future IEEE 802.11 applications. In this paper, the hardware integrations of WEP and AES are studied. A field programmable gate array (FPGA) device has been used as the hardware implementation platform, for a fair comparison between the two security schemes. Measurements for the FPGA implementation cost, operating frequency, power consumption and performance are given.
Small Microprocessor for ASIC or FPGA Implementation
NASA Technical Reports Server (NTRS)
Kleyner, Igor; Katz, Richard; Blair-Smith, Hugh
2011-01-01
A small microprocessor, suitable for use in applications in which high reliability is required, was designed to be implemented in either an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA). The design is based on commercial microprocessor architecture, making it possible to use available software development tools and thereby to implement the microprocessor at relatively low cost. The design features enhancements, including trapping during execution of illegal instructions. The internal structure of the design yields relatively high performance, with a significant decrease, relative to other microprocessors that perform the same functions, in the number of microcycles needed to execute macroinstructions. The problem meant to be solved in designing this microprocessor was to provide a modest level of computational capability in a general-purpose processor while adding as little as possible to the power demand, size, and weight of a system into which the microprocessor would be incorporated. As designed, this microprocessor consumes very little power and occupies only a small portion of a typical modern ASIC or FPGA. The microprocessor operates at a rate of about 4 million instructions per second with clock frequency of 20 MHz.
Economical Implementation of a Filter Engine in an FPGA
NASA Technical Reports Server (NTRS)
Kowalski, James E.
2009-01-01
A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be advantageous to combine compact FPGA digital filter implementations with other application-specific logic implementations on single integrated-circuit chips. An FPGA could readily be tailored to implement a variety of filters because the filter coefficients would be loaded into memory at startup.
Field-Programmable Gate Array Computer in Structural Analysis: An Initial Exploration
NASA Technical Reports Server (NTRS)
Singleterry, Robert C., Jr.; Sobieszczanski-Sobieski, Jaroslaw; Brown, Samuel
2002-01-01
This paper reports on an initial assessment of using a Field-Programmable Gate Array (FPGA) computational device as a new tool for solving structural mechanics problems. A FPGA is an assemblage of binary gates arranged in logical blocks that are interconnected via software in a manner dependent on the algorithm being implemented and can be reprogrammed thousands of times per second. In effect, this creates a computer specialized for the problem that automatically exploits all the potential for parallel computing intrinsic in an algorithm. This inherent parallelism is the most important feature of the FPGA computational environment. It is therefore important that if a problem offers a choice of different solution algorithms, an algorithm of a higher degree of inherent parallelism should be selected. It is found that in structural analysis, an 'analog computer' style of programming, which solves problems by direct simulation of the terms in the governing differential equations, yields a more favorable solution algorithm than current solution methods. This style of programming is facilitated by a 'drag-and-drop' graphic programming language that is supplied with the particular type of FPGA computer reported in this paper. Simple examples in structural dynamics and statics illustrate the solution approach used. The FPGA system also allows linear scalability in computing capability. As the problem grows, the number of FPGA chips can be increased with no loss of computing efficiency due to data flow or algorithmic latency that occurs when a single problem is distributed among many conventional processors that operate in parallel. This initial assessment finds the FPGA hardware and software to be in their infancy in regard to the user conveniences; however, they have enormous potential for shrinking the elapsed time of structural analysis solutions if programmed with algorithms that exhibit inherent parallelism and linear scalability. This potential warrants further development of FPGA-tailored algorithms for structural analysis.
HDL Based FPGA Interface Library for Data Acquisition and Multipurpose Real Time Algorithms
NASA Astrophysics Data System (ADS)
Fernandes, Ana M.; Pereira, R. C.; Sousa, J.; Batista, A. J. N.; Combo, A.; Carvalho, B. B.; Correia, C. M. B. A.; Varandas, C. A. F.
2011-08-01
The inherent parallelism of the logic resources, the flexibility in its configuration and the performance at high processing frequencies makes the field programmable gate array (FPGA) the most suitable device to be used both for real time algorithm processing and data transfer in instrumentation modules. Moreover, the reconfigurability of these FPGA based modules enables exploiting different applications on the same module. When using a reconfigurable module for various applications, the availability of a common interface library for easier implementation of the algorithms on the FPGA leads to more efficient development. The FPGA configuration is usually specified in a hardware description language (HDL) or other higher level descriptive language. The critical paths, such as the management of internal hardware clocks that require deep knowledge of the module behavior shall be implemented in HDL to optimize the timing constraints. The common interface library should include these critical paths, freeing the application designer from hardware complexity and able to choose any of the available high-level abstraction languages for the algorithm implementation. With this purpose a modular Verilog code was developed for the Virtex 4 FPGA of the in-house Transient Recorder and Processor (TRP) hardware module, based on the Advanced Telecommunications Computing Architecture (ATCA), with eight channels sampling at up to 400 MSamples/s (MSPS). The TRP was designed to perform real time Pulse Height Analysis (PHA), Pulse Shape Discrimination (PSD) and Pile-Up Rejection (PUR) algorithms at a high count rate (few Mevent/s). A brief description of this modular code is presented and examples of its use as an interface with end user algorithms, including a PHA with PUR, are described.
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations
NASA Astrophysics Data System (ADS)
Hamada, Tsuyoshi; Fukushige, Toshiyuki; Kawai, Atsushi; Makino, Junichiro
2000-10-01
We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and ``traditional'' GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter relies on a hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions, but also other forms of interactions, such as the van der Waals force, hydro\\-dynamical interactions in the SPHr calculation, and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interactions similar to that of GRAPE-3. One pipeline is fitted into a single FPGA chip, operated at 16 MHz clock. Thus, for gravitational interactions, PROGRAPE-1 provided a speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for a wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.
Optimized FPGA Implementation of the Thyroid Hormone Secretion Mechanism Using CAD Tools.
Alghazo, Jaafar M
2017-02-01
The goal of this paper is to implement the secretion mechanism of the Thyroid Hormone (TH) based on bio-mathematical differential eqs. (DE) on an FPGA chip. Hardware Descriptive Language (HDL) is used to develop a behavioral model of the mechanism derived from the DE. The Thyroid Hormone secretion mechanism is simulated with the interaction of the related stimulating and inhibiting hormones. Synthesis of the simulation is done with the aid of CAD tools and downloaded on a Field Programmable Gate Arrays (FPGAs) Chip. The chip output shows identical behavior to that of the designed algorithm through simulation. It is concluded that the chip mimics the Thyroid Hormone secretion mechanism. The chip, operating in real-time, is computer-independent stand-alone system.
Systems and methods for detecting a failure event in a field programmable gate array
NASA Technical Reports Server (NTRS)
Ng, Tak-Kwong (Inventor); Herath, Jeffrey A. (Inventor)
2009-01-01
An embodiment generally relates to a method of self-detecting an error in a field programmable gate array (FPGA). The method includes writing a signature value into a signature memory in the FPGA and determining a conclusion of a configuration refresh operation in the FPGA. The method also includes reading an outcome value from the signature memory.
CoNNeCT Baseband Processor Module
NASA Technical Reports Server (NTRS)
Yamamoto, Clifford K; Jedrey, Thomas C.; Gutrich, Daniel G.; Goodpasture, Richard L.
2011-01-01
A document describes the CoNNeCT Baseband Processor Module (BPM) based on an updated processor, memory technology, and field-programmable gate arrays (FPGAs). The BPM was developed from a requirement to provide sufficient computing power and memory storage to conduct experiments for a Software Defined Radio (SDR) to be implemented. The flight SDR uses the AT697 SPARC processor with on-chip data and instruction cache. The non-volatile memory has been increased from a 20-Mbit EEPROM (electrically erasable programmable read only memory) to a 4-Gbit Flash, managed by the RTAX2000 Housekeeper, allowing more programs and FPGA bit-files to be stored. The volatile memory has been increased from a 20-Mbit SRAM (static random access memory) to a 1.25-Gbit SDRAM (synchronous dynamic random access memory), providing additional memory space for more complex operating systems and programs to be executed on the SPARC. All memory is EDAC (error detection and correction) protected, while the SPARC processor implements fault protection via TMR (triple modular redundancy) architecture. Further capability over prior BPM designs includes the addition of a second FPGA to implement features beyond the resources of a single FPGA. Both FPGAs are implemented with Xilinx Virtex-II and are interconnected by a 96-bit bus to facilitate data exchange. Dedicated 1.25- Gbit SDRAMs are wired to each Xilinx FPGA to accommodate high rate data buffering for SDR applications as well as independent SpaceWire interfaces. The RTAX2000 manages scrub and configuration of each Xilinx.
Field-programmable gate array-controlled sweep velocity-locked laser pulse generator
NASA Astrophysics Data System (ADS)
Chen, Zhen; Hefferman, Gerald; Wei, Tao
2017-05-01
A field-programmable gate array (FPGA)-controlled sweep velocity-locked laser pulse generator (SV-LLPG) design based on an all-digital phase-locked loop (ADPLL) is proposed. A distributed feedback laser with modulated injection current was used as a swept-frequency laser source. An open-loop predistortion modulation waveform was calibrated using a feedback iteration method to initially improve frequency sweep linearity. An ADPLL control system was then implemented using an FPGA to lock the output of a Mach-Zehnder interferometer that was directly proportional to laser sweep velocity to an on-board system clock. Using this system, linearly chirped laser pulses with a sweep bandwidth of 111.16 GHz were demonstrated. Further testing evaluating the sensing utility of the system was conducted. In this test, the SV-LLPG served as the swept laser source of an optical frequency-domain reflectometry system used to interrogate a subterahertz range fiber structure (sub-THz-FS) array. A static strain test was then conducted and linear sensor results were observed.
NASA Astrophysics Data System (ADS)
Jackson, Christopher Robert
"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
Optimization on fixed low latency implementation of the GBT core in FPGA
Chen, K.; Chen, H.; Wu, W.; ...
2017-07-11
We present that in the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, themore » GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. Finally, the system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.« less
Optimization on fixed low latency implementation of the GBT core in FPGA
NASA Astrophysics Data System (ADS)
Chen, K.; Chen, H.; Wu, W.; Xu, H.; Yao, L.
2017-07-01
In the upgrade of ATLAS experiment [1], the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link [2]. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, the GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA [3]. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system [4, 5] is used to interface the front-end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. The system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.
NASA Astrophysics Data System (ADS)
Yu, Shi Jing; Fajeau, Emma; Liu, Lin Qiao; Jones, David J.; Madison, Kirk W.
2018-02-01
In this work, we address the advantages, limitations, and technical subtleties of employing field programmable gate array (FPGA)-based digital servos for high-bandwidth feedback control of lasers in atomic, molecular, and optical physics experiments. Specifically, we provide the results of benchmark performance tests in experimental setups including noise, bandwidth, and dynamic range for two digital servos built with low and mid-range priced FPGA development platforms. The digital servo results are compared to results obtained from a commercially available state-of-the-art analog servo using the same plant for control (intensity stabilization). The digital servos have feedback bandwidths of 2.5 MHz, limited by the total signal latency, and we demonstrate improvements beyond the transfer function offered by the analog servo including a three-pole filter and a two-pole filter with phase compensation to suppress resonances. We also discuss limitations of our FPGA-servo implementation and general considerations when designing and using digital servos.
Yu, Shi Jing; Fajeau, Emma; Liu, Lin Qiao; Jones, David J; Madison, Kirk W
2018-02-01
In this work, we address the advantages, limitations, and technical subtleties of employing field programmable gate array (FPGA)-based digital servos for high-bandwidth feedback control of lasers in atomic, molecular, and optical physics experiments. Specifically, we provide the results of benchmark performance tests in experimental setups including noise, bandwidth, and dynamic range for two digital servos built with low and mid-range priced FPGA development platforms. The digital servo results are compared to results obtained from a commercially available state-of-the-art analog servo using the same plant for control (intensity stabilization). The digital servos have feedback bandwidths of 2.5 MHz, limited by the total signal latency, and we demonstrate improvements beyond the transfer function offered by the analog servo including a three-pole filter and a two-pole filter with phase compensation to suppress resonances. We also discuss limitations of our FPGA-servo implementation and general considerations when designing and using digital servos.
Diagnostic layer integration in FPGA-based pipeline measurement systems for HEP experiments
NASA Astrophysics Data System (ADS)
Pozniak, Krzysztof T.
2007-08-01
Integrated triggering and data acquisition systems for high energy physics experiments may be considered as fast, multichannel, synchronous, distributed, pipeline measurement systems. A considerable extension of functional, technological and monitoring demands, which has recently been imposed on them, forced a common usage of large field-programmable gate array (FPGA), digital signal processing-enhanced matrices and fast optical transmission for their realization. This paper discusses modelling, design, realization and testing of pipeline measurement systems. A distribution of synchronous data stream flows is considered in the network. A general functional structure of a single network node is presented. A suggested, novel block structure of the node model facilitates full implementation in the FPGA chip, circuit standardization and parametrization, as well as integration of functional and diagnostic layers. A general method for pipeline system design was derived. This method is based on a unified model of the synchronous data network node. A few examples of practically realized, FPGA-based, pipeline measurement systems were presented. The described systems were applied in ZEUS and CMS.
A Radiation Dosimeter Concept for the Lunar Surface Environment
NASA Technical Reports Server (NTRS)
Adams, James H.; Christl, Mark J.; Watts, John; Kuznetsov, Eugeny N.; Parnell, Thomas A.; Pendleton, Geoff N.
2007-01-01
A novel silicon detector configuration for radiation dose measurements in an environment where solar energetic particles are of most concern is described. The dosimeter would also measure the dose from galactic cosmic rays. In the lunar environment a large range in particle flux and ionization density must be measured and converted to dose equivalent. This could be accomplished with a thick (e.g. 2mm) silicon detector segmented into cubic volume elements "voxels" followed by a second, thin monolithic silicon detector. The electronics needed to implement this detector concept include analog signal processors (ASIC) and a field programmable gate array (FPGA) for data accumulation and conversion to linear energy transfer (LET) spectra and to dose-equivalent (Sievert). Currently available commercial ASIC's and FPGA's are suitable for implementing the analog and digital systems.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
Bio-Inspired Controller on an FPGA Applied to Closed-Loop Diaphragmatic Stimulation
Zbrzeski, Adeline; Bornat, Yannick; Hillen, Brian; Siu, Ricardo; Abbas, James; Jung, Ranu; Renaud, Sylvie
2016-01-01
Cervical spinal cord injury can disrupt connections between the brain respiratory network and the respiratory muscles which can lead to partial or complete loss of ventilatory control and require ventilatory assistance. Unlike current open-loop technology, a closed-loop diaphragmatic pacing system could overcome the drawbacks of manual titration as well as respond to changing ventilation requirements. We present an original bio-inspired assistive technology for real-time ventilation assistance, implemented in a digital configurable Field Programmable Gate Array (FPGA). The bio-inspired controller, which is a spiking neural network (SNN) inspired by the medullary respiratory network, is as robust as a classic controller while having a flexible, low-power and low-cost hardware design. The system was simulated in MATLAB with FPGA-specific constraints and tested with a computational model of rat breathing; the model reproduced experimentally collected respiratory data in eupneic animals. The open-loop version of the bio-inspired controller was implemented on the FPGA. Electrical test bench characterizations confirmed the system functionality. Open and closed-loop paradigm simulations were simulated to test the FPGA system real-time behavior using the rat computational model. The closed-loop system monitors breathing and changes in respiratory demands to drive diaphragmatic stimulation. The simulated results inform future acute animal experiments and constitute the first step toward the development of a neuromorphic, adaptive, compact, low-power, implantable device. The bio-inspired hardware design optimizes the FPGA resource and time costs while harnessing the computational power of spike-based neuromorphic hardware. Its real-time feature makes it suitable for in vivo applications. PMID:27378844
A FPGA Implementation of the CAR-FAC Cochlear Model.
Xu, Ying; Thakur, Chetan S; Singh, Ram K; Hamilton, Tara Julia; Wang, Runchun M; van Schaik, André
2018-01-01
This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on.
A FPGA Implementation of the CAR-FAC Cochlear Model
Xu, Ying; Thakur, Chetan S.; Singh, Ram K.; Hamilton, Tara Julia; Wang, Runchun M.; van Schaik, André
2018-01-01
This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on. PMID:29692700
An 18-ps TDC using timing adjustment and bin realignment methods in a Cyclone-IV FPGA
NASA Astrophysics Data System (ADS)
Cao, Guiping; Xia, Haojie; Dong, Ning
2018-05-01
The method commonly used to produce a field-programmable gate array (FPGA)-based time-to-digital converter (TDC) creates a tapped delay line (TDL) for time interpolation to yield high time precision. We conduct timing adjustment and bin realignment to implement a TDC in the Altera Cyclone-IV FPGA. The former tunes the carry look-up table (LUT) cell delay by changing the LUT's function through low-level primitives according to timing analysis results, while the latter realigns bins according to the timing result obtained by timing adjustment so as to create a uniform TDL with bins of equivalent width. The differential nonlinearity and time resolution can be improved by realigning the bins. After calibration, the TDC has a 18 ps root-mean-square timing resolution and a 45 ps least-significant bit resolution.
Embedded System Implementation on FPGA System With μCLinux OS
NASA Astrophysics Data System (ADS)
Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna
2011-02-01
Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
NEPP Update of Independent Single Event Upset Field Programmable Gate Array Testing
NASA Technical Reports Server (NTRS)
Berg, Melanie; Label, Kenneth; Campola, Michael; Pellish, Jonathan
2017-01-01
This presentation provides a NASA Electronic Parts and Packaging (NEPP) Program update of independent Single Event Upset (SEU) Field Programmable Gate Array (FPGA) testing including FPGA test guidelines, Microsemi RTG4 heavy-ion results, Xilinx Kintex-UltraScale heavy-ion results, Xilinx UltraScale+ single event effect (SEE) test plans, development of a new methodology for characterizing SEU system response, and NEPP involvement with FPGA security and trust.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Learn, Mark Walter
Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
NASA Technical Reports Server (NTRS)
Gregory, Kyle J.; Hill, Joanne E. (Editor); Black, J. Kevin; Baumgartner, Wayne H.; Jahoda, Keith
2016-01-01
A fundamental challenge in a spaceborne application of a gas-based Time Projection Chamber (TPC) for observation of X-ray polarization is handling the large amount of data collected. The TPC polarimeter described uses the APV-25 Application Specific Integrated Circuit (ASIC) to readout a strip detector. Two dimensional photoelectron track images are created with a time projection technique and used to determine the polarization of the incident X-rays. The detector produces a 128x30 pixel image per photon interaction with each pixel registering 12 bits of collected charge. This creates challenging requirements for data storage and downlink bandwidth with only a modest incidence of photons and can have a significant impact on the overall mission cost. An approach is described for locating and isolating the photoelectron track within the detector image, yielding a much smaller data product, typically between 8x8 pixels and 20x20 pixels. This approach is implemented using a Microsemi RT-ProASIC3-3000 Field-Programmable Gate Array (FPGA), clocked at 20 MHz and utilizing 10.7k logic gates (14% of FPGA), 20 Block RAMs (17% of FPGA), and no external RAM. Results will be presented, demonstrating successful photoelectron track cluster detection with minimal impact to detector dead-time.
Implementation of a Virtual Microphone Array to Obtain High Resolution Acoustic Images
Izquierdo, Alberto; Suárez, Luis; Suárez, David
2017-01-01
Using arrays with digital MEMS (Micro-Electro-Mechanical System) microphones and FPGA-based (Field Programmable Gate Array) acquisition/processing systems allows building systems with hundreds of sensors at a reduced cost. The problem arises when systems with thousands of sensors are needed. This work analyzes the implementation and performance of a virtual array with 6400 (80 × 80) MEMS microphones. This virtual array is implemented by changing the position of a physical array of 64 (8 × 8) microphones in a grid with 10 × 10 positions, using a 2D positioning system. This virtual array obtains an array spatial aperture of 1 × 1 m2. Based on the SODAR (SOund Detection And Ranging) principle, the measured beampattern and the focusing capacity of the virtual array have been analyzed, since beamforming algorithms assume to be working with spherical waves, due to the large dimensions of the array in comparison with the distance between the target (a mannequin) and the array. Finally, the acoustic images of the mannequin, obtained for different frequency and range values, have been obtained, showing high angular resolutions and the possibility to identify different parts of the body of the mannequin. PMID:29295485
Dynamically Reconfigurable Systolic Array Accelerator
NASA Technical Reports Server (NTRS)
Dasu, Aravind; Barnes, Robert
2012-01-01
A polymorphic systolic array framework has been developed that works in conjunction with an embedded microprocessor on a field-programmable gate array (FPGA), which allows for dynamic and complimentary scaling of acceleration levels of two algorithms active concurrently on the FPGA. Use is made of systolic arrays and a hardware-software co-design to obtain an efficient multi-application acceleration system. The flexible and simple framework allows hosting of a broader range of algorithms, and is extendable to more complex applications in the area of aerospace embedded systems. FPGA chips can be responsive to realtime demands for changing applications needs, but only if the electronic fabric can respond fast enough. This systolic array framework allows for rapid partial and dynamic reconfiguration of the chip in response to the real-time needs of scalability, and adaptability of executables.
Radiation-Hardened Solid-State Drive
NASA Technical Reports Server (NTRS)
Sheldon, Douglas J.
2010-01-01
A method is provided for a radiationhardened (rad-hard) solid-state drive for space mission memory applications by combining rad-hard and commercial off-the-shelf (COTS) non-volatile memories (NVMs) into a hybrid architecture. The architecture is controlled by a rad-hard ASIC (application specific integrated circuit) or a FPGA (field programmable gate array). Specific error handling and data management protocols are developed for use in a rad-hard environment. The rad-hard memories are smaller in overall memory density, but are used to control and manage radiation-induced errors in the main, and much larger density, non-rad-hard COTS memory devices. Small amounts of rad-hard memory are used as error buffers and temporary caches for radiation-induced errors in the large COTS memories. The rad-hard ASIC/FPGA implements a variety of error-handling protocols to manage these radiation-induced errors. The large COTS memory is triplicated for protection, and CRC-based counters are calculated for sub-areas in each COTS NVM array. These counters are stored in the rad-hard non-volatile memory. Through monitoring, rewriting, regeneration, triplication, and long-term storage, radiation-induced errors in the large NV memory are managed. The rad-hard ASIC/FPGA also interfaces with the external computer buses.
NASA Technical Reports Server (NTRS)
Allen, Gregory; Edmonds, Larry D.; Swift, Gary; Carmichael, Carl; Tseng, Chen Wei; Heldt, Kevin; Anderson, Scott Arlo; Coe, Michael
2010-01-01
We present a test methodology for estimating system error rates of Field Programmable Gate Arrays (FPGAs) mitigated with Triple Modular Redundancy (TMR). The test methodology is founded in a mathematical model, which is also presented. Accelerator data from 90 nm Xilins Military/Aerospace grade FPGA are shown to fit the model. Fault injection (FI) results are discussed and related to the test data. Design implementation and the corresponding impact of multiple bit upset (MBU) are also discussed.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Xiao, Yong; Cheng, Xinyi; Li, Deng; Wang, Liwei
2016-02-01
For the continuous crystal-based positron emission tomography (PET) detector built in our lab, a maximum likelihood algorithm adapted for implementation on a field programmable gate array (FPGA) is proposed to estimate the three-dimensional (3D) coordinate of interaction position with the single-end detected scintillation light response. The row-sum and column-sum readout scheme organizes the 64 channels of photomultiplier (PMT) into eight row signals and eight column signals to be readout for X- and Y-coordinates estimation independently. By the reference events irradiated in a known oblique angle, the probability density function (PDF) for each depth-of-interaction (DOI) segment is generated, by which the reference events in perpendicular irradiation are assigned to DOI segments for generating the PDFs for X and Y estimation in each DOI layer. Evaluated by the experimental data, the algorithm achieves an average X resolution of 1.69 mm along the central X-axis, and DOI resolution of 3.70 mm over the whole thickness (0-10 mm) of crystal. The performance improvements from 2D estimation to the 3D algorithm are also presented. Benefiting from abundant resources of FPGA and a hierarchical storage arrangement, the whole algorithm can be implemented into a middle-scale FPGA. By a parallel structure in pipelines, the 3D position estimator on the FPGA can achieve a processing throughput of 15 M events/s, which is sufficient for the requirement of real-time PET imaging.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Li, Deng; Lu, Xiaoming; Cheng, Xinyi; Wang, Liwei
2014-10-01
Continuous crystal-based positron emission tomography (PET) detectors could be an ideal alternative for current high-resolution pixelated PET detectors if the issues of high performance γ interaction position estimation and its real-time implementation are solved. Unfortunately, existing position estimators are not very feasible for implementation on field-programmable gate array (FPGA). In this paper, we propose a new self-organizing map neural network-based nearest neighbor (SOM-NN) positioning scheme aiming not only at providing high performance, but also at being realistic for FPGA implementation. Benefitting from the SOM feature mapping mechanism, the large set of input reference events at each calibration position is approximated by a small set of prototypes, and the computation of the nearest neighbor searching for unknown events is largely reduced. Using our experimental data, the scheme was evaluated, optimized and compared with the smoothed k-NN method. The spatial resolutions of full-width-at-half-maximum (FWHM) of both methods averaged over the center axis of the detector were obtained as 1.87 ±0.17 mm and 1.92 ±0.09 mm, respectively. The test results show that the SOM-NN scheme has an equivalent positioning performance with the smoothed k-NN method, but the amount of computation is only about one-tenth of the smoothed k-NN method. In addition, the algorithm structure of the SOM-NN scheme is more feasible for implementation on FPGA. It has the potential to realize real-time position estimation on an FPGA with a high-event processing throughput.
Remote hardware-reconfigurable robotic camera
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.
2001-10-01
In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.
NASA Technical Reports Server (NTRS)
Allen, Gregory
2011-01-01
The NEPP Reconfigurable Field-Programmable Gate Array (FPGA) task has been charged to evaluate reconfigurable FPGA technologies for use in space. Under this task, the Xilinx single-event-immune, reconfigurable FPGA (SIRF) XQR5VFX130 device was evaluated for SEE. Additionally, the Altera Stratix-IV and SiliconBlue iCE65 were screened for single-event latchup (SEL).
Controller for the Electronically Scanned Thinned Array Radiometer (ESTAR) instrument
NASA Technical Reports Server (NTRS)
Zomberg, Brian G.; Chren, William A., Jr.
1994-01-01
A prototype controller for the ESTAR (electronically scanned thinned array radiometer) instrument has been designed and tested. It manages the operation of the digital data subsystem (DDS) and its communication with the Small Explorer data system (SEDS). Among the data processing tasks that it coordinates are FEM data acquisition, noise removal, phase alignment and correlation. Its control functions include instrument calibration and testing of two critical subsystems, the output data formatter and Walsh function generator. It is implemented in a Xilinx XC3064PC84-100 field programmable gate array (FPGA) and has a maximum clocking frequency of 10 MHz.
Digital Fingerprinting of Field Programmable Gate Arrays
2008-03-01
48 vii Page Appendix B . Tranistional Sampling Outputs . . . . . . . . . . . . . . 49 Appendix C. VHDL Entities...cumulative sampling outputs by pin . . . . . . . . . . . 48 B .1. FPGA outputs for Sample 0, Clk 18 . . . . . . . . . . . . . . . 49 B .2. FPGA outputs for...Sample 0, Clk 19 . . . . . . . . . . . . . . . 49 B .3. FPGA outputs for Sample 0, Clk 21 . . . . . . . . . . . . . . . 50 B .4. FPGA outputs for Sample
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Liu, Chong
2016-10-01
Field programmable gate arrays (FPGAs) manufactured with more advanced processing technology have faster carry chains and smaller delay elements, which are favorable for the design of tapped delay line (TDL)-style time-to-digital converters (TDCs) in FPGA. However, new challenges are posed in using them to implement TDCs with a high time precision. In this paper, we propose a bin realignment method and a dual-sampling method for TDC implementation in a Xilinx UltraScale FPGA. The former realigns the disordered time delay taps so that the TDC precision can approach the limit of its delay granularity, while the latter doubles the number of taps in the delay line so that the TDC precision beyond the cell delay limitation can be expected. Two TDC channels were implemented in a Kintex UltraScale FPGA, and the effectiveness of the new methods was evaluated. For fixed time intervals in the range from 0 to 440 ns, the average RMS precision measured by the two TDC channels reaches 5.8 ps using the bin realignment, and it further improves to 3.9 ps by using the dual-sampling method. The time precision has a 5.6% variation in the measured temperature range. Every part of the TDC, including dual-sampling, encoding, and on-line calibration, could run at a 500 MHz clock frequency. The system measurement dead time is only 4 ns.
Single Event Testing on Complex Devices: Test Like You Fly versus Test-Specific Design Structures
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Kenneth A.
2014-01-01
We present a framework for evaluating complex digital systems targeted for harsh radiation environments such as space. Focus is limited to analyzing the single event upset (SEU) susceptibility of designs implemented inside Field Programmable Gate Array (FPGA) devices. Tradeoffs are provided between application-specific versus test-specific test structures.
Heavy-Ion Microbeam Fault Injection into SRAM-Based FPGA Implementations of Cryptographic Circuits
NASA Astrophysics Data System (ADS)
Li, Huiyun; Du, Guanghua; Shao, Cuiping; Dai, Liang; Xu, Guoqing; Guo, Jinlong
2015-06-01
Transistors hit by heavy ions may conduct transiently, thereby introducing transient logic errors. Attackers can exploit these abnormal behaviors and extract sensitive information from the electronic devices. This paper demonstrates an ion irradiation fault injection attack experiment into a cryptographic field-programmable gate-array (FPGA) circuit. The experiment proved that the commercial FPGA chip is vulnerable to low-linear energy transfer carbon irradiation, and the attack can cause the leakage of secret key bits. A statistical model is established to estimate the possibility of an effective fault injection attack on cryptographic integrated circuits. The model incorporates the effects from temporal, spatial, and logical probability of an effective attack on the cryptographic circuits. The rate of successful attack calculated from the model conforms well to the experimental results. This quantitative success rate model can help evaluate security risk for designers as well as for the third-party assessment organizations.
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
Embedded algorithms within an FPGA-based system to process nonlinear time series data
NASA Astrophysics Data System (ADS)
Jones, Jonathan D.; Pei, Jin-Song; Tull, Monte P.
2008-03-01
This paper presents some preliminary results of an ongoing project. A pattern classification algorithm is being developed and embedded into a Field-Programmable Gate Array (FPGA) and microprocessor-based data processing core in this project. The goal is to enable and optimize the functionality of onboard data processing of nonlinear, nonstationary data for smart wireless sensing in structural health monitoring. Compared with traditional microprocessor-based systems, fast growing FPGA technology offers a more powerful, efficient, and flexible hardware platform including on-site (field-programmable) reconfiguration capability of hardware. An existing nonlinear identification algorithm is used as the baseline in this study. The implementation within a hardware-based system is presented in this paper, detailing the design requirements, validation, tradeoffs, optimization, and challenges in embedding this algorithm. An off-the-shelf high-level abstraction tool along with the Matlab/Simulink environment is utilized to program the FPGA, rather than coding the hardware description language (HDL) manually. The implementation is validated by comparing the simulation results with those from Matlab. In particular, the Hilbert Transform is embedded into the FPGA hardware and applied to the baseline algorithm as the centerpiece in processing nonlinear time histories and extracting instantaneous features of nonstationary dynamic data. The selection of proper numerical methods for the hardware execution of the selected identification algorithm and consideration of the fixed-point representation are elaborated. Other challenges include the issues of the timing in the hardware execution cycle of the design, resource consumption, approximation accuracy, and user flexibility of input data types limited by the simplicity of this preliminary design. Future work includes making an FPGA and microprocessor operate together to embed a further developed algorithm that yields better computational and power efficiency.
The Use of Field Programmable Gate Arrays (FPGA) in Small Satellite Communication Systems
NASA Technical Reports Server (NTRS)
Varnavas, Kosta; Sims, William Herbert; Casas, Joseph
2015-01-01
This paper will describe the use of digital Field Programmable Gate Arrays (FPGA) to contribute to advancing the state-of-the-art in software defined radio (SDR) transponder design for the emerging SmallSat and CubeSat industry and to provide advances for NASA as described in the TAO5 Communication and Navigation Roadmap (Ref 4). The use of software defined radios (SDR) has been around for a long time. A typical implementation of the SDR is to use a processor and write software to implement all the functions of filtering, carrier recovery, error correction, framing etc. Even with modern high speed and low power digital signal processors, high speed memories, and efficient coding, the compute intensive nature of digital filters, error correcting and other algorithms is too much for modern processors to get efficient use of the available bandwidth to the ground. By using FPGAs, these compute intensive tasks can be done in parallel, pipelined fashion and more efficiently use every clock cycle to significantly increase throughput while maintaining low power. These methods will implement digital radios with significant data rates in the X and Ka bands. Using these state-of-the-art technologies, unprecedented uplink and downlink capabilities can be achieved in a 1/2 U sized telemetry system. Additionally, modern FPGAs have embedded processing systems, such as ARM cores, integrated inside the FPGA allowing mundane tasks such as parameter commanding to occur easily and flexibly. Potential partners include other NASA centers, industry and the DOD. These assets are associated with small satellite demonstration flights, LEO and deep space applications. MSFC currently has an SDR transponder test-bed using Hardware-in-the-Loop techniques to evaluate and improve SDR technologies.
NASA Astrophysics Data System (ADS)
Jorge, L. S.; Bonifacio, D. A. B.; DeWitt, Don; Miyaoka, R. S.
2016-12-01
Continuous scintillator-based detectors have been considered as a competitive and cheaper approach than highly pixelated discrete crystal positron emission tomography (PET) detectors, despite the need for algorithms to estimate 3D gamma interaction position. In this work, we report on the implementation of a positioning algorithm to estimate the 3D interaction position in a continuous crystal PET detector using a Field Programmable Gate Array (FPGA). The evaluated method is the Statistics-Based Processing (SBP) technique that requires light response function and event position characterization. An algorithm has been implemented using the Verilog language and evaluated using a data acquisition board that contains an Altera Stratix III FPGA. The 3D SBP algorithm was previously successfully implemented on a Stratix II FPGA using simulated data and a different module design. In this work, improvements were made to the FPGA coding of the 3D positioning algorithm, reducing the total memory usage to around 34%. Further the algorithm was evaluated using experimental data from a continuous miniature crystal element (cMiCE) detector module. Using our new implementation, average FWHM (Full Width at Half Maximum) for the whole block is 1.71±0.01 mm, 1.70±0.01 mm and 1.632±0.005 mm for x, y and z directions, respectively. Using a pipelined architecture, the FPGA is able to process 245,000 events per second for interactions inside of the central area of the detector that represents 64% of the total block area. The weighted average of the event rate by regional area (corner, border and central regions) is about 198,000 events per second. This event rate is greater than the maximum expected coincidence rate for any given detector module in future PET systems using the cMiCE detector design.
STRS Compliant FPGA Waveform Development
NASA Technical Reports Server (NTRS)
Nappier, Jennifer; Downey, Joseph; Mortensen, Dale
2008-01-01
The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. The extension of STRS to the SSP hardware will promote easier waveform reconfiguration and reuse. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. A FPGA-based transmit waveform implementation of the proposed standard interfaces on a laboratory breadboard SDR will be discussed.
A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations
Mak, Terrence S. T.; Rachmuth, Guy; Lam, Kai-Pui; Poon, Chi-Sang
2008-01-01
Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field Programmable Gate Array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the AMPA and NMDA synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired. PMID:17190033
Self-Adaptive System based on Field Programmable Gate Array for Extreme Temperature Electronics
NASA Technical Reports Server (NTRS)
Keymeulen, Didier; Zebulum, Ricardo; Rajeshuni, Ramesham; Stoica, Adrian; Katkoori, Srinivas; Graves, Sharon; Novak, Frank; Antill, Charles
2006-01-01
In this work, we report the implementation of a self-adaptive system using a field programmable gate array (FPGA) and data converters. The self-adaptive system can autonomously recover the lost functionality of a reconfigurable analog array (RAA) integrated circuit (IC) [3]. Both the RAA IC and the self-adaptive system are operating in extreme temperatures (from 120 C down to -180 C). The RAA IC consists of reconfigurable analog blocks interconnected by several switches and programmable by bias voltages. It implements filters/amplifiers with bandwidth up to 20 MHz. The self-adaptive system controls the RAA IC and is realized on Commercial-Off-The-Shelf (COTS) parts. It implements a basic compensation algorithm that corrects a RAA IC in less than a few milliseconds. Experimental results for the cold temperature environment (down to -180 C) demonstrate the feasibility of this approach.
Field programmable gate arrays: Evaluation report for space-flight application
NASA Technical Reports Server (NTRS)
Sandoe, Mike; Davarpanah, Mike; Soliman, Kamal; Suszko, Steven; Mackey, Susan
1992-01-01
Field Programmable Gate Arrays commonly called FPGA's are the newer generation of field programmable devices and offer more flexibility in the logic modules they incorporate and in how they are interconnected. The flexibility, the number of logic building blocks available, and the high gate densities achievable are why users find FPGA's attractive. These attributes are important in reducing product development costs and shortening the development cycle. The aerospace community is interested in incorporating this new generation of field programmable technology in space applications. To this end, a consortium was formed to evaluate the quality, reliability, and radiation performance of FPGA's. This report presents the test results on FPGA parts provided by ACTEL Corporation.
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA).
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA). PMID:22319345
High-speed polarization sensitive optical coherence tomography for retinal diagnostics
NASA Astrophysics Data System (ADS)
Yin, Biwei; Wang, Bingqing; Vemishetty, Kalyanramu; Nagle, Jim; Liu, Shuang; Wang, Tianyi; Rylander, Henry G., III; Milner, Thomas E.
2012-01-01
We report design and construction of an FPGA-based high-speed swept-source polarization-sensitive optical coherence tomography (SS-PS-OCT) system for clinical retinal imaging. Clinical application of the SS-PS-OCT system is accurate measurement and display of thickness, phase retardation and birefringence maps of the retinal nerve fiber layer (RNFL) in human subjects for early detection of glaucoma. The FPGA-based SS-PS-OCT system provides three incident polarization states on the eye and uses a bulk-optic polarization sensitive balanced detection module to record two orthogonal interference fringe signals. Interference fringe signals and relative phase retardation between two orthogonal polarization states are used to obtain Stokes vectors of light returning from each RNFL depth. We implement a Levenberg-Marquardt algorithm on a Field Programmable Gate Array (FPGA) to compute accurate phase retardation and birefringence maps. For each retinal scan, a three-state Levenberg-Marquardt nonlinear algorithm is applied to 360 clusters each consisting of 100 A-scans to determine accurate maps of phase retardation and birefringence in less than 1 second after patient measurement allowing real-time clinical imaging-a speedup of more than 300 times over previous implementations. We report application of the FPGA-based SS-PS-OCT system for real-time clinical imaging of patients enrolled in a clinical study at the Eye Institute of Austin and Duke Eye Center.
A Fixed Point VHDL Component Library for a High Efficiency Reconfigurable Radio Design Methodology
NASA Technical Reports Server (NTRS)
Hoy, Scott D.; Figueiredo, Marco A.
2006-01-01
Advances in Field Programmable Gate Array (FPGA) technologies enable the implementation of reconfigurable radio systems for both ground and space applications. The development of such systems challenges the current design paradigms and requires more robust design techniques to meet the increased system complexity. Among these techniques is the development of component libraries to reduce design cycle time and to improve design verification, consequently increasing the overall efficiency of the project development process while increasing design success rates and reducing engineering costs. This paper describes the reconfigurable radio component library developed at the Software Defined Radio Applications Research Center (SARC) at Goddard Space Flight Center (GSFC) Microwave and Communications Branch (Code 567). The library is a set of fixed-point VHDL components that link the Digital Signal Processing (DSP) simulation environment with the FPGA design tools. This provides a direct synthesis path based on the latest developments of the VHDL tools as proposed by the BEE VBDL 2004 which allows for the simulation and synthesis of fixed-point math operations while maintaining bit and cycle accuracy. The VHDL Fixed Point Reconfigurable Radio Component library does not require the use of the FPGA vendor specific automatic component generators and provide a generic path from high level DSP simulations implemented in Mathworks Simulink to any FPGA device. The access to the component synthesizable, source code provides full design verification capability:
Clock and carrier recovery in high-speed coherent optical communication systems
NASA Astrophysics Data System (ADS)
Amado, Sofia B.; Ferreira, Ricardo; Costa, Pedro S.; Guiomar, Fernando P.; Ziaie, Somayeh; Teixeira, António L.; Muga, Nelson J.; Pinto, Armando N.
2014-08-01
In this paper, the implementations of clock and carrier recovery in digital domain are analyzed. Hardware implementation details, resources estimation and real-time results are presented. Analog-to-Digital Converters (ADC), operating at 1.25Gsa/s, and a Virtex-6 Field-Programmable Gate Array (FPGA), have been used, allowing the implementation of a real-time Quadrature Phase Shift Keying (QPSK) system operating at 1.25Gb/s. The real-time mode operation is successfully demonstrated over 80 km of Standard Single Mode Fiber (SSMF).
Ripple FPN reduced algorithm based on temporal high-pass filter and hardware implementation
NASA Astrophysics Data System (ADS)
Li, Yiyang; Li, Shuo; Zhang, Zhipeng; Jin, Weiqi; Wu, Lei; Jin, Minglei
2016-11-01
Cooled infrared detector arrays always suffer from undesired Ripple Fixed-Pattern Noise (FPN) when observe the scene of sky. The Ripple Fixed-Pattern Noise seriously affect the imaging quality of thermal imager, especially for small target detection and tracking. It is hard to eliminate the FPN by the Calibration based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified space low-pass and temporal high-pass nonuniformity correction algorithm using adaptive time domain threshold (THP&GM). The threshold is designed to significantly reduce ghosting artifacts. We test the algorithm on real infrared in comparison to several previously published methods. This algorithm not only can effectively correct common FPN such as Stripe, but also has obviously advantage compared with the current methods in terms of detail protection and convergence speed, especially for Ripple FPN correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA). The hardware implementation of the algorithm based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay (less than 20 lines). The hardware has been successfully applied in actual system.
TOT measurement implemented in FPGA TDC
NASA Astrophysics Data System (ADS)
Fan, Huan-Huan; Cao, Ping; Liu, Shu-Bin; An, Qi
2015-11-01
Time measurement plays a crucial role for the purpose of particle identification in high energy physics experiments. With increasingly demanding physics goals and the development of electronics, modern time measurement systems need to meet the requirement of excellent resolution specification as well as high integrity. Based on Field Programmable Gate Arrays (FPGAs), FPGA time-to-digital converters (TDCs) have become one of the most mature and prominent time measurement methods in recent years. For correcting the time-walk effect caused by leading timing, a time-over-threshold (TOT) measurement should be added to the FPGA TDC. TOT can be obtained by measuring the interval between the signal leading and trailing edges. Unfortunately, a traditional TDC can recognize only one kind of signal edge, the leading or the trailing. Generally, to measure the interval, two TDC channels need to be used at the same time, one for leading, the other for trailing. However, this method unavoidably increases the amount of FPGA resources used and reduces the TDC's integrity. This paper presents one method of TOT measurement implemented in a Xilinx Virtex-5 FPGA. In this method, TOT measurement can be achieved using only one TDC input channel. The consumed resources and time resolution can both be guaranteed. Testing shows that this TDC can achieve resolution better than 15ps for leading edge measurement and 37 ps for TOT measurement. Furthermore, the TDC measurement dead time is about two clock cycles, which makes it good for applications with higher physics event rates. Supported by National Natural Science Foundation of China (11079003, 10979003)
NASA Technical Reports Server (NTRS)
Howard, J. W.; Kim, H.; Berg, M.; LaBel, K. A.; Stansberry, S.; Friendlich, M.; Irwin, T.
2006-01-01
A viewgraph presentation on the development of a low cost, high speed tester reconfigurable Field Programmable Gata Array (FPGA) is shown. The topics include: 1) Introduction; 2) Objectives; 3) Tester Descriptions; 4) Tester Validations and Demonstrations; 5) Future Work; and 6) Summary.
NASA Astrophysics Data System (ADS)
HUSEJKO, Michal; EVANS, John; RASTEIRO DA SILVA, Jose Carlos
2015-12-01
High-Level Synthesis (HLS) for Field-Programmable Logic Array (FPGA) programming is becoming a practical alternative to well-established VHDL and Verilog languages. This paper describes a case study in the use of HLS tools to design FPGA-based data acquisition systems (DAQ). We will present the implementation of the CERN CMS detector ECAL Data Concentrator Card (DCC) functionality in HLS and lessons learned from using HLS design flow. The DCC functionality and a definition of the initial system-level performance requirements (latency, bandwidth, and throughput) will be presented. We will describe how its packet processing control centric algorithm was implemented with VHDL and Verilog languages. We will then show how the HLS flow could speed up design-space exploration by providing loose coupling between functions interface design and functions algorithm implementation. We conclude with results of real-life hardware tests performed with the HLS flow-generated design with a DCC Tester system.
A New FPGA Architecture of FAST and BRIEF Algorithm for On-Board Corner Detection and Matching.
Huang, Jingjin; Zhou, Guoqing; Zhou, Xiang; Zhang, Rongting
2018-03-28
Although some researchers have proposed the Field Programmable Gate Array (FPGA) architectures of Feature From Accelerated Segment Test (FAST) and Binary Robust Independent Elementary Features (BRIEF) algorithm, there is no consideration of image data storage in these traditional architectures that will result in no image data that can be reused by the follow-up algorithms. This paper proposes a new FPGA architecture that considers the reuse of sub-image data. In the proposed architecture, a remainder-based method is firstly designed for reading the sub-image, a FAST detector and a BRIEF descriptor are combined for corner detection and matching. Six pairs of satellite images with different textures, which are located in the Mentougou district, Beijing, China, are used to evaluate the performance of the proposed architecture. The Modelsim simulation results found that: (i) the proposed architecture is effective for sub-image reading from DDR3 at a minimum cost; (ii) the FPGA implementation is corrected and efficient for corner detection and matching, such as the average value of matching rate of natural areas and artificial areas are approximately 67% and 83%, respectively, which are close to PC's and the processing speed by FPGA is approximately 31 and 2.5 times faster than those by PC processing and by GPU processing, respectively.
FPGA Sequencer for Radar Altimeter Applications
NASA Technical Reports Server (NTRS)
Berkun, Andrew C.; Pollard, Brian D.; Chen, Curtis W.
2011-01-01
A sequencer for a radar altimeter provides accurate attitude information for a reliable soft landing of the Mars Science Laboratory (MSL). This is a field-programmable- gate-array (FPGA)-only implementation. A table loaded externally into the FPGA controls timing, processing, and decision structures. Radar is memory-less and does not use previous acquisitions to assist in the current acquisition. All cycles complete in exactly 50 milliseconds, regardless of range or whether a target was found. A RAM (random access memory) within the FPGA holds instructions for up to 15 sets. For each set, timing is run, echoes are processed, and a comparison is made. If a target is seen, more detailed processing is run on that set. If no target is seen, the next set is tried. When all sets have been run, the FPGA terminates and waits for the next 50-millisecond event. This setup simplifies testing and improves reliability. A single vertex chip does the work of an entire assembly. Output products require minor processing to become range and velocity. This technology is the heart of the Terminal Descent Sensor, which is an integral part of the Entry Decent and Landing system for MSL. In addition, it is a strong candidate for manned landings on Mars or the Moon.
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-01-01
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness. PMID:23867746
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-07-17
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
Fly-By-Light/Power-By-Wire Fault-Tolerant Fiber-Optic Backplane
NASA Technical Reports Server (NTRS)
Malekpour, Mahyar R.
2002-01-01
The design and development of a fault-tolerant fiber-optic backplane to demonstrate feasibility of such architecture is presented. The simulation results of test cases on the backplane in the advent of induced faults are presented, and the fault recovery capability of the architecture is demonstrated. The architecture was designed, developed, and implemented using the Very High Speed Integrated Circuits (VHSIC) Hardware Description Language (VHDL). The architecture was synthesized and implemented in hardware using Field Programmable Gate Arrays (FPGA) on multiple prototype boards.
Computing Models for FPGA-Based Accelerators
Herbordt, Martin C.; Gu, Yongfeng; VanCourt, Tom; Model, Josh; Sukhwani, Bharat; Chiu, Matt
2011-01-01
Field-programmable gate arrays are widely considered as accelerators for compute-intensive applications. A critical phase of FPGA application development is finding and mapping to the appropriate computing model. FPGA computing enables models with highly flexible fine-grained parallelism and associative operations such as broadcast and collective response. Several case studies demonstrate the effectiveness of using these computing models in developing FPGA applications for molecular modeling. PMID:21603152
Logic design and implementation of FPGA for a high frame rate ultrasound imaging system
NASA Astrophysics Data System (ADS)
Liu, Anjun; Wang, Jing; Lu, Jian-Yu
2002-05-01
Recently, a method has been developed for high frame rate medical imaging [Jian-yu Lu, ``2D and 3D high frame rate imaging with limited diffraction beams,'' IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44(4), 839-856 (1997)]. To realize this method, a complicated system [multiple-channel simultaneous data acquisition, large memory in each channel for storing up to 16 seconds of data at 40 MHz and 12-bit resolution, time-variable-gain (TGC) control, Doppler imaging, harmonic imaging, as well as coded transmissions] is designed. Due to the complexity of the system, field programmable gate array (FPGA) (Xilinx Spartn II) is used. In this presentation, the design and implementation of the FPGA for the system will be reported. This includes the synchronous dynamic random access memory (SDRAM) controller and other system controllers, time sharing for auto-refresh of SDRAMs to reduce peak power, transmission and imaging modality selections, ECG data acquisition and synchronization, 160 MHz delay locked loop (DLL) for accurate timing, and data transfer via either a parallel port or a PCI bus for post image processing. [Work supported in part by Grant 5RO1 HL60301 from NIH.
Floating-Point Units and Algorithms for field-programmable gate arrays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Underwood, Keith D.; Hemmert, K. Scott
2005-11-01
The software that we are attempting to copyright is a package of floating-point unit descriptions and example algorithm implementations using those units for use in FPGAs. The floating point units are best-in-class implementations of add, multiply, divide, and square root floating-point operations. The algorithm implementations are sample (not highly flexible) implementations of FFT, matrix multiply, matrix vector multiply, and dot product. Together, one could think of the collection as an implementation of parts of the BLAS library or something similar to the FFTW packages (without the flexibility) for FPGAs. Results from this work has been published multiple times and wemore » are working on a publication to discuss the techniques we use to implement the floating-point units, For some more background, FPGAS are programmable hardware. "Programs" for this hardware are typically created using a hardware description language (examples include Verilog, VHDL, and JHDL). Our floating-point unit descriptions are written in JHDL, which allows them to include placement constraints that make them highly optimized relative to some other implementations of floating-point units. Many vendors (Nallatech from the UK, SRC Computers in the US) have similar implementations, but our implementations seem to be somewhat higher performance. Our algorithm implementations are written in VHDL and models of the floating-point units are provided in VHDL as well. FPGA "programs" make multiple "calls" (hardware instantiations) to libraries of intellectual property (IP), such as the floating-point unit library described here. These programs are then compiled using a tool called a synthesizer (such as a tool from Synplicity, Inc.). The compiled file is a netlist of gates and flip-flops. This netlist is then mapped to a particular type of FPGA by a mapper and then a place- and-route tool. These tools assign the gates in the netlist to specific locations on the specific type of FPGA chip used and constructs the required routes between them. The result is a "bitstream" that is analogous to a compiled binary. The bitstream is loaded into the FPGA to create a specific hardware configuration.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Szadkowski, Zbigniew
2015-07-01
The paper presents the first results from the trigger based on the Discrete Cosine Transform (DCT) operating in the new Front-End Boards with Cyclone V FPGA deployed in 8 test surface detectors in the Pierre Auger Engineering Array. The patterns of the ADC traces generated by very inclined showers were obtained from the Auger database and from the CORSIKA simulation package supported next by Offline reconstruction Auger platform which gives a predicted digitized signal profiles. Simulations for many variants of the initial angle of shower, initialization depth in the atmosphere, type of particle and its initial energy gave a boundarymore » of the DCT coefficients used next for the on-line pattern recognition in the FPGA. Preliminary results have proven a right approach. We registered several showers triggered by the DCT for 120 MSps and 160 MSps. (authors)« less
RHrFPGA Radiation-Hardened Re-programmable Field-Programmable Gate Array
NASA Technical Reports Server (NTRS)
Sanders, A. B.; LaBel, K. A.; McCabe, J. F.; Gardner, G. A.; Lintz, J.; Ross, C.; Golke, K.; Burns, B.; Carts, M. A.; Kim, H. S.
2004-01-01
Viewgraphs on the development of the Radiation-Hardened Re-programmable Field-Programmable Gate Array (RHrFPGA) are presented. The topics include: 1) Radiation Test Suite; 2) Testing Interface; 3) Test Configuration; 4) Facilities; 5) Test Programs; 6) Test Procedure; and 7) Test Results. A summary of heavy ion and proton testing is also included.
ERIC Educational Resources Information Center
Meyer-Base, U.; Vera, A.; Meyer-Base, A.; Pattichis, M. S.; Perry, R. J.
2010-01-01
In this paper, an innovative educational approach to introducing undergraduates to both digital signal processing (DSP) and field programmable gate array (FPGA)-based design in a one-semester course and laboratory is described. While both DSP and FPGA-based courses are currently present in different curricula, this integrated approach reduces the…
Reconfigurable signal processor designs for advanced digital array radar systems
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan (Rockee); Yu, Xining
2017-05-01
The new challenges originated from Digital Array Radar (DAR) demands a new generation of reconfigurable backend processor in the system. The new FPGA devices can support much higher speed, more bandwidth and processing capabilities for the need of digital Line Replaceable Unit (LRU). This study focuses on using the latest Altera and Xilinx devices in an adaptive beamforming processor. The field reprogrammable RF devices from Analog Devices are used as analog front end transceivers. Different from other existing Software-Defined Radio transceivers on the market, this processor is designed for distributed adaptive beamforming in a networked environment. The following aspects of the novel radar processor will be presented: (1) A new system-on-chip architecture based on Altera's devices and adaptive processing module, especially for the adaptive beamforming and pulse compression, will be introduced, (2) Successful implementation of generation 2 serial RapidIO data links on FPGA, which supports VITA-49 radio packet format for large distributed DAR processing. (3) Demonstration of the feasibility and capabilities of the processor in a Micro-TCA based, SRIO switching backplane to support multichannel beamforming in real-time. (4) Application of this processor in ongoing radar system development projects, including OU's dual-polarized digital array radar, the planned new cylindrical array radars, and future airborne radars.
Dynamically Reconfigurable Systolic Array Accelorators
NASA Technical Reports Server (NTRS)
Dasu, Aravind (Inventor); Barnes, Robert C. (Inventor)
2014-01-01
A polymorphic systolic array framework that works in conjunction with an embedded microprocessor on an FPGA, that allows for dynamic and complimentary scaling of acceleration levels of two algorithms active concurrently on the FPGA. Use is made of systolic arrays and hardware-software co-design to obtain an efficient multi-application acceleration system. The flexible and simple framework allows hosting of a broader range of algorithms and extendable to more complex applications in the area of aerospace embedded systems.
1998-04-01
selected is statistically based on the total number of faults and the failure rate distribution in the system under test. The fault set is also...implemented the BPM and system level emulation consolidation logic as well as statistics counters for cache misses and various bus transactions. These...instruction F22 Advanced Tactical Fighter FET Field Effect Transitor FF Flip-Flop FM Failures/Milhon hours C-3 FPGA Field Programmable Gate Array GET
NASA Astrophysics Data System (ADS)
Wang, Su-Yin; Wu, Jinyuan; Yao, Shi-Hong; Chang, Wen-Chen
2014-12-01
We developed a field-programmable gate array (FPGA) TDC module for the tracking detectors of the Fermilab SeaQuest (E906) experiment, including drift chambers, proportional tubes, and hodoscopes. This 64-channel TDC module had a 6U VMEbus form factor and was equipped with a low-power, radiation-hardened Microsemi ProASIC3 Flash-based FPGA. The design of the new FPGA firmware (Run2-TDC) aimed to reduce the data volume and data acquisition (DAQ) deadtime. The firmware digitized multiple input hits of both polarities while allowing users to turn on a multiple-hit elimination logic to remove after-pulses in the wire chambers and proportional tubes. A scaler was implemented in the firmware to allow for recording the number of hits in each channel. The TDC resolution was determined by an internal cell delay of 450 ps. A measurement precision of 200 ps was achieved. We used five kinds of tests to ensure the qualification of 93 TDCs in mass production. We utilized the external wave union launcher in our test to improve the TDC's measurement precision and also to illustrate how to construct the Wave Union TDC using an existing multi-hit TDC without modifying its firmware. Measurement precision was improved by a factor of about two (108 ps) based on the four-edge wave union. Better measurement precision (69 ps) was achieved by combining the approaches of Wave Union TDC and multiple-channel ganging.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters.
Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Serrano, Alejandro; Godoy, Jorge; Martínez-Álvarez, Antonio; Villagra, Jorge
2017-11-11
Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle.
NASA Astrophysics Data System (ADS)
Mellal, Idir; Laghrouche, Mourad; Bui, Hung Tien
2017-04-01
This paper describes a non-invasive system for respiratory monitoring using a Micro Electro Mechanical Systems (MEMS) flow sensor and an IMU (Inertial Measurement Unit) accelerometer. The designed system is intended to be wearable and used in a hospital or at home to assist people with respiratory disorders. To ensure the accuracy of our system, we proposed a calibration method based on ANN (Artificial Neural Network) to compensate the temperature drift of the silicon flow sensor. The sigmoid activation functions used in the ANN model were computed with the CORDIC (COordinate Rotation DIgital Computer) algorithm. This algorithm was also used to estimate the tilt angle in body position. The design was implemented on reconfigurable platform FPGA.
Reconfigurable Gabor Filter For Fingerprint Recognition Using FPGA Verilog
NASA Astrophysics Data System (ADS)
Rosshidi, H. T.; Hadi, A. R.
2009-06-01
This paper present the implementations of Gabor filter for fingerprint recognition using Verilog HDL. This work demonstrates the application of Gabor Filter technique to enhance the fingerprint image. The incoming signal in form of image pixel will be filter out or convolute by the Gabor filter to define the ridge and valley regions of fingerprint. This is done with the application of a real time convolve based on Field Programmable Gate Array (FPGA) to perform the convolution operation. The main characteristic of the proposed approach are the usage of memory to store the incoming image pixel and the coefficient of the Gabor filter before the convolution matrix take place. The result was the signal convoluted with the Gabor coefficient.
NASA Technical Reports Server (NTRS)
Arnold, Jeffrey M.; Buell, Duncan A.; Kleinfelder, Walter J.
1993-01-01
Splash 2 is an attached processor system for Sun SPARC 2 workstations that uses Xilinx 4010 Field Programmable Gate Arrays (FPGA's) as its processing elements. The purpose of this paper is to describe Splash 2. The predecessor system, Splash 1, was designed to be used as a systolic processing system. Although it was very successful in that mode, there were many other applications that were not systolic, but which were successful, nonetheless, on Splash 1, or that were not implemented successfully due to one or more architectural limitations, most notably I/O bandwidth and interprocessor communication. Although other uses to increase computational performance have been found for the Xilinx FPGA's that are Splash's processing elements. Splash is unique in its goal to be programmable in a general sense.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fernandes, Ana; Pereira, Rita C.; Sousa, Jorge
The Instituto de Plasmas e Fusao Nuclear (IPFN) has developed dedicated re-configurable modules based on field programmable gate array (FPGA) devices for several nuclear fusion machines worldwide. Moreover, new Advanced Telecommunication Computing Architecture (ATCA) based modules developed by IPFN are already included in the ITER catalogue. One of the requirements for re-configurable modules operating in future nuclear environments including ITER is the remote update capability. Accordingly, this work presents an alternative method for FPGA remote programing to be implemented in new ATCA based re-configurable modules. FPGAs are volatile devices and their programming code is usually stored in dedicated flash memoriesmore » for properly configuration during module power-on. The presented method is capable to store new FPGA codes in Serial Peripheral Interface (SPI) flash memories using the PCIexpress (PCIe) network established on the ATCA back-plane, linking data acquisition endpoints and the data switch blades. The method is based on the Xilinx Quick Boot application note, adapted to PCIe protocol and ATCA based modules. (authors)« less
NASA Astrophysics Data System (ADS)
Oztekin, Halit; Temurtas, Feyzullah; Gulbag, Ali
The Arithmetic and Logic Unit (ALU) design is one of the important topics in Computer Architecture and Organization course in Computer and Electrical Engineering departments. There are ALU designs that have non-modular nature to be used as an educational tool. As the programmable logic technology has developed rapidly, it is feasible that ALU design based on Field Programmable Gate Array (FPGA) is implemented in this course. In this paper, we have adopted the modular approach to ALU design based on FPGA. All the modules in the ALU design are realized using schematic structure on Altera's Cyclone II Development board. Under this model, the ALU content is divided into four distinct modules. These are arithmetic unit except for multiplication and division operations, logic unit, multiplication unit and division unit. User can easily design any size of ALU unit since this approach has the modular nature. Then, this approach was applied to microcomputer architecture design named BZK.SAU.FPGA10.0 instead of the current ALU unit.
NASA Astrophysics Data System (ADS)
Xu, Zhipeng; Wei, Jun; Li, Jianwei; Zhou, Qianting
2010-11-01
An image spectrometer of a spatial remote sensing satellite requires shortwave band range from 2.1μm to 3μm which is one of the most important bands in remote sensing. We designed an infrared sub-system of the image spectrometer using a homemade 640x1 InGaAs shortwave infrared sensor working on FPA system which requires high uniformity and low level of dark current. The working temperature should be -15+/-0.2 Degree Celsius. This paper studies the model of noise for focal plane array (FPA) system, investigated the relationship with temperature and dark current noise, and adopts Incremental PID algorithm to generate PWM wave in order to control the temperature of the sensor. There are four modules compose of the FPGA module design. All of the modules are coded by VHDL and implemented in FPGA device APA300. Experiment shows the intelligent temperature control system succeeds in controlling the temperature of the sensor.
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors
NASA Technical Reports Server (NTRS)
Flatley, Thomas P.
2015-01-01
SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
STRS Compliant FPGA Waveform Development
NASA Technical Reports Server (NTRS)
Nappier, Jennifer; Downey, Joseph
2008-01-01
The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. Current standards were researched and new standard interfaces were proposed. The implementation of the proposed standard interfaces on a laboratory breadboard SDR will be presented.
FPGA-Based Optical Cavity Phase Stabilization for Coherent Pulse Stacking
Xu, Yilun; Wilcox, Russell; Byrd, John; ...
2017-11-20
Coherent pulse stacking (CPS) is a new time-domain coherent addition technique that stacks several optical pulses into a single output pulse, enabling high pulse energy from fiber lasers. We develop a robust, scalable, and distributed digital control system with firmware and software integration for algorithms, to support the CPS application. We model CPS as a digital filter in the Z domain and implement a pulse-pattern-based cavity phase detection algorithm on an field-programmable gate array (FPGA). A two-stage (2+1 cavities) 15-pulse stacking system achieves an 11.0 peak-power enhancement factor. Each optical cavity is fed back at 1.5kHz, and stabilized at anmore » individually-prescribed round-trip phase with 0.7deg and 2.1deg rms phase errors for Stages 1 and 2, respectively. Optical cavity phase control with nanometer accuracy ensures 1.2% intensity stability of the stacked pulse over 12 h. The FPGA-based feedback control system can be scaled to large numbers of optical cavities.« less
FPGA-Based Optical Cavity Phase Stabilization for Coherent Pulse Stacking
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Yilun; Wilcox, Russell; Byrd, John
Coherent pulse stacking (CPS) is a new time-domain coherent addition technique that stacks several optical pulses into a single output pulse, enabling high pulse energy from fiber lasers. We develop a robust, scalable, and distributed digital control system with firmware and software integration for algorithms, to support the CPS application. We model CPS as a digital filter in the Z domain and implement a pulse-pattern-based cavity phase detection algorithm on an field-programmable gate array (FPGA). A two-stage (2+1 cavities) 15-pulse stacking system achieves an 11.0 peak-power enhancement factor. Each optical cavity is fed back at 1.5kHz, and stabilized at anmore » individually-prescribed round-trip phase with 0.7deg and 2.1deg rms phase errors for Stages 1 and 2, respectively. Optical cavity phase control with nanometer accuracy ensures 1.2% intensity stability of the stacked pulse over 12 h. The FPGA-based feedback control system can be scaled to large numbers of optical cavities.« less
Design for Review - Applying Lessons Learned to Improve the FPGA Review Process
NASA Technical Reports Server (NTRS)
Figueiredo, Marco A.; Li, Kenneth E.
2014-01-01
Flight Field Programmable Gate Array (FPGA) designs are required to be independently reviewed. This paper provides recommendations to Flight FPGA designers to properly prepare their designs for review in order to facilitate the review process, and reduce the impact of the review time in the overall project schedule.
NASA Technical Reports Server (NTRS)
Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.
2017-01-01
Today's launch vehicles complex electronic and avionics systems heavily utilize Field Programmable Gate Array (FPGA) integrated circuits (IC) for their superb speed and reconfiguration capabilities. Consequently, FPGAs are prevalent ICs in communication protocols such as MILSTD- 1553B and in control signal commands such as in solenoid valve actuations. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming
2011-11-01
Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
V&V Plan for FPGA-based ESF-CCS Using System Engineering Approach.
NASA Astrophysics Data System (ADS)
Maerani, Restu; Mayaka, Joyce; El Akrat, Mohamed; Cheon, Jung Jae
2018-02-01
Instrumentation and Control (I&C) systems play an important role in maintaining the safety of Nuclear Power Plant (NPP) operation. However, most current I&C safety systems are based on Programmable Logic Controller (PLC) hardware, which is difficult to verify and validate, and is susceptible to software common cause failure. Therefore, a plan for the replacement of the PLC-based safety systems, such as the Engineered Safety Feature - Component Control System (ESF-CCS), with Field Programmable Gate Arrays (FPGA) is needed. By using a systems engineering approach, which ensures traceability in every phase of the life cycle, from system requirements, design implementation to verification and validation, the system development is guaranteed to be in line with the regulatory requirements. The Verification process will ensure that the customer and stakeholder’s needs are satisfied in a high quality, trustworthy, cost efficient and schedule compliant manner throughout a system’s entire life cycle. The benefit of the V&V plan is to ensure that the FPGA based ESF-CCS is correctly built, and to ensure that the measurement of performance indicators has positive feedback that “do we do the right thing” during the re-engineering process of the FPGA based ESF-CCS.
2017-03-20
sub-array, which is based on all-pass filters (APFs) is realized using 130 nm CMOS technology. Approximate- discrete Fourier transform (a-DFT...fixed beams are directed at known directions [9]. The proposed approximate- discrete Fourier transform (a-DFT) based multi-beamformer [9] yields L...to digital conversion daughter board. occurs in the discrete time domain (in ROACH-2 FPGA platform) following signal digitization (see Figs. 1(d) and
An acceleration framework for synthetic aperture radar algorithms
NASA Astrophysics Data System (ADS)
Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.
2017-04-01
Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
Subnanosecond time-to-digital converter implemented in a Kintex-7 FPGA
NASA Astrophysics Data System (ADS)
Sano, Y.; Horii, Y.; Ikeno, M.; Sasaki, O.; Tomoto, M.; Uchida, T.
2017-12-01
Time-to-digital converters (TDCs) are used in various fields, including high-energy physics. One advantage of implementing TDCs in field-programmable gate arrays (FPGAs) is the flexibility on the modification of the logics, which is useful to cope with the changes in the experimental conditions. Recent FPGAs make it possible to implement TDCs with a time resolution less than 10 ps. On the other hand, various drift chambers require a time resolution of O(0.1) ns, and a simple and easy-to-implement TDC is useful for a robust operation. Herein an eight-channel TDC with a variable bin size down to 0.28 ns is implemented in a Xilinx Kintex-7 FPGA and tested. The TDC is based on a multisampling scheme with quad phase clocks synchronised with an external reference clock. Calibration of the bin size is unnecessary if a stable reference clock is available, which is common in high-energy physics experiments. Depending on the channel, the standard deviation of the differential nonlinearity for a 0.28 ns bin size is 0.13-0.31. The performance has a negligible dependence on the temperature. The power consumption and the potential to extend the number of channels are also discussed.
Introduction to FPGA Devices and The Challenges for Critical Application - A User's Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Kenneth
2015-01-01
This presentation is an introduction to Field Programmable Gate Array (FPGA) devices and the challenges of critical application including: safety, reliability, availability, recoverability, and security.
Design of an FPGA-Based Algorithm for Real-Time Solutions of Statistics-Based Positioning
DeWitt, Don; Johnson-Williams, Nathan G.; Miyaoka, Robert S.; Li, Xiaoli; Lockhart, Cate; Lewellen, Tom K.; Hauck, Scott
2010-01-01
We report on the implementation of an algorithm and hardware platform to allow real-time processing of the statistics-based positioning (SBP) method for continuous miniature crystal element (cMiCE) detectors. The SBP method allows an intrinsic spatial resolution of ~1.6 mm FWHM to be achieved using our cMiCE design. Previous SBP solutions have required a postprocessing procedure due to the computation and memory intensive nature of SBP. This new implementation takes advantage of a combination of algebraic simplifications, conversion to fixed-point math, and a hierarchal search technique to greatly accelerate the algorithm. For the presented seven stage, 127 × 127 bin LUT implementation, these algorithm improvements result in a reduction from >7 × 106 floating-point operations per event for an exhaustive search to < 5 × 103 integer operations per event. Simulations show nearly identical FWHM positioning resolution for this accelerated SBP solution, and positioning differences of <0.1 mm from the exhaustive search solution. A pipelined field programmable gate array (FPGA) implementation of this optimized algorithm is able to process events in excess of 250 K events per second, which is greater than the maximum expected coincidence rate for an individual detector. In contrast with all detectors being processed at a centralized host, as in the current system, a separate FPGA is available at each detector, thus dividing the computational load. These methods allow SBP results to be calculated in real-time and to be presented to the image generation components in real-time. A hardware implementation has been developed using a commercially available prototype board. PMID:21197135
NASA Astrophysics Data System (ADS)
Saxena, Shefali; Hawari, Ayman I.
2017-07-01
Digital signal processing techniques have been widely used in radiation spectrometry to provide improved stability and performance with compact physical size over the traditional analog signal processing. In this paper, field-programmable gate array (FPGA)-based adaptive digital pulse shaping techniques are investigated for real-time signal processing. National Instruments (NI) NI 5761 14-bit, 250-MS/s adaptor module is used for digitizing high-purity germanium (HPGe) detector's preamplifier pulses. Digital pulse processing algorithms are implemented on the NI PXIe-7975R reconfigurable FPGA (Kintex-7) using the LabVIEW FPGA module. Based on the time separation between successive input pulses, the adaptive shaping algorithm selects the optimum shaping parameters (rise time and flattop time of trapezoid-shaping filter) for each incoming signal. A digital Sallen-Key low-pass filter is implemented to enhance signal-to-noise ratio and reduce baseline drifting in trapezoid shaping. A recursive trapezoid-shaping filter algorithm is employed for pole-zero compensation of exponentially decayed (with two-decay constants) preamplifier pulses of an HPGe detector. It allows extraction of pulse height information at the beginning of each pulse, thereby reducing the pulse pileup and increasing throughput. The algorithms for RC-CR2 timing filter, baseline restoration, pile-up rejection, and pulse height determination are digitally implemented for radiation spectroscopy. Traditionally, at high-count-rate conditions, a shorter shaping time is preferred to achieve high throughput, which deteriorates energy resolution. In this paper, experimental results are presented for varying count-rate and pulse shaping conditions. Using adaptive shaping, increased throughput is accepted while preserving the energy resolution observed using the longer shaping times.
Neuron array with plastic synapses and programmable dendrites.
Ramakrishnan, Shubha; Wunderlich, Richard; Hasler, Jennifer; George, Suma
2013-10-01
We describe a novel neuromorphic chip architecture that models neurons for efficient computation. Traditional architectures of neuron array chips consist of large scale systems that are interfaced with AER for implementing intra- or inter-chip connectivity. We present a chip that uses AER for inter-chip communication but uses fast, reconfigurable FPGA-style routing with local memory for intra-chip connectivity. We model neurons with biologically realistic channel models, synapses and dendrites. This chip is suitable for small-scale network simulations and can also be used for sequence detection, utilizing directional selectivity properties of dendrites, ultimately for use in word recognition.
2005-12-01
Upsets in SRAM FPGAs,” Military and Aerospace Applications of Programmable Logic Devices, September 2002. 8. Wakerly , John F,. “Microcomputer...change. The goal of the Configurable Fault Tolerant Processor (CFTP) Project is to explore, develop and demonstrate the applicability of using off-the...develop and demonstrate the applicability of using commercial-of-the-shelf (COTS) Field Programmable Gate Arrays (FPGA) in the design of
High-speed line-scan camera with digital time delay integration
NASA Astrophysics Data System (ADS)
Bodenstorfer, Ernst; Fürtler, Johannes; Brodersen, Jörg; Mayer, Konrad J.; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert
2007-02-01
Dealing with high-speed image acquisition and processing systems, the speed of operation is often limited by the amount of available light, due to short exposure times. Therefore, high-speed applications often use line-scan cameras, based on charge-coupled device (CCD) sensors with time delayed integration (TDI). Synchronous shift and accumulation of photoelectric charges on the CCD chip - according to the objects' movement - result in a longer effective exposure time without introducing additional motion blur. This paper presents a high-speed color line-scan camera based on a commercial complementary metal oxide semiconductor (CMOS) area image sensor with a Bayer filter matrix and a field programmable gate array (FPGA). The camera implements a digital equivalent to the TDI effect exploited with CCD cameras. The proposed design benefits from the high frame rates of CMOS sensors and from the possibility of arbitrarily addressing the rows of the sensor's pixel array. For the digital TDI just a small number of rows are read out from the area sensor which are then shifted and accumulated according to the movement of the inspected objects. This paper gives a detailed description of the digital TDI algorithm implemented on the FPGA. Relevant aspects for the practical application are discussed and key features of the camera are listed.
Digitization of Analog Signals using a Field Programmable Gate Array (FPGA)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilera, Daniel; Rusu, Vadim
The idea of this research is consolidating the electrical components used for capturing data in the Mu2e Tracker. Ideally, an FPGA will serve as the Time-Division Converters (TDC) and Analog-to-Digital Converters (ADC). The TDC is already being carried out by the FPGA, but we are still using off the shelf ADCs. This poster proposes using Low Voltage Differential Signaling as the basis for analog-to-digital conversion using and FPGA.
Design and implementation of the NaI(Tl)/CsI(Na) detectors output signal generator
NASA Astrophysics Data System (ADS)
Zhou, Xu; Liu, Cong-Zhan; Zhao, Jian-Ling; Zhang, Fei; Zhang, Yi-Fei; Li, Zheng-Wei; Zhang, Shuo; Li, Xu-Fang; Lu, Xue-Feng; Xu, Zhen-Ling; Lu, Fang-Jun
2014-02-01
We designed and implemented a signal generator that can simulate the output of the NaI(Tl)/CsI(Na) detectors' pre-amplifier onboard the Hard X-ray Modulation Telescope (HXMT). Using the development of the FPGA (Field Programmable Gate Array) with VHDL language and adding a random constituent, we have finally produced the double exponential random pulse signal generator. The statistical distribution of the signal amplitude is programmable. The occurrence time intervals of the adjacent signals contain negative exponential distribution statistically.
NASA Technical Reports Server (NTRS)
Al Hassan, Mohammad; Novack, Steven D.; Hatfield, Glen S.; Britton, Paul
2017-01-01
Today's launch vehicles complex electronic and avionic systems heavily utilize the Field Programmable Gate Array (FPGA) integrated circuit (IC). FPGAs are prevalent ICs in communication protocols such as MIL-STD-1553B, and in control signal commands such as in solenoid/servo valves actuations. This paper will demonstrate guidelines to estimate FPGA failure rates for a launch vehicle, the guidelines will account for hardware, firmware, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC, FPGA memory and clock. The firmware portion will provide guidelines on the high level FPGA programming language and ways to account for software/code reliability growth. The radiation portion will provide guidelines on environment susceptibility as well as guidelines on tailoring other launch vehicle programs historical data to a specific launch vehicle.
An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed
NASA Astrophysics Data System (ADS)
Kim, H.; Choi, Y.; Yang, Y.
In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
Neuromorphic Hardware Architecture Using the Neural Engineering Framework for Pattern Recognition.
Wang, Runchun; Thakur, Chetan Singh; Cohen, Gregory; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, Andre
2017-06-01
We present a hardware architecture that uses the neural engineering framework (NEF) to implement large-scale neural networks on field programmable gate arrays (FPGAs) for performing massively parallel real-time pattern recognition. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks and we have previously presented an FPGA implementation of the NEF that successfully performs nonlinear mathematical computations. That work was developed based on a compact digital neural core, which consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. We have now scaled this approach up to build a pattern recognition system by combining identical neural cores together. As a proof of concept, we have developed a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture and hardware optimisations presented offer high-speed and resource-efficient means for performing high-speed, neuromorphic, and massively parallel pattern recognition and classification tasks.
Yang, Fan; Paindavoine, M
2003-01-01
This paper describes a real time vision system that allows us to localize faces in video sequences and verify their identity. These processes are image processing techniques based on the radial basis function (RBF) neural network approach. The robustness of this system has been evaluated quantitatively on eight video sequences. We have adapted our model for an application of face recognition using the Olivetti Research Laboratory (ORL), Cambridge, UK, database so as to compare the performance against other systems. We also describe three hardware implementations of our model on embedded systems based on the field programmable gate array (FPGA), zero instruction set computer (ZISC) chips, and digital signal processor (DSP) TMS320C62, respectively. We analyze the algorithm complexity and present results of hardware implementations in terms of the resources used and processing speed. The success rates of face tracking and identity verification are 92% (FPGA), 85% (ZISC), and 98.2% (DSP), respectively. For the three embedded systems, the processing speeds for images size of 288 /spl times/ 352 are 14 images/s, 25 images/s, and 4.8 images/s, respectively.
Free-running ADC- and FPGA-based signal processing method for brain PET using GAPD arrays
NASA Astrophysics Data System (ADS)
Hu, Wei; Choi, Yong; Hong, Key Jo; Kang, Jihoon; Jung, Jin Ho; Huh, Youn Suk; Lim, Hyun Keong; Kim, Sang Su; Kim, Byung-Tae; Chung, Yonghyun
2012-02-01
Currently, for most photomultiplier tube (PMT)-based PET systems, constant fraction discriminators (CFD) and time to digital converters (TDC) have been employed to detect gamma ray signal arrival time, whereas anger logic circuits and peak detection analog-to-digital converters (ADCs) have been implemented to acquire position and energy information of detected events. As compared to PMT the Geiger-mode avalanche photodiodes (GAPDs) have a variety of advantages, such as compactness, low bias voltage requirement and MRI compatibility. Furthermore, the individual read-out method using a GAPD array coupled 1:1 with an array scintillator can provide better image uniformity than can be achieved using PMT and anger logic circuits. Recently, a brain PET using 72 GAPD arrays (4×4 array, pixel size: 3 mm×3 mm) coupled 1:1 with LYSO scintillators (4×4 array, pixel size: 3 mm×3 mm×20 mm) has been developed for simultaneous PET/MRI imaging in our laboratory. Eighteen 64:1 position decoder circuits (PDCs) were used to reduce GAPD channel number and three off-the-shelf free-running ADC and field programmable gate array (FPGA) combined data acquisition (DAQ) cards were used for data acquisition and processing. In this study, a free-running ADC- and FPGA-based signal processing method was developed for the detection of gamma ray signal arrival time, energy and position information all together for each GAPD channel. For the method developed herein, three DAQ cards continuously acquired 18 channels of pre-amplified analog gamma ray signals and 108-bit digital addresses from 18 PDCs. In the FPGA, the digitized gamma ray pulses and digital addresses were processed to generate data packages containing pulse arrival time, baseline value, energy value and GAPD channel ID. Finally, these data packages were saved to a 128 Mbyte on-board synchronous dynamic random access memory (SDRAM) and then transferred to a host computer for coincidence sorting and image reconstruction. In order to evaluate the functionality of the developed signal processing method, energy and timing resolutions for brain PET were measured via the placement of a 6 μCi 22Na point source at the center of the PET scanner. Furthermore the PET image of the hot rod phantom (rod diameter: from 2.5 mm to 6.5 mm) with activity of 1 mCi was simulated, and then image acquisition experiment was performed using the brain PET. Measured average energy resolution for 1152 GAPD channels and system timing resolution were 19.5% (FWHM%) and 2.7 ns (FWHM), respectively. With regard to the acquisition of the hot rod phantom image, rods could be resolved down to a diameter of 2.5 mm, which was similar to simulated results. The experimental results demonstrated that the signal processing method developed herein was successfully implemented for brain PET. This reduced the complexity, cost and developing duration for PET system relative to normal PET electronics, and it will obviously be useful for the development of high-performance investigational PET systems.
An improved non-uniformity correction algorithm and its hardware implementation on FPGA
NASA Astrophysics Data System (ADS)
Rong, Shenghui; Zhou, Huixin; Wen, Zhigang; Qin, Hanlin; Qian, Kun; Cheng, Kuanhong
2017-09-01
The Non-uniformity of Infrared Focal Plane Arrays (IRFPA) severely degrades the infrared image quality. An effective non-uniformity correction (NUC) algorithm is necessary for an IRFPA imaging and application system. However traditional scene-based NUC algorithm suffers the image blurring and artificial ghosting. In addition, few effective hardware platforms have been proposed to implement corresponding NUC algorithms. Thus, this paper proposed an improved neural-network based NUC algorithm by the guided image filter and the projection-based motion detection algorithm. First, the guided image filter is utilized to achieve the accurate desired image to decrease the artificial ghosting. Then a projection-based moving detection algorithm is utilized to determine whether the correction coefficients should be updated or not. In this way the problem of image blurring can be overcome. At last, an FPGA-based hardware design is introduced to realize the proposed NUC algorithm. A real and a simulated infrared image sequences are utilized to verify the performance of the proposed algorithm. Experimental results indicated that the proposed NUC algorithm can effectively eliminate the fix pattern noise with less image blurring and artificial ghosting. The proposed hardware design takes less logic elements in FPGA and spends less clock cycles to process one frame of image.
NASA Astrophysics Data System (ADS)
Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza
2013-01-01
Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters
Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Godoy, Jorge; Martínez-Álvarez, Antonio
2017-01-01
Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle. PMID:29137137
Design of area array CCD image acquisition and display system based on FPGA
NASA Astrophysics Data System (ADS)
Li, Lei; Zhang, Ning; Li, Tianting; Pan, Yue; Dai, Yuming
2014-09-01
With the development of science and technology, CCD(Charge-coupled Device) has been widely applied in various fields and plays an important role in the modern sensing system, therefore researching a real-time image acquisition and display plan based on CCD device has great significance. This paper introduces an image data acquisition and display system of area array CCD based on FPGA. Several key technical challenges and problems of the system have also been analyzed and followed solutions put forward .The FPGA works as the core processing unit in the system that controls the integral time sequence .The ICX285AL area array CCD image sensor produced by SONY Corporation has been used in the system. The FPGA works to complete the driver of the area array CCD, then analog front end (AFE) processes the signal of the CCD image, including amplification, filtering, noise elimination, CDS correlation double sampling, etc. AD9945 produced by ADI Corporation to convert analog signal to digital signal. Developed Camera Link high-speed data transmission circuit, and completed the PC-end software design of the image acquisition, and realized the real-time display of images. The result through practical testing indicates that the system in the image acquisition and control is stable and reliable, and the indicators meet the actual project requirements.
NASA Technical Reports Server (NTRS)
Wang, Jih-Jong; Cronquist, Brian E.; McGowan, John E.; Katz, Richard B.
1997-01-01
The goals for a radiation hardened (RAD-HARD) and high reliability (HI-REL) field programmable gate array (FPGA) are described. The first qualified manufacturer list (QML) radiation hardened RH1280 and RH1020 were developed. The total radiation dose and single event effects observed on the antifuse FPGA RH1280 are reported on. Tradeoffs and the limitations in the single event upset hardening are discussed.
An innovative telescope control system architecture for SST-GATE telescopes at the CTA Observatory
NASA Astrophysics Data System (ADS)
Fasola, Gilles; Mignot, Shan; Laporte, Philippe; Abchiche, Abdel; Buchholtz, Gilles; Jégouzo, Isabelle
2014-07-01
SST-GATE (Small Size Telescope - GAmma-ray Telescope Elements) is a 4-metre telescope designed as a prototype for the Small Size Telescopes (SST) of the Cherenkov Telescope Array (CTA), a major facility for the very high energy gamma-ray astronomy of the next three decades. In this 100-telescope array there will be 70 SSTs, involving a design with an industrial view aiming at long-term service, low maintenance effort and reduced costs. More than a prototype, SST-GATE is also a fully functional telescope that shall be usable by scientists and students at the Observatoire de Meudon for 30 years. The Telescope Control System (TCS) is designed to work either as an element of a large array driven by an array controller or in a stand-alone mode with a remote workstation. Hence it is built to be autonomous with versatile interfacing; as an example, pointing and tracking —the main functions of the telescope— are managed onboard, including astronomical transformations, geometrical transformations (e.g. telescope bending model) and drive control. The core hardware is a CompactRIO (cRIO) featuring a real-time operating system and an FPGA. In this paper, we present an overview of the current status of the TCS. We especially focus on three items: the pointing computation implemented in the FPGA of the cRIO —using CORDIC algorithms— since it enables an optimisation of the hardware resources; data flow management based on OPCUA with its specific implementation on the cRIO; and the use of an EtherCAT field-bus for its ability to provide real-time data exchanges with the sensors and actuators distributed throughout the telescope.
Scholze, Stefan; Schiefer, Stefan; Partzsch, Johannes; Hartmann, Stephan; Mayr, Christian Georg; Höppner, Sebastian; Eisenreich, Holger; Henker, Stephan; Vogginger, Bernhard; Schüffny, Rene
2011-01-01
State-of-the-art large-scale neuromorphic systems require sophisticated spike event communication between units of the neural network. We present a high-speed communication infrastructure for a waferscale neuromorphic system, based on application-specific neuromorphic communication ICs in an field programmable gate arrays (FPGA)-maintained environment. The ICs implement configurable axonal delays, as required for certain types of dynamic processing or for emulating spike-based learning among distant cortical areas. Measurements are presented which show the efficacy of these delays in influencing behavior of neuromorphic benchmarks. The specialized, dedicated address-event-representation communication in most current systems requires separate, low-bandwidth configuration channels. In contrast, the configuration of the waferscale neuromorphic system is also handled by the digital packet-based pulse channel, which transmits configuration data at the full bandwidth otherwise used for pulse transmission. The overall so-called pulse communication subgroup (ICs and FPGA) delivers a factor 25–50 more event transmission rate than other current neuromorphic communication infrastructures. PMID:22016720
Design and implementation of an optical Gaussian noise generator
NASA Astrophysics Data System (ADS)
Za~O, Leonardo; Loss, Gustavo; Coelho, Rosângela
2009-08-01
A design of a fast and accurate optical Gaussian noise generator is proposed and demonstrated. The noise sample generation is based on the Box-Muller algorithm. The functions implementation was performed on a high-speed Altera Stratix EP1S25 field-programmable gate array (FPGA) development kit. It enabled the generation of 150 million 16-bit noise samples per second. The Gaussian noise generator required only 7.4% of the FPGA logic elements, 1.2% of the RAM memory, 0.04% of the ROM memory, and a laser source. The optical pulses were generated by a laser source externally modulated by the data bit samples using the frequency-shift keying technique. The accuracy of the noise samples was evaluated for different sequences size and confidence intervals. The noise sample pattern was validated by the Bhattacharyya distance (Bd) and the autocorrelation function. The results showed that the proposed design of the optical Gaussian noise generator is very promising to evaluate the performance of optical communications channels with very low bit-error-rate values.
NASA Astrophysics Data System (ADS)
Bostrom, G.; Atkinson, D.; Rice, A.
2015-04-01
Cavity ringdown spectroscopy (CRDS) uses the exponential decay constant of light exiting a high-finesse resonance cavity to determine analyte concentration, typically via absorption. We present a high-throughput data acquisition system that determines the decay constant in near real time using the discrete Fourier transform algorithm on a field programmable gate array (FPGA). A commercially available, high-speed, high-resolution, analog-to-digital converter evaluation board system is used as the platform for the system, after minor hardware and software modifications. The system outputs decay constants at maximum rate of 4.4 kHz using an 8192-point fast Fourier transform by processing the intensity decay signal between ringdown events. We present the details of the system, including the modifications required to adapt the evaluation board to accurately process the exponential waveform. We also demonstrate the performance of the system, both stand-alone and incorporated into our existing CRDS system. Details of FPGA, microcontroller, and circuitry modifications are provided in the Appendix and computer code is available upon request from the authors.
Hu, Chang-Hong; Xu, Xiao-Chen; Cannata, Jonathan M; Yen, Jesse T; Shung, K Kirk
2006-02-01
A real-time digital beamformer for high-frequency (>20 MHz) linear ultrasonic arrays has been developed. The system can handle up to 64-element linear array transducers and excite 16 channels and receive simultaneously at 100 MHz sampling frequency with 8-bit precision. Radio frequency (RF) signals are digitized, delayed, and summed through a real-time digital beamformer, which is implemented using a field programmable gate array (FPGA). Using fractional delay filters, fine delays as small as 2 ns can be implemented. A frame rate of 30 frames per second is achieved. Wire phantom (20 microm tungsten) images were obtained and -6 dB axial and lateral widths were measured. The results showed that, using a 30 MHz, 48-element array with a pitch of 100 microm produced a -6 dB width of 68 microm in the axial and 370 microm in the lateral direction at 6.4 mm range. Images from an excised rabbit eye sample also were acquired, and fine anatomical structures, such as the cornea and lens, were resolved.
Low-Cost Space Hardware and Software
NASA Technical Reports Server (NTRS)
Shea, Bradley Franklin
2013-01-01
The goal of this project is to demonstrate and support the overall vision of NASA's Rocket University (RocketU) through the design of an electrical power system (EPS) monitor for implementation on RUBICS (Rocket University Broad Initiatives CubeSat), through the support for the CHREC (Center for High-Performance Reconfigurable Computing) Space Processor, and through FPGA (Field Programmable Gate Array) design. RocketU will continue to provide low-cost innovations even with continuous cuts to the budget.
Field-programmable gate array implementation of an all-digital IEEE 802.15.4-compliant transceiver
NASA Astrophysics Data System (ADS)
Cornetta, Gianluca; Touhafi, Abdellah; Santos, David J.; Vázquez, José M.
2010-12-01
An architecture for a low-cost, low-complexity digital transceiver is presented in this article. The proposed architecture targets the IEEE 802.15.4 standard for short-range wireless personal area networks and has been implemented as a synthesisable VHDL register transfer level description. The system has been evaluated and tested using a Xilinx 90 nm Virtex-4 field-programmable gate array as the target technology. Bit error rate (BER) and error vector magnitude (EVM) have been used as the figures of merit for modem performance. Simulations show that the recommended minimum BER is achieved at E b/N 0 = 8.7 dB, whereas the EVM is 19.5%. The implemented device occupies 10% of the target FPGA and has a normalised maximum power consumption of 44 mW in transmit mode and 53 mW in receiver mode.
A shared synapse architecture for efficient FPGA implementation of autoencoders.
Suzuki, Akihiro; Morie, Takashi; Tamukoh, Hakaru
2018-01-01
This paper proposes a shared synapse architecture for autoencoders (AEs), and implements an AE with the proposed architecture as a digital circuit on a field-programmable gate array (FPGA). In the proposed architecture, the values of the synapse weights are shared between the synapses of an input and a hidden layer, and between the synapses of a hidden and an output layer. This architecture utilizes less of the limited resources of an FPGA than an architecture which does not share the synapse weights, and reduces the amount of synapse modules used by half. For the proposed circuit to be implemented into various types of AEs, it utilizes three kinds of parameters; one to change the number of layers' units, one to change the bit width of an internal value, and a learning rate. By altering a network configuration using these parameters, the proposed architecture can be used to construct a stacked AE. The proposed circuits are logically synthesized, and the number of their resources is determined. Our experimental results show that single and stacked AE circuits utilizing the proposed shared synapse architecture operate as regular AEs and as regular stacked AEs. The scalability of the proposed circuit and the relationship between the bit widths and the learning results are also determined. The clock cycles of the proposed circuits are formulated, and this formula is used to estimate the theoretical performance of the circuit when the circuit is used to construct arbitrary networks.
A shared synapse architecture for efficient FPGA implementation of autoencoders
Morie, Takashi; Tamukoh, Hakaru
2018-01-01
This paper proposes a shared synapse architecture for autoencoders (AEs), and implements an AE with the proposed architecture as a digital circuit on a field-programmable gate array (FPGA). In the proposed architecture, the values of the synapse weights are shared between the synapses of an input and a hidden layer, and between the synapses of a hidden and an output layer. This architecture utilizes less of the limited resources of an FPGA than an architecture which does not share the synapse weights, and reduces the amount of synapse modules used by half. For the proposed circuit to be implemented into various types of AEs, it utilizes three kinds of parameters; one to change the number of layers’ units, one to change the bit width of an internal value, and a learning rate. By altering a network configuration using these parameters, the proposed architecture can be used to construct a stacked AE. The proposed circuits are logically synthesized, and the number of their resources is determined. Our experimental results show that single and stacked AE circuits utilizing the proposed shared synapse architecture operate as regular AEs and as regular stacked AEs. The scalability of the proposed circuit and the relationship between the bit widths and the learning results are also determined. The clock cycles of the proposed circuits are formulated, and this formula is used to estimate the theoretical performance of the circuit when the circuit is used to construct arbitrary networks. PMID:29543909
The characterization and application of a low resource FPGA-based time to digital converter
NASA Astrophysics Data System (ADS)
Balla, Alessandro; Mario Beretta, Matteo; Ciambrone, Paolo; Gatta, Maurizio; Gonnella, Francesco; Iafolla, Lorenzo; Mascolo, Matteo; Messi, Roberto; Moricciani, Dario; Riondino, Domenico
2014-03-01
Time to Digital Converters (TDCs) are very common devices in particles physics experiments. A lot of "off-the-shelf" TDCs can be employed but the necessity of a custom DAta acQuisition (DAQ) system makes the TDCs implemented on the Field-Programmable Gate Arrays (FPGAs) desirable. Most of the architectures developed so far are based on the tapped delay lines with precision down to 10 ps, obtained with high FPGA resources usage and non-linearity issues to be managed. Often such precision is not necessary; in this case TDC architectures with low resources occupancy are preferable allowing the implementation of data processing systems and of other utilities on the same device. In order to reconstruct γγ physics events tagged with High Energy Tagger (HET) in the KLOE-2 (K LOng Experiment 2), we need to measure the Time Of Flight (TOF) of the electrons and positrons from the KLOE-2 Interaction Point (IP) to our tagging stations (11 m apart). The required resolution must be better than the bunch spacing (2.7 ns). We have developed and implemented on a Xilinx Virtex-5 FPGA a 32 channel TDC with a precision of 255 ps and low non-linearity effects along with an embedded data acquisition system and the interface to the online FARM of KLOE-2. The TDC is based on a low resources occupancy technique: the 4×Oversampling technique which, in this work, is pushed to its best resolution and its performances were exhaustively measured.
NASA Astrophysics Data System (ADS)
Jin, Minglei; Jin, Weiqi; Li, Yiyang; Li, Shuo
2015-08-01
In this paper, we propose a novel scene-based non-uniformity correction algorithm for infrared image processing-temporal high-pass non-uniformity correction algorithm based on grayscale mapping (THP and GM). The main sources of non-uniformity are: (1) detector fabrication inaccuracies; (2) non-linearity and variations in the read-out electronics and (3) optical path effects. The non-uniformity will be reduced by non-uniformity correction (NUC) algorithms. The NUC algorithms are often divided into calibration-based non-uniformity correction (CBNUC) algorithms and scene-based non-uniformity correction (SBNUC) algorithms. As non-uniformity drifts temporally, CBNUC algorithms must be repeated by inserting a uniform radiation source which SBNUC algorithms do not need into the view, so the SBNUC algorithm becomes an essential part of infrared imaging system. The SBNUC algorithms' poor robustness often leads two defects: artifacts and over-correction, meanwhile due to complicated calculation process and large storage consumption, hardware implementation of the SBNUC algorithms is difficult, especially in Field Programmable Gate Array (FPGA) platform. The THP and GM algorithm proposed in this paper can eliminate the non-uniformity without causing defects. The hardware implementation of the algorithm only based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay: less than 20 lines, it can be transplanted to a variety of infrared detectors equipped with FPGA image processing module, it can reduce the stripe non-uniformity and the ripple non-uniformity.
NASA Technical Reports Server (NTRS)
Wade, Randall S.; Jones, Bailey
2009-01-01
A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), reads back and verifies that code, reloads the code if an error is detected, and monitors the performance of the FPGA for errors in the presence of radiation. The program consists mainly of a set of VHDL files (wherein "VHDL" signifies "VHSIC Hardware Description Language" and "VHSIC" signifies "very-high-speed integrated circuit").
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choong, W. -S.; Abu-Nimeh, F.; Moses, W. W.
Here, we present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, whichmore » allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is "time stamped" by a time-to-digital converter (TDC) implemented inside the FPGA. In conclusion, this digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.« less
Hardware Implementation of Lossless Adaptive Compression of Data From a Hyperspectral Imager
NASA Technical Reports Server (NTRS)
Keymeulen, Didlier; Aranki, Nazeeh I.; Klimesh, Matthew A.; Bakhshi, Alireza
2012-01-01
Efficient onboard data compression can reduce the data volume from hyperspectral imagers on NASA and DoD spacecraft in order to return as much imagery as possible through constrained downlink channels. Lossless compression is important for signature extraction, object recognition, and feature classification capabilities. To provide onboard data compression, a hardware implementation of a lossless hyperspectral compression algorithm was developed using a field programmable gate array (FPGA). The underlying algorithm is the Fast Lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral- Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), p. 26 with the modification reported in Lossless, Multi-Spectral Data Comressor for Improved Compression for Pushbroom-Type Instruments (NPO-45473), NASA Tech Briefs, Vol. 32, No. 7 (July 2008) p. 63, which provides improved compression performance for data from pushbroom-type imagers. An FPGA implementation of the unmodified FL algorithm was previously developed and reported in Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System (NPO-46867), NASA Tech Briefs, Vol. 36, No. 5 (May 2012) p. 42. The essence of the FL algorithm is adaptive linear predictive compression using the sign algorithm for filter adaption. The FL compressor achieves a combination of low complexity and compression effectiveness that exceeds that of stateof- the-art techniques currently in use. The modification changes the predictor structure to tolerate differences in sensitivity of different detector elements, as occurs in pushbroom-type imagers, which are suitable for spacecraft use. The FPGA implementation offers a low-cost, flexible solution compared to traditional ASIC (application specific integrated circuit) and can be integrated as an intellectual property (IP) for part of, e.g., a design that manages the instrument interface. The FPGA implementation was benchmarked on the Xilinx Virtex IV LX25 device, and ported to a Xilinx prototype board. The current implementation has a critical path of 29.5 ns, which dictated a clock speed of 33 MHz. The critical path delay is end-to-end measurement between the uncompressed input data and the output compression data stream. The implementation compresses one sample every clock cycle, which results in a speed of 33 Msample/s. The implementation has a rather low device use of the Xilinx Virtex IV LX25, making the total power consumption of the implementation about 1.27 W.
New Developments in FPGA: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; Label, Kenneth A.; Pellish, Jonathan
2016-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
New Developments in FPGA: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; LaBel, Kenneth; Pellish, Jonathan
2015-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
VHDL Descriptions for the FPGA Implementation of PWL-Function-Based Multi-Scroll Chaotic Oscillators
2016-01-01
Nowadays, chaos generators are an attractive field for research and the challenge is their realization for the development of engineering applications. From more than three decades ago, chaotic oscillators have been designed using discrete electronic devices, very few with integrated circuit technology, and in this work we propose the use of field-programmable gate arrays (FPGAs) for fast prototyping. FPGA-based applications require that one be expert on programming with very-high-speed integrated circuits hardware description language (VHDL). In this manner, we detail the VHDL descriptions of chaos generators for fast prototyping from high-level programming using Python. The cases of study are three kinds of chaos generators based on piecewise-linear (PWL) functions that can be systematically augmented to generate even and odd number of scrolls. We introduce new algorithms for the VHDL description of PWL functions like saturated functions series, negative slopes and sawtooth. The generated VHDL-code is portable, reusable and open source to be synthesized in an FPGA. Finally, we show experimental results for observing 2, 10 and 30-scroll attractors. PMID:27997930
Tlelo-Cuautle, Esteban; Quintas-Valles, Antonio de Jesus; de la Fraga, Luis Gerardo; Rangel-Magdaleno, Jose de Jesus
2016-01-01
Nowadays, chaos generators are an attractive field for research and the challenge is their realization for the development of engineering applications. From more than three decades ago, chaotic oscillators have been designed using discrete electronic devices, very few with integrated circuit technology, and in this work we propose the use of field-programmable gate arrays (FPGAs) for fast prototyping. FPGA-based applications require that one be expert on programming with very-high-speed integrated circuits hardware description language (VHDL). In this manner, we detail the VHDL descriptions of chaos generators for fast prototyping from high-level programming using Python. The cases of study are three kinds of chaos generators based on piecewise-linear (PWL) functions that can be systematically augmented to generate even and odd number of scrolls. We introduce new algorithms for the VHDL description of PWL functions like saturated functions series, negative slopes and sawtooth. The generated VHDL-code is portable, reusable and open source to be synthesized in an FPGA. Finally, we show experimental results for observing 2, 10 and 30-scroll attractors.
Real-time digital signal processing in multiphoton and time-resolved microscopy
NASA Astrophysics Data System (ADS)
Wilson, Jesse W.; Warren, Warren S.; Fischer, Martin C.
2016-03-01
The use of multiphoton interactions in biological tissue for imaging contrast requires highly sensitive optical measurements. These often involve signal processing and filtering steps between the photodetector and the data acquisition device, such as photon counting and lock-in amplification. These steps can be implemented as real-time digital signal processing (DSP) elements on field-programmable gate array (FPGA) devices, an approach that affords much greater flexibility than commercial photon counting or lock-in devices. We will present progress toward developing two new FPGA-based DSP devices for multiphoton and time-resolved microscopy applications. The first is a high-speed multiharmonic lock-in amplifier for transient absorption microscopy, which is being developed for real-time analysis of the intensity-dependence of melanin, with applications in vivo and ex vivo (noninvasive histopathology of melanoma and pigmented lesions). The second device is a kHz lock-in amplifier running on a low cost (50-200) development platform. It is our hope that these FPGA-based DSP devices will enable new, high-speed, low-cost applications in multiphoton and time-resolved microscopy.
Systems-on-chip approach for real-time simulation of wheel-rail contact laws
NASA Astrophysics Data System (ADS)
Mei, T. X.; Zhou, Y. J.
2013-04-01
This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
Fpga based L-band pulse doppler radar design and implementation
NASA Astrophysics Data System (ADS)
Savci, Kubilay
As its name implies RADAR (Radio Detection and Ranging) is an electromagnetic sensor used for detection and locating targets from their return signals. Radar systems propagate electromagnetic energy, from the antenna which is in part intercepted by an object. Objects reradiate a portion of energy which is captured by the radar receiver. The received signal is then processed for information extraction. Radar systems are widely used for surveillance, air security, navigation, weather hazard detection, as well as remote sensing applications. In this work, an FPGA based L-band Pulse Doppler radar prototype, which is used for target detection, localization and velocity calculation has been built and a general-purpose Pulse Doppler radar processor has been developed. This radar is a ground based stationary monopulse radar, which transmits a short pulse with a certain pulse repetition frequency (PRF). Return signals from the target are processed and information about their location and velocity is extracted. Discrete components are used for the transmitter and receiver chain. The hardware solution is based on Xilinx Virtex-6 ML605 FPGA board, responsible for the control of the radar system and the digital signal processing of the received signal, which involves Constant False Alarm Rate (CFAR) detection and Pulse Doppler processing. The algorithm is implemented in MATLAB/SIMULINK using the Xilinx System Generator for DSP tool. The field programmable gate arrays (FPGA) implementation of the radar system provides the flexibility of changing parameters such as the PRF and pulse length therefore it can be used with different radar configurations as well. A VHDL design has been developed for 1Gbit Ethernet connection to transfer digitized return signal and detection results to PC. An A-Scope software has been developed with C# programming language to display time domain radar signals and detection results on PC. Data are processed both in FPGA chip and on PC. FPGA uses fixed point arithmetic operations as it is fast and facilitates source requirement as it consumes less hardware than floating point arithmetic operations. The software uses floating point arithmetic operations, which ensure precision in processing at the expense of speed. The functionality of the radar system has been tested for experimental validation in the field with a moving car and the validation of submodules are tested with synthetic data simulated on MATLAB.
NASA Technical Reports Server (NTRS)
Roosta, Ramin; Wang, Xinchen; Sadigursky, Michael; Tracton, Phil
2004-01-01
Field Programmable Gate Arrays (FPGA) have played increasingly important roles in military and aerospace applications. Xilinx SRAM-based FPGAs have been extensively used in commercial applications. They have been used less frequently in space flight applications due to their susceptibility to single-event upsets. Reliability of these devices in space applications is a concern that has not been addressed. The objective of this project is to design a fully programmable hardware/software platform that allows (but is not limited to) comprehensive static/dynamic burn-in test of Virtex-II 3000 FPGAs, at speed test and SEU test. Conventional methods test very few discrete AC parameters (primarily switching) of a given integrated circuit. This approach will test any possible configuration of the FPGA and any associated performance parameters. It allows complete or partial re-programming of the FPGA and verification of the program by using read back followed by dynamic test. Designers have full control over which functional elements of the FPGA to stress. They can completely simulate all possible types of configurations/functions. Another benefit of this platform is that it allows collecting information on elevation of the junction temperature as a function of gate utilization, operating frequency and functionality. A software tool has been implemented to demonstrate the various features of the system. The software consists of three major parts: the parallel interface driver, main system procedure and a graphical user interface (GUI).
Radiation Hardened 10BASE-T Ethernet Physical Layer (PHY)
NASA Technical Reports Server (NTRS)
Lin, Michael R. (Inventor); Petrick, David J. (Inventor); Ballou, Kevin M. (Inventor); Espinosa, Daniel C. (Inventor); James, Edward F. (Inventor); Kliesner, Matthew A. (Inventor)
2017-01-01
Embodiments may provide a radiation hardened 10BASE-T Ethernet interface circuit suitable for space flight and in compliance with the IEEE 802.3 standard for Ethernet. The various embodiments may provide a 10BASE-T Ethernet interface circuit, comprising a field programmable gate array (FPGA), a transmitter circuit connected to the FPGA, a receiver circuit connected to the FPGA, and a transformer connected to the transmitter circuit and the receiver circuit. In the various embodiments, the FPGA, transmitter circuit, receiver circuit, and transformer may be radiation hardened.
Hardware realization of an SVM algorithm implemented in FPGAs
NASA Astrophysics Data System (ADS)
Wiśniewski, Remigiusz; Bazydło, Grzegorz; Szcześniak, Paweł
2017-08-01
The paper proposes a technique of hardware realization of a space vector modulation (SVM) of state function switching in matrix converter (MC), oriented on the implementation in a single field programmable gate array (FPGA). In MC the SVM method is based on the instantaneous space-vector representation of input currents and output voltages. The traditional computation algorithms usually involve digital signal processors (DSPs) which consumes the large number of power transistors (18 transistors and 18 independent PWM outputs) and "non-standard positions of control pulses" during the switching sequence. Recently, hardware implementations become popular since computed operations may be executed much faster and efficient due to nature of the digital devices (especially concurrency). In the paper, we propose a hardware algorithm of SVM computation. In opposite to the existing techniques, the presented solution applies COordinate Rotation DIgital Computer (CORDIC) method to solve the trigonometric operations. Furthermore, adequate arithmetic modules (that is, sub-devices) used for intermediate calculations, such as code converters or proper sectors selectors (for output voltages and input current) are presented in detail. The proposed technique has been implemented as a design described with the use of Verilog hardware description language. The preliminary results of logic implementation oriented on the Xilinx FPGA (particularly, low-cost device from Artix-7 family from Xilinx was used) are also presented.
NASA Astrophysics Data System (ADS)
Yang, Shuangming; Wei, Xile; Deng, Bin; Liu, Chen; Li, Huiyan; Wang, Jiang
2018-03-01
Balance between biological plausibility of dynamical activities and computational efficiency is one of challenging problems in computational neuroscience and neural system engineering. This paper proposes a set of efficient methods for the hardware realization of the conductance-based neuron model with relevant dynamics, targeting reproducing the biological behaviors with low-cost implementation on digital programmable platform, which can be applied in wide range of conductance-based neuron models. Modified GP neuron models for efficient hardware implementation are presented to reproduce reliable pallidal dynamics, which decode the information of basal ganglia and regulate the movement disorder related voluntary activities. Implementation results on a field-programmable gate array (FPGA) demonstrate that the proposed techniques and models can reduce the resource cost significantly and reproduce the biological dynamics accurately. Besides, the biological behaviors with weak network coupling are explored on the proposed platform, and theoretical analysis is also made for the investigation of biological characteristics of the structured pallidal oscillator and network. The implementation techniques provide an essential step towards the large-scale neural network to explore the dynamical mechanisms in real time. Furthermore, the proposed methodology enables the FPGA-based system a powerful platform for the investigation on neurodegenerative diseases and real-time control of bio-inspired neuro-robotics.
Moving Horizon Estimation on a Chip
2014-06-26
description, e.g. VHDL or Verilog, for FPGA implementation . Especially for those whose main expertise is in control system design, writing algorithms in C...ditional Kalman Filter(KF) where recursive solution is available. We devel- oped various MHE designs and implemented them on the Xilinx Zynq ZC702 FPGA...practical deployment of the MHE technology. 2.2 Implementation of MHE on FPGA The next paper demonstrated the feasibility of implementing MHE algo
SWARM: A 32 GHz Correlator and VLBI Beamformer for the Submillimeter Array
NASA Astrophysics Data System (ADS)
Primiani, Rurik A.; Young, Kenneth H.; Young, André; Patel, Nimesh; Wilson, Robert W.; Vertatschitsch, Laura; Chitwood, Billie B.; Srinivasan, Ranjani; MacMahon, David; Weintroub, Jonathan
2016-03-01
A 32GHz bandwidth VLBI capable correlator and phased array has been designed and deployeda at the Smithsonian Astrophysical Observatory’s Submillimeter Array (SMA). The SMA Wideband Astronomical ROACH2 Machine (SWARM) integrates two instruments: a correlator with 140kHz spectral resolution across its full 32GHz band, used for connected interferometric observations, and a phased array summer used when the SMA participates as a station in the Event Horizon Telescope (EHT) very long baseline interferometry (VLBI) array. For each SWARM quadrant, Reconfigurable Open Architecture Computing Hardware (ROACH2) units shared under open-source from the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER) are equipped with a pair of ultra-fast analog-to-digital converters (ADCs), a field programmable gate array (FPGA) processor, and eight 10 Gigabit Ethernet (GbE) ports. A VLBI data recorder interface designated the SWARM digital back end, or SDBE, is implemented with a ninth ROACH2 per quadrant, feeding four Mark6 VLBI recorders with an aggregate recording rate of 64 Gbps. This paper describes the design and implementation of SWARM, as well as its deployment at SMA with reference to verification and science data.
Generation of Custom DSP Transform IP Cores: Case Study Walsh-Hadamard Transform
2002-09-01
mathematics and hardware design What I know: Finite state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing...state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing Adaptive filter theory … A math guy A hardware engineer...Synthesis Technology Libary Bit-width (8) HF factor (1,2,3,6) VF factor (1,2,4, ... 32) Xilinx FPGA Place&Route Xilinx FPGA Place&Route Performance
On-chip visual perception of motion: a bio-inspired connectionist model on FPGA.
Torres-Huitzil, César; Girau, Bernard; Castellanos-Sánchez, Claudio
2005-01-01
Visual motion provides useful information to understand the dynamics of a scene to allow intelligent systems interact with their environment. Motion computation is usually restricted by real time requirements that need the design and implementation of specific hardware architectures. In this paper, the design of hardware architecture for a bio-inspired neural model for motion estimation is presented. The motion estimation is based on a strongly localized bio-inspired connectionist model with a particular adaptation of spatio-temporal Gabor-like filtering. The architecture is constituted by three main modules that perform spatial, temporal, and excitatory-inhibitory connectionist processing. The biomimetic architecture is modeled, simulated and validated in VHDL. The synthesis results on a Field Programmable Gate Array (FPGA) device show the potential achievement of real-time performance at an affordable silicon area.
NASA Astrophysics Data System (ADS)
Wei, ZHANG; Tongyu, WU; Bowen, ZHENG; Shiping, LI; Yipo, ZHANG; Zejie, YIN
2018-04-01
A new neutron-gamma discriminator based on the support vector machine (SVM) method is proposed to improve the performance of the time-of-flight neutron spectrometer. The neutron detector is an EJ-299-33 plastic scintillator with pulse-shape discrimination (PSD) property. The SVM algorithm is implemented in field programmable gate array (FPGA) to carry out the real-time sifting of neutrons in neutron-gamma mixed radiation fields. This study compares the ability of the pulse gradient analysis method and the SVM method. The results show that this SVM discriminator can provide a better discrimination accuracy of 99.1%. The accuracy and performance of the SVM discriminator based on FPGA have been evaluated in the experiments. It can get a figure of merit of 1.30.
Wilson, Jesse W.; Park, Jong Kang; Warren, Warren S.
2015-01-01
The lock-in amplifier is a critical component in many different types of experiments, because of its ability to reduce spurious or environmental noise components by restricting detection to a single frequency and phase. One example application is pump-probe microscopy, a multiphoton technique that leverages excited-state dynamics for imaging contrast. With this application in mind, we present here the design and implementation of a high-speed lock-in amplifier on the field-programmable gate array (FPGA) coprocessor of a data acquisition board. The most important advantage is the inherent ability to filter signals based on more complex modulation patterns. As an example, we use the flexibility of the FPGA approach to enable a novel pump-probe detection scheme based on spread-spectrum communications techniques. PMID:25832238
Wilson, Jesse W; Park, Jong Kang; Warren, Warren S; Fischer, Martin C
2015-03-01
The lock-in amplifier is a critical component in many different types of experiments, because of its ability to reduce spurious or environmental noise components by restricting detection to a single frequency and phase. One example application is pump-probe microscopy, a multiphoton technique that leverages excited-state dynamics for imaging contrast. With this application in mind, we present here the design and implementation of a high-speed lock-in amplifier on the field-programmable gate array (FPGA) coprocessor of a data acquisition board. The most important advantage is the inherent ability to filter signals based on more complex modulation patterns. As an example, we use the flexibility of the FPGA approach to enable a novel pump-probe detection scheme based on spread-spectrum communications techniques.
Trigger design for a gamma ray detector of HIRFL-ETF
NASA Astrophysics Data System (ADS)
Du, Zhong-Wei; Su, Hong; Qian, Yi; Kong, Jie
2013-10-01
The Gamma Ray Array Detector (GRAD) is one subsystem of HIRFL-ETF (the External Target Facility (ETF) of the Heavy Ion Research Facility in Lanzhou (HIRFL)). It is capable of measuring the energy of gamma-rays with 1024 CsI scintillators in in-beam nuclear experiments. The GRAD trigger should select the valid events and reject the data from the scintillators which are not hit by the gamma-ray. The GRAD trigger has been developed based on the Field Programmable Gate Array (FPGAs) and PXI interface. It makes prompt trigger decisions to select valid events by processing the hit signals from the 1024 CsI scintillators. According to the physical requirements, the GRAD trigger module supplies 12-bit trigger information for the global trigger system of ETF and supplies a trigger signal for data acquisition (DAQ) system of GRAD. In addition, the GRAD trigger generates trigger data that are packed and transmitted to the host computer via PXI bus to be saved for off-line analysis. The trigger processing is implemented in the front-end electronics of GRAD and one FPGA of the GRAD trigger module. The logic of PXI transmission and reconfiguration is implemented in another FPGA of the GRAD trigger module. During the gamma-ray experiments, the GRAD trigger performs reliably and efficiently. The function of GRAD trigger is capable of satisfying the physical requirements.
Hardware Implementation of a Bilateral Subtraction Filter
NASA Technical Reports Server (NTRS)
Huertas, Andres; Watson, Robert; Villalpando, Carlos; Goldberg, Steven
2009-01-01
A bilateral subtraction filter has been implemented as a hardware module in the form of a field-programmable gate array (FPGA). In general, a bilateral subtraction filter is a key subsystem of a high-quality stereoscopic machine vision system that utilizes images that are large and/or dense. Bilateral subtraction filters have been implemented in software on general-purpose computers, but the processing speeds attainable in this way even on computers containing the fastest processors are insufficient for real-time applications. The present FPGA bilateral subtraction filter is intended to accelerate processing to real-time speed and to be a prototype of a link in a stereoscopic-machine- vision processing chain, now under development, that would process large and/or dense images in real time and would be implemented in an FPGA. In terms that are necessarily oversimplified for the sake of brevity, a bilateral subtraction filter is a smoothing, edge-preserving filter for suppressing low-frequency noise. The filter operation amounts to replacing the value for each pixel with a weighted average of the values of that pixel and the neighboring pixels in a predefined neighborhood or window (e.g., a 9 9 window). The filter weights depend partly on pixel values and partly on the window size. The present FPGA implementation of a bilateral subtraction filter utilizes a 9 9 window. This implementation was designed to take advantage of the ability to do many of the component computations in parallel pipelines to enable processing of image data at the rate at which they are generated. The filter can be considered to be divided into the following parts (see figure): a) An image pixel pipeline with a 9 9- pixel window generator, b) An array of processing elements; c) An adder tree; d) A smoothing-and-delaying unit; and e) A subtraction unit. After each 9 9 window is created, the affected pixel data are fed to the processing elements. Each processing element is fed the pixel value for its position in the window as well as the pixel value for the central pixel of the window. The absolute difference between these two pixel values is calculated and used as an address in a lookup table. Each processing element has a lookup table, unique for its position in the window, containing the weight coefficients for the Gaussian function for that position. The pixel value is multiplied by the weight, and the outputs of the processing element are the weight and pixel-value weight product. The products and weights are fed to the adder tree. The sum of the products and the sum of the weights are fed to the divider, which computes the sum of products the sum of weights. The output of the divider is denoted the bilateral smoothed image. The smoothing function is a simple weighted average computed over a 3 3 subwindow centered in the 9 9 window. After smoothing, the image is delayed by an additional amount of time needed to match the processing time for computing the bilateral smoothed image. The bilateral smoothed image is then subtracted from the 3 3 smoothed image to produce the final output. The prototype filter as implemented in a commercially available FPGA processes one pixel per clock cycle. Operation at a clock speed of 66 MHz has been demonstrated, and results of a static timing analysis have been interpreted as suggesting that the clock speed could be increased to as much as 100 MHz.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, K.; Chen, H.; Wu, W.
We present that in the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, themore » GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. Finally, the system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.« less
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
New Developments in FPGA Devices: SEUs and Fail-Safe Strategies from the NASA Goddard Perspective
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Kenneth; Pellish, Jonathan
2016-01-01
It has been shown that, when exposed to radiation environments, each Field Programmable Gate Array (FPGA) device has unique error signatures. Subsequently, fail-safe and mitigation strategies will differ per FPGA type. In this session several design approaches for safe systems will be presented. It will also explore the benefits and limitations of several mitigation techniques. The intention of the presentation is to provide information regarding FPGA types, their susceptibilities, and proven fail-safe strategies; so that users can select appropriate mitigation and perform the required trade for system insertion. The presentation will describe three types of FPGA devices and their susceptibilities in radiation environments.
Digitally Controlled Slot Coupled Patch Array
NASA Technical Reports Server (NTRS)
D'Arista, Thomas; Pauly, Jerry
2010-01-01
A four-element array conformed to a singly curved conducting surface has been demonstrated to provide 2 dB axial ratio of 14 percent, while maintaining VSWR (voltage standing wave ratio) of 2:1 and gain of 13 dBiC. The array is digitally controlled and can be scanned with the LMS Adaptive Algorithm using the power spectrum as the objective, as well as the Direction of Arrival (DoA) of the beam to set the amplitude of the power spectrum. The total height of the array above the conducting surface is 1.5 inches (3.8 cm). A uniquely configured microstrip-coupled aperture over a conducting surface produced supergain characteristics, achieving 12.5 dBiC across the 2-to-2.13- GHz and 2.2-to-2.3-GHz frequency bands. This design is optimized to retain VSWR and axial ratio across the band as well. The four elements are uniquely configured with respect to one another for performance enhancement, and the appropriate phase excitation to each element for scan can be found either by analytical beam synthesis using the genetic algorithm with the measured or simulated far field radiation pattern, or an adaptive algorithm implemented with the digitized signal. The commercially available tuners and field-programmable gate array (FPGA) boards utilized required precise phase coherent configuration control, and with custom code developed by Nokomis, Inc., were shown to be fully functional in a two-channel configuration controlled by FPGA boards. A four-channel tuner configuration and oscilloscope configuration were also demonstrated although algorithm post-processing was required.
IF digitization receiver of wideband digital array radar test-bed
NASA Astrophysics Data System (ADS)
Li, Weixing; Zhang, Yue; Lin, Jianzhi; Chen, Zengping
2014-10-01
In this paper, an X-band, 8-element wideband digital array radar (DAR) test-bed is presented, which makes use of a novel digital backend coupled with highly-integrated, multi-channel intermediate frequency (IF) digital receiver. Radar returns are received by the broadband antenna and then down-converted to the IF of 0.6GHz-3.0GHz. Four band-pass filters are applied in the front-end to divide the IF returns into four frequency bands with the instantaneous bandwidth of 500MHz. Every four array elements utilize a digital receiver, which is focused in this paper. The digital receivers are designed in a compact and flexible manner to meet the demands of DAR system. Each receiver consists of a fourchannel ADC, a high-performance FPGA, four DDR3 chips and two optical transceivers. With the sampling rate of up to 1.2GHz each channel, the ADC is capable of directly sampling the IF returns of four array elements at 10bits. In addition to serving as FIFO and controller, the onboard FPGA is also utilized for the implementation of various real-time algorithms such as DDC and channel calibration. Data is converted to bit stream and transferred through two low overhead, high data rate and multi-channel optical transceivers. Key technologies such as channel calibration and wideband DOA are studied with the measured data which is obtained in the experiments to illustrate the functionality of the system.
A CMOS ASIC Design for SiPM Arrays
Dey, Samrat; Banks, Lushon; Chen, Shaw-Pin; Xu, Wenbin; Lewellen, Thomas K.; Miyaoka, Robert S.; Rudell, Jacques C.
2012-01-01
Our lab has previously reported on novel board-level readout electronics for an 8×8 silicon photomultiplier (SiPM) array featuring row/column summation technique to reduce the hardware requirements for signal processing. We are taking the next step by implementing a monolithic CMOS chip which is based on the row-column architecture. In addition, this paper explores the option of using diagonal summation as well as calibration to compensate for temperature and process variations. Further description of a timing pickoff signal which aligns all of the positioning (spatial channels) pulses in the array is described. The ASIC design is targeted to be scalable with the detector size and flexible to accommodate detectors from different vendors. This paper focuses on circuit implementation issues associated with the design of the ASIC to interface our Phase II MiCES FPGA board with a SiPM array. Moreover, a discussion is provided for strategies to eventually integrate all the analog and mixed-signal electronics with the SiPM, on either a single-silicon substrate or multi-chip module (MCM). PMID:24825923
Wide Tuning Capability for Spacecraft Transponders
NASA Technical Reports Server (NTRS)
Lux, James; Mysoor, Narayan; Shah, Biren; Cook, Brian; Smith, Scott
2007-01-01
A document presents additional information on the means of implementing a capability for wide tuning of microwave receiver and transmitter frequencies in the development reported in the immediately preceding article, VCO PLL Frequency Synthesizers for Spacecraft Transponders (NPO- 42909). The reference frequency for a PLL-based frequency synthesizer is derived from a numerically controlled oscillator (NCO) implemented in digital logic, such that almost any reference frequency can be derived from a fixed crystal reference oscillator with microhertz precision. The frequency of the NCO is adjusted to track the received signal, then used to create another NCO frequency used to synthesize the transmitted signal coherent with, and at a specified frequency ratio to, the received signal. The frequencies can be changed, even during operation, through suitable digital programming. The NCOs and the related tracking loops and coherent turnaround logic are implemented in a field-programmable gate array (FPGA). The interface between the analog microwave receiver and transmitter circuits and the FPGA includes analog-to-digital and digital-toanalog converters, the sampling rates of which are chosen to minimize spurious signals and otherwise optimize performance. Several mixers and filters are used to properly route various signals.
FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG
2014-06-01
is normalized to π. The proposed burst-mode architecture is written in VHDL and verified using Modelsim. The VHDL design is implemented on a Xilinx...Document Number: SET 2014-0043 412TW-PA-14298 FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG June 2014 Final Report Test...To) 9/11 -- 8/14 4. TITLE AND SUBTITLE FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG 5a. CONTRACT NUMBER: W900KK-11-C-0032 5b
FPGA based demodulation of laser induced fluorescence in plasmas
NASA Astrophysics Data System (ADS)
Mattingly, Sean W.; Skiff, Fred
2018-04-01
We present a field programmable gate array (FPGA)-based system that counts photons from laser-induced fluorescence (LIF) on a laboratory plasma. This is accomplished with FPGA-based up/down counters that demodulate the data, giving a background-subtracted LIF signal stream that is updated with a new point as each laser amplitude modulation cycle completes. We demonstrate using the FPGA to modulate a laser at 1 MHz and demodulate the resulting LIF data stream. This data stream is used to calculate an LIF-based measurement sampled at 1 MHz of a plasma ion fluctuation spectrum.
A novel approach to Hough Transform for implementation in fast triggers
NASA Astrophysics Data System (ADS)
Pozzobon, Nicola; Montecassiano, Fabio; Zotto, Pierluigi
2016-10-01
Telescopes of position sensitive detectors are common layouts in charged particles tracking, and programmable logic devices, such as FPGAs, represent a viable choice for the real-time reconstruction of track segments in such detector arrays. A compact implementation of the Hough Transform for fast triggers in High Energy Physics, exploiting a parameter reduction method, is proposed, targeting the reduction of the needed storage or computing resources in current, or next future, state-of-the-art FPGA devices, while retaining high resolution over a wide range of track parameters. The proposed approach is compared to a Standard Hough Transform with particular emphasis on their application to muon detectors. In both cases, an original readout implementation is modeled.
Implementation of High Speed Distributed Data Acquisition System
NASA Astrophysics Data System (ADS)
Raju, Anju P.; Sekhar, Ambika
2012-09-01
This paper introduces a high speed distributed data acquisition system based on a field programmable gate array (FPGA). The aim is to develop a "distributed" data acquisition interface. The development of instruments such as personal computers and engineering workstations based on "standard" platforms is the motivation behind this effort. Using standard platforms as the controlling unit allows independence in hardware from a particular vendor and hardware platform. The distributed approach also has advantages from a functional point of view: acquisition resources become available to multiple instruments; the acquisition front-end can be physically remote from the rest of the instrument. High speed data acquisition system transmits data faster to a remote computer system through Ethernet interface. The data is acquired through 16 analog input channels. The input data commands are multiplexed and digitized and then the data is stored in 1K buffer for each input channel. The main control unit in this design is the 16 bit processor implemented in the FPGA. This 16 bit processor is used to set up and initialize the data source and the Ethernet controller, as well as control the flow of data from the memory element to the NIC. Using this processor we can initialize and control the different configuration registers in the Ethernet controller in a easy manner. Then these data packets are sending to the remote PC through the Ethernet interface. The main advantages of the using FPGA as standard platform are its flexibility, low power consumption, short design duration, fast time to market, programmability and high density. The main advantages of using Ethernet controller AX88796 over others are its non PCI interface, the presence of embedded SRAM where transmit and reception buffers are located and high-performance SRAM-like interface. The paper introduces the implementation of the distributed data acquisition using FPGA by VHDL. The main advantages of this system are high accuracy, high speed, real time monitoring.
Wirtz, Sebastian F; Cunha, Adauto P A; Labusch, Marc; Marzun, Galina; Barcikowski, Stephan; Söffker, Dirk
2018-06-01
Today, the demand for continuous monitoring of valuable or safety critical equipment is increasing in many industrial applications due to safety and economical requirements. Therefore, reliable in-situ measurement techniques are required for instance in Structural Health Monitoring (SHM) as well as process monitoring and control. Here, current challenges are related to the processing of sensor data with a high data rate and low latency. In particular, measurement and analyses of Acoustic Emission (AE) are widely used for passive, in-situ inspection. Advantages of AE are related to its sensitivity to different micro-mechanical mechanisms on the material level. However, online processing of AE waveforms is computationally demanding. The related equipment is typically bulky, expensive, and not well suited for permanent installation. The contribution of this paper is the development of a Field Programmable Gate Array (FPGA)-based measurement system using ZedBoard devlopment kit with Zynq-7000 system on chip for embedded implementation of suitable online processing algorithms. This platform comprises a dual-core Advanced Reduced Instruction Set Computer Machine (ARM) architecture running a Linux operating system and FPGA fabric. A FPGA-based hardware implementation of the discrete wavelet transform is realized to accelerate processing the AE measurements. Key features of the system are low cost, small form factor, and low energy consumption, which makes it suitable to serve as field-deployed measurement and control device. For verification of the functionality, a novel automatically realized adjustment of the working distance during pulsed laser ablation in liquids is established as an example. A sample rate of 5 MHz is achieved at 16 bit resolution.
Real-time windowing in imaging radar using FPGA technique
NASA Astrophysics Data System (ADS)
Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique
2005-02-01
The imaging radar uses the high frequency electromagnetic waves reflected from different objects for estimating of its parameters. Pulse compression is a standard signal processing technique used to minimize the peak transmission power and to maximize SNR, and to get a better resolution. Usually the pulse compression can be achieved using a matched filter. The level of the side-lobes in the imaging radar can be reduced using the special weighting function processing. There are very known different weighting functions: Hamming, Hanning, Blackman, Chebyshev, Blackman-Harris, Kaiser-Bessel, etc., widely used in the signal processing applications. Field Programmable Gate Arrays (FPGAs) offers great benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. This reconfiguration makes FPGAs a better solution over custom-made integrated circuits. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal and pulse compression using Matlab, Simulink, and System Generator. Employing FPGA and mentioned software we have proposed the pulse compression design on FPGA using classical and novel windows technique to reduce the side-lobes level. This permits increasing the detection ability of the small or nearly placed targets in imaging radar. The advantage of FPGA that can do parallelism in real time processing permits to realize the proposed algorithms. The paper also presents the experimental results of proposed windowing procedure in the marine radar with such the parameters: signal is linear FM (Chirp); frequency deviation DF is 9.375MHz; the pulse width T is 3.2μs taps number in the matched filter is 800 taps; sampling frequency 253.125*106 MHz. It has been realized the reducing of side-lobes levels in real time permitting better resolution of the small targets.
FPGA wavelet processor design using language for instruction-set architectures (LISA)
NASA Astrophysics Data System (ADS)
Meyer-Bäse, Uwe; Vera, Alonzo; Rao, Suhasini; Lenk, Karl; Pattichis, Marios
2007-04-01
The design of an microprocessor is a long, tedious, and error-prone task consisting of typically three design phases: architecture exploration, software design (assembler, linker, loader, profiler), architecture implementation (RTL generation for FPGA or cell-based ASIC) and verification. The Language for instruction-set architectures (LISA) allows to model a microprocessor not only from instruction-set but also from architecture description including pipelining behavior that allows a design and development tool consistency over all levels of the design. To explore the capability of the LISA processor design platform a.k.a. CoWare Processor Designer we present in this paper three microprocessor designs that implement a 8/8 wavelet transform processor that is typically used in today's FBI fingerprint compression scheme. We have designed a 3 stage pipelined 16 bit RISC processor (NanoBlaze). Although RISC μPs are usually considered "fast" processors due to design concept like constant instruction word size, deep pipelines and many general purpose registers, it turns out that DSP operations consume essential processing time in a RISC processor. In a second step we have used design principles from programmable digital signal processor (PDSP) to improve the throughput of the DWT processor. A multiply-accumulate operation along with indirect addressing operation were the key to achieve higher throughput. A further improvement is possible with today's FPGA technology. Today's FPGAs offer a large number of embedded array multipliers and it is now feasible to design a "true" vector processor (TVP). A multiplication of two vectors can be done in just one clock cycle with our TVP, a complete scalar product in two clock cycles. Code profiling and Xilinx FPGA ISE synthesis results are provided that demonstrate the essential improvement that a TVP has compared with traditional RISC or PDSP designs.
NASA Astrophysics Data System (ADS)
Mandal, Swagata; Saini, Jogender; Zabołotny, Wojciech M.; Sau, Suman; Chakrabarti, Amlan; Chattopadhyay, Subhasis
2017-03-01
Due to the dramatic increase of data volume in modern high energy physics (HEP) experiments, a robust high-speed data acquisition (DAQ) system is very much needed to gather the data generated during different nuclear interactions. As the DAQ works under harsh radiation environment, there is a fair chance of data corruption due to various energetic particles like alpha, beta, or neutron. Hence, a major challenge in the development of DAQ in the HEP experiment is to establish an error resilient communication system between front-end sensors or detectors and back-end data processing computing nodes. Here, we have implemented the DAQ using field-programmable gate array (FPGA) due to some of its inherent advantages over the application-specific integrated circuit. A novel orthogonal concatenated code and cyclic redundancy check (CRC) have been used to mitigate the effects of data corruption in the user data. Scrubbing with a 32-b CRC has been used against error in the configuration memory of FPGA. Data from front-end sensors will reach to the back-end processing nodes through multiple stages that may add an uncertain amount of delay to the different data packets. We have also proposed a novel memory management algorithm that helps to process the data at the back-end computing nodes removing the added path delays. To the best of our knowledge, the proposed FPGA-based DAQ utilizing optical link with channel coding and efficient memory management modules can be considered as first of its kind. Performance estimation of the implemented DAQ system is done based on resource utilization, bit error rate, efficiency, and robustness to radiation.
The Design and Development of the SMEX-Lite Power System
NASA Technical Reports Server (NTRS)
Rakow, Glenn P.; Schnurr, Richard G., Jr.; Solly, Michael A.
1998-01-01
This paper describes the design and development of a 250W orbit average electrical power system electronic Power Node and software for use in Low Earth Orbit missions. The mass of the Power Node is 3.6 Kg (8 lb.). The dimensions of the Power Node are 30cm x 26cm x 7.9cm (11 in. x 10.25 in x 3.1 in.) The design was realized using software, Field Programmable Gate Array (FPGA) digital logic and surface mount technology. The design is generic enough to reduce the non-recurring engineering for different mission configurations. The Power Node charges one to five, low cost, 22-cell 4 AH D-cell battery packs independently. The battery charging algorithms are executed in the power software to reduce the mass and size of the power electronic. The Power Node implements a peak-power tracking algorithm using an innovative hardware/software approach. The power software task is hosted on the spacecraft processor. The power software task generates a MIL-STD-1553 command packet to update the Power Node control settings. The settings for the battery voltage and current limits, as well as minimum solar array voltage used to implement peak power tracking are contained in this packet. Several advanced topologies are used in the Power Node. These include synchronous rectification in the bus regulators, average current control in the battery chargers and quasi-resonant converters for the Field Effect Transistor (FET) transistor drive electronics. Lastly, the main bus regulator uses a feed-forward topology with the PWM implemented in an FPGA.
Preliminary Study of Image Reconstruction Algorithm on a Digital Signal Processor
2014-03-01
5.2 Comparison of CPU-GPU, CPU-FPGA, and CPU-DSP Designs The work for implementing VHDL description of the back-projection algorithm on a physical...FPGA was not complete. Hence, the DSP implementation results are compared with the simulated results for the VHDL design. Simulating VHDL provides an...rather than at the software level. Depending on an application’s characteristics, FPGA implementations can provide a significant performance
Particle Identification on an FPGA Accelerated Compute Platform for the LHCb Upgrade
NASA Astrophysics Data System (ADS)
Fäerber, Christian; Schwemmer, Rainer; Machen, Jonathan; Neufeld, Niko
2017-07-01
The current LHCb readout system will be upgraded in 2018 to a “triggerless” readout of the entire detector at the Large Hadron Collider collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 Gb/s up to 40 Tb/s. The event filter farm will preanalyze the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, and the reason why different new technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building farm or in the event filter farm (trigger), an experimental field programmable gate array (FPGA) accelerated computing platform is considered and, therefore, tested. FPGA compute accelerators are used more and more in standard servers such as for Microsoft Bing search or Baidu search. The platform we use hosts a general Intel CPU and a high-performance FPGA linked via the high-speed Intel QuickPath Interconnect. An accelerator is implemented on the FPGA. It is very likely that these platforms, which are built, in general, for high-performance computing, are also very interesting for the high-energy physics community. First, the performance results of smaller test cases performed at the beginning are presented. Afterward, a part of the existing LHCb RICH particle identification is tested and is ported to the experimental FPGA accelerated platform. We have compared the performance of the LHCb RICH particle identification running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform.
Encoders for block-circulant LDPC codes
NASA Technical Reports Server (NTRS)
Andrews, Kenneth; Dolinar, Sam; Thorpe, Jeremy
2005-01-01
In this paper, we present two encoding methods for block-circulant LDPC codes. The first is an iterative encoding method based on the erasure decoding algorithm, and the computations required are well organized due to the block-circulant structure of the parity check matrix. The second method uses block-circulant generator matrices, and the encoders are very similar to those for recursive convolutional codes. Some encoders of the second type have been implemented in a small Field Programmable Gate Array (FPGA) and operate at 100 Msymbols/second.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wojahn, Christopher K.
2015-10-20
This HDL code (hereafter referred to as "software") implements circuitry in Xilinx Virtex-5QV Field Programmable Gate Array (FPGA) hardware. This software allows the device to self-check the consistency of its own configuration memory for radiation-induced errors. The software then provides the capability to correct any single-bit errors detected in the memory using the device's inherent circuitry, or reload corrupted memory frames when larger errors occur that cannot be corrected with the device's built-in error correction and detection scheme.
2016-05-01
A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
2016-05-01
A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
Field-Programmable Gate Array-based fluxgate magnetometer with digital integration
NASA Astrophysics Data System (ADS)
Butta, Mattia; Janosek, Michal; Ripka, Pavel
2010-05-01
In this paper, a digital magnetometer based on printed circuit board fluxgate is presented. The fluxgate is pulse excited and the signal is extracted by gate integration. We investigate the possibility to perform integration on very narrow gates (typically 500 ns) by using digital techniques. The magnetometer is based on field-programmable gate array (FPGA) card: we will show all the advantages and disadvantages, given by digitalization of fluxgate output voltage by means of analog-to-digital converter on FPGA card, as well as digitalization performed by external digitizer. Due to very narrow gate, it is shown that a magnetometer entirely based on a FPGA card is preferable, because it avoids noise due to trigger instability. Both open loop and feedback operative mode are described and achieved results are presented.
Performance of the Fully Digital FPGA-Based Front-End Electronics for the GALILEO Array
NASA Astrophysics Data System (ADS)
Barrientos, D.; Bellato, M.; Bazzacco, D.; Bortolato, D.; Cocconi, P.; Gadea, A.; González, V.; Gulmini, M.; Isocrate, R.; Mengoni, D.; Pullia, A.; Recchia, F.; Rosso, D.; Sanchis, E.; Toniolo, N.; Ur, C. A.; Valiente-Dobón, J. J.
2015-12-01
In this work we present the architecture and results of a fully digital Front End Electronics (FEE) read out system developed for the GALILEO array. The FEE system, developed in collaboration with the Advanced Gamma Tracking Array (AGATA) collaboration, is composed of three main blocks: preamplifiers, digitizers and preprocessing electronics. The slow control system contains a custom Linux driver, a dynamic library and a server implementing network services. This work presents the first results of the digital FEE system coupled with a GALILEO germanium detector, which has demonstrated the capability to achieve an energy resolution of 1.530/00 at an energy of 1.33 MeV, similar to the one obtained with a conventional analog system. While keeping a good performance in terms of energy resolution, digital electronics will allow to instrument the full GALILEO array with a versatile system with high integration and low power consumption and costs.
NASA Astrophysics Data System (ADS)
Abdolmohammadi, Hamid Reza; Khalaf, Abdul Jalil M.; Panahi, Shirin; Rajagopal, Karthikeyan; Pham, Viet-Thanh; Jafari, Sajad
2018-06-01
Nowadays, designing chaotic systems with hidden attractor is one of the most interesting topics in nonlinear dynamics and chaos. In this paper, a new 4D chaotic system is proposed. This new chaotic system has no equilibria, and so it belongs to the category of systems with hidden attractors. Dynamical features of this system are investigated with the help of its state-space portraits, bifurcation diagram, Lyapunov exponents diagram, and basin of attraction. Also a hardware realisation of this system is proposed by using field programmable gate arrays (FPGA). In addition, an electronic circuit design for the chaotic system is introduced.
Design of barrier bucket kicker control system
NASA Astrophysics Data System (ADS)
Ni, Fa-Fu; Wang, Yan-Yu; Yin, Jun; Zhou, De-Tai; Shen, Guo-Dong; Zheng, Yang-De.; Zhang, Jian-Chuan; Yin, Jia; Bai, Xiao; Ma, Xiao-Li
2018-05-01
The Heavy-Ion Research Facility in Lanzhou (HIRFL) contains two synchrotrons: the main cooler storage ring (CSRm) and the experimental cooler storage ring (CSRe). Beams are extracted from CSRm, and injected into CSRe. To apply the Barrier Bucket (BB) method on the CSRe beam accumulation, a new BB technology based kicker control system was designed and implemented. The controller of the system is implemented using an Advanced Reduced Instruction Set Computer (RISC) Machine (ARM) chip and a field-programmable gate array (FPGA) chip. Within the architecture, ARM is responsible for data presetting and floating number arithmetic processing. The FPGA computes the RF phase point of the two rings and offers more accurate control of the time delay. An online preliminary experiment on HIRFL was also designed to verify the functionalities of the control system. The result shows that the reference trigger point of two different sinusoidal RF signals for an arbitrary phase point was acquired with a matched phase error below 1° (approximately 2.1 ns), and the step delay time better than 2 ns were realized.
DeepX: Deep Learning Accelerator for Restricted Boltzmann Machine Artificial Neural Networks.
Kim, Lok-Won
2018-05-01
Although there have been many decades of research and commercial presence on high performance general purpose processors, there are still many applications that require fully customized hardware architectures for further computational acceleration. Recently, deep learning has been successfully used to learn in a wide variety of applications, but their heavy computation demand has considerably limited their practical applications. This paper proposes a fully pipelined acceleration architecture to alleviate high computational demand of an artificial neural network (ANN) which is restricted Boltzmann machine (RBM) ANNs. The implemented RBM ANN accelerator (integrating network size, using 128 input cases per batch, and running at a 303-MHz clock frequency) integrated in a state-of-the art field-programmable gate array (FPGA) (Xilinx Virtex 7 XC7V-2000T) provides a computational performance of 301-billion connection-updates-per-second and about 193 times higher performance than a software solution running on general purpose processors. Most importantly, the architecture enables over 4 times (12 times in batch learning) higher performance compared with a previous work when both are implemented in an FPGA device (XC2VP70).
A front-end readout Detector Board for the OpenPET electronics system
NASA Astrophysics Data System (ADS)
Choong, W.-S.; Abu-Nimeh, F.; Moses, W. W.; Peng, Q.; Vu, C. Q.; Wu, J.-Y.
2015-08-01
We present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, which allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is ``time stamped'' by a time-to-digital converter (TDC) implemented inside the FPGA . This digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.
A front-end readout Detector Board for the OpenPET electronics system
Choong, W. -S.; Abu-Nimeh, F.; Moses, W. W.; ...
2015-08-12
Here, we present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, whichmore » allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is "time stamped" by a time-to-digital converter (TDC) implemented inside the FPGA. In conclusion, this digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.« less
Full image-processing pipeline in field-programmable gate array for a small endoscopic camera
NASA Astrophysics Data System (ADS)
Mostafa, Sheikh Shanawaz; Sousa, L. Natércia; Ferreira, Nuno Fábio; Sousa, Ricardo M.; Santos, Joao; Wäny, Martin; Morgado-Dias, F.
2017-01-01
Endoscopy is an imaging procedure used for diagnosis as well as for some surgical purposes. The camera used for the endoscopy should be small and able to produce a good quality image or video, to reduce discomfort of the patients, and to increase the efficiency of the medical team. To achieve these fundamental goals, a small endoscopy camera with a footprint of 1 mm×1 mm×1.65 mm is used. Due to the physical properties of the sensors and human vision system limitations, different image-processing algorithms, such as noise reduction, demosaicking, and gamma correction, among others, are needed to faithfully reproduce the image or video. A full image-processing pipeline is implemented using a field-programmable gate array (FPGA) to accomplish a high frame rate of 60 fps with minimum processing delay. Along with this, a viewer has also been developed to display and control the image-processing pipeline. The control and data transfer are done by a USB 3.0 end point in the computer. The full developed system achieves real-time processing of the image and fits in a Xilinx Spartan-6LX150 FPGA.
Tethered Forth system for FPGA applications
NASA Astrophysics Data System (ADS)
Goździkowski, Paweł; Zabołotny, Wojciech M.
2013-10-01
This paper presents the tethered Forth system dedicated for testing and debugging of FPGA based electronic systems. Use of the Forth language allows to interactively develop and run complex testing or debugging routines. The solution is based on a small, 16-bit soft core CPU, used to implement the Forth Virtual Machine. Thanks to the use of the tethered Forth model it is possible to minimize usage of the internal RAM memory in the FPGA. The function of the intelligent terminal, which is an essential part of the tethered Forth system, may be fulfilled by the standard PC computer or by the smartphone. System is implemented in Python (the software for intelligent terminal), and in VHDL (the IP core for FPGA), so it can be easily ported to different hardware platforms. The connection between the terminal and FPGA may be established and disconnected many times without disturbing the state of the FPGA based system. The presented system has been verified in the hardware, and may be used as a tool for debugging, testing and even implementing of control algorithms for FPGA based systems.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms
NASA Astrophysics Data System (ADS)
Du, Hongtao; Qi, Hairong; Szu, Harold H.
2005-03-01
Recent advances in intelligence technology have boosted the development of micro- Unmanned Air Vehicles (UAVs) including Sliver Fox, Shadow, and Scan Eagle for various surveillance and reconnaissance applications. These affordable and reusable devices have to fit a series of size, weight, and power constraints. Cameras used on such micro-UAVs are therefore mounted directly at a fixed angle without any motion-compensated gimbals. This mounting scheme has resulted in the so-called jitter effect in which jitter is defined as sub-pixel or small amplitude vibrations. The jitter blur caused by the jitter effect needs to be corrected before any other processing algorithms can be practically applied. Jitter restoration has been solved by various optimization techniques, including Wiener approximation, maximum a-posteriori probability (MAP), etc. However, these algorithms normally assume a spatial-invariant blur model that is not the case with jitter blur. Szu et al. developed a smart real-time algorithm based on auto-regression (AR) with its natural generalization of unsupervised artificial neural network (ANN) learning to achieve restoration accuracy at the sub-pixel level. This algorithm resembles the capability of the human visual system, in which an agreement between the pair of eyes indicates "signal", otherwise, the jitter noise. Using this non-statistical method, for each single pixel, a deterministic blind sources separation (BSS) process can then be carried out independently based on a deterministic minimum of the Helmholtz free energy with a generalization of Shannon's information theory applied to open dynamic systems. From a hardware implementation point of view, the process of jitter restoration of an image using Szu's algorithm can be optimized by pixel-based parallelization. In our previous work, a parallelly structured independent component analysis (ICA) algorithm has been implemented on both Field Programmable Gate Array (FPGA) and Application-Specific Integrated Circuit (ASIC) using standard-height cells. ICA is an algorithm that can solve BSS problems by carrying out the all-order statistical, decorrelation-based transforms, in which an assumption that neighborhood pixels share the same but unknown mixing matrix A is made. In this paper, we continue our investigation on the design challenges of firmware approaches to smart algorithms. We think two levels of parallelization can be explored, including pixel-based parallelization and the parallelization of the restoration algorithm performed at each pixel. This paper focuses on the latter and we use ICA as an example to explain the design and implementation methods. It is well known that the capacity constraints of single FPGA have limited the implementation of many complex algorithms including ICA. Using the reconfigurability of FPGA, we show, in this paper, how to manipulate the FPGA-based system to provide extra computing power for the parallelized ICA algorithm with limited FPGA resources. The synthesis aiming at the pilchard re-configurable FPGA platform is reported. The pilchard board is embedded with single Xilinx VIRTEX 1000E FPGA and transfers data directly to CPU on the 64-bit memory bus at the maximum frequency of 133MHz. Both the feasibility performance evaluations and experimental results validate the effectiveness and practicality of this synthesis, which can be extended to the spatial-variant jitter restoration for micro-UAV deployment.
Accuracy and Resolution Analysis of a Direct Resistive Sensor Array to FPGA Interface
Oballe-Peinado, Óscar; Vidal-Verdú, Fernando; Sánchez-Durán, José A.; Castellanos-Ramos, Julián; Hidalgo-López, José A.
2016-01-01
Resistive sensor arrays are formed by a large number of individual sensors which are distributed in different ways. This paper proposes a direct connection between an FPGA and a resistive array distributed in M rows and N columns, without the need of analog-to-digital converters to obtain resistance values in the sensor and where the conditioning circuit is reduced to the use of a capacitor in each of the columns of the matrix. The circuit allows parallel measurements of the N resistors which form each of the rows of the array, eliminating the resistive crosstalk which is typical of these circuits. This is achieved by an addressing technique which does not require external elements to the FPGA. Although the typical resistive crosstalk between resistors which are measured simultaneously is eliminated, other elements that have an impact on the measurement of discharge times appear in the proposed architecture and, therefore, affect the uncertainty in resistance value measurements; these elements need to be studied. Finally, the performance of different calibration techniques is assessed experimentally on a discrete resistor array, obtaining for a new model of calibration, a maximum relative error of 0.066% in a range of resistor values which correspond to a tactile sensor. PMID:26840321
Accuracy and Resolution Analysis of a Direct Resistive Sensor Array to FPGA Interface.
Oballe-Peinado, Óscar; Vidal-Verdú, Fernando; Sánchez-Durán, José A; Castellanos-Ramos, Julián; Hidalgo-López, José A
2016-02-01
Resistive sensor arrays are formed by a large number of individual sensors which are distributed in different ways. This paper proposes a direct connection between an FPGA and a resistive array distributed in M rows and N columns, without the need of analog-to-digital converters to obtain resistance values in the sensor and where the conditioning circuit is reduced to the use of a capacitor in each of the columns of the matrix. The circuit allows parallel measurements of the N resistors which form each of the rows of the array, eliminating the resistive crosstalk which is typical of these circuits. This is achieved by an addressing technique which does not require external elements to the FPGA. Although the typical resistive crosstalk between resistors which are measured simultaneously is eliminated, other elements that have an impact on the measurement of discharge times appear in the proposed architecture and, therefore, affect the uncertainty in resistance value measurements; these elements need to be studied. Finally, the performance of different calibration techniques is assessed experimentally on a discrete resistor array, obtaining for a new model of calibration, a maximum relative error of 0.066% in a range of resistor values which correspond to a tactile sensor.
A fast-locking all-digital delay-locked loop for phase/delay generation in an FPGA
NASA Astrophysics Data System (ADS)
Zhujia, Chen; Haigang, Yang; Fei, Liu; Yu, Wang
2011-10-01
A fast-locking all-digital delay-locked loop (ADDLL) is proposed for the DDR SDRAM controller interface in a field programmable gate array (FPGA). The ADDLL performs a 90° phase-shift so that the data strobe (DQS) can enlarge the data valid window in order to minimize skew. In order to further reduce the locking time and to prevent the harmonic locking problem, a time-to-digital converter (TDC) is proposed. A duty cycle corrector (DCC) is also designed in the ADDLL to adjust the output duty cycle to 50%. The ADDLL, implemented in a commercial 0.13 μm CMOS process, occupies a total of 0.017 mm2 of active area. Measurement results show that the ADDLL has an operating frequency range of 75 to 350 MHz and a total delay resolution of 15 ps. The time interval error (TIE) of the proposed circuit is 60.7 ps.
A digitalized silicon microgyroscope based on embedded FPGA.
Xia, Dunzhu; Yu, Cheng; Wang, Yuliang
2012-09-27
This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system.
A Digitalized Silicon Microgyroscope Based on Embedded FPGA
Xia, Dunzhu; Yu, Cheng; Wang, Yuliang
2012-01-01
This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system. PMID:23201990
A Gigabit-per-Second Ka-Band Demonstration Using a Reconfigurable FPGA Modulator
NASA Technical Reports Server (NTRS)
Lee, Dennis; Gray, Andrew A.; Kang, Edward C.; Tsou, Haiping; Lay, Norman E.; Fong, Wai; Fisher, Dave; Hoy, Scott
2005-01-01
Gigabit-per-second communications have been a desired target for future NASA Earth science missions, and for potential manned lunar missions. Frequency bandwidth at S-band and X-band is typically insufficient to support missions at these high data rates. In this paper, we present the results of a 1 Gbps 32-QAM end-to-end experiment at Ka-band using a reconfigurable Field Programmable Gate Array (FPGA) baseband modulator board. Bit error rate measurements of the received signal using a software receiver demonstrate the feasibility of using ultra-high data rates at Ka-band, although results indicate that error correcting coding and/or modulator predistortion must be implemented in addition. Also, results of the demonstration validate the low-cost, MOS-based reconfigurable modulator approach taken to development of a high rate modulator, as opposed to more expensive ASIC or pure analog approaches.
Real-Time Phase Correction Based on FPGA in the Beam Position and Phase Measurement System
NASA Astrophysics Data System (ADS)
Gao, Xingshun; Zhao, Lei; Liu, Jinxin; Jiang, Zouyi; Hu, Xiaofang; Liu, Shubin; An, Qi
2016-12-01
A fully digital beam position and phase measurement (BPPM) system was designed for the linear accelerator (LINAC) in Accelerator Driven Sub-critical System (ADS) in China. Phase information is obtained from the summed signals from four pick-ups of the Beam Position Monitor (BPM). Considering that the delay variations of different analog circuit channels would introduce phase measurement errors, we propose a new method to tune the digital waveforms of four channels before summation and achieve real-time error correction. The process is based on the vector rotation method and implemented within one single Field Programmable Gate Array (FPGA) device. Tests were conducted to evaluate this correction method and the results indicate that a phase correction precision better than ± 0.3° over the dynamic range from -60 dBm to 0 dBm is achieved.
Multiple Embedded Processors for Fault-Tolerant Computing
NASA Technical Reports Server (NTRS)
Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy
2005-01-01
A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
NASA Technical Reports Server (NTRS)
Berg, Melanie D.; LaBel, Kenneth A.
2018-01-01
The following are updated or new subjects added to the FPGA SEE Test Guidelines manual: academic versus mission specific device evaluation, single event latch-up (SEL) test and analysis, SEE response visibility enhancement during radiation testing, mitigation evaluation (embedded and user-implemented), unreliable design and its affects to SEE Data, testing flushable architectures versus non-flushable architectures, intellectual property core (IP Core) test and evaluation (addresses embedded and user-inserted), heavy-ion energy and linear energy transfer (LET) selection, proton versus heavy-ion testing, fault injection, mean fluence to failure analysis, and mission specific system-level single event upset (SEU) response prediction. Most sections within the guidelines manual provide information regarding best practices for test structure and test system development. The scope of this manual addresses academic versus mission specific device evaluation and visibility enhancement in IP Core testing.
A visually guided collision warning system with a neuromorphic architecture.
Okuno, Hirotsugu; Yagi, Tetsuya
2008-12-01
We have designed a visually guided collision warning system with a neuromorphic architecture, employing an algorithm inspired by the visual nervous system of locusts. The system was implemented with mixed analog-digital integrated circuits consisting of an analog resistive network and field-programmable gate array (FPGA) circuits. The resistive network processes the interaction between the laterally spreading excitatory and inhibitory signals instantaneously, which is essential for real-time computation of collision avoidance with a low power consumption and a compact hardware. The system responded selectively to approaching objects of simulated movie images at close range. The system was, however, confronted with serious noise problems due to the vibratory ego-motion, when it was installed in a mobile miniature car. To overcome this problem, we developed the algorithm, which is also installable in FPGA circuits, in order for the system to respond robustly during the ego-motion.
Stego on FPGA: An IWT Approach
Ramalingam, Balakrishnan
2014-01-01
A reconfigurable hardware architecture for the implementation of integer wavelet transform (IWT) based adaptive random image steganography algorithm is proposed. The Haar-IWT was used to separate the subbands namely, LL, LH, HL, and HH, from 8 × 8 pixel blocks and the encrypted secret data is hidden in the LH, HL, and HH blocks using Moore and Hilbert space filling curve (SFC) scan patterns. Either Moore or Hilbert SFC was chosen for hiding the encrypted data in LH, HL, and HH coefficients, whichever produces the lowest mean square error (MSE) and the highest peak signal-to-noise ratio (PSNR). The fixated random walk's verdict of all blocks is registered which is nothing but the furtive key. Our system took 1.6 µs for embedding the data in coefficient blocks and consumed 34% of the logic elements, 22% of the dedicated logic register, and 2% of the embedded multiplier on Cyclone II field programmable gate array (FPGA). PMID:24723794
Achieving High Performance with FPGA-Based Computing
Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug
2011-01-01
Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
Extending the BEAGLE library to a multi-FPGA platform.
Jin, Zheming; Bakos, Jason D
2013-01-19
Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.
NASA Astrophysics Data System (ADS)
Lai, Qiang; Zhao, Xiao-Wen; Rajagopal, Karthikeyan; Xu, Guanghui; Akgul, Akif; Guleryuz, Emre
2018-01-01
This paper considers the generation of multi-butterfly chaotic attractors from a generalised Sprott C system with multiple non-hyperbolic equilibria. The system is constructed by introducing an additional variable whose derivative has a switching function to the Sprott C system. It is numerically found that the system creates two-, three-, four-, five-butterfly attractors and any other multi-butterfly attractors. First, the dynamic analyses of multi-butterfly chaotic attractors are presented. Secondly, the field programmable gate array implementation, electronic circuit realisation and random number generator are done with the multi-butterfly chaotic attractors.
Removal of anti-Stokes emission background in STED microscopy by FPGA-based synchronous detection
NASA Astrophysics Data System (ADS)
Castello, M.; Tortarolo, G.; Coto Hernández, I.; Deguchi, T.; Diaspro, A.; Vicidomini, G.
2017-05-01
In stimulated emission depletion (STED) microscopy, the role of the STED beam is to de-excite, via stimulated emission, the fluorophores that have been previously excited by the excitation beam. This condition, together with specific beam intensity distributions, allows obtaining true sub-diffraction spatial resolution images. However, if the STED beam has a non-negligible probability to excite the fluorophores, a strong fluorescent background signal (anti-Stokes emission) reduces the effective resolution. For STED scanning microscopy, different synchronous detection methods have been proposed to remove this anti-Stokes emission background and recover the resolution. However, every method works only for a specific STED microscopy implementation. Here we present a user-friendly synchronous detection method compatible with any STED scanning microscope. It exploits a data acquisition (DAQ) card based on a field-programmable gate array (FPGA), which is progressively used in STED microscopy. In essence, the FPGA-based DAQ card synchronizes the fluorescent signal registration, the beam deflection, and the excitation beam interruption, providing a fully automatic pixel-by-pixel synchronous detection method. We validate the proposed method in both continuous wave and pulsed STED microscope systems.
NASA Astrophysics Data System (ADS)
Kassem, A.; Sawan, M.; Boukadoum, M.; Haidar, A.
2005-12-01
We are concerned with the design, implementation, and validation of a perception SoC based on an ultrasonic array of sensors. The proposed SoC is dedicated to ultrasonic echography applications. A rapid prototyping platform is used to implement and validate the new architecture of the digital signal processing (DSP) core. The proposed DSP core efficiently integrates all of the necessary ultrasonic B-mode processing modules. It includes digital beamforming, quadrature demodulation of RF signals, digital filtering, and envelope detection of the received signals. This system handles 128 scan lines and 6400 samples per scan line with a[InlineEquation not available: see fulltext.] angle of view span. The design uses a minimum size lookup memory to store the initial scan information. Rapid prototyping using an ARM/FPGA combination is used to validate the operation of the described system. This system offers significant advantages of portability and a rapid time to market.
Qualification Strategies of Field Programmable Gate Arrays (FPGAs) for Space Application
NASA Technical Reports Server (NTRS)
Sheldon, Douglas; Schone, Harald
2005-01-01
This viewgraph document reviews the issue of using Field Programmable Gate Arrays (FPGAs) in Space Application, and the some of the strategies for qualifying the FPGA. Qualification and risk management of such complex systems requires new approaches. The paper presents a matrix approach to qualification has been presented that: - Complements historical specifications - Highlights the importance of device physics as a cornerstone to qualification. - Provides levels of risk management that expressly document trade offs. - Stresses the role of the FPGA vendor as team member in the development of modern spacecraft.
Radiation Tolerant, FPGA-Based SmallSat Computer System
NASA Technical Reports Server (NTRS)
LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew
2015-01-01
The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
Field Programmable Gate Array for Implementation of Redundant Advanced Digital Feedback Control
NASA Technical Reports Server (NTRS)
King, K. D.
2003-01-01
The goal of this effort was to develop a digital motor controller using field programmable gate arrays (FPGAs). This is a more rugged approach than a conventional microprocessor digital controller. FPGAs typically have higher radiation (rad) tolerance than both the microprocessor and memory required for a conventional digital controller. Furthermore, FPGAs can typically operate at higher speeds. (While speed is usually not an issue for motor controllers, it can be for other system controllers.) Other than motor power, only a 3.3-V digital power supply was used in the controller; no analog bias supplies were used. Since most of the circuit was implemented in the FPGA, no additional parts were needed other than the power transistors to drive the motor. The benefits that FPGAs provide over conventional designs-lower power and fewer parts-allow for smaller packaging and reduced weight and cost.
A programmable controller based on CAN field bus embedded microprocessor and FPGA
NASA Astrophysics Data System (ADS)
Cai, Qizhong; Guo, Yifeng; Chen, Wenhei; Wang, Mingtao
2008-10-01
One kind of new programmable controller(PLC) is introduced in this paper. The advanced embedded microprocessor and Field-Programmable Gate Array (FPGA) device are applied in the PLC system. The PLC system structure was presented in this paper. It includes 32 bits Advanced RISC Machines (ARM) embedded microprocessor as control core, FPGA as control arithmetic coprocessor and CAN bus as data communication criteria protocol connected the host controller and its various extension modules. It is detailed given that the circuits and working principle, IiO interface circuit between ARM and FPGA and interface circuit between ARM and FPGA coprocessor. Furthermore the interface circuit diagrams between various modules are written. In addition, it is introduced that ladder chart program how to control the transfer info of control arithmetic part in FPGA coprocessor. The PLC, through nearly two months of operation to meet the design of the basic requirements.
NASA Technical Reports Server (NTRS)
Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.
2017-01-01
Field Programmable Gate Arrays (FPGAs) integrated circuits (IC) are one of the key electronic components in today's sophisticated launch and space vehicle complex avionic systems, largely due to their superb reprogrammable and reconfigurable capabilities combined with relatively low non-recurring engineering costs (NRE) and short design cycle. Consequently, FPGAs are prevalent ICs in communication protocols and control signal commands. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
High altitude subsonic parachute field programmable gate array
NASA Technical Reports Server (NTRS)
Kowalski, James E.; Gromov, Konstantin; Konefat, Edward H.
2005-01-01
This paper describes a rapid, top down requirements-driven design of an FPGA used in an Earth qualification test program for a new Mars subsonic parachute. The FPGA is used to process and control storage of telemetry data from multiple sensors throughout; launch, ascent, deployment and descent phases of the subsonic parachute test.
Li, Zong-Tao; Wu, Tie-Jun; Lin, Can-Long; Ma, Long-Hua
2011-01-01
A new generalized optimum strapdown algorithm with coning and sculling compensation is presented, in which the position, velocity and attitude updating operations are carried out based on the single-speed structure in which all computations are executed at a single updating rate that is sufficiently high to accurately account for high frequency angular rate and acceleration rectification effects. Different from existing algorithms, the updating rates of the coning and sculling compensations are unrelated with the number of the gyro incremental angle samples and the number of the accelerometer incremental velocity samples. When the output sampling rate of inertial sensors remains constant, this algorithm allows increasing the updating rate of the coning and sculling compensation, yet with more numbers of gyro incremental angle and accelerometer incremental velocity in order to improve the accuracy of system. Then, in order to implement the new strapdown algorithm in a single FPGA chip, the parallelization of the algorithm is designed and its computational complexity is analyzed. The performance of the proposed parallel strapdown algorithm is tested on the Xilinx ISE 12.3 software platform and the FPGA device XC6VLX550T hardware platform on the basis of some fighter data. It is shown that this parallel strapdown algorithm on the FPGA platform can greatly decrease the execution time of algorithm to meet the real-time and high precision requirements of system on the high dynamic environment, relative to the existing implemented on the DSP platform. PMID:22164058
Design of transient light signal simulator based on FPGA
NASA Astrophysics Data System (ADS)
Kang, Jing; Chen, Rong-li; Wang, Hong
2014-11-01
A design scheme of transient light signal simulator based on Field Programmable gate Array (FPGA) was proposed in this paper. Based on the characteristics of transient light signals and measured feature points of optical intensity signals, a fitted curve was created in MATLAB. And then the wave data was stored in a programmed memory chip AT29C1024 by using SUPERPRO programmer. The control logic was realized inside one EP3C16 FPGA chip. Data readout, data stream cache and a constant current buck regulator for powering high-brightness LEDs were all controlled by FPGA. A 12-Bit multiplying CMOS digital-to-analog converter (DAC) DAC7545 and an amplifier OPA277 were used to convert digital signals to voltage signals. A voltage-controlled current source constituted by a NPN transistor and an operational amplifier controlled LED array diming to achieve simulation of transient light signal. LM3405A, 1A Constant Current Buck Regulator for Powering LEDs, was used to simulate strong background signal in space. Experimental results showed that the scheme as a transient light signal simulator can satisfy the requests of the design stably.
Real-time FPGA architectures for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2000-03-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
Efficient Digital Implementation of The Sigmoidal Function For Artificial Neural Network
NASA Astrophysics Data System (ADS)
Pratap, Rana; Subadra, M.
2011-10-01
An efficient piecewise linear approximation of a nonlinear function (PLAN) is proposed. This uses simulink environment design to perform a direct transformation from X to Y, where X is the input and Y is the approximated sigmoidal output. This PLAN is then used within the outputs of an artificial neural network to perform the nonlinear approximation. In This paper, is proposed a method to implement in FPGA (Field Programmable Gate Array) circuits different approximation of the sigmoid function.. The major benefit of the proposed method resides in the possibility to design neural networks by means of predefined block systems created in System Generator environment and the possibility to create a higher level design tools used to implement neural networks in logical circuits.
Fast and Adaptive Lossless On-Board Hyperspectral Data Compression System for Space Applications
NASA Technical Reports Server (NTRS)
Aranki, Nazeeh; Bakhshi, Alireza; Keymeulen, Didier; Klimesh, Matthew
2009-01-01
Efficient on-board lossless hyperspectral data compression reduces the data volume necessary to meet NASA and DoD limited downlink capabilities. The techniques also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware, which makes it practical for flight implementations of pushbroom instruments. A prototype of the compressor (and decompressor) of the algorithm is available in software, but this implementation may not meet speed and real-time requirements of some space applications. Hardware acceleration provides performance improvements of 10x-100x vs. the software implementation (about 1M samples/sec on a Pentium IV machine). This paper describes a hardware implementation of the JPL-developed 'Fast Lossless' compression algorithm on a Field Programmable Gate Array (FPGA). The FPGA implementation targets the current state of the art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for Space applications.
Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; de Jesus Romero-Troncoso, Rene
2010-01-01
Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node. PMID:22163602
Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; Romero-Troncoso, Rene de Jesus
2010-01-01
Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node.
Adaptive Controller for Compact Fourier Transform Spectrometer with Space Applications
NASA Astrophysics Data System (ADS)
Keymeulen, D.; Yiu, P.; Berisford, D. F.; Hand, K. P.; Carlson, R. W.; Conroy, M.
2014-12-01
Here we present noise mitigation techniques developed as part of an adaptive controller for a very compact Compositional InfraRed Interferometric Spectrometer (CIRIS) implemented on a stand-alone field programmable gate array (FPGA) architecture with emphasis on space applications in high radiation environments such as Europa. CIRIS is a novel take on traditional Fourier Transform Spectrometers (FTS) and replaces linearly moving mirrors (characteristic of Michelson interferometers) with a constant-velocity rotating refractor to variably phase shift and alter the path length of incoming light. The design eschews a monochromatic reference laser typically used for sampling clock generation and instead utilizes constant time-sampling via internally generated clocks. This allows for a compact and robust device, making it ideal for spaceborne measurements in the near-IR to thermal-IR band (2-12 µm) on planetary exploration missions. The instrument's embedded microcontroller is implemented on a VIRTEX-5 FPGA and a PowerPC with the aim of sampling the instrument's detector and optical rotary encoder in order to construct interferograms. Subsequent onboard signal processing provides spectral immunity from the noise effects introduced by the compact design's removal of a reference laser and by the radiation encountered during space flight to destinations such as Europa. A variety of signal processing techniques including resampling, radiation peak removal, Fast Fourier Transform (FFT), spectral feature alignment, dispersion correction and calibration processes are applied to compose the sample spectrum in real-time with signal-to-noise-ratio (SNR) performance comparable to laser-based FTS designs in radiation-free environments. The instrument's FPGA controller is demonstrated with the FTS to characterize its noise mitigation techniques and highlight its suitability for implementation in space systems.
XMOS XC-2 Development Board for Mechanical Control and Data Collection
NASA Technical Reports Server (NTRS)
Jarnot, Robert F.; Bowden, William J.
2011-01-01
The scanning microwave limb sounder (SMLS) will use technological improvements in low-noise mixers to provide precise data on the Earth s atmospheric composition with high spatial resolution. This project focuses on the design and implementation of a realtime control system needed for airborne engineering tests of the SMLS. The system must coordinate the actuation of optical components using four motors with encoder readback, while collecting synchronized telemetric data from a GPS receiver and 3-axis gyrometric system. A graphical user interface for testing the control system was also designed using Python. Although the system could have been implemented with an FPGA(fieldprogrammable gate array)-based setup, a processor development kit manufactured by XMOS was chosen. The XMOS architecture allows parallel execution of multiple tasks on separate threads, making it ideal for this application. It is easily programmed using XC (a subset of C). The necessary communication interfaces were implemented in software, including Ethernet, with significant cost and time reduction compared to an FPGA-based approach. A simple approach to control the chopper, calibration mirror, and gimbal for the airborne SMLS was needed. The XMOS board allows for multiple threads and real-time data acquisition. The XC-2 development kit is an attractive choice for synchronized, real-time, event-driven applications. The XMOS is based on the transputer microprocessor architecture developed for parallel computing, which is being revamped in this new platform. The XMOS device has multiple cores capable of running parallel applications on separate threads. The threads communicate with each other via user-defined channels capable of transmitting data within the device. XMOS provides a C-based development environment using XC, which eliminates the need for custom tool kits associated with FPGA programming. The XC-2 has four cores and necessary hardware for Ethernet I/O.
NASA Astrophysics Data System (ADS)
Alvear, Andrés.; Finger, Ricardo; Fuentes, Roberto; Sapunar, Raúl; Geelen, Tom; Curotto, Franco; Rodríguez, Rafael; Monasterio, David; Reyes, Nicolás.; Mena, Patricio; Bronfman, Leonardo
2016-07-01
Field Programmable Gate Arrays (FPGAs) capacity and Analog to Digital Converters (ADCs) speed have largely increased in the last decade. Nowadays we can find one million or more logic blocks (slices) as well as several thousand arithmetic units (ALUs/DSP) available on a single FPGA chip. We can also commercially procure ADC chips reaching 10 GSPS, with 8 bits resolution or more. This unprecedented power of computing hardware has allowed the digitalization of signal processes traditionally performed by analog components. In radio astronomy, the clearest example has been the development of digital sideband separating receivers which, by replacing the IF hybrid and calibrating the system imbalances, have exhibited a sideband rejection above 40dB; this is 20 to 30dB higher than traditional analog sideband separating (2SB) receivers. In Rodriguez et al.,1 and Finger et al.,2 we have demonstrated very high digital sideband separation at 3mm and 1mm wavelengths, using laboratory setups. We here show the first implementation of such technique with a 3mm receiver integrated into a telescope, where the calibration was performed by quasi-optical injection of the test tone in front of the Cassegrain antenna. We also reported progress in digital polarization synthesis, particularly in the implementation of a calibrated Digital Ortho-Mode Transducer (DOMT) based on the Morgan et al. proof of concept.3 They showed off- line synthesis of polarization with isolation higher than 40dB. We plan to implement a digital polarimeter in a real-time FPGA-based (ROACH-2) platform, to show ultra-pure polarization isolation in a non-stop integrating spectrometer.
Configurable test bed design for nanosats to qualify commercial and customized integrated circuits
NASA Astrophysics Data System (ADS)
Guareschi, W.; Azambuja, J.; Kastensmidt, F.; Reis, R.; Durao, O.; Schuch, N.; Dessbesel, G.
The use of small satellites has increased substantially in recent years due to the reduced cost of their development and launch, as well to the flexibility offered by commercial components. The test bed is a platform that allows components to be evaluated and tested in space. It is a flexible platform, which can be adjusted to a wide quantity of components and interfaces. This work proposes the design and implementation of a test bed suitable for test and evaluation of commercial circuits used in nanosatellites. The development of such a platform allows developers to reduce the efforts in the integration of components and therefore speed up the overall system development time. The proposed test bed is a configurable platform implemented using a Field Programmable Gate Array (FPGA) that controls the communication protocols and connections to the devices under test. The Flash-based ProASIC3E FPGA from Microsemi is used as a control system. This adaptive system enables the control of new payloads and softcores for test and validation in space. Thus, the integration can be easily performed through configuration parameters. It is intended for modularity. Each component connected to the test bed can have a specific interface programmed using a hardware description language (HDL). The data of each component is stored in embedded memories. Each component has its own memory space. The size of the allocated memory can be also configured. The data transfer priority can be set and packaging can be added to the logic, when needed. Communication with peripheral devices and with the Onboard Computer (OBC) is done through the pre-implemented protocols, such as I2C (Inter-Integrated Circuit), SPI (Serial Peripheral Interface) and external memory control. In loco primary tests demonstrated the control system's functionality. The commercial ProASIC3E FPGA family is not space-flight qualified, but tests have been made under Total Ionizing Dose (TID) showing its robustness up to 25 kr- ds (Si). When considering proton and heavy ions, flash-based FPGAs provide immunity to configuration loss and low bit-flips susceptibility in flash memory. In this first version of the test bed two components are connected to the controller FPGA: a commercial magnetometer and a hardened test chip. The embedded FPGA implements a Single Event Effects (SEE) hardened microprocessor and few other soft-cores to be used in space. This test bed will be used in the NanoSatC-BR1, the first Brazilian Cubesat scheduled to be launched in mid-2013.
A reconfigurable cryogenic platform for the classical control of quantum processors
NASA Astrophysics Data System (ADS)
Homulle, Harald; Visser, Stefan; Patra, Bishnu; Ferrari, Giorgio; Prati, Enrico; Sebastiano, Fabio; Charbon, Edoardo
2017-04-01
The implementation of a classical control infrastructure for large-scale quantum computers is challenging due to the need for integration and processing time, which is constrained by coherence time. We propose a cryogenic reconfigurable platform as the heart of the control infrastructure implementing the digital error-correction control loop. The platform is implemented on a field-programmable gate array (FPGA) that supports the functionality required by several qubit technologies and that can operate close to the physical qubits over a temperature range from 4 K to 300 K. This work focuses on the extensive characterization of the electronic platform over this temperature range. All major FPGA building blocks (such as look-up tables (LUTs), carry chains (CARRY4), mixed-mode clock manager (MMCM), phase-locked loop (PLL), block random access memory, and IDELAY2 (programmable delay element)) operate correctly and the logic speed is very stable. The logic speed of LUTs and CARRY4 changes less then 5%, whereas the jitter of MMCM and PLL clock managers is reduced by 20%. The stability is finally demonstrated by operating an integrated 1.2 GSa/s analog-to-digital converter (ADC) with a relatively stable performance over temperature. The ADCs effective number of bits drops from 6 to 4.5 bits when operating at 15 K.
A reconfigurable cryogenic platform for the classical control of quantum processors.
Homulle, Harald; Visser, Stefan; Patra, Bishnu; Ferrari, Giorgio; Prati, Enrico; Sebastiano, Fabio; Charbon, Edoardo
2017-04-01
The implementation of a classical control infrastructure for large-scale quantum computers is challenging due to the need for integration and processing time, which is constrained by coherence time. We propose a cryogenic reconfigurable platform as the heart of the control infrastructure implementing the digital error-correction control loop. The platform is implemented on a field-programmable gate array (FPGA) that supports the functionality required by several qubit technologies and that can operate close to the physical qubits over a temperature range from 4 K to 300 K. This work focuses on the extensive characterization of the electronic platform over this temperature range. All major FPGA building blocks (such as look-up tables (LUTs), carry chains (CARRY4), mixed-mode clock manager (MMCM), phase-locked loop (PLL), block random access memory, and IDELAY2 (programmable delay element)) operate correctly and the logic speed is very stable. The logic speed of LUTs and CARRY4 changes less then 5%, whereas the jitter of MMCM and PLL clock managers is reduced by 20%. The stability is finally demonstrated by operating an integrated 1.2 GSa/s analog-to-digital converter (ADC) with a relatively stable performance over temperature. The ADCs effective number of bits drops from 6 to 4.5 bits when operating at 15 K.
A hardware implementation of the discrete Pascal transform for image processing
NASA Astrophysics Data System (ADS)
Goodman, Thomas J.; Aburdene, Maurice F.
2006-02-01
The discrete Pascal transform is a polynomial transform with applications in pattern recognition, digital filtering, and digital image processing. It already has been shown that the Pascal transform matrix can be decomposed into a product of binary matrices. Such a factorization leads to a fast and efficient hardware implementation without the use of multipliers, which consume large amounts of hardware. We recently developed a field-programmable gate array (FPGA) implementation to compute the Pascal transform. Our goal was to demonstrate the computational efficiency of the transform while keeping hardware requirements at a minimum. Images are uploaded into memory from a remote computer prior to processing, and the transform coefficients can be offloaded from the FPGA board for analysis. Design techniques like as-soon-as-possible scheduling and adder sharing allowed us to develop a fast and efficient system. An eight-point, one-dimensional transform completes in 13 clock cycles and requires only four adders. An 8x8 two-dimensional transform completes in 240 cycles and requires only a top-level controller in addition to the one-dimensional transform hardware. Finally, through minor modifications to the controller, the transform operations can be pipelined to achieve 100% utilization of the four adders, allowing one eight-point transform to complete every seven clock cycles.
Multi-DSP and FPGA based Multi-channel Direct IF/RF Digital receiver for atmospheric radar
NASA Astrophysics Data System (ADS)
Yasodha, Polisetti; Jayaraman, Achuthan; Kamaraj, Pandian; Durga rao, Meka; Thriveni, A.
2016-07-01
Modern phased array radars depend highly on digital signal processing (DSP) to extract the echo signal information and to accomplish reliability along with programmability and flexibility. The advent of ASIC technology has made various digital signal processing steps to be realized in one DSP chip, which can be programmed as per the application and can handle high data rates, to be used in the radar receiver to process the received signal. Further, recent days field programmable gate array (FPGA) chips, which can be re-programmed, also present an opportunity to utilize them to process the radar signal. A multi-channel direct IF/RF digital receiver (MCDRx) is developed at NARL, taking the advantage of high speed ADCs and high performance DSP chips/FPGAs, to be used for atmospheric radars working in HF/VHF bands. Multiple channels facilitate the radar t be operated in multi-receiver modes and also to obtain the wind vector with improved time resolution, without switching the antenna beam. MCDRx has six channels, implemented on a custom built digital board, which is realized using six numbers of ADCs for simultaneous processing of the six input signals, Xilinx vertex5 FPGA and Spartan6 FPGA, and two ADSPTS201 DSP chips, each of which performs one phase of processing. MCDRx unit interfaces with the data storage/display computer via two gigabit ethernet (GbE) links. One of the six channels is used for Doppler beam swinging (DBS) mode and the other five channels are used for multi-receiver mode operations, dedicatedly. Each channel has (i) ADC block, to digitize RF/IF signal, (ii) DDC block for digital down conversion of the digitized signal, (iii) decoding block to decode the phase coded signal, and (iv) coherent integration block for integrating the data preserving phase intact. ADC block consists of Analog devices make AD9467 16-bit ADCs, to digitize the input signal at 80 MSPS. The output of ADC is centered around (80 MHz - input frequency). The digitized data is fed to DDC block, which down converts the data to base-band. The DDC block has NCO, mixer and two chains of Bessel filters (fifth order cascaded integration comb filter, two FIR filters, two half band filters and programmable FIR filters) for in-phase (I) and Quadrature phase (Q) channels. The NCO has 32 bits and is set to match the output frequency of ADC. Further, DDC down samples (decimation) the data and reduces the data rate to 16 MSPS. This data is further decimated and the data rate is reduced down to 4/2/1/0.5/0.25/0.125/0.0625 MSPS for baud lengths 0.25/0.5/1/2/4/8/16 μs respectively. The down sampled data is then fed to decoding block, which performs cross correlation to achieve pulse compression of the binary-phase coded data to obtain better range resolution with maximum possible height coverage. This step improves the signal power by a factor equal to the length of the code. Coherent integration block integrates the decoded data coherently for successive pulses, which improves the signal to noise ratio and reduces the data volume. DDC, decoding and coherent integration blocks are implemented in Xilinx vertex5 FPGA. Till this point, function of all six channels is same for DBS mode and multi-receiver modes. Data from vertex5 FPGA is transferred to PC via GbE-1 interface for multi-modes or to two Analog devices make ADSP-TS201 DSP chips (A and B), via link port for DBS mode. ADSP-TS201 chips perform the normalization, DC removal, windowing, FFT computation and spectral averaging on the data, which is transferred to storage/display PC via GbE-2 interface for real-time data display and data storing. Physical layer of GbE interface is implemented in an external chip (Marvel 88E1111) and MAC layer is implemented internal to vertex5 FPGA. The MCDRx has total 4 GB of DDR2 memory for data storage. Spartan6 FPGA is used for generating timing signals, required for basic operation of the radar and testing of the MCDRx.
A novel pipeline based FPGA implementation of a genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2014-05-01
To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
Schaefer, R T; MacAskill, J A; Mojarradi, M; Chutjian, A; Darrach, M R; Madzunkov, S M; Shortt, B J
2008-09-01
Reported herein is development of a quadrupole mass spectrometer controller (MSC) with integrated radio frequency (rf) power supply and mass spectrometer drive electronics. Advances have been made in terms of the physical size and power consumption of the MSC, while simultaneously making improvements in frequency stability, total harmonic distortion, and spectral purity. The rf power supply portion of the MSC is based on a series-resonant LC tank, where the capacitive load is the mass spectrometer itself, and the inductor is a solenoid or toroid, with various core materials. The MSC drive electronics is based on a field programmable gate array (FPGA), with serial peripheral interface for analog-to-digital and digital-to-analog converter support, and RS232/RS422 communications interfaces. The MSC offers spectral quality comparable to, or exceeding, that of conventional rf power supplies used in commercially available mass spectrometers; and as well an inherent flexibility, via the FPGA implementation, for a variety of tasks that includes proportional-integral derivative closed-loop feedback and control of rf, rf amplitude, and mass spectrometer sensitivity. Also provided are dc offsets and resonant dipole excitation for mass selective accumulation in applications involving quadrupole ion traps; rf phase locking and phase shifting for external loading of a quadrupole ion trap; and multichannel scaling of acquired mass spectra. The functionality of the MSC is task specific, and is easily modified by simply loading FPGA registers or reprogramming FPGA firmware.
Design of a Ferroelectric Programmable Logic Gate Array
NASA Technical Reports Server (NTRS)
MacLeod, Todd C.; Ho, Fat Duen
2003-01-01
A programmable logic gate array has been designed utilizing ferroelectric field effect transistors. The design has only a small number of gates, but this could be scaled up to a more useful size. Using FFET's in a logic array gives several advantages. First, it allows real-time programmability to the array to give high speed reconfiguration. It also allows the array to be configured nearly an unlimited number of times, unlike a FLASH FPGA. Finally, the Ferroelectric Programmable Logic Gate Array (FPLGA) can be implemented using a smaller number of transistors because of the inherent logic characteristics of an FFET. The device was only designed and modeled using Spice models of the circuit, including the FFET. The actual device was not produced. The design consists of a small array of NAND and NOR logic gates. Other gates could easily be produced. They are linked by FFET's that control the logic flow. Timing and logic tables have been produced showing the array can produce a variety of logic combinations at a real time usable speed. This device could be a prototype for a device that could be put into imbedded systems that need the high speed of hardware implementation of logic and the complexity to need to change the logic algorithm. Because of the non-volatile nature of the FFET, it would also be useful in situations that needed to program a logic array once and use it repeatedly after the power has been shut off.
A control system based on field programmable gate array for papermaking sewage treatment
NASA Astrophysics Data System (ADS)
Zhang, Zi Sheng; Xie, Chang; Qing Xiong, Yan; Liu, Zhi Qiang; Li, Qing
2013-03-01
A sewage treatment control system is designed to improve the efficiency of papermaking wastewater treatment system. The automation control system is based on Field Programmable Gate Array (FPGA), coded with Very-High-Speed Integrate Circuit Hardware Description Language (VHDL), compiled and simulated with Quartus. In order to ensure the stability of the data used in FPGA, the data is collected through temperature sensors, water level sensor and online PH measurement system. The automatic control system is more sensitive, and both the treatment efficiency and processing power are increased. This work provides a new method for sewage treatment control.
Cycle accurate and cycle reproducible memory for an FPGA based hardware accelerator
Asaad, Sameh W.; Kapur, Mohit
2016-03-15
A method, system and computer program product are disclosed for using a Field Programmable Gate Array (FPGA) to simulate operations of a device under test (DUT). The DUT includes a device memory having a number of input ports, and the FPGA is associated with a target memory having a second number of input ports, the second number being less than the first number. In one embodiment, a given set of inputs is applied to the device memory at a frequency Fd and in a defined cycle of time, and the given set of inputs is applied to the target memory at a frequency Ft. Ft is greater than Fd and cycle accuracy is maintained between the device memory and the target memory. In an embodiment, a cycle accurate model of the DUT memory is created by separating the DUT memory interface protocol from the target memory storage array.
A FPGA-based architecture for real-time image matching
NASA Astrophysics Data System (ADS)
Wang, Jianhui; Zhong, Sheng; Xu, Wenhui; Zhang, Weijun; Cao, Zhiguo
2013-10-01
Image matching is a fundamental task in computer vision. It is used to establish correspondence between two images taken at different viewpoint or different time from the same scene. However, its large computational complexity has been a challenge to most embedded systems. This paper proposes a single FPGA-based image matching system, which consists of SIFT feature detection, BRIEF descriptor extraction and BRIEF matching. It optimizes the FPGA architecture for the SIFT feature detection to reduce the FPGA resources utilization. Moreover, we implement BRIEF description and matching on FPGA also. The proposed system can implement image matching at 30fps (frame per second) for 1280x720 images. Its processing speed can meet the demand of most real-life computer vision applications.
Implementation of Multispectral Image Classification on a Remote Adaptive Computer
NASA Technical Reports Server (NTRS)
Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna
1999-01-01
As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).
A Low Cost Matching Motion Estimation Sensor Based on the NIOS II Microprocessor
González, Diego; Botella, Guillermo; Meyer-Baese, Uwe; García, Carlos; Sanz, Concepción; Prieto-Matías, Manuel; Tirado, Francisco
2012-01-01
This work presents the implementation of a matching-based motion estimation sensor on a Field Programmable Gate Array (FPGA) and NIOS II microprocessor applying a C to Hardware (C2H) acceleration paradigm. The design, which involves several matching algorithms, is mapped using Very Large Scale Integration (VLSI) technology. These algorithms, as well as the hardware implementation, are presented here together with an extensive analysis of the resources needed and the throughput obtained. The developed low-cost system is practical for real-time throughput and reduced power consumption and is useful in robotic applications, such as tracking, navigation using an unmanned vehicle, or as part of a more complex system. PMID:23201989
Implementation of image transmission server system using embedded Linux
NASA Astrophysics Data System (ADS)
Park, Jong-Hyun; Jung, Yeon Sung; Nam, Boo Hee
2005-12-01
In this paper, we performed the implementation of image transmission server system using embedded system that is for the specified object and easy to install and move. Since the embedded system has lower capability than the PC, we have to reduce the quantity of calculation of the baseline JPEG image compression and transmission. We used the Redhat Linux 9.0 OS at the host PC and the target board based on embedded Linux. The image sequences are obtained from the camera attached to the FPGA (Field Programmable Gate Array) board with ALTERA cooperation chip. For effectiveness and avoiding some constraints from the vendor's own, we made the device driver using kernel module.
NASA Astrophysics Data System (ADS)
Ahmad, Nabihah; Rifen, A. Aminurdin M.; Helmy Abd Wahab, Mohd
2016-11-01
Automated Teller Machine (ATM) is an electronic banking outlet that allows bank customers to complete a banking transactions without the aid of any bank official or teller. Several problems are associated with the use of ATM card such card cloning, card damaging, card expiring, cast skimming, cost of issuance and maintenance and accessing customer account by third parties. The aim of this project is to give a freedom to the user by changing the card to biometric security system to access the bank account using Advanced Encryption Standard (AES) algorithm. The project is implemented using Field Programmable Gate Array (FPGA) DE2-115 board with Cyclone IV device, fingerprint scanner, and Multi-Touch Liquid Crystal Display (LCD) Second Edition (MTL2) using Very High Speed Integrated Circuit Hardware (VHSIC) Description Language (VHDL). This project used 128-bits AES for recommend the device with the throughput around 19.016Gbps and utilized around 520 slices. This design offers a secure banking transaction with a low rea and high performance and very suited for restricted space environments for small amounts of RAM or ROM where either encryption or decryption is performed.
NASA Technical Reports Server (NTRS)
Frank, Andreas O.; Twombly, I. Alexander; Barth, Timothy J.; Smith, Jeffrey D.; Dalton, Bonnie P. (Technical Monitor)
2001-01-01
We have applied the linear elastic finite element method to compute haptic force feedback and domain deformations of soft tissue models for use in virtual reality simulators. Our results show that, for virtual object models of high-resolution 3D data (>10,000 nodes), haptic real time computations (>500 Hz) are not currently possible using traditional methods. Current research efforts are focused in the following areas: 1) efficient implementation of fully adaptive multi-resolution methods and 2) multi-resolution methods with specialized basis functions to capture the singularity at the haptic interface (point loading). To achieve real time computations, we propose parallel processing of a Jacobi preconditioned conjugate gradient method applied to a reduced system of equations resulting from surface domain decomposition. This can effectively be achieved using reconfigurable computing systems such as field programmable gate arrays (FPGA), thereby providing a flexible solution that allows for new FPGA implementations as improved algorithms become available. The resulting soft tissue simulation system would meet NASA Virtual Glovebox requirements and, at the same time, provide a generalized simulation engine for any immersive environment application, such as biomedical/surgical procedures or interactive scientific applications.
Empirical Mode Decomposition and Neural Networks on FPGA for Fault Diagnosis in Induction Motors
Garcia-Perez, Arturo; Osornio-Rios, Roque Alfredo; Romero-Troncoso, Rene de Jesus
2014-01-01
Nowadays, many industrial applications require online systems that combine several processing techniques in order to offer solutions to complex problems as the case of detection and classification of multiple faults in induction motors. In this work, a novel digital structure to implement the empirical mode decomposition (EMD) for processing nonstationary and nonlinear signals using the full spline-cubic function is presented; besides, it is combined with an adaptive linear network (ADALINE)-based frequency estimator and a feed forward neural network (FFNN)-based classifier to provide an intelligent methodology for the automatic diagnosis during the startup transient of motor faults such as: one and two broken rotor bars, bearing defects, and unbalance. Moreover, the overall methodology implementation into a field-programmable gate array (FPGA) allows an online and real-time operation, thanks to its parallelism and high-performance capabilities as a system-on-a-chip (SoC) solution. The detection and classification results show the effectiveness of the proposed fused techniques; besides, the high precision and minimum resource usage of the developed digital structures make them a suitable and low-cost solution for this and many other industrial applications. PMID:24678281
Empirical mode decomposition and neural networks on FPGA for fault diagnosis in induction motors.
Camarena-Martinez, David; Valtierra-Rodriguez, Martin; Garcia-Perez, Arturo; Osornio-Rios, Roque Alfredo; Romero-Troncoso, Rene de Jesus
2014-01-01
Nowadays, many industrial applications require online systems that combine several processing techniques in order to offer solutions to complex problems as the case of detection and classification of multiple faults in induction motors. In this work, a novel digital structure to implement the empirical mode decomposition (EMD) for processing nonstationary and nonlinear signals using the full spline-cubic function is presented; besides, it is combined with an adaptive linear network (ADALINE)-based frequency estimator and a feed forward neural network (FFNN)-based classifier to provide an intelligent methodology for the automatic diagnosis during the startup transient of motor faults such as: one and two broken rotor bars, bearing defects, and unbalance. Moreover, the overall methodology implementation into a field-programmable gate array (FPGA) allows an online and real-time operation, thanks to its parallelism and high-performance capabilities as a system-on-a-chip (SoC) solution. The detection and classification results show the effectiveness of the proposed fused techniques; besides, the high precision and minimum resource usage of the developed digital structures make them a suitable and low-cost solution for this and many other industrial applications.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.
Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph
2013-08-01
Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
NASA Astrophysics Data System (ADS)
Song, Z.; Wang, Y.; Kuang, J.
2018-05-01
Field Programmable Gate Arrays (FPGAs) made with 28 nm and more advanced process technology have great potentials for implementation of high precision time-to-digital convertors (TDC), because the delay cells in the tapped delay line (TDL) used for time interpolation are getting smaller and smaller. However, the bubble problems in the TDL status are becoming more complicated, which make it difficult to achieve TDCs on these chips with a high time precision. In this paper, we are proposing a novel decomposition encoding scheme, which not only can solve the bubble problem easily, but also has a high encoding efficiency. The potential of these chips to realize TDC can be fully released with the scheme. In a Xilinx Kintex-7 FPGA chip, we implemented a TDC system with 256 TDC channels, which doubles the number of TDC channels that our previous technique could achieve. Performances of all these TDC channels are evaluated. The average RMS time precision among them is 10.23 ps in the time-interval measurement range of (0–10 ns), and their measurement throughput reaches 277 M measures per second.
Design and FPGA implementation for MAC layer of Ethernet PON
NASA Astrophysics Data System (ADS)
Zhu, Zengxi; Lin, Rujian; Chen, Jian; Ye, Jiajun; Chen, Xinqiao
2004-04-01
Ethernet passive optical network (EPON), which represents the convergence of low-cost, high-bandwidth and supporting multiple services, appears to be one of the best candidates for the next-generation access network. The work of standardizing EPON as a solution for access network is still underway in the IEEE802.3ah Ethernet in the first mile (EFM) task force. The final release is expected in 2004. Up to now, there has been no standard application specific integrated circuit (ASIC) chip available which fulfills the functions of media access control (MAC) layer of EPON. The MAC layer in EPON system has many functions, such as point-to-point emulation (P2PE), Ethernet MAC functionality, multi-point control protocol (MPCP), network operation, administration and maintenance (OAM) and link security. To implement those functions mentioned above, an embedded real-time operating system (RTOS) and a flexible programmable logic device (PLD) with an embedded processor are used. The software and hardware functions in MAC layer are realized through programming embedded microprocessor and field programmable gate array(FPGA). Finally, some experimental results are given in this paper. The method stated here can provide a valuable reference for developing EPON MAC layer ASIC.
NASA Astrophysics Data System (ADS)
Won, Jun Yeon; Ko, Guen Bae; Lee, Jae Sung
2016-10-01
In this paper, we propose a fully time-based multiplexing and readout method that uses the principle of the global positioning system. Time-based multiplexing allows simplifying the multiplexing circuits where the only innate traces that connect the signal pins of the silicon photomultiplier (SiPM) channels to the readout channels are used as the multiplexing circuit. Every SiPM channel is connected to the delay grid that consists of the traces on a printed circuit board, and the inherent transit times from each SiPM channel to the readout channels encode the position information uniquely. Thus, the position of each SiPM can be identified using the time difference of arrival (TDOA) measurements. The proposed multiplexing can also allow simplification of the readout circuit using the time-to-digital converter (TDC) implemented in a field-programmable gate array (FPGA), where the time-over-threshold (ToT) is used to extract the energy information after multiplexing. In order to verify the proposed multiplexing method, we built a positron emission tomography (PET) detector that consisted of an array of 4 × 4 LGSO crystals, each with a dimension of 3 × 3 × 20 mm3, and one- to-one coupled SiPM channels. We first employed the waveform sampler as an initial study, and then replaced the waveform sampler with an FPGA-TDC to further simplify the readout circuits. The 16 crystals were clearly resolved using only the time information obtained from the four readout channels. The coincidence resolving times (CRTs) were 382 and 406 ps FWHM when using the waveform sampler and the FPGA-TDC, respectively. The proposed simple multiplexing and readout methods can be useful for time-of-flight (TOF) PET scanners.
Timing Results Using an FPGA-Based TDC with Large Arrays of 144 SiPMs
NASA Astrophysics Data System (ADS)
Aguilar, A.; González, A. J.; Torres, J.; García-Olcina, R.; Martos, J.; Soret, J.; Conde, P.; Hernández, L.; Sánchez, F.; Benlloch, J. M.
2015-02-01
Silicon photomultipliers (SiPMs) have become an alternative to traditional tubes due to several features. However, their implementation to form large arrays is still a challenge especially due to their relatively high intrinsic noise, depending on the chosen readout. In this contribution, two modules composed of 12 ×12 SiPMs with an area of roughly 50 mm×50 mm are used in coincidence. Coincidence resolving time (CRT) results with a field-programmable gate array, in combination with a time to digital converter, are shown as a function of both the sensor bias voltage and the digitizer threshold. The dependence of the CRT on the sensor matrix temperature, the amount of SiPM active area and the crystal type is also analyzed. Measurements carried out with a crystal array of 2 mm pixel size and 10 mm height have shown time resolutions for the entire 288 SiPM two-detector set-up as good as 800 ps full width at half maximum (FWHM).
Tang, Wenming; Liu, Guixiong; Li, Yuzhong; Tan, Daji
2017-01-01
High data transmission efficiency is a key requirement for an ultrasonic phased array with multi-group ultrasonic sensors. Here, a novel FIFOs scheduling algorithm was proposed and the data transmission efficiency with hardware technology was improved. This algorithm includes FIFOs as caches for the ultrasonic scanning data obtained from the sensors with the output data in a bandwidth-sharing way, on the basis of which an optimal length ratio of all the FIFOs is achieved, allowing the reading operations to be switched among all the FIFOs without time slot waiting. Therefore, this algorithm enhances the utilization ratio of the reading bandwidth resources so as to obtain higher efficiency than the traditional scheduling algorithms. The reliability and validity of the algorithm are substantiated after its implementation in the field programmable gate array (FPGA) technology, and the bandwidth utilization ratio and the real-time performance of the ultrasonic phased array are enhanced. PMID:29035345
Pipelined CPU Design with FPGA in Teaching Computer Architecture
ERIC Educational Resources Information Center
Lee, Jong Hyuk; Lee, Seung Eun; Yu, Heon Chang; Suh, Taeweon
2012-01-01
This paper presents a pipelined CPU design project with a field programmable gate array (FPGA) system in a computer architecture course. The class project is a five-stage pipelined 32-bit MIPS design with experiments on the Altera DE2 board. For proper scheduling, milestones were set every one or two weeks to help students complete the project on…
Zhang, Zhen; Ma, Cheng; Zhu, Rong
2017-08-23
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.
Zhang, Zhen; Zhu, Rong
2017-01-01
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
HALO: a reconfigurable image enhancement and multisensor fusion system
NASA Astrophysics Data System (ADS)
Wu, F.; Hickman, D. L.; Parker, Steve J.
2014-06-01
Contemporary high definition (HD) cameras and affordable infrared (IR) imagers are set to dramatically improve the effectiveness of security, surveillance and military vision systems. However, the quality of imagery is often compromised by camera shake, or poor scene visibility due to inadequate illumination or bad atmospheric conditions. A versatile vision processing system called HALO™ is presented that can address these issues, by providing flexible image processing functionality on a low size, weight and power (SWaP) platform. Example processing functions include video distortion correction, stabilisation, multi-sensor fusion and image contrast enhancement (ICE). The system is based around an all-programmable system-on-a-chip (SoC), which combines the computational power of a field-programmable gate array (FPGA) with the flexibility of a CPU. The FPGA accelerates computationally intensive real-time processes, whereas the CPU provides management and decision making functions that can automatically reconfigure the platform based on user input and scene content. These capabilities enable a HALO™ equipped reconnaissance or surveillance system to operate in poor visibility, providing potentially critical operational advantages in visually complex and challenging usage scenarios. The choice of an FPGA based SoC is discussed, and the HALO™ architecture and its implementation are described. The capabilities of image distortion correction, stabilisation, fusion and ICE are illustrated using laboratory and trials data.
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
FPGA-based gating and logic for multichannel single photon counting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pooser, Raphael C; Earl, Dennis Duncan; Evans, Philip G
2012-01-01
We present results characterizing multichannel InGaAs single photon detectors utilizing gated passive quenching circuits (GPQC), self-differencing techniques, and field programmable gate array (FPGA)-based logic for both diode gating and coincidence counting. Utilizing FPGAs for the diode gating frontend and the logic counting backend has the advantage of low cost compared to custom built logic circuits and current off-the-shelf detector technology. Further, FPGA logic counters have been shown to work well in quantum key distribution (QKD) test beds. Our setup combines multiple independent detector channels in a reconfigurable manner via an FPGA backend and post processing in order to perform coincidencemore » measurements between any two or more detector channels simultaneously. Using this method, states from a multi-photon polarization entangled source are detected and characterized via coincidence counting on the FPGA. Photons detection events are also processed by the quantum information toolkit for application testing (QITKAT)« less
Toward an Ultralow-Power Onboard Processor for Tongue Drive System
Viseh, Sina; Ghovanloo, Maysam; Mohsenin, Tinoosh
2015-01-01
The Tongue Drive System (TDS) is a new unobtrusive, wireless, and wearable assistive device that allows for real-time tracking of the voluntary tongue motion in the oral space for communication, control, and navigation applications. The latest TDS prototype appears as a wireless headphone and has been tested in human subject trials. However, the robustness of the external TDS (eTDS) in real-life outdoor conditions may not meet safety regulations because of the limited mechanical stability of the headset. The intraoral TDS (iTDS), which is in the shape of a dental retainer, firmly clasps to the upper teeth and resists sensor misplacement. However, the iTDS has more restrictions on its dimensions, limiting the battery size and consequently requiring a considerable reduction in its power consumption to operate over an extended period of two days on a single charge. In this brief, we propose an ultralow-power local processor for the TDS that performs all signal processing on the transmitter side, following the sensors. Assuming the TDS user on average issuing one command/s, implementing the computational engine reduces the data volume that needs to be wirelessly transmitted to a PC or smartphone by a factor of 1500×, from 12 kb/s to ~8 b/s. The proposed design is implemented on an ultralow-power IGLOO nano field-programmable gate array (FPGA) and is tested on AGLN250 prototype board. According to our post-place-and-route results, implementing the engine on the FPGA significantly drops the required data transmission, while an application-specific integrated circuit (ASIC) implementation in a 65-nm CMOS results in a 15× power saving compared to the FPGA solution and occupies a 0.02-mm2 footprint. As a result, the power consumption and size of the iTDS will be significantly reduced through the use of a much smaller rechargeable battery. Moreover, the system can operate longer following every recharge, improving the iTDS usability. PMID:26185489
Toward an Ultralow-Power Onboard Processor for Tongue Drive System.
Viseh, Sina; Ghovanloo, Maysam; Mohsenin, Tinoosh
2015-02-01
The Tongue Drive System (TDS) is a new unobtrusive, wireless, and wearable assistive device that allows for real-time tracking of the voluntary tongue motion in the oral space for communication, control, and navigation applications. The latest TDS prototype appears as a wireless headphone and has been tested in human subject trials. However, the robustness of the external TDS (eTDS) in real-life outdoor conditions may not meet safety regulations because of the limited mechanical stability of the headset. The intraoral TDS (iTDS), which is in the shape of a dental retainer, firmly clasps to the upper teeth and resists sensor misplacement. However, the iTDS has more restrictions on its dimensions, limiting the battery size and consequently requiring a considerable reduction in its power consumption to operate over an extended period of two days on a single charge. In this brief, we propose an ultralow-power local processor for the TDS that performs all signal processing on the transmitter side, following the sensors. Assuming the TDS user on average issuing one command/s, implementing the computational engine reduces the data volume that needs to be wirelessly transmitted to a PC or smartphone by a factor of 1500×, from 12 kb/s to ~8 b/s. The proposed design is implemented on an ultralow-power IGLOO nano field-programmable gate array (FPGA) and is tested on AGLN250 prototype board. According to our post-place-and-route results, implementing the engine on the FPGA significantly drops the required data transmission, while an application-specific integrated circuit (ASIC) implementation in a 65-nm CMOS results in a 15× power saving compared to the FPGA solution and occupies a 0.02-mm 2 footprint. As a result, the power consumption and size of the iTDS will be significantly reduced through the use of a much smaller rechargeable battery. Moreover, the system can operate longer following every recharge, improving the iTDS usability.
High density, multi-range analog output Versa Module Europa board for control system applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Kundan, E-mail: kundan@iuac.res.in; Das, Ajit Lal
2014-01-15
A new VMEDAC64, 12-bit 64 channel digital-to-analog converter, a Versa Module Europa (VME) module, features 64 analog voltage outputs with user selectable multiple ranges, has been developed for control system applications at Inter University Accelerator Centre. The FPGA (Field Programmable Gate Array) is the module's core, i.e., it implements the DAC control logic and complexity of VMEbus slave interface logic. The VMEbus slave interface and DAC control logic are completely designed and implemented on a single FPGA chip to achieve high density of 64 channels in a single width VME module and will reduce the module count in the controlmore » system applications, and hence will reduce the power consumption and cost of overall system. One of our early design goals was to develop the VME interface such that it can be easily integrated with the peripheral devices and satisfy the timing specifications of VME standard. The modular design of this module reduces the amount of time required to develop other custom modules for control system. The VME slave interface is written as a single component inside FPGA which will be used as a basic building block for any VMEbus interface project. The module offers multiple output voltage ranges depending upon the requirement. The output voltage range can be reduced or expanded by writing range selection bits in the control register. The module has programmable refresh rate and by default hold capacitors in the sample and hold circuit for each channel are charged periodically every 7.040 ms (i.e., update frequency 284 Hz). Each channel has software controlled output switch which disconnects analog output from the field. The modularity in the firmware design on FPGA makes the debugging very easy. On-board DC/DC converters are incorporated for isolated power supply for the analog section of the board.« less
Design of an MR image processing module on an FPGA chip
NASA Astrophysics Data System (ADS)
Li, Limin; Wyrwicz, Alice M.
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments.
A firmware-defined digital direct-sampling NMR spectrometer for condensed matter physics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pikulski, M., E-mail: marekp@ethz.ch; Shiroka, T.; Ott, H.-R.
2014-09-15
We report on the design and implementation of a new digital, broad-band nuclear magnetic resonance (NMR) spectrometer suitable for probing condensed matter. The spectrometer uses direct sampling in both transmission and reception. It relies on a single, commercially-available signal processing device with a user-accessible field-programmable gate array (FPGA). Its functions are defined exclusively by the FPGA firmware and the application software. Besides allowing for fast replication, flexibility, and extensibility, our software-based solution preserves the option to reuse the components for other projects. The device operates up to 400 MHz without, and up to 800 MHz with undersampling, respectively. Digital down-conversion with ±10 MHzmore » passband is provided on the receiver side. The system supports high repetition rates and has virtually no intrinsic dead time. We describe briefly how the spectrometer integrates into the experimental setup and present test data which demonstrates that its performance is competitive with that of conventional designs.« less
Design of an MR image processing module on an FPGA chip
Li, Limin; Wyrwicz, Alice M.
2015-01-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. PMID:25909646
A firmware-defined digital direct-sampling NMR spectrometer for condensed matter physics.
Pikulski, M; Shiroka, T; Ott, H-R; Mesot, J
2014-09-01
We report on the design and implementation of a new digital, broad-band nuclear magnetic resonance (NMR) spectrometer suitable for probing condensed matter. The spectrometer uses direct sampling in both transmission and reception. It relies on a single, commercially-available signal processing device with a user-accessible field-programmable gate array (FPGA). Its functions are defined exclusively by the FPGA firmware and the application software. Besides allowing for fast replication, flexibility, and extensibility, our software-based solution preserves the option to reuse the components for other projects. The device operates up to 400 MHz without, and up to 800 MHz with undersampling, respectively. Digital down-conversion with ±10 MHz passband is provided on the receiver side. The system supports high repetition rates and has virtually no intrinsic dead time. We describe briefly how the spectrometer integrates into the experimental setup and present test data which demonstrates that its performance is competitive with that of conventional designs.
Invited Article: Digital beam-forming imaging riometer systems
NASA Astrophysics Data System (ADS)
Honary, Farideh; Marple, Steve R.; Barratt, Keith; Chapman, Peter; Grill, Martin; Nielsen, Erling
2011-03-01
The design and operation of a new generation of digital imaging riometer systems developed by Lancaster University are presented. In the heart of the digital imaging riometer is a field-programmable gate array (FPGA), which is used for the digital signal processing and digital beam forming, completely replacing the analog Butler matrices which have been used in previous designs. The reconfigurable nature of the FPGA has been exploited to produce tools for remote system testing and diagnosis which have proven extremely useful for operation in remote locations such as the Arctic and Antarctic. Different FPGA programs enable different instrument configurations, including a 4 × 4 antenna filled array (producing 4 × 4 beams), an 8 × 8 antenna filled array (producing 7 × 7 beams), and a Mills cross system utilizing 63 antennas producing 556 usable beams. The concept of using a Mills cross antenna array for riometry has been successfully demonstrated for the first time. The digital beam forming has been validated by comparing the received signal power from cosmic radio sources with results predicted from the theoretical beam radiation pattern. The performances of four digital imaging riometer systems are compared against each other and a traditional imaging riometer utilizing analog Butler matrices. The comparison shows that digital imaging riometer systems, with independent receivers for each antenna, can obtain much better measurement precision for filled arrays or much higher spatial resolution for the Mills cross configuration when compared to existing imaging riometer systems.
Estimating the circuit delay of FPGA with a transfer learning method
NASA Astrophysics Data System (ADS)
Cui, Xiuhai; Liu, Datong; Peng, Yu; Peng, Xiyuan
2017-10-01
With the increase of FPGA (Field Programmable Gate Array, FPGA) functionality, FPGA has become an on-chip system platform. Due to increase the complexity of FPGA, estimating the delay of FPGA is a very challenge work. To solve the problems, we propose a transfer learning estimation delay (TLED) method to simplify the delay estimation of different speed grade FPGA. In fact, the same style different speed grade FPGA comes from the same process and layout. The delay has some correlation among different speed grade FPGA. Therefore, one kind of speed grade FPGA is chosen as a basic training sample in this paper. Other training samples of different speed grade can get from the basic training samples through of transfer learning. At the same time, we also select a few target FPGA samples as training samples. A general predictive model is trained by these samples. Thus one kind of estimation model is used to estimate different speed grade FPGA circuit delay. The framework of TRED includes three phases: 1) Building a basic circuit delay library which includes multipliers, adders, shifters, and so on. These circuits are used to train and build the predictive model. 2) By contrasting experiments among different algorithms, the forest random algorithm is selected to train predictive model. 3) The target circuit delay is predicted by the predictive model. The Artix-7, Kintex-7, and Virtex-7 are selected to do experiments. Each of them includes -1, -2, -2l, and -3 different speed grade. The experiments show the delay estimation accuracy score is more than 92% with the TLED method. This result shows that the TLED method is a feasible delay assessment method, especially in the high-level synthesis stage of FPGA tool, which is an efficient and effective delay assessment method.
Implementation of the 2-D Wavelet Transform into FPGA for Image
NASA Astrophysics Data System (ADS)
León, M.; Barba, L.; Vargas, L.; Torres, C. O.
2011-01-01
This paper presents a hardware system implementation of the of discrete wavelet transform algoritm in two dimensions for FPGA, using the Daubechies filter family of order 2 (db2). The decomposition algorithm of this transform is designed and simulated with the Hardware Description Language VHDL and is implemented in a programmable logic device (FPGA) XC3S1200E reference, Spartan IIIE family, by Xilinx, take advantage the parallels properties of these gives us and speeds processing that can reach them. The architecture is evaluated using images input of different sizes. This implementation is done with the aim of developing a future images encryption hardware system using wavelet transform for security information.
Digital hardware implementation of a stochastic two-dimensional neuron model.
Grassia, F; Kohno, T; Levi, T
2016-11-01
This study explores the feasibility of stochastic neuron simulation in digital systems (FPGA), which realizes an implementation of a two-dimensional neuron model. The stochasticity is added by a source of current noise in the silicon neuron using an Ornstein-Uhlenbeck process. This approach uses digital computation to emulate individual neuron behavior using fixed point arithmetic operation. The neuron model's computations are performed in arithmetic pipelines. It was designed in VHDL language and simulated prior to mapping in the FPGA. The experimental results confirmed the validity of the developed stochastic FPGA implementation, which makes the implementation of the silicon neuron more biologically plausible for future hybrid experiments. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Berg, Melanie; LaBel, Ken
2007-01-01
This viewgraph presentation reviews the selection of the optimum Field Programmable Gate Arrays (FPGA) for space missions. Included in this review is a discussion on differentiating amongst various FPGAs, cost analysis of the various options, the investigation of radiation effects, an expansion of the evaluation criteria, and the application of the evaluation criteria to the selection process.
ERIC Educational Resources Information Center
Zumel, P.; Fernandez, C.; Sanz, M.; Lazaro, A.; Barrado, A.
2011-01-01
In this paper, a short introductory course to introduce field-programmable gate array (FPGA)-based digital control of dc/dc switching power converters is presented. Digital control based on specific hardware has been at the leading edge of low-medium power dc/dc switching converters in recent years. Besides industry's interest in this topic, from…
NASA Technical Reports Server (NTRS)
Morfopoulos, Arin C.; Pham, Thang D.
2013-01-01
JPL has produced a series of FPGA (field programmable gate array) vision algorithms that were written with custom interfaces to get data in and out of each vision module. Each module has unique requirements on the data interface, and further vision modules are continually being developed, each with their own custom interfaces. Each memory module had also been designed for direct access to memory or to another memory module.
FPGA and USB based control board for quantum random number generator
NASA Astrophysics Data System (ADS)
Wang, Jian; Wan, Xu; Zhang, Hong-Fei; Gao, Yuan; Chen, Teng-Yun; Liang, Hao
2009-09-01
The design and implementation of FPGA-and-USB-based control board for quantum experiments are discussed. The usage of quantum true random number generator, control- logic in FPGA and communication with computer through USB protocol are proposed in this paper. Programmable controlled signal input and output ports are implemented. The error-detections of data frame header and frame length are designed. This board has been used in our decoy-state based quantum key distribution (QKD) system successfully.
Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System
NASA Technical Reports Server (NTRS)
Aranki, Nazeeh I.; Keymeulen, Didier; Kimesh, Matthew A.
2012-01-01
Modern hyperspectral imaging systems are able to acquire far more data than can be downlinked from a spacecraft. Onboard data compression helps to alleviate this problem, but requires a system capable of power efficiency and high throughput. Software solutions have limited throughput performance and are power-hungry. Dedicated hardware solutions can provide both high throughput and power efficiency, while taking the load off of the main processor. Thus a hardware compression system was developed. The implementation uses a field-programmable gate array (FPGA). The implementation is based on the fast lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral-Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), page 26, which achieves excellent compression performance and has low complexity. This algorithm performs predictive compression using an adaptive filtering method, and uses adaptive Golomb coding. The implementation also packetizes the coded data. The FL algorithm is well suited for implementation in hardware. In the FPGA implementation, one sample is compressed every clock cycle, which makes for a fast and practical realtime solution for space applications. Benefits of this implementation are: 1) The underlying algorithm achieves a combination of low complexity and compression effectiveness that exceeds that of techniques currently in use. 2) The algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. 3) Hardware acceleration provides a throughput improvement of 10 to 100 times vs. the software implementation. A prototype of the compressor is available in software, but it runs at a speed that does not meet spacecraft requirements. The hardware implementation targets the Xilinx Virtex IV FPGAs, and makes the use of this compressor practical for Earth satellites as well as beyond-Earth missions with hyperspectral instruments.
FPGA implementation of image dehazing algorithm for real time applications
NASA Astrophysics Data System (ADS)
Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.
2017-09-01
Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
FPGA implementation of sparse matrix algorithm for information retrieval
NASA Astrophysics Data System (ADS)
Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio
2005-06-01
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
High-speed multiple sequence alignment on a reconfigurable platform.
Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf
2006-01-01
Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
High-Precision Pulse Generator
NASA Technical Reports Server (NTRS)
Katz, Richard; Kleyner, Igor
2011-01-01
A document discusses a pulse generator with subnanosecond resolution implemented with a low-cost field-programmable gate array (FPGA) at low power levels. The method used exploits the fast carry chains of certain FPGAs. Prototypes have been built and tested in both Actel AX and Xilinx Virtex 4 technologies. In-flight calibration or control can be performed by using a similar and related technique as a time interval measurement circuit by measuring a period of the stable oscillator, as the delays through the fast carry chains will vary as a result of manufacturing variances as well as the result of environmental conditions (voltage, aging, temperature, and radiation).
A Hardware Platform for Tuning of MEMS Devices Using Closed-Loop Frequency Response
NASA Technical Reports Server (NTRS)
Ferguson, Michael I.; MacDonald, Eric; Foor, David
2005-01-01
We report on the development of a hardware platform for integrated tuning and closed-loop operation of MEMS gyroscopes. The platform was developed and tested for the second generation JPL/Boeing Post-Resonator MEMS gyroscope. The control of this device is implemented through a digital design on a Field Programmable Gate Array (FPGA). A software interface allows the user to configure, calibrate, and tune the bias voltages on the micro-gyro. The interface easily transitions to an embedded solution that allows for the miniaturization of the system to a single chip.
A FPGA embedded web server for remote monitoring and control of smart sensors networks.
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2013-12-27
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology.
A FPGA Embedded Web Server for Remote Monitoring and Control of Smart Sensors Networks
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2014-01-01
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology. PMID:24379047
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
Chao, Chun-Tang; Maneetien, Nopadon; Wang, Chi-Jo; Chiou, Juing-Shian
2014-01-01
This paper presents the design and evaluation of the hardware circuit for electronic stethoscopes with heart sound cancellation capabilities using field programmable gate arrays (FPGAs). The adaptive line enhancer (ALE) was adopted as the filtering methodology to reduce heart sound attributes from the breath sounds obtained via the electronic stethoscope pickup. FPGAs were utilized to implement the ALE functions in hardware to achieve near real-time breath sound processing. We believe that such an implementation is unprecedented and crucial toward a truly useful, standalone medical device in outpatient clinic settings. The implementation evaluation with one Altera cyclone II-EP2C70F89 shows that the proposed ALE used 45% resources of the chip. Experiments with the proposed prototype were made using DE2-70 emulation board with recorded body signals obtained from online medical archives. Clear suppressions were observed in our experiments from both the frequency domain and time domain perspectives.
Extending the BEAGLE library to a multi-FPGA platform
2013-01-01
Background Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein’s pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein’s pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. Results The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform’s peak memory bandwidth and the implementation’s memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE’s CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE’s GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. Conclusions The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor. PMID:23331707
An IO block array in a radiation-hardened SOI SRAM-based FPGA
NASA Astrophysics Data System (ADS)
Yan, Zhao; Lihua, Wu; Xiaowei, Han; Yan, Li; Qianli, Zhang; Liang, Chen; Guoquan, Zhang; Jianzhong, Li; Bo, Yang; Jiantou, Gao; Jian, Wang; Ming, Li; Guizhai, Liu; Feng, Zhang; Xufeng, Guo; Kai, Zhao; Chen, Stanley L.; Fang, Yu; Zhongli, Liu
2012-01-01
We present an input/output block (IOB) array used in the radiation-hardened SRAM-based field-programmable gate array (FPGA) VS1000, which is designed and fabricated with a 0.5 μm partially depleted silicon-on-insulator (SOI) logic process at the CETC 58th Institute. Corresponding with the characteristics of the FPGA, each IOB includes a local routing pool and two IO cells composed of a signal path circuit, configurable input/output buffers and an ESD protection network. A boundary-scan path circuit can be used between the programmable buffers and the input/output circuit or as a transparent circuit when the IOB is applied in different modes. Programmable IO buffers can be used at TTL/CMOS standard levels. The local routing pool enhances the flexibility and routability of the connection between the IOB array and the core logic. Radiation-hardened designs, including A-type and H-type body-tied transistors and special D-type registers, improve the anti-radiation performance. The ESD protection network, which provides a high-impulse discharge path on a pad, prevents the breakdown of the core logic caused by the immense current. These design strategies facilitate the design of FPGAs with different capacities or architectures to form a series of FPGAs. The functionality and performance of the IOB array is proved after a functional test. The radiation test indicates that the proposed VS1000 chip with an IOB array has a total dose tolerance of 100 krad(Si), a dose survivability rate of 1.5 × 1011 rad(Si)/s, and a neutron fluence immunity of 1 × 1014 n/cm2.
A digital-receiver for the MurchisonWidefield Array
NASA Astrophysics Data System (ADS)
Prabu, Thiagaraj; Srivani, K. S.; Roshi, D. Anish; Kamini, P. A.; Madhavi, S.; Emrich, David; Crosse, Brian; Williams, Andrew J.; Waterson, Mark; Deshpande, Avinash A.; Shankar, N. Udaya; Subrahmanyan, Ravi; Briggs, Frank H.; Goeke, Robert F.; Tingay, Steven J.; Johnston-Hollitt, Melanie; R, Gopalakrishna M.; Morgan, Edward H.; Pathikulangara, Joseph; Bunton, John D.; Hampson, Grant; Williams, Christopher; Ord, Stephen M.; Wayth, Randall B.; Kumar, Deepak; Morales, Miguel F.; deSouza, Ludi; Kratzenberg, Eric; Pallot, D.; McWhirter, Russell; Hazelton, Bryna J.; Arcus, Wayne; Barnes, David G.; Bernardi, Gianni; Booler, T.; Bowman, Judd D.; Cappallo, Roger J.; Corey, Brian E.; Greenhill, Lincoln J.; Herne, David; Hewitt, Jacqueline N.; Kaplan, David L.; Kasper, Justin C.; Kincaid, Barton B.; Koenig, Ronald; Lonsdale, Colin J.; Lynch, Mervyn J.; Mitchell, Daniel A.; Oberoi, Divya; Remillard, Ronald A.; Rogers, Alan E.; Salah, Joseph E.; Sault, Robert J.; Stevens, Jamie B.; Tremblay, S.; Webster, Rachel L.; Whitney, Alan R.; Wyithe, Stuart B.
2015-03-01
An FPGA-based digital-receiver has been developed for a low-frequency imaging radio interferometer, the Murchison Widefield Array (MWA). The MWA, located at the Murchison Radio-astronomy Observatory (MRO) in Western Australia, consists of 128 dual-polarized aperture-array elements (tiles) operating between 80 and 300 MHz, with a total processed bandwidth of 30.72 MHz for each polarization. Radio-frequency signals from the tiles are amplified and band limited using analog signal conditioning units; sampled and channelized by digital-receivers. The signals from eight tiles are processed by a single digital-receiver, thus requiring 16 digital-receivers for the MWA. The main function of the digital-receivers is to digitize the broad-band signals from each tile, channelize them to form the sky-band, and transport it through optical fibers to a centrally located correlator for further processing. The digital-receiver firmware also implements functions to measure the signal power, perform power equalization across the band, detect interference-like events, and invoke diagnostic modes. The digital-receiver is controlled by high-level programs running on a single-board-computer. This paper presents the digital-receiver design, implementation, current status, and plans for future enhancements.
Fast semivariogram computation using FPGA architectures
NASA Astrophysics Data System (ADS)
Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang
2015-02-01
The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
A single FPGA-based portable ultrasound imaging system for point-of-care applications.
Kim, Gi-Duck; Yoon, Changhan; Kye, Sang-Bum; Lee, Youngbae; Kang, Jeeun; Yoo, Yangmo; Song, Tai-kyong
2012-07-01
We present a cost-effective portable ultrasound system based on a single field-programmable gate array (FPGA) for point-of-care applications. In the portable ultrasound system developed, all the ultrasound signal and image processing modules, including an effective 32-channel receive beamformer with pseudo-dynamic focusing, are embedded in an FPGA chip. For overall system control, a mobile processor running Linux at 667 MHz is used. The scan-converted ultrasound image data from the FPGA are directly transferred to the system controller via external direct memory access without a video processing unit. The potable ultrasound system developed can provide real-time B-mode imaging with a maximum frame rate of 30, and it has a battery life of approximately 1.5 h. These results indicate that the single FPGA-based portable ultrasound system developed is able to meet the processing requirements in medical ultrasound imaging while providing improved flexibility for adapting to emerging POC applications.
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.
Ly, Daniel Le; Chow, Paul
2010-11-01
Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
NASA Astrophysics Data System (ADS)
Poinsot, Audrey; Yang, Fan; Brost, Vincent
2011-02-01
Including multiple sources of information in personal identity recognition and verification gives the opportunity to greatly improve performance. We propose a contactless biometric system that combines two modalities: palmprint and face. Hardware implementations are proposed on the Texas Instrument Digital Signal Processor and Xilinx Field-Programmable Gate Array (FPGA) platforms. The algorithmic chain consists of a preprocessing (which includes palm extraction from hand images), Gabor feature extraction, comparison by Hamming distance, and score fusion. Fusion possibilities are discussed and tested first using a bimodal database of 130 subjects that we designed (uB database), and then two common public biometric databases (AR for face and PolyU for palmprint). High performance has been obtained for recognition and verification purpose: a recognition rate of 97.49% with AR-PolyU database and an equal error rate of 1.10% on the uB database using only two training samples per subject have been obtained. Hardware results demonstrate that preprocessing can easily be performed during the acquisition phase, and multimodal biometric recognition can be treated almost instantly (0.4 ms on FPGA). We show the feasibility of a robust and efficient multimodal hardware biometric system that offers several advantages, such as user-friendliness and flexibility.
An FPGA Platform for Real-Time Simulation of Spiking Neuronal Networks
Pani, Danilo; Meloni, Paolo; Tuveri, Giuseppe; Palumbo, Francesca; Massobrio, Paolo; Raffo, Luigi
2017-01-01
In the last years, the idea to dynamically interface biological neurons with artificial ones has become more and more urgent. The reason is essentially due to the design of innovative neuroprostheses where biological cell assemblies of the brain can be substituted by artificial ones. For closed-loop experiments with biological neuronal networks interfaced with in silico modeled networks, several technological challenges need to be faced, from the low-level interfacing between the living tissue and the computational model to the implementation of the latter in a suitable form for real-time processing. Field programmable gate arrays (FPGAs) can improve flexibility when simple neuronal models are required, obtaining good accuracy, real-time performance, and the possibility to create a hybrid system without any custom hardware, just programming the hardware to achieve the required functionality. In this paper, this possibility is explored presenting a modular and efficient FPGA design of an in silico spiking neural network exploiting the Izhikevich model. The proposed system, prototypically implemented on a Xilinx Virtex 6 device, is able to simulate a fully connected network counting up to 1,440 neurons, in real-time, at a sampling rate of 10 kHz, which is reasonable for small to medium scale extra-cellular closed-loop experiments. PMID:28293163
Espinal, Andres; Rostro-Gonzalez, Horacio; Carpio, Martin; Guerra-Hernandez, Erick I.; Ornelas-Rodriguez, Manuel; Sotelo-Figueroa, Marco
2016-01-01
This paper presents a method to design Spiking Central Pattern Generators (SCPGs) to achieve locomotion at different frequencies on legged robots. It is validated through embedding its designs into a Field-Programmable Gate Array (FPGA) and implemented on a real hexapod robot. The SCPGs are automatically designed by means of a Christiansen Grammar Evolution (CGE)-based methodology. The CGE performs a solution for the configuration (synaptic weights and connections) for each neuron in the SCPG. This is carried out through the indirect representation of candidate solutions that evolve to replicate a specific spike train according to a locomotion pattern (gait) by measuring the similarity between the spike trains and the SPIKE distance to lead the search to a correct configuration. By using this evolutionary approach, several SCPG design specifications can be explicitly added into the SPIKE distance-based fitness function, such as looking for Spiking Neural Networks (SNNs) with minimal connectivity or a Central Pattern Generator (CPG) able to generate different locomotion gaits only by changing the initial input stimuli. The SCPG designs have been successfully implemented on a Spartan 6 FPGA board and a real time validation on a 12 Degrees Of Freedom (DOFs) hexapod robot is presented. PMID:27516737
Real-time plasma control based on the ISTTOK tomography diagnostica)
NASA Astrophysics Data System (ADS)
Carvalho, P. J.; Carvalho, B. B.; Neto, A.; Coelho, R.; Fernandes, H.; Sousa, J.; Varandas, C.; Chávez-Alarcón, E.; Herrera-Velázquez, J. J. E.
2008-10-01
The presently available processing power in generic processing units (GPUs) combined with state-of-the-art programmable logic devices benefits the implementation of complex, real-time driven, data processing algorithms for plasma diagnostics. A tomographic reconstruction diagnostic has been developed for the ISTTOK tokamak, based on three linear pinhole cameras each with ten lines of sight. The plasma emissivity in a poloidal cross section is computed locally on a submillisecond time scale, using a Fourier-Bessel algorithm, allowing the use of the output signals for active plasma position control. The data acquisition and reconstruction (DAR) system is based on ATCA technology and consists of one acquisition board with integrated field programmable gate array (FPGA) capabilities and a dual-core Pentium module running real-time application interface (RTAI) Linux. In this paper, the DAR real-time firmware/software implementation is presented, based on (i) front-end digital processing in the FPGA; (ii) a device driver specially developed for the board which enables streaming data acquisition to the host GPU; and (iii) a fast reconstruction algorithm running in Linux RTAI. This system behaves as a module of the central ISTTOK control and data acquisition system (FIRESIGNAL). Preliminary results of the above experimental setup are presented and a performance benchmarking against the magnetic coil diagnostic is shown.
Solving the corner-turning problem for large interferometers
NASA Astrophysics Data System (ADS)
Lutomirski, Andrew; Tegmark, Max; Sanchez, Nevada J.; Stein, Leo C.; Urry, W. Lynn; Zaldarriaga, Matias
2011-01-01
The so-called corner-turning problem is a major bottleneck for radio telescopes with large numbers of antennas. The problem is essentially that of rapidly transposing a matrix that is too large to store on one single device; in radio interferometry, it occurs because data from each antenna need to be routed to an array of processors each of which will handle a limited portion of the data (say, a frequency range) but requires input from each antenna. We present a low-cost solution allowing the correlator to transpose its data in real time, without contending for bandwidth, via a butterfly network requiring neither additional RAM memory nor expensive general-purpose switching hardware. We discuss possible implementations of this using FPGA, CMOS, analog logic and optical technology, and conclude that the corner-turner cost can be small even for upcoming massive radio arrays.
NASA Astrophysics Data System (ADS)
García, Aday; Santos, Lucana; López, Sebastián.; Callicó, Gustavo M.; Lopez, Jose F.; Sarmiento, Roberto
2014-05-01
Efficient onboard satellite hyperspectral image compression represents a necessity and a challenge for current and future space missions. Therefore, it is mandatory to provide hardware implementations for this type of algorithms in order to achieve the constraints required for onboard compression. In this work, we implement the Lossy Compression for Exomars (LCE) algorithm on an FPGA by means of high-level synthesis (HSL) in order to shorten the design cycle. Specifically, we use CatapultC HLS tool to obtain a VHDL description of the LCE algorithm from C-language specifications. Two different approaches are followed for HLS: on one hand, introducing the whole C-language description in CatapultC and on the other hand, splitting the C-language description in functional modules to be implemented independently with CatapultC, connecting and controlling them by an RTL description code without HLS. In both cases the goal is to obtain an FPGA implementation. We explain the several changes applied to the original Clanguage source code in order to optimize the results obtained by CatapultC for both approaches. Experimental results show low area occupancy of less than 15% for a SRAM-based Virtex-5 FPGA and a maximum frequency above 80 MHz. Additionally, the LCE compressor was implemented into an RTAX2000S antifuse-based FPGA, showing an area occupancy of 75% and a frequency around 53 MHz. All these serve to demonstrate that the LCE algorithm can be efficiently executed on an FPGA onboard a satellite. A comparison between both implementation approaches is also provided. The performance of the algorithm is finally compared with implementations on other technologies, specifically a graphics processing unit (GPU) and a single-threaded CPU.
An embedded face-classification system for infrared images on an FPGA
NASA Astrophysics Data System (ADS)
Soto, Javier E.; Figueroa, Miguel
2014-10-01
We present a face-classification architecture for long-wave infrared (IR) images implemented on a Field Programmable Gate Array (FPGA). The circuit is fast, compact and low power, can recognize faces in real time and be embedded in a larger image-processing and computer vision system operating locally on an IR camera. The algorithm uses Local Binary Patterns (LBP) to perform feature extraction on each IR image. First, each pixel in the image is represented as an LBP pattern that encodes the similarity between the pixel and its neighbors. Uniform LBP codes are then used to reduce the number of patterns to 59 while preserving more than 90% of the information contained in the original LBP representation. Then, the image is divided into 64 non-overlapping regions, and each region is represented as a 59-bin histogram of patterns. Finally, the algorithm concatenates all 64 regions to create a 3,776-bin spatially enhanced histogram. We reduce the dimensionality of this histogram using Linear Discriminant Analysis (LDA), which improves clustering and enables us to store an entire database of 53 subjects on-chip. During classification, the circuit applies LBP and LDA to each incoming IR image in real time, and compares the resulting feature vector to each pattern stored in the local database using the Manhattan distance. We implemented the circuit on a Xilinx Artix-7 XC7A100T FPGA and tested it with the UCHThermalFace database, which consists of 28 81 x 150-pixel images of 53 subjects in indoor and outdoor conditions. The circuit achieves a 98.6% hit ratio, trained with 16 images and tested with 12 images of each subject in the database. Using a 100 MHz clock, the circuit classifies 8,230 images per second, and consumes only 309mW.
FPGA-Based X-Ray Detection and Measurement for an X-Ray Polarimeter
NASA Technical Reports Server (NTRS)
Gregory, Kyle; Hill, Joanne; Black, Kevin; Baumgartner, Wayne
2013-01-01
This technology enables detection and measurement of x-rays in an x-ray polarimeter using a field-programmable gate array (FPGA). The technology was developed for the Gravitational and Extreme Magnetism Small Explorer (GEMS) mission. It performs precision energy and timing measurements, as well as rejection of non-x-ray events. It enables the GEMS polarimeter to detect precisely when an event has taken place so that additional measurements can be made. The technology also enables this function to be performed in an FPGA using limited resources so that mass and power can be minimized while reliability for a space application is maximized and precise real-time operation is achieved. This design requires a low-noise, charge-sensitive preamplifier; a highspeed analog to digital converter (ADC); and an x-ray detector with a cathode terminal. It functions by computing a sum of differences for time-samples whose difference exceeds a programmable threshold. A state machine advances through states as a programmable number of consecutive samples exceeds or fails to exceed this threshold. The pulse height is recorded as the accumulated sum. The track length is also measured based on the time from the start to the end of accumulation. For track lengths longer than a certain length, the algorithm estimates the barycenter of charge deposit by comparing the accumulator value at the midpoint to the final accumulator value. The design also employs a number of techniques for rejecting background events. This innovation enables the function to be performed in space where it can operate autonomously with a rapid response time. This implementation combines advantages of computing system-based approaches with those of pure analog approaches. The result is an implementation that is highly reliable, performs in real-time, rejects background events, and consumes minimal power.
NASA Astrophysics Data System (ADS)
Passas, Georgios; Freear, Steven; Fawcett, Darren
2010-01-01
Space-time coding (STC) is an important milestone in modern wireless communications. In this technique, more copies of the same signal are transmitted through different antennas (space) and different symbol periods (time), to improve the robustness of a wireless system by increasing its diversity gain. STCs are channel coding algorithms that can be readily implemented on a field programmable gate array (FPGA) device. This work provides some figures for the amount of required FPGA hardware resources, the speed that the algorithms can operate and the power consumption requirements of a space-time block code (STBC) encoder. Seven encoder very high-speed integrated circuit hardware description language (VHDL) designs have been coded, synthesised and tested. Each design realises a complex orthogonal space-time block code with a different transmission matrix. All VHDL designs are parameterisable in terms of sample precision. Precisions ranging from 4 bits to 32 bits have been synthesised. Alamouti's STBC encoder design [Alamouti, S.M. (1998), 'A Simple Transmit Diversity Technique for Wireless Communications', IEEE Journal on Selected Areas in Communications, 16:55-108.] proved to be the best trade-off, since it is on average 3.2 times smaller, 1.5 times faster and requires slightly less power than the next best trade-off in the comparison, which is a 3/4-rate full-diversity 3Tx-antenna STBC.
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2013-01-01
Typically, commercial sensor nodes are equipped with MCUsclocked at a low-frequency (i.e., within the 4–12 MHz range). Consequently, executing cryptographic algorithms in those MCUs generally requires a huge amount of time. In this respect, the required energy consumption can be higher than using a separate accelerator based on a Field-programmable Gate Array (FPGA) that is switched on when needed. In this manuscript, we present the design of a cryptographic accelerator suitable for an FPGA-based sensor node and compliant with the IEEE802.15.4 standard. All the embedded resources of the target platform (Xilinx Artix-7) have been maximized in order to provide a cost-effective solution. Moreover, we have added key negotiation capabilities to the IEEE 802.15.4 security suite based on Elliptic Curve Cryptography (ECC;. Our results suggest that tailored accelerators based on FPGA can behave better in terms of energy than contemporary software solutions for motes, such as the TinyECC and NanoECC libraries. In this regard, a point multiplication (PM) can be performed between 8.58- and 15.4-times faster, 3.40- to 23.59-times faster (Elliptic Curve Diffie-Hellman, ECDH) and between 5.45- and 34.26-times faster (Elliptic Curve Integrated Encryption Scheme, ECIES). Moreover, the energy consumption was also improved with a factor of 8.96 (PM). PMID:23899936
Autonomous Lawnmower using FPGA implementation.
NASA Astrophysics Data System (ADS)
Ahmad, Nabihah; Lokman, Nabill bin; Helmy Abd Wahab, Mohd
2016-11-01
Nowadays, there are various types of robot have been invented for multiple purposes. The robots have the special characteristic that surpass the human ability and could operate in extreme environment which human cannot endure. In this paper, an autonomous robot is built to imitate the characteristic of a human cutting grass. A Field Programmable Gate Array (FPGA) is used to control the movements where all data and information would be processed. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) is used to describe the hardware using Quartus II software. This robot has the ability of avoiding obstacle using ultrasonic sensor. This robot used two DC motors for its movement. It could include moving forward, backward, and turning left and right. The movement or the path of the automatic lawn mower is based on a path planning technique. Four Global Positioning System (GPS) plot are set to create a boundary. This to ensure that the lawn mower operates within the area given by user. Every action of the lawn mower is controlled by the FPGA DE' Board Cyclone II with the help of the sensor. Furthermore, Sketch Up software was used to design the structure of the lawn mower. The autonomous lawn mower was able to operate efficiently and smoothly return to coordinated paths after passing the obstacle. It uses 25% of total pins available on the board and 31% of total Digital Signal Processing (DSP) blocks.
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2013-07-29
Typically, commercial sensor nodes are equipped with MCUsclocked at a low-frequency (i.e., within the 4-12 MHz range). Consequently, executing cryptographic algorithms in those MCUs generally requires a huge amount of time. In this respect, the required energy consumption can be higher than using a separate accelerator based on a Field-programmable Gate Array (FPGA) that is switched on when needed. In this manuscript, we present the design of a cryptographic accelerator suitable for an FPGA-based sensor node and compliant with the IEEE802.15.4 standard. All the embedded resources of the target platform (Xilinx Artix-7) have been maximized in order to provide a cost-effective solution. Moreover, we have added key negotiation capabilities to the IEEE 802.15.4 security suite based on Elliptic Curve Cryptography (ECC). Our results suggest that tailored accelerators based on FPGA can behave better in terms of energy than contemporary software solutions for motes, such as the TinyECC and NanoECC libraries. In this regard, a point multiplication (PM) can be performed between 8.58- and 15.4-times faster, 3.40- to 23.59-times faster (Elliptic Curve Diffie-Hellman, ECDH) and between 5.45- and 34.26-times faster (Elliptic Curve Integrated Encryption Scheme, ECIES). Moreover, the energy consumption was also improved with a factor of 8.96 (PM).
Field Programmable Gate Array Control of Power Systems in Graduate Student Laboratories
2008-03-01
NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS Approved for public release; distribution is unlimited FIELD PROGRAMMABLE...REPORT TYPE AND DATES COVERED Master’s Thesis 4. TITLE AND SUBTITLE Field Programmable Gate Array Control of Power Systems in Graduate Student...Electronics curriculum track is the development of a design center that explores Field Programmable Gate Array (FPGA) control of power electronics
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe
2017-01-01
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l’information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N-th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work. PMID:28718788
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe; Thom, Christian
2017-07-18
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l'information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N -th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work.
Fast particles identification in programmable form at level-0 trigger by means of the 3D-Flow system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crosetto, Dario B.
1998-10-30
The 3D-Flow Processor system is a new, technology-independent concept in very fast, real-time system architectures. Based on either an FPGA or an ASIC implementation, it can address, in a fully programmable manner, applications where commercially available processors would fail because of throughput requirements. Possible applications include filtering-algorithms (pattern recognition) from the input of multiple sensors, as well as moving any input validated by these filtering-algorithms to a single output channel. Both operations can easily be implemented on a 3D-Flow system to achieve a real-time processing system with a very short lag time. This system can be built either with off-the-shelfmore » FPGAs or, for higher data rates, with CMOS chips containing 4 to 16 processors each. The basic building block of the system, a 3D-Flow processor, has been successfully designed in VHDL code written in ''Generic HDL'' (mostly made of reusable blocks that are synthesizable in different technologies, or FPGAs), to produce a netlist for a four-processor ASIC featuring 0.35 micron CBA (Ceil Base Array) technology at 3.3 Volts, 884 mW power dissipation at 60 MHz and 63.75 mm sq. die size. The same VHDL code has been targeted to three FPGA manufacturers (Altera EPF10K250A, ORCA-Lucent Technologies 0R3T165 and Xilinx XCV1000). A complete set of software tools, the 3D-Flow System Manager, equally applicable to ASIC or FPGA implementations, has been produced to provide full system simulation, application development, real-time monitoring, and run-time fault recovery. Today's technology can accommodate 16 processors per chip in a medium size die, at a cost per processor of less than $5 based on the current silicon die/size technology cost.« less
FPGA-based Klystron linearization implementations in scope of ILC
Omet, M.; Michizono, S.; Matsumoto, T.; ...
2015-01-23
We report the development and implementation of four FPGA-based predistortion-type klystron linearization algorithms. Klystron linearization is essential for the realization of ILC, since it is required to operate the klystrons 7% in power below their saturation. The work presented was performed in international collaborations at the Fermi National Accelerator Laboratory (FNAL), USA and the Deutsches Elektronen Synchrotron (DESY), Germany. With the newly developed algorithms, the generation of correction factors on the FPGA was improved compared to past algorithms, avoiding quantization and decreasing memory requirements. At FNAL, three algorithms were tested at the Advanced Superconducting Test Accelerator (ASTA), demonstrating a successfulmore » implementation for one algorithm and a proof of principle for two algorithms. Furthermore, the functionality of the algorithm implemented at DESY was demonstrated successfully in a simulation.« less
Current Status of The Low Frequency All Sky Monitor
NASA Astrophysics Data System (ADS)
Dartez, Louis; Creighton, Teviet; Jenet, Fredrick; Dolch, Timothy; Boehler, Keith; Bres, Luis; Cole, Brent; Luo, Jing; Miller, Rossina; Murray, James; Reyes, Alex; Rivera, Jesse
2018-01-01
The Low Frequency All Sky Monitor (LoFASM) is a distributed array of cross-dipole antennas that are sensitive to radio frequencies from 10 to 88 MHz. LoFASM consists of antennas and front end electronics that were originally developed for the Long Wavelength Array by the U.S. Naval Research Lab, the University of New Mexico, Virginia Tech, and the Jet Propulsion Laboratory. LoFASM, funded by the U.S. Department of Defense, will initially consist of 4 stations, each consisting of 12 dual- polarization dipole antenna stands. The primary science goals of LoFASM will be the detection and study of low-frequency radio transients, a high priority science goal as deemed by the National Research Council’s ASTRO2010 decadal survey. The data acquisition system for the LoFASM antenna array uses Field Programmable Gate Array (FPGA) technology to implement a real time full Stokes spectrometer and data recorder. This poster presents an overview of the LoFASM Radio Telescope as well as the status of data analysis of initial commissioning observations.
NASA Astrophysics Data System (ADS)
Isaak, S.; Bull, S.; Pitter, M. C.; Harrison, Ian.
2011-05-01
This paper reports on the development of a SPAD device and its subsequent use in an actively quenched single photon counting imaging system, and was fabricated in a UMC 0.18 μm CMOS process. A low-doped p- guard ring (t-well layer) encircling the active area to prevent the premature reverse breakdown. The array is a 16×1 parallel output SPAD array, which comprises of an active quenched SPAD circuit in each pixel with the current value being set by an external resistor RRef = 300 kΩ. The SPAD I-V response, ID was found to slowly increase until VBD was reached at excess bias voltage, Ve = 11.03 V, and then rapidly increase due to avalanche multiplication. Digital circuitry to control the SPAD array and perform the necessary data processing was designed in VHDL and implemented on a FPGA chip. At room temperature, the dark count was found to be approximately 13 KHz for most of the 16 SPAD pixels and the dead time was estimated to be 40 ns.
Atoche, Alejandro Castillo; Castillo, Javier Vázquez
2012-01-01
A high-speed dual super-systolic core for reconstructive signal processing (SP) operations consists of a double parallel systolic array (SA) machine in which each processing element of the array is also conceptualized as another SA in a bit-level fashion. In this study, we addressed the design of a high-speed dual super-systolic array (SSA) core for the enhancement/reconstruction of remote sensing (RS) imaging of radar/synthetic aperture radar (SAR) sensor systems. The selected reconstructive SP algorithms are efficiently transformed in their parallel representation and then, they are mapped into an efficient high performance embedded computing (HPEC) architecture in reconfigurable Xilinx field programmable gate array (FPGA) platforms. As an implementation test case, the proposed approach was aggregated in a HW/SW co-design scheme in order to solve the nonlinear ill-posed inverse problem of nonparametric estimation of the power spatial spectrum pattern (SSP) from a remotely sensed scene. We show how such dual SSA core, drastically reduces the computational load of complex RS regularization techniques achieving the required real-time operational mode. PMID:22736964
NASA Astrophysics Data System (ADS)
Villar, Xabier; Piso, Daniel; Bruguera, Javier D.
2014-02-01
This paper presents an FPGA implementation of an algorithm, previously published, for the the reconstruction of cosmic rays' trajectories and the determination of the time of arrival and velocity of the particles. The accuracy and precision issues of the algorithm have been analyzed to propose a suitable implementation. Thus, a 32-bit fixed-point format has been used for the representation of the data values. Moreover, the dependencies among the different operations have been taken into account to obtain a highly parallel and efficient hardware implementation. The final hardware architecture requires 18 cycles to process every particle, and has been exhaustively simulated to validate all the design decisions. The architecture has been mapped over different commercial FPGAs, with a frequency of operation ranging from 300 MHz to 1.3 GHz, depending on the FPGA being used. Consequently, the number of particle trajectories processed per second is between 16 million and 72 million. The high number of particle trajectories calculated per second shows that the proposed FPGA implementation might be used also in high rate environments such as those found in particle and nuclear physics experiments.
HIFU Monitoring and Control with Dual-Mode Ultrasound Arrays
NASA Astrophysics Data System (ADS)
Casper, Andrew Jacob
The biological effects of high-intensity focused ultrasound (HIFU) have been known and studied for decades. HIFU has been shown capable of treating a wide variety of diseases and disorders. However, despite its demonstrated potential, HIFU has been slow to gain clinical acceptance. This is due, in part, to the difficulty associated with robustly monitoring and controlling the delivery of the HIFU energy. The non-invasive nature of the surgery makes the assessment of treatment progression difficult, leading to long treatment times and a significant risk of under treatment. This thesis research develops new techniques and systems for robustly monitoring HIFU therapies for the safe and efficacious delivery of the intended treatment. Systems and algorithms were developed for the two most common modes of HIFU delivery systems: single-element and phased array applicators. Delivering HIFU with a single element transducer is a widely used technique in HIFU therapies. The simplicity of a single element offers many benefits in terms of cost and overall system complexity. Typical monitoring schemes rely on an external device (e.g. diagnostic ultrasound or MRI) to assess the progression of therapy. The research presented in this thesis explores using the same element to both deliver and monitor the HIFU therapy. The use of a dual-mode ultrasound transducer (DMUT) required the development of an FPGA based single-channel arbitrary waveform generator and high-speed data acquisition unit. Data collected from initial uncontrolled ablations led to the development of monitoring and control algorithms which were implemented directly on the FPGA. Close integration between the data acquisition and arbitrary waveform units allowed for fast, low latency control over the ablation process. Results are presented that demonstrate control of HIFU therapies over a broad range of intensities and in multiple in vitro tissues. The second area of investigation expands the DMUT research to an ultrasound phased-array. The phased-array allows for electronic steering of the HIFU focus and imaging of the acoustic medium. Investigating the dual-mode ultrasound array (DMUA) required the design and construction of a novel ultrasound-guided focused ultrasound (USgFUS) platform. The platform consisted of custom hardware designed for the unique requirements of operating a phased-array in both therapeutic and imaging modes. The platform also required the development of FPGA based signal processing and GPU based beamforming algorithms for online monitoring of the therapy process. The results presented in this thesis represent the first demonstration of a real-time USgFUS platform based around a DMUA. Experimental imaging and therapy results from series of animal experiments, including a 12 animal GLP study, are presented. In addition, in vitro control results, which build upon the DMUT work, are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobrek, Miljko; Albright, Austin P
This paper presents FPGA implementation of the Reed-Solomon decoder for use in IEEE 802.16 WiMAX systems. The decoder is based on RS(255,239) code, and is additionally shortened and punctured according to the WiMAX specifications. Simulink model based on Sysgen library of Xilinx blocks was used for simulation and hardware implementation. At the end, simulation results and hardware implementation performances are presented.
Development of an FPGA-based multipoint laser pyroshock measurement system for explosive bolts
NASA Astrophysics Data System (ADS)
Abbas, Syed Haider; Jang, Jae-Kyeong; Lee, Jung-Ryul; Kim, Zaeill
2016-07-01
Pyroshock can cause failure to the objective of an aerospace structure by damaging its sensitive electronic equipment, which is responsible for performing decisive operations. A pyroshock is the high intensity shock wave that is generated when a pyrotechnic device is explosively triggered to separate, release, or activate structural subsystems of an aerospace architecture. Pyroshock measurement plays an important role in experimental simulations to understand the characteristics of pyroshock on the host structure. This paper presents a technology to measure a pyroshock wave at multiple points using laser Doppler vibrometers (LDVs). These LDVs detect the pyroshock wave generated due to an explosive-based pyrotechnical event. Field programmable gate array (FPGA) based data acquisition is used in the study to acquire pyroshock signals simultaneously from multiple channels. This paper describes the complete system design for multipoint pyroshock measurement. The firmware architecture for the implementation of multichannel data acquisition on an FPGA-based development board is also discussed. An experiment using explosive bolts was configured to test the reliability of the system. Pyroshock was generated using explosive excitation on a 22-mm-thick steel plate. Three LDVs were deployed to capture the pyroshock wave at different points. The pyroshocks captured were displayed as acceleration plots. The results showed that our system effectively captured the pyroshock wave with a peak-to-peak magnitude of 303 741 g. The contribution of this paper is a specialized architecture of firmware design programmed in FPGA for data acquisition of large amount of multichannel pyroshock data. The advantages of the developed system are the near-field, multipoint, non-contact, and remote measurement of a pyroshock wave, which is dangerous and expensive to produce in aerospace pyrotechnic tests.
A Timing Synchronizer System for Beam Test Setups Requiring Galvanic Isolation
NASA Astrophysics Data System (ADS)
Meder, Lukas Dominik; Emschermann, David; Frühauf, Jochen; Müller, Walter F. J.; Becker, Jürgen
2017-07-01
In beam test setups detector elements together with a readout composed of frontend electronics (FEE) and usually a layer of field-programmable gate arrays (FPGAs) are being analyzed. The FEE is in this scenario often directly connected to both the detector and the FPGA layer what in many cases requires sharing the ground potentials of these layers. This setup can become problematic if parts of the detector need to be operated at different high-voltage potentials, since all of the FPGA boards need to receive a common clock and timing reference for getting the readout synchronized. Thus, for the context of the compressed baryonic matter experiment a versatile timing synchronizer (TS) system was designed providing galvanically isolated timing distribution links over twisted-pair cables. As an electrical interface the so-called timing data processing board FPGA mezzanine card was created for being mounted onto FPGA-based advanced mezzanine cards for mTCA.4 crates. The FPGA logic of the TS system connects to this card and can be monitored and controlled through IPBus slow-control links. Evaluations show that the system is capable of stably synchronizing the FPGA boards of a beam test setup being integrated into a hierarchical TS network.
Instrumentation and control of harmonic oscillators via a single-board microprocessor-FPGA device.
Picone, Rico A R; Davis, Solomon; Devine, Cameron; Garbini, Joseph L; Sidles, John A
2017-04-01
We report the development of an instrumentation and control system instantiated on a microprocessor-field programmable gate array (FPGA) device for a harmonic oscillator comprising a portion of a magnetic resonance force microscope. The specific advantages of the system are that it minimizes computation, increases maintainability, and reduces the technical barrier required to enter the experimental field of magnetic resonance force microscopy. Heterodyne digital control and measurement yields computational advantages. A single microprocessor-FPGA device improves system maintainability by using a single programming language. The system presented requires significantly less technical expertise to instantiate than the instrumentation of previous systems, yet integrity of performance is retained and demonstrated with experimental data.
Instrumentation and control of harmonic oscillators via a single-board microprocessor-FPGA device
NASA Astrophysics Data System (ADS)
Picone, Rico A. R.; Davis, Solomon; Devine, Cameron; Garbini, Joseph L.; Sidles, John A.
2017-04-01
We report the development of an instrumentation and control system instantiated on a microprocessor-field programmable gate array (FPGA) device for a harmonic oscillator comprising a portion of a magnetic resonance force microscope. The specific advantages of the system are that it minimizes computation, increases maintainability, and reduces the technical barrier required to enter the experimental field of magnetic resonance force microscopy. Heterodyne digital control and measurement yields computational advantages. A single microprocessor-FPGA device improves system maintainability by using a single programming language. The system presented requires significantly less technical expertise to instantiate than the instrumentation of previous systems, yet integrity of performance is retained and demonstrated with experimental data.
Selected issues of the universal communication environment implementation for CII standard
NASA Astrophysics Data System (ADS)
Zagoździńska, Agnieszka; Poźniak, Krzysztof T.; Drabik, Paweł K.
2011-10-01
In the contemporary FPGA market there is the wide assortment of structures, integrated development environments, and boards of different producers. The variety allows to fit resources to requirements of the individual designer. There is the need of standardization of the projects to make it useful in research laboratories equipped with different producers tools. Proposed solution is CII standardization of VHDL components. This paper contains specification of the universal communication environment for CII standard. The link can be used in different FPGA structures. Implementation of the link enables object oriented VHDL programming with the use of CII standardization. The whole environment contains FPGA environment and PC software. The paper contains description of the selected issues of FPGA environment. There is description of some specific solutions that enables environment usage in structures of different producers. The flexibility of different size data transmissions with the use of CII is presented. The specified tool gives the opportunity to use FPGA structures variety fully and design faster and more effectively.
Tuple spaces in hardware for accelerated implicit routing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Zachary Kent; Tripp, Justin
2010-12-01
Organizing and optimizing data objects on networks with support for data migration and failing nodes is a complicated problem to handle as systems grow. The goal of this work is to demonstrate that high levels of speedup can be achieved by moving responsibility for finding, fetching, and staging data into an FPGA-based network card. We present a system for implicit routing of data via FPGA-based network cards. In this system, data structures are requested by name, and the network of FPGAs finds the data within the network and relays the structure to the requester. This is acheived through successive examinationmore » of hardware hash tables implemented in the FPGA. By avoiding software stacks between nodes, the data is quickly fetched entirely through FPGA-FPGA interaction. The performance of this system is orders of magnitude faster than software implementations due to the improved speed of the hash tables and lowered latency between the network nodes.« less
Radiation effects in reconfigurable FPGAs
NASA Astrophysics Data System (ADS)
Quinn, Heather
2017-04-01
Field-programmable gate arrays (FPGAs) are co-processing hardware used in image and signal processing. FPGA are programmed with custom implementations of an algorithm. These algorithms are highly parallel hardware designs that are faster than software implementations. This flexibility and speed has made FPGAs attractive for many space programs that need in situ, high-speed signal processing for data categorization and data compression. Most commercial FPGAs are affected by the space radiation environment, though. Problems with TID has restricted the use of flash-based FPGAs. Static random access memory based FPGAs must be mitigated to suppress errors from single-event upsets. This paper provides a review of radiation effects issues in reconfigurable FPGAs and discusses methods for mitigating these problems. With careful design it is possible to use these components effectively and resiliently.
A CDMA system implementation with dimming control for visible light communication
NASA Astrophysics Data System (ADS)
Chen, Danyang; Wang, Jianping; Jin, Jianli; Lu, Huimin; Feng, Lifang
2018-04-01
Visible light communication (VLC), using solid-state lightings to transmit information, has become a complement technology to wireless radio communication. As a realistic multiple access scheme for VLC system, code division multiple access (CDMA) has attracted more and more attentions in recent years. In this paper, we address and implement an improved CDMA scheme for VLC system. The simulation results reveal that the improved CDMA scheme not only supports multi-users' transmission but also maintains dimming value at about 50% and enhances the system efficiency. It can also realize the flexible dimming control by adjusting some parameters of system structure, which rarely affects the system BER performance. A real-time experimental VLC system with improved CDMA scheme is performed based on field programmable gate array (FPGA), reaching a good BER performance.
Design and implementation of a high performance network security processor
NASA Astrophysics Data System (ADS)
Wang, Haixin; Bai, Guoqiang; Chen, Hongyi
2010-03-01
The last few years have seen many significant progresses in the field of application-specific processors. One example is network security processors (NSPs) that perform various cryptographic operations specified by network security protocols and help to offload the computation intensive burdens from network processors (NPs). This article presents a high performance NSP system architecture implementation intended for both internet protocol security (IPSec) and secure socket layer (SSL) protocol acceleration, which are widely employed in virtual private network (VPN) and e-commerce applications. The efficient dual one-way pipelined data transfer skeleton and optimised integration scheme of the heterogenous parallel crypto engine arrays lead to a Gbps rate NSP, which is programmable with domain specific descriptor-based instructions. The descriptor-based control flow fragments large data packets and distributes them to the crypto engine arrays, which fully utilises the parallel computation resources and improves the overall system data throughput. A prototyping platform for this NSP design is implemented with a Xilinx XC3S5000 based FPGA chip set. Results show that the design gives a peak throughput for the IPSec ESP tunnel mode of 2.85 Gbps with over 2100 full SSL handshakes per second at a clock rate of 95 MHz.
Super-Resolution in Plenoptic Cameras Using FPGAs
Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime
2014-01-01
Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes. PMID:24841246
FPGA-accelerated algorithm for the regular expression matching system
NASA Astrophysics Data System (ADS)
Russek, P.; Wiatr, K.
2015-01-01
This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
Design of an MR image processing module on an FPGA chip.
Li, Limin; Wyrwicz, Alice M
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128×128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. Copyright © 2015 Elsevier Inc. All rights reserved.
Super-resolution in plenoptic cameras using FPGAs.
Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime
2014-05-16
Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes.
An FPGA computing demo core for space charge simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Jinyuan; Huang, Yifei; /Fermilab
2009-01-01
In accelerator physics, space charge simulation requires large amount of computing power. In a particle system, each calculation requires time/resource consuming operations such as multiplications, divisions, and square roots. Because of the flexibility of field programmable gate arrays (FPGAs), we implemented this task with efficient use of the available computing resources and completely eliminated non-calculating operations that are indispensable in regular micro-processors (e.g. instruction fetch, instruction decoding, etc.). We designed and tested a 16-bit demo core for computing Coulomb's force in an Altera Cyclone II FPGA device. To save resources, the inverse square-root cube operation in our design is computedmore » using a memory look-up table addressed with nine to ten most significant non-zero bits. At 200 MHz internal clock, our demo core reaches a throughput of 200 M pairs/s/core, faster than a typical 2 GHz micro-processor by about a factor of 10. Temperature and power consumption of FPGAs were also lower than those of micro-processors. Fast and convenient, FPGAs can serve as alternatives to time-consuming micro-processors for space charge simulation.« less
García, Gabriel J.; Jara, Carlos A.; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M.; Torres, Fernando
2014-01-01
The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field. PMID:24691100
NASA Astrophysics Data System (ADS)
Finger, R.; Curotto, F.; Fuentes, R.; Duan, R.; Bronfman, L.; Li, D.
2018-02-01
Radio Frequency Interference (RFI) is a growing concern in the radio astronomy community. Single-dish telescopes are particularly susceptible to RFI. Several methods have been developed to cope with RF-polluted environments, based on flagging, excision, and real-time blanking, among others. All these methods produce some degree of data loss or require assumptions to be made on the astronomical signal. We report the development of a real-time, digital adaptive filter implemented on a Field Programmable Gate Array (FPGA) capable of processing 4096 spectral channels in a 1 GHz of instantaneous bandwidth. The filter is able to cancel a broad range of interference signals and quickly adapt to changes on the RFI source, minimizing the data loss without any assumption on the astronomical or interfering signal properties. The speed of convergence (for a decrease to a 1%) was measured to be 208.1 μs for a broadband noise-like RFI signal and 125.5 μs for a multiple-carrier RFI signal recorded at the FAST radio telescope.
Volumetric visualization algorithm development for an FPGA-based custom computing machine
NASA Astrophysics Data System (ADS)
Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim
1998-05-01
Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
García, Gabriel J; Jara, Carlos A; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M; Torres, Fernando
2014-03-31
The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos
2011-01-01
General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Monitoring system for testing the radiation hardness of a KINTEX-7 FPGA
NASA Astrophysics Data System (ADS)
Cojocariu, L. N.; Placinta, V. M.; Dumitru, L.
2016-03-01
A much more efficient Ring Imaging Cherenkov sub-detector system will be rebuilt in the second long shutdown of Large Hadron Collider for the LHCb experiment. Radiation-hard electronic components together with Commercial Off-The-Shelf ones will be used in the new Cherenkov photon detection system architecture. An irradiation program was foreseen to determine the radiation tolerance for the new electronic devices, including a Field Programmable Gate Array from KINTEX-7 family of XILINX. An automated test bench for online monitoring of the XC7K70T KINTEX-7 device operation in radiation conditions was designed and implemented by the LHCb Romanian group.
Moving target detection for frequency agility radar by sparse reconstruction
NASA Astrophysics Data System (ADS)
Quan, Yinghui; Li, YaChao; Wu, Yaojun; Ran, Lei; Xing, Mengdao; Liu, Mengqi
2016-09-01
Frequency agility radar, with randomly varied carrier frequency from pulse to pulse, exhibits superior performance compared to the conventional fixed carrier frequency pulse-Doppler radar against the electromagnetic interference. A novel moving target detection (MTD) method is proposed for the estimation of the target's velocity of frequency agility radar based on pulses within a coherent processing interval by using sparse reconstruction. Hardware implementation of orthogonal matching pursuit algorithm is executed on Xilinx Virtex-7 Field Programmable Gata Array (FPGA) to perform sparse optimization. Finally, a series of experiments are performed to evaluate the performance of proposed MTD method for frequency agility radar systems.
FPGA-based real time processing of the Plenoptic Wavefront Sensor
NASA Astrophysics Data System (ADS)
Rodríguez-Ramos, L. F.; Marín, Y.; Díaz, J. J.; Piqueras, J.; García-Jiménez, J.; Rodríguez-Ramos, J. M.
The plenoptic wavefront sensor combines measurements at pupil and image planes in order to obtain simultaneously wavefront information from different points of view, being capable to sample the volume above the telescope to extract the tomographic information of the atmospheric turbulence. The advantages of this sensor are presented elsewhere at this conference (José M. Rodríguez-Ramos et al). This paper will concentrate in the processing required for pupil plane phase recovery, and its computation in real time using FPGAs (Field Programmable Gate Arrays). This technology eases the implementation of massive parallel processing and allows tailoring the system to the requirements, maintaining flexibility, speed and cost figures.
NASA Astrophysics Data System (ADS)
Cruz Jiménez, Miriam Guadalupe; Meyer Baese, Uwe; Jovanovic Dolecek, Gordana
2017-12-01
New theoretical lower bounds for the number of operators needed in fixed-point constant multiplication blocks are presented. The multipliers are constructed with the shift-and-add approach, where every arithmetic operation is pipelined, and with the generalization that n-input pipelined additions/subtractions are allowed, along with pure pipelining registers. These lower bounds, tighter than the state-of-the-art theoretical limits, are particularly useful in early design stages for a quick assessment in the hardware utilization of low-cost constant multiplication blocks implemented in the newest families of field programmable gate array (FPGA) integrated circuits.
A Realization of Theoretical Maximum Performance in IPSec on Gigabit Ethernet
NASA Astrophysics Data System (ADS)
Onuki, Atsushi; Takeuchi, Kiyofumi; Inada, Toru; Tokiniwa, Yasuhisa; Ushirozawa, Shinobu
This paper describes “IPSec(IP Security) VPN system" and how it attains a theoretical maximum performance on Gigabit Ethernet. The Conventional System is implemented by software. However, the system has several bottlenecks which must be overcome to realize a theoretical maximum performance on Gigabit Ethernet. Thus, we newly propose IPSec VPN System with the FPGA(Field Programmable Gate Array) based hardware architecture, which transmits a packet by the pipe-lined flow processing and has 6 parallel structure of encryption and authentication engines. We show that our system attains the theoretical maximum performance in the short packet which is difficult to realize until now.
NASA Astrophysics Data System (ADS)
Deng, B.; Xiao, L.; Zhao, X.; Baker, E.; Gong, D.; Guo, D.; He, H.; Hou, S.; Liu, C.; Liu, T.; Sun, Q.; Thomas, J.; Wang, J.; Xiang, A. C.; Yang, D.; Ye, J.; Zhou, W.
2018-05-01
Two optical data link data transmission Application Specific Integrated Circuits (ASICs), the baseline and its backup, have been designed for the ATLAS Liquid Argon (LAr) Calorimeter Phase-I trigger upgrade. The latency of each ASIC and that of its corresponding receiver implemented in a back-end Field-Programmable Gate Array (FPGA) are critical specifications. In this paper, we present the latency measurements and simulation of two ASICs. The measurement results indicate that both ASICs achieve their design goals and meet the latency specifications. The consistency between the simulation and measurements validates the ASIC latency characterization.
Moving target detection for frequency agility radar by sparse reconstruction.
Quan, Yinghui; Li, YaChao; Wu, Yaojun; Ran, Lei; Xing, Mengdao; Liu, Mengqi
2016-09-01
Frequency agility radar, with randomly varied carrier frequency from pulse to pulse, exhibits superior performance compared to the conventional fixed carrier frequency pulse-Doppler radar against the electromagnetic interference. A novel moving target detection (MTD) method is proposed for the estimation of the target's velocity of frequency agility radar based on pulses within a coherent processing interval by using sparse reconstruction. Hardware implementation of orthogonal matching pursuit algorithm is executed on Xilinx Virtex-7 Field Programmable Gata Array (FPGA) to perform sparse optimization. Finally, a series of experiments are performed to evaluate the performance of proposed MTD method for frequency agility radar systems.
A Practical, Hardware Friendly MMSE Detector for MIMO-OFDM-Based Systems
NASA Astrophysics Data System (ADS)
Kim, Hun Seok; Zhu, Weijun; Bhatia, Jatin; Mohammed, Karim; Shah, Anish; Daneshrad, Babak
2008-12-01
Design and implementation of a highly optimized MIMO (multiple-input multiple-output) detector requires cooptimization of the algorithm with the underlying hardware architecture. Special attention must be paid to application requirements such as throughput, latency, and resource constraints. In this work, we focus on a highly optimized matrix inversion free [InlineEquation not available: see fulltext.] MMSE (minimum mean square error) MIMO detector implementation. The work has resulted in a real-time field-programmable gate array-based implementation (FPGA-) on a Xilinx Virtex-2 6000 using only 9003 logic slices, 66 multipliers, and 24 Block RAMs (less than 33% of the overall resources of this part). The design delivers over 420 Mbps sustained throughput with a small 2.77-microsecond latency. The designed [InlineEquation not available: see fulltext.] linear MMSE MIMO detector is capable of complying with the proposed IEEE 802.11n standard.
FPGA Based Adaptive Rate and Manifold Pattern Projection for Structured Light 3D Camera System †
Lee, Sukhan
2018-01-01
The quality of the captured point cloud and the scanning speed of a structured light 3D camera system depend upon their capability of handling the object surface of a large reflectance variation in the trade-off of the required number of patterns to be projected. In this paper, we propose and implement a flexible embedded framework that is capable of triggering the camera single or multiple times for capturing single or multiple projections within a single camera exposure setting. This allows the 3D camera system to synchronize the camera and projector even for miss-matched frame rates such that the system is capable of projecting different types of patterns for different scan speed applications. This makes the system capturing a high quality of 3D point cloud even for the surface of a large reflectance variation while achieving a high scan speed. The proposed framework is implemented on the Field Programmable Gate Array (FPGA), where the camera trigger is adaptively generated in such a way that the position and the number of triggers are automatically determined according to camera exposure settings. In other words, the projection frequency is adaptive to different scanning applications without altering the architecture. In addition, the proposed framework is unique as it does not require any external memory for storage because pattern pixels are generated in real-time, which minimizes the complexity and size of the application-specific integrated circuit (ASIC) design and implementation. PMID:29642506
NASA Astrophysics Data System (ADS)
Butkowski, Łukasz; Vogel, Vladimir; Schlarb, Holger; Szabatin, Jerzy
2017-06-01
The driving engine of the superconducting accelerator of the European X-ray free electron laser (XFEL) is a set of 27 radio frequency (RF) stations. Each of the underground RF stations consists of a multibeam horizontal klystron that can provide up to 10 MW of power at 1.3 GHz. Klystrons are sensitive devices with a limited lifetime and a high mean time between failures. In real operation, the lifetime of the tube can be significantly reduced because of failures. The special fast protection klystron lifetime management (KLM) system has been developed to minimize the influence of service conditions on the lifetime of klystrons. The main task of this system is to detect all events which can destroy the tube as quickly as possible, and switch off the driving RF signal or the high voltage. Detection of events is based on a comparison of the value of the real signal obtained at the system output with the value estimated on the basis of a high-power RF amplifier model and input signals. The KLM system has been realized in field-programmable gate array (FPGA) and implemented in XFEL. Implementation is based on the standard low-level RF micro telecommunications computing architecture (MTCA.4 or xTCA). The main part of the paper focuses on an estimation of the klystron model and the implementation of KLM in FPGA. The results of the performance of the KLM system will also be presented.
Implementation of 4-way Superscalar Hash MIPS Processor Using FPGA
NASA Astrophysics Data System (ADS)
Sahib Omran, Safaa; Fouad Jumma, Laith
2018-05-01
Due to the quick advancements in the personal communications systems and wireless communications, giving data security has turned into a more essential subject. This security idea turns into a more confounded subject when next-generation system requirements and constant calculation speed are considered in real-time. Hash functions are among the most essential cryptographic primitives and utilized as a part of the many fields of signature authentication and communication integrity. These functions are utilized to acquire a settled size unique fingerprint or hash value of an arbitrary length of message. In this paper, Secure Hash Algorithms (SHA) of types SHA-1, SHA-2 (SHA-224, SHA-256) and SHA-3 (BLAKE) are implemented on Field-Programmable Gate Array (FPGA) in a processor structure. The design is described and implemented using a hardware description language, namely VHSIC “Very High Speed Integrated Circuit” Hardware Description Language (VHDL). Since the logical operation of the hash types of (SHA-1, SHA-224, SHA-256 and SHA-3) are 32-bits, so a Superscalar Hash Microprocessor without Interlocked Pipelines (MIPS) processor are designed with only few instructions that were required in invoking the desired Hash algorithms, when the four types of hash algorithms executed sequentially using the designed processor, the total time required equal to approximately 342 us, with a throughput of 4.8 Mbps while the required to execute the same four hash algorithms using the designed four-way superscalar is reduced to 237 us with improved the throughput to 5.1 Mbps.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2017-03-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2016-01-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
A Test Methodology for Determining Space-Readiness of Xilinx SRAM-Based FPGA Designs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather M; Graham, Paul S; Morgan, Keith S
2008-01-01
Using reconfigurable, static random-access memory (SRAM) based field-programmable gate arrays (FPGAs) for space-based computation has been an exciting area of research for the past decade. Since both the circuit and the circuit's state is stored in radiation-tolerant memory, both could be alterd by the harsh space radiation environment. Both the circuit and the circuit's state can be prote cted by triple-moduler redundancy (TMR), but applying TMR to FPGA user designs is often an error-prone process. Faulty application of TMR could cause the FPGA user circuit to output incorrect data. This paper will describe a three-tiered methodology for testing FPGA usermore » designs for space-readiness. We will describe the standard approach to testing FPGA user designs using a particle accelerator, as well as two methods using fault injection and a modeling tool. While accelerator testing is the current 'gold standard' for pre-launch testing, we believe the use of fault injection and modeling tools allows for easy, cheap and uniform access for discovering errors early in the design process.« less
Intelligent FPGA Data Acquisition Framework
NASA Astrophysics Data System (ADS)
Bai, Yunpeng; Gaisbauer, Dominic; Huber, Stefan; Konorov, Igor; Levit, Dmytro; Steffen, Dominik; Paul, Stephan
2017-06-01
In this paper, we present the field programmable gate arrays (FPGA)-based framework intelligent FPGA data acquisition (IFDAQ), which is used for the development of DAQ systems for detectors in high-energy physics. The framework supports Xilinx FPGA and provides a collection of IP cores written in very high speed integrated circuit hardware description language, which use the common interconnect interface. The IP core library offers functionality required for the development of the full DAQ chain. The library consists of Serializer/Deserializer (SERDES)-based time-to-digital conversion channels, an interface to a multichannel 80-MS/s 10-b analog-digital conversion, data transmission, and synchronization protocol between FPGAs, event builder, and slow control. The functionality is distributed among FPGA modules built in the AMC form factor: front end and data concentrator. This modular design also helps to scale and adapt the DAQ system to the needs of the particular experiment. The first application of the IFDAQ framework is the upgrade of the read-out electronics for the drift chambers and the electromagnetic calorimeters (ECALs) of the COMPASS experiment at CERN. The framework will be presented and discussed in the context of this paper.
Analog Module Architecture for Space-Qualified Field-Programmable Mixed-Signal Arrays
NASA Technical Reports Server (NTRS)
Edwards, R. Timothy; Strohbehn, Kim; Jaskulek, Steven E.; Katz, Richard
1999-01-01
Spacecraft require all manner of both digital and analog circuits. Onboard digital systems are constructed almost exclusively from field-programmable gate array (FPGA) circuits providing numerous advantages over discrete design including high integration density, high reliability, fast turn-around design cycle time, lower mass, volume, and power consumption, and lower parts acquisition and flight qualification costs. Analog and mixed-signal circuits perform tasks ranging from housekeeping to signal conditioning and processing. These circuits are painstakingly designed and built using discrete components due to a lack of options for field-programmability. FPAA (Field-Programmable Analog Array) and FPMA (Field-Programmable Mixed-signal Array) parts exist but not in radiation-tolerant technology and not necessarily in an architecture optimal for the design of analog circuits for spaceflight applications. This paper outlines an architecture proposed for an FPAA fabricated in an existing commercial digital CMOS process used to make radiation-tolerant antifuse-based FPGA devices. The primary concerns are the impact of the technology and the overall array architecture on the flexibility of programming, the bandwidth available for high-speed analog circuits, and the accuracy of the components for high-performance applications.
Hierarchical MFMO Circuit Modules for an Energy-Efficient SDR DBF
NASA Astrophysics Data System (ADS)
Mar, Jeich; Kuo, Chi-Cheng; Wu, Shin-Ru; Lin, You-Rong
The hierarchical multi-function matrix operation (MFMO) circuit modules are designed using coordinate rotations digital computer (CORDIC) algorithm for realizing the intensive computation of matrix operations. The paper emphasizes that the designed hierarchical MFMO circuit modules can be used to develop a power-efficient software-defined radio (SDR) digital beamformer (DBF). The formulas of the processing time for the scalable MFMO circuit modules implemented in field programmable gate array (FPGA) are derived to allocate the proper logic resources for the hardware reconfiguration. The hierarchical MFMO circuit modules are scalable to the changing number of array branches employed for the SDR DBF to achieve the purpose of power saving. The efficient reuse of the common MFMO circuit modules in the SDR DBF can also lead to energy reduction. Finally, the power dissipation and reconfiguration function in the different modes of the SDR DBF are observed from the experiment results.
NASA Astrophysics Data System (ADS)
Szadkowski, Zbigniew
2016-06-01
The paper presents first results from the Front-End Board (FEB) with the biggest Cyclone® V E FPGA 5CEFA9F31I7N, supporting 8 channels sampled up to 250 MSps @ 14-bit resolution. Considered sampling for the planned upgrade of the Pierre Auger surface detector array is 120 MSps, however, the FEB has been developed with external anti-aliasing filters to keep a maximal flexibility. Six channels are targeted to the SD, two the rest for other experiments like: Auger Engineering Radio Array and additional muon counters. More channels and higher sampling generate larger size of registered events. We used the standard radio channel for a radio transmission from the detectors to the Central Data Acquisition Station (CDAS) to avoid at present a significant modification of a software in both sides: the detector and the CDAS (planned in a future for a final design). Several variants of the FPGA code were tested for 120, 160, 200 and even 240 MSps DAQ. Tests confirmed a stability and reliability of the FEB design in real pampas conditions with more than 40°C daily temperature variation and a strong sun exposition with a limited power budget only from a single solar panel. Seven FEBs have been deployed in a hexagon of test detectors on a dedicated Engineering Array.
FPGA Implementation of Stereo Disparity with High Throughput for Mobility Applications
NASA Technical Reports Server (NTRS)
Villalpando, Carlos Y.; Morfopolous, Arin; Matthies, Larry; Goldberg, Steven
2011-01-01
High speed stereo vision can allow unmanned robotic systems to navigate safely in unstructured terrain, but the computational cost can exceed the capacity of typical embedded CPUs. In this paper, we describe an end-to-end stereo computation co-processing system optimized for fast throughput that has been implemented on a single Virtex 4 LX160 FPGA. This system is capable of operating on images from a 1024 x 768 3CCD (true RGB) camera pair at 15 Hz. Data enters the FPGA directly from the cameras via Camera Link and is rectified, pre-filtered and converted into a disparity image all within the FPGA, incurring no CPU load. Once complete, a rectified image and the final disparity image are read out over the PCI bus, for a bandwidth cost of 68 MB/sec. Within the FPGA there are 4 distinct algorithms: Camera Link capture, Bilinear rectification, Bilateral subtraction pre-filtering and the Sum of Absolute Difference (SAD) disparity. Each module will be described in brief along with the data flow and control logic for the system. The system has been successfully fielded upon the Carnegie Mellon University's National Robotics Engineering Center (NREC) Crusher system during extensive field trials in 2007 and 2008 and is being implemented for other surface mobility systems at JPL.
Parallel algorithm for computation of second-order sequential best rotations
NASA Astrophysics Data System (ADS)
Redif, Soydan; Kasap, Server
2013-12-01
Algorithms for computing an approximate polynomial matrix eigenvalue decomposition of para-Hermitian systems have emerged as a powerful, generic signal processing tool. A technique that has shown much success in this regard is the sequential best rotation (SBR2) algorithm. Proposed is a scheme for parallelising SBR2 with a view to exploiting the modern architectural features and inherent parallelism of field-programmable gate array (FPGA) technology. Experiments show that the proposed scheme can achieve low execution times while requiring minimal FPGA resources.
Combine Flash-Based FPGA TID and Long-Term Retention Reliabilities Through VT Shift
NASA Astrophysics Data System (ADS)
Wang, Jih-Jong; Rezzak, Nadia; Dsilva, Durwyn; Xue, Fengliang; Samiee, Salim; Singaraju, Pavan; Jia, James; Nguyen, Victor; Hawley, Frank; Hamdy, Esmat
2016-08-01
Reliability test results of data retention and total ionizing dose (TID) in 65 nm Flash-based field programmable gate array (FPGA) are presented. Long-chain inverter design is recommended for reliability evaluation because it is the worst case design for both effects. Based on preliminary test data, both issues are unified and modeled by one natural decay equation. The relative contributions of TID induced threshold-voltage shift and retention mechanisms are evaluated by analyzing test data.
Integration of multi-interface conversion channel using FPGA for modular photonic network
NASA Astrophysics Data System (ADS)
Janicki, Tomasz; Pozniak, Krzysztof T.; Romaniuk, Ryszard S.
2010-09-01
The article discusses the integration of different types of interfaces with FPGA circuits using a reconfigurable communication platform. The solution has been implemented in practice in a single node of a distributed measurement system. Construction of communication platform has been presented with its selected hardware modules, described in VHDL and implemented in FPGA circuits. The graphical user interface (GUI) has been described that allows a user to control the operation of the system. In the final part of the article selected practical solutions have been introduced. The whole measurement system resides on multi-gigabit optical network. The optical network construction is highly modular, reconfigurable and scalable.
A real-time n/γ digital pulse shape discriminator based on FPGA.
Li, Shiping; Xu, Xiufeng; Cao, Hongrui; Yuan, Guoliang; Yang, Qingwei; Yin, Zejie
2013-02-01
A FPGA-based real-time digital pulse shape discriminator has been employed to distinguish between neutrons (n) and gammas (γ) in the Neutron Flux Monitor (NFM) for International Thermonuclear Experimental Reactor (ITER). The discriminator takes advantages of the Field Programmable Gate Array (FPGA) parallel and pipeline process capabilities to carry out the real-time sifting of neutrons in n/γ mixed radiation fields, and uses the rise time and amplitude inspection techniques simultaneously as the discrimination algorithm to observe good n/γ separation. Some experimental results have been presented which show that this discriminator can realize the anticipated goals of NFM perfectly with its excellent discrimination quality and zero dead time. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Passas, Georgios; Freear, Steven; Fawcett, Darren
2010-08-01
Orthogonal frequency division multiplexing (OFDM)-based feed-forward space-time trellis code (FFSTTC) encoders can be synthesised as very high speed integrated circuit hardware description language (VHDL) designs. Evaluation of their FPGA implementation can lead to conclusions that help a designer to decide the optimum implementation, given the encoder structural parameters. VLSI architectures based on 1-bit multipliers and look-up tables (LUTs) are compared in terms of FPGA slices and block RAMs (area), as well as in terms of minimum clock period (speed). Area and speed graphs versus encoder memory order are provided for quadrature phase shift keying (QPSK) and 8 phase shift keying (8-PSK) modulation and two transmit antennas, revealing best implementation under these conditions. The effect of number of modulation bits and transmit antennas on the encoder implementation complexity is also investigated.
High-Performance CCSDS Encapsulation Service Implementation in FPGA
NASA Technical Reports Server (NTRS)
Clare, Loren P.; Torgerson, Jordan L.; Pang, Jackson
2010-01-01
The Consultative Committee for Space Data Systems (CCSDS) Encapsulation Service is a convergence layer between lower-layer space data link framing protocols, such as CCSDS Advanced Orbiting System (AOS), and higher-layer networking protocols, such as CFDP (CCSDS File Delivery Protocol) and Internet Protocol Extension (IPE). CCSDS Encapsulation Service is considered part of the data link layer. The CCSDS AOS implementation is described in the preceding article. Recent advancement in RF modem technology has allowed multi-megabit transmission over space links. With this increase in data rate, the CCSDS Encapsulation Service needs to be optimized to both reduce energy consumption and operate at a high rate. CCSDS Encapsulation Service has been implemented as an intellectual property core so that the aforementioned problems are solved by way of operating the CCSDS Encapsulation Service inside an FPGA. The CCSDS En capsula tion Service in FPGA implementation consists of both packetizing and de-packetizing features
RFI Risk Reduction Activities Using New Goddard Digital Radiometry Capabilities
NASA Technical Reports Server (NTRS)
Bradley, Damon; Kim, Ed; Young, Peter; Miles, Lynn; Wong, Mark; Morris, Joel
2012-01-01
The Goddard Radio-Frequency Explorer (GREX) is the latest fast-sampling radiometer digital back-end processor that will be used for radiometry and radio-frequency interference (RFI) surveying at Goddard Space Flight Center. The system is compact and deployable, with a mass of about 40 kilograms. It is intended to be flown on aircraft. GREX is compatible with almost any aircraft, including P-3, twin otter, C-23, C-130, G3, and G5 types. At a minimum, the system can function as a clone of the Soil Moisture Active Passive (SMAP) ground-based development unit [1], or can be a completely independent system that is interfaced to any radiometer, provided that frequency shifting to GREX's intermediate frequency is performed prior to sampling. If the radiometer RF is less than 200MHz, then the band can be sampled and acquired directly by the system. A key feature of GREX is its ability to simultaneously sample two polarization channels simultaneously at up to 400MSPS, 14-bit resolution each. The sampled signals can be recorded continuously to a 23 TB solid-state RAID storage array. Data captures can be analyzed offline using the supercomputing facilities at Goddard Space Flight Center. In addition, various Field Programmable Gate Array (FPGA) - amenable radiometer signal processing and RFI detection algorithms can be implemented directly on the GREX system because it includes a high-capacity Xilinx Virtex-5 FPGA prototyping system that is user customizable.
Evolutionary Based Techniques for Fault Tolerant Field Programmable Gate Arrays
NASA Technical Reports Server (NTRS)
Larchev, Gregory V.; Lohn, Jason D.
2006-01-01
The use of SRAM-based Field Programmable Gate Arrays (FPGAs) is becoming more and more prevalent in space applications. Commercial-grade FPGAs are potentially susceptible to permanently debilitating Single-Event Latchups (SELs). Repair methods based on Evolutionary Algorithms may be applied to FPGA circuits to enable successful fault recovery. This paper presents the experimental results of applying such methods to repair four commonly used circuits (quadrature decoder, 3-by-3-bit multiplier, 3-by-3-bit adder, 440-7 decoder) into which a number of simulated faults have been introduced. The results suggest that evolutionary repair techniques can improve the process of fault recovery when used instead of or as a supplement to Triple Modular Redundancy (TMR), which is currently the predominant method for mitigating FPGA faults.
NASA Astrophysics Data System (ADS)
Abdelazim, S.; Santoro, D.; Arend, M.; Moshary, F.; Ahmed, S.
2011-11-01
A field deployable all-fiber eye-safe Coherent Doppler LIDAR is being developed at the Optical Remote Sensing Lab at the City College of New York (CCNY) and is designed to monitor wind fields autonomously and continuously in urban settings. Data acquisition is accomplished by sampling lidar return signals at 400 MHz and performing onboard processing using field programmable gate arrays (FPGAs). The FPGA is programmed to accumulate signal information that is used to calculate the power spectrum of the atmospherically back scattered signal. The advantage of using FPGA is that signal processing will be performed at the hardware level, reducing the load on the host computer and allowing for 100% return signal processing. An experimental setup measured wind speeds at ranges of up to 3 km.
Optically Programmable Field Programmable Gate Arrays (FPGA) Systems
2004-01-01
VCSEL requires placing the array far enough as to overlap the entire footprint of the signal beam in order to record the hologram. Therefore, these...hologram that self-focuses, due to phase -conjugation, on the array of detectors in the chip. VC A 10 m m 10 mm 18mm 16mm SEL RRAY OPTICAL MEMORY LOGIC...the VCSEL array , the chip and the optical material, and the requirements they have to meet for their use in the OPGA system. Section
An Analysis of Offset, Gain, and Phase Corrections in Analog to Digital Converters
NASA Astrophysics Data System (ADS)
Cody, Devin; Ford, John
2015-01-01
Many high-speed analog to digital converters (ADCs) use interwoven ADCs to greatly boost their sample rate. This interwoven architecture can introduce problems if the low speed ADCs do not have identical outputs. These errors are manifested as phantom frequencies that appear in the digitized signal although they never existed in the analog domain. Through the application of offset, gain, and phase (OGP) corrections to the ADC, this problem can be reduced. Here we report on an implementation of such a correction in a high speed ADC chip used for radio astronomy. While the corrections could not be implemented in the ADCs themselves, a partial solution was devised and implemented digitally inside of a signal processing field programmable gate array (FPGA). Positive results to contrived situations are shown, and null results are presented for implementation in an ADC083000 card with minimal error. Lastly, we discuss the implications of this method as well as its mathematical basis.
Chao, Chun-Tang
2014-01-01
This paper presents the design and evaluation of the hardware circuit for electronic stethoscopes with heart sound cancellation capabilities using field programmable gate arrays (FPGAs). The adaptive line enhancer (ALE) was adopted as the filtering methodology to reduce heart sound attributes from the breath sounds obtained via the electronic stethoscope pickup. FPGAs were utilized to implement the ALE functions in hardware to achieve near real-time breath sound processing. We believe that such an implementation is unprecedented and crucial toward a truly useful, standalone medical device in outpatient clinic settings. The implementation evaluation with one Altera cyclone II–EP2C70F89 shows that the proposed ALE used 45% resources of the chip. Experiments with the proposed prototype were made using DE2-70 emulation board with recorded body signals obtained from online medical archives. Clear suppressions were observed in our experiments from both the frequency domain and time domain perspectives. PMID:24790573
NASA Astrophysics Data System (ADS)
Qiu, Mo; Yu, Simin; Wen, Yuqiong; Lü, Jinhu; He, Jianbin; Lin, Zhuosheng
In this paper, a novel design methodology and its FPGA hardware implementation for a universal chaotic signal generator is proposed via the Verilog HDL fixed-point algorithm and state machine control. According to continuous-time or discrete-time chaotic equations, a Verilog HDL fixed-point algorithm and its corresponding digital system are first designed. In the FPGA hardware platform, each operation step of Verilog HDL fixed-point algorithm is then controlled by a state machine. The generality of this method is that, for any given chaotic equation, it can be decomposed into four basic operation procedures, i.e. nonlinear function calculation, iterative sequence operation, iterative values right shifting and ceiling, and chaotic iterative sequences output, each of which corresponds to only a state via state machine control. Compared with the Verilog HDL floating-point algorithm, the Verilog HDL fixed-point algorithm can save the FPGA hardware resources and improve the operation efficiency. FPGA-based hardware experimental results validate the feasibility and reliability of the proposed approach.
NASA Technical Reports Server (NTRS)
McGuffey, Alex; Berg, Melanie; Pellish, Jonathan
2010-01-01
Field programmable gate arrays (FPGA) are used in every space application. Currently, most space flight applications use radiation hardened (RH) FPGAs, which are very expensive. There is a desire to use cheaper, commercial off the shelf reprogrammable FPGAs, which are more susceptible to radiation effects known as single-event effects (SEE). The RH parts have SEE and total ionizing dose (TID) hardened elements pre-integrated into the part. This means that the designer does not need to implement any hardening techniques while configuring the device. The COTS parts on the other hand must be mitigated by design in order to insure any form of mitigation. The design techniques this project examines concern the use of localized triple modular redundancy (LTMR) and distributed triple modular redundancy (DTMR). LTMR triples every flip flop in the device architecture while DTMR triples everything except for the global routes (clocks, resets, and enables). The testing was performed on a ProASIC3E FPGA at the Texas A&M cyclotron facility. Two design architectures were used: shift registers and counters, both with LTMR and DTMR mitigation techniques. The test results prove that DTMR is more effective at reducing SEE than LTMR. We also determined that there was not a significant difference between the use of shift registers and counters for test purposes. More testing is required to obtain additional linear energy transfer values for each architecture and mitigation technique in order to determine the most cost-effective method of SEE mitigation.
FPGA for Power Control of MSL Avionics
NASA Technical Reports Server (NTRS)
Wang, Duo; Burke, Gary R.
2011-01-01
A PLGT FPGA (Field Programmable Gate Array) is included in the LCC (Load Control Card), GID (Guidance Interface & Drivers), TMC (Telemetry Multiplexer Card), and PFC (Pyro Firing Card) boards of the Mars Science Laboratory (MSL) spacecraft. (PLGT stands for PFC, LCC, GID, and TMC.) It provides the interface between the backside bus and the power drivers on these boards. The LCC drives power switches to switch power loads, and also relays. The GID drives the thrusters and latch valves, as well as having the star-tracker and Sun-sensor interface. The PFC drives pyros, and the TMC receives digital and analog telemetry. The FPGA is implemented both in Xilinx (Spartan 3- 400) and in Actel (RTSX72SU, ASX72S). The Xilinx Spartan 3 part is used for the breadboard, the Actel ASX part is used for the EM (Engineer Module), and the pin-compatible, radiation-hardened RTSX part is used for final EM and flight. The MSL spacecraft uses a FC (Flight Computer) to control power loads, relays, thrusters, latch valves, Sun-sensor, and star-tracker, and to read telemetry such as temperature. Commands are sent over a 1553 bus to the MREU (Multi-Mission System Architecture Platform Remote Engineering Unit). The MREU resends over a remote serial command bus c-bus to the LCC, GID TMC, and PFC. The MREU also sends out telemetry addresses via a remote serial telemetry address bus to the LCC, GID, TMC, and PFC, and the status is returned over the remote serial telemetry data bus.
NASA Astrophysics Data System (ADS)
Cobos Arribas, Pedro; Monasterio Huelin Macia, Felix
2003-04-01
A FPGA based hardware implementation of the Santos-Victor optical flow algorithm, useful in robot guidance applications, is described in this paper. The system used to do contains an ALTERA FPGA (20K100), an interface with a digital camera, three VRAM memories to contain the data input and some output memories (a VRAM and a EDO) to contain the results. The system have been used previously to develop and test other vision algorithms, such as image compression, optical flow calculation with differential and correlation methods. The designed system let connect the digital camera, or the FPGA output (results of algorithms) to a PC, throw its Firewire or USB port. The problems take place in this occasion have motivated to adopt another hardware structure for certain vision algorithms with special requirements, that need a very hard code intensive processing.
High-performance camera module for fast quality inspection in industrial printing applications
NASA Astrophysics Data System (ADS)
Fürtler, Johannes; Bodenstorfer, Ernst; Mayer, Konrad J.; Brodersen, Jörg; Heiss, Dorothea; Penz, Harald; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert
2007-02-01
Today, printing products which must meet highest quality standards, e.g., banknotes, stamps, or vouchers, are automatically checked by optical inspection systems. Typically, the examination of fine details of the print or security features demands images taken from various perspectives, with different spectral sensitivity (visible, infrared, ultraviolet), and with high resolution. Consequently, the inspection system is equipped with several cameras and has to cope with an enormous data rate to be processed in real-time. Hence, it is desirable to move image processing tasks into the camera to reduce the amount of data which has to be transferred to the (central) image processing system. The idea is to transfer relevant information only, i.e., features of the image instead of the raw image data from the sensor. These features are then further processed. In this paper a color line-scan camera for line rates up to 100 kHz is presented. The camera is based on a commercial CMOS (complementary metal oxide semiconductor) area image sensor and a field programmable gate array (FPGA). It implements extraction of image features which are well suited to detect print flaws like blotches of ink, color smears, splashes, spots and scratches. The camera design and several image processing methods implemented on the FPGA are described, including flat field correction, compensation of geometric distortions, color transformation, as well as decimation and neighborhood operations.
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2017-01-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2016-10-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
Pérez Suárez, Santiago T.; Travieso González, Carlos M.; Alonso Hernández, Jesús B.
2013-01-01
This article presents a design methodology for designing an artificial neural network as an equalizer for a binary signal. Firstly, the system is modelled in floating point format using Matlab. Afterward, the design is described for a Field Programmable Gate Array (FPGA) using fixed point format. The FPGA design is based on the System Generator from Xilinx, which is a design tool over Simulink of Matlab. System Generator allows one to design in a fast and flexible way. It uses low level details of the circuits and the functionality of the system can be fully tested. System Generator can be used to check the architecture and to analyse the effect of the number of bits on the system performance. Finally the System Generator design is compiled for the Xilinx Integrated System Environment (ISE) and the system is described using a hardware description language. In ISE the circuits are managed with high level details and physical performances are obtained. In the Conclusions section, some modifications are proposed to improve the methodology and to ensure portability across FPGA manufacturers.
Brusati, M.; Camplani, A.; Cannon, M.; ...
2017-02-20
SRAM-ba8ed Field Programmable Gate Array (FPGA) logic devices arc very attractive in applications where high data throughput is needed, such as the latest generation of High Energy Physics (HEP) experiments. FPGAs have been rarely used in such experiments because of their sensitivity to radiation. The present paper proposes a mitigation approach applied to commercial FPGA devices to meet the reliability requirements for the front-end electronics of the Liquid Argon (LAr) electromagnetic calorimeter of the ATLAS experiment, located at CERN. Particular attention will be devoted to define a proper mitigation scheme of the multi-gigabit transceivers embedded in the FPGA, which ismore » a critical part of the LAr data acquisition chain. A demonstrator board is being developed to validate the proposed methodology. :!\\litigation techniques such as Triple Modular Redundancy (T:t\\IR) and scrubbing will be used to increase the robustness of the design and to maximize the fault tolerance from Single-Event Upsets (SEUs).« less
Software-based high-level synthesis design of FPGA beamformers for synthetic aperture imaging.
Amaro, Joao; Yiu, Billy Y S; Falcao, Gabriel; Gomes, Marco A C; Yu, Alfred C H
2015-05-01
Field-programmable gate arrays (FPGAs) can potentially be configured as beamforming platforms for ultrasound imaging, but a long design time and skilled expertise in hardware programming are typically required. In this article, we present a novel approach to the efficient design of FPGA beamformers for synthetic aperture (SA) imaging via the use of software-based high-level synthesis techniques. Software kernels (coded in OpenCL) were first developed to stage-wise handle SA beamforming operations, and their corresponding FPGA logic circuitry was emulated through a high-level synthesis framework. After design space analysis, the fine-tuned OpenCL kernels were compiled into register transfer level descriptions to configure an FPGA as a beamformer module. The processing performance of this beamformer was assessed through a series of offline emulation experiments that sought to derive beamformed images from SA channel-domain raw data (40-MHz sampling rate, 12 bit resolution). With 128 channels, our FPGA-based SA beamformer can achieve 41 frames per second (fps) processing throughput (3.44 × 10(8) pixels per second for frame size of 256 × 256 pixels) at 31.5 W power consumption (1.30 fps/W power efficiency). It utilized 86.9% of the FPGA fabric and operated at a 196.5 MHz clock frequency (after optimization). Based on these findings, we anticipate that FPGA and high-level synthesis can together foster rapid prototyping of real-time ultrasound processor modules at low power consumption budgets.
Board Saver for Use with Developmental FPGAs
NASA Technical Reports Server (NTRS)
Berkun, Andrew
2009-01-01
A device denoted a board saver has been developed as a means of reducing wear and tear of a printed-circuit board onto which an antifuse field programmable gate array (FPGA) is to be eventually soldered permanently after a number of design iterations. The need for the board saver or a similar device arises because (1) antifuse-FPGA design iterations are common and (2) repeated soldering and unsoldering of FPGAs on the printed-circuit board to accommodate design iterations can wear out the printed-circuit board. The board saver is basically a solderable/unsolderable FPGA receptacle that is installed temporarily on the printed-circuit board. The board saver is, more specifically, a smaller, square-ring-shaped, printed-circuit board (see figure) that contains half via holes one for each contact pad along its periphery. As initially fabricated, the board saver is a wider ring containing full via holes, but then it is milled along its outer edges, cutting the via holes in half and laterally exposing their interiors. The board saver is positioned in registration with the designated FPGA footprint and each via hole is soldered to the outer portion of the corresponding FPGA contact pad on the first-mentioned printed-circuit board. The via-hole/contact joints can be inspected visually and can be easily unsoldered later. The square hole in the middle of the board saver is sized to accommodate the FPGA, and the thickness of the board saver is the same as that of the FPGA. Hence, when a non-final FPGA is placed in the square hole, the combination of the non-final FPGA and the board saver occupy no more area and thickness than would a final FPGA soldered directly into its designated position on the first-mentioned circuit board. The contact leads of a non-final FPGA are not bent and are soldered, at the top of the board saver, to the corresponding via holes. A non-final FPGA can readily be unsoldered from the board saver and replaced by another one. Once the final FPGA design has been determined, the board saver can be unsoldered from the contact pads on the first-mentioned printed-circuit board and replaced by the final FPGA.
NASA Astrophysics Data System (ADS)
Yang, C.; Zheng, W.; Zhang, M.; Yuan, T.; Zhuang, G.; Pan, Y.
2016-06-01
Measurement and control of the plasma in real-time are critical for advanced Tokamak operation. It requires high speed real-time data acquisition and processing. ITER has designed the Fast Plant System Controllers (FPSC) for these purposes. At J-TEXT Tokamak, a real-time data acquisition and processing framework has been designed and implemented using standard ITER FPSC technologies. The main hardware components of this framework are an Industrial Personal Computer (IPC) with a real-time system and FlexRIO devices based on FPGA. With FlexRIO devices, data can be processed by FPGA in real-time before they are passed to the CPU. The software elements are based on a real-time framework which runs under Red Hat Enterprise Linux MRG-R and uses Experimental Physics and Industrial Control System (EPICS) for monitoring and configuring. That makes the framework accord with ITER FPSC standard technology. With this framework, any kind of data acquisition and processing FlexRIO FPGA program can be configured with a FPSC. An application using the framework has been implemented for the polarimeter-interferometer diagnostic system on J-TEXT. The application is able to extract phase-shift information from the intermediate frequency signal produced by the polarimeter-interferometer diagnostic system and calculate plasma density profile in real-time. Different algorithms implementations on the FlexRIO FPGA are compared in the paper.
FPGA implementation of current-sharing strategy for parallel-connected SEPICs
NASA Astrophysics Data System (ADS)
Ezhilarasi, A.; Ramaswamy, M.
2016-01-01
The attempt echoes to evolve an equal current-sharing algorithm over a number of single-ended primary inductance converters connected in parallel. The methodology involves the development of state-space model to predict the condition for the existence of a stable equilibrium portrait. It acquires the role of a variable structure controller to guide the trajectory, with a view to circumvent the circuit non-linearities and arrive at a stable performance through a preferred operating range. The design elicits an acceptable servo and regulatory characteristics, the desired time response and ensures regulation of the load voltage. The simulation results validated through a field programmable gate array-based prototype serves to illustrate its suitability for present-day applications.
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing
Bravo, Ignacio; Baliñas, Javier; Gardel, Alfredo; Lázaro, José L.; Espinosa, Felipe; García, Jorge
2011-01-01
This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely. PMID:22163739
A generic FPGA-based detector readout and real-time image processing board
NASA Astrophysics Data System (ADS)
Sarpotdar, Mayuresh; Mathew, Joice; Safonova, Margarita; Murthy, Jayant
2016-07-01
For space-based astronomical observations, it is important to have a mechanism to capture the digital output from the standard detector for further on-board analysis and storage. We have developed a generic (application- wise) field-programmable gate array (FPGA) board to interface with an image sensor, a method to generate the clocks required to read the image data from the sensor, and a real-time image processor system (on-chip) which can be used for various image processing tasks. The FPGA board is applied as the image processor board in the Lunar Ultraviolet Cosmic Imager (LUCI) and a star sensor (StarSense) - instruments developed by our group. In this paper, we discuss the various design considerations for this board and its applications in the future balloon and possible space flights.
A SEU-Hard Flip-Flop for Antifuse FPGAs
NASA Technical Reports Server (NTRS)
Katz, R.; Wang, J. J.; McCollum, J.; Cronquist, B.; Chan, R.; Yu, D.; Kleyner, I.; Day, John H. (Technical Monitor)
2001-01-01
A single event upset (SEU)-hardened flip-flop has been designed and developed for antifuse Field Programmable Gate Array (FPGA) application. Design and application issues, testability, test methods, simulation, and results are discussed.
Real-time biomimetic Central Pattern Generators in an FPGA for hybrid experiments
Ambroise, Matthieu; Levi, Timothée; Joucla, Sébastien; Yvert, Blaise; Saïghi, Sylvain
2013-01-01
This investigation of the leech heartbeat neural network system led to the development of a low resources, real-time, biomimetic digital hardware for use in hybrid experiments. The leech heartbeat neural network is one of the simplest central pattern generators (CPG). In biology, CPG provide the rhythmic bursts of spikes that form the basis for all muscle contraction orders (heartbeat) and locomotion (walking, running, etc.). The leech neural network system was previously investigated and this CPG formalized in the Hodgkin–Huxley neural model (HH), the most complex devised to date. However, the resources required for a neural model are proportional to its complexity. In response to this issue, this article describes a biomimetic implementation of a network of 240 CPGs in an FPGA (Field Programmable Gate Array), using a simple model (Izhikevich) and proposes a new synapse model: activity-dependent depression synapse. The network implementation architecture operates on a single computation core. This digital system works in real-time, requires few resources, and has the same bursting activity behavior as the complex model. The implementation of this CPG was initially validated by comparing it with a simulation of the complex model. Its activity was then matched with pharmacological data from the rat spinal cord activity. This digital system opens the way for future hybrid experiments and represents an important step toward hybridization of biological tissue and artificial neural networks. This CPG network is also likely to be useful for mimicking the locomotion activity of various animals and developing hybrid experiments for neuroprosthesis development. PMID:24319408
A FPGA implementation for linearly unmixing a hyperspectral image using OpenCL
NASA Astrophysics Data System (ADS)
Guerra, Raúl; López, Sebastián.; Sarmiento, Roberto
2017-10-01
Hyperspectral imaging systems provide images in which single pixels have information from across the electromagnetic spectrum of the scene under analysis. These systems divide the spectrum into many contiguos channels, which may be even out of the visible part of the spectra. The main advantage of the hyperspectral imaging technology is that certain objects leave unique fingerprints in the electromagnetic spectrum, known as spectral signatures, which allow to distinguish between different materials that may look like the same in a traditional RGB image. Accordingly, the most important hyperspectral imaging applications are related with distinguishing or identifying materials in a particular scene. In hyperspectral imaging applications under real-time constraints, the huge amount of information provided by the hyperspectral sensors has to be rapidly processed and analysed. For such purpose, parallel hardware devices, such as Field Programmable Gate Arrays (FPGAs) are typically used. However, developing hardware applications typically requires expertise in the specific targeted device, as well as in the tools and methodologies which can be used to perform the implementation of the desired algorithms in the specific device. In this scenario, the Open Computing Language (OpenCL) emerges as a very interesting solution in which a single high-level synthesis design language can be used to efficiently develop applications in multiple and different hardware devices. In this work, the Fast Algorithm for Linearly Unmixing Hyperspectral Images (FUN) has been implemented into a Bitware Stratix V Altera FPGA using OpenCL. The obtained results demonstrate the suitability of OpenCL as a viable design methodology for quickly creating efficient FPGAs designs for real-time hyperspectral imaging applications.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubois, David H; Dubois, Andrew J; Boorman, Thomas M
2009-01-01
This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Improved On-Chip Measurement of Delay in an FPGA or ASIC
NASA Technical Reports Server (NTRS)
Chen, Yuan; Burke, Gary; Sheldon, Douglas
2007-01-01
An improved design has been devised for on-chip-circuitry for measuring the delay through a chain of combinational logic elements in a field-programmable gate array (FPGA) or application-specific integrated circuit (ASIC). In the improved design, the delay chain does not include input and output buffers and is not configured as an oscillator. Instead, the delay chain is made part of the signal chain of an on-chip pulse generator. The duration of the pulse is measured on-chip and taken to equal the delay.
A Reconfigurable Communications System for Small Spacecraft
NASA Technical Reports Server (NTRS)
Chu, Pong P.; Kifle, Muli
2004-01-01
Two trends of NASA missions are the use of multiple small spacecraft and the development of an integrated space network. To achieve these goals, a robust and agile communications system is needed. Advancements in field programmable gate array (FPGA) technology have made it possible to incorporate major communication and network functionalities in FPGA chips; thus this technology has great potential as the basis for a reconfigurable communications system. This report discusses the requirements of future space communications, reviews relevant issues, and proposes a methodology to design and construct a reconfigurable communications system for small scientific spacecraft.
NASA Astrophysics Data System (ADS)
Faerber, Christian
2017-10-01
The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.
A Control System and Streaming DAQ Platform with Image-Based Trigger for X-ray Imaging
NASA Astrophysics Data System (ADS)
Stevanovic, Uros; Caselle, Michele; Cecilia, Angelica; Chilingaryan, Suren; Farago, Tomas; Gasilov, Sergey; Herth, Armin; Kopmann, Andreas; Vogelgesang, Matthias; Balzer, Matthias; Baumbach, Tilo; Weber, Marc
2015-06-01
High-speed X-ray imaging applications play a crucial role for non-destructive investigations of the dynamics in material science and biology. On-line data analysis is necessary for quality assurance and data-driven feedback, leading to a more efficient use of a beam time and increased data quality. In this article we present a smart camera platform with embedded Field Programmable Gate Array (FPGA) processing that is able to stream and process data continuously in real-time. The setup consists of a Complementary Metal-Oxide-Semiconductor (CMOS) sensor, an FPGA readout card, and a readout computer. It is seamlessly integrated in a new custom experiment control system called Concert that provides a more efficient way of operating a beamline by integrating device control, experiment process control, and data analysis. The potential of the embedded processing is demonstrated by implementing an image-based trigger. It records the temporal evolution of physical events with increased speed while maintaining the full field of view. The complete data acquisition system, with Concert and the smart camera platform was successfully integrated and used for fast X-ray imaging experiments at KIT's synchrotron radiation facility ANKA.
A hybrid intelligent controller for a twin rotor MIMO system and its hardware implementation.
Juang, Jih-Gau; Liu, Wen-Kai; Lin, Ren-Wei
2011-10-01
This paper presents a fuzzy PID control scheme with a real-valued genetic algorithm (RGA) to a setpoint control problem. The objective of this paper is to control a twin rotor MIMO system (TRMS) to move quickly and accurately to the desired attitudes, both the pitch angle and the azimuth angle in a cross-coupled condition. A fuzzy compensator is applied to the PID controller. The proposed control structure includes four PID controllers with independent inputs in 2-DOF. In order to reduce total error and control energy, all parameters of the controller are obtained by a RGA with the system performance index as a fitness function. The system performance index utilized the integral of time multiplied by the square error criterion (ITSE) to build a suitable fitness function in the RGA. A new method for RGA to solve more than 10 parameters in the control scheme is investigated. For real-time control, Xilinx Spartan II SP200 FPGA (Field Programmable Gate Array) is employed to construct a hardware-in-the-loop system through writing VHDL on this FPGA. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.
Design and construction of a high frame rate imaging system
NASA Astrophysics Data System (ADS)
Wang, Jing; Waugaman, John L.; Liu, Anjun; Lu, Jian-Yu
2002-05-01
A new high frame rate imaging method has been developed recently [Jian-yu Lu, ``2D and 3D high frame rate imaging with limited diffraction beams,'' IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44, 839-856 (1997)]. This method may have a clinical application for imaging of fast moving objects such as human hearts, velocity vector imaging, and low-speckle imaging. To implement the method, an imaging system has been designed. The system consists of one main printed circuit board (PCB) and 16 channel boards (each channel board contains 8 channels), in addition to a set-top box for connections to a personal computer (PC), a front panel board for user control and message display, and a power control and distribution board. The main board contains a field programmable gate array (FPGA) and controls all channels (each channel has also an FPGA). We will report the analog and digital circuit design and simulations, multiplayer PCB designs with commercial software (Protel 99), PCB signal integrity testing and system RFI/EMI shielding, and the assembly and construction of the entire system. [Work supported in part by Grant 5RO1 HL60301 from NIH.
Embedded Streaming Deep Neural Networks Accelerator With Applications.
Dundar, Aysegul; Jin, Jonghoon; Martini, Berin; Culurciello, Eugenio
2017-07-01
Deep convolutional neural networks (DCNNs) have become a very powerful tool in visual perception. DCNNs have applications in autonomous robots, security systems, mobile phones, and automobiles, where high throughput of the feedforward evaluation phase and power efficiency are important. Because of this increased usage, many field-programmable gate array (FPGA)-based accelerators have been proposed. In this paper, we present an optimized streaming method for DCNNs' hardware accelerator on an embedded platform. The streaming method acts as a compiler, transforming a high-level representation of DCNNs into operation codes to execute applications in a hardware accelerator. The proposed method utilizes maximum computational resources available based on a novel-scheduled routing topology that combines data reuse and data concatenation. It is tested with a hardware accelerator implemented on the Xilinx Kintex-7 XC7K325T FPGA. The system fully explores weight-level and node-level parallelizations of DCNNs and achieves a peak performance of 247 G-ops while consuming less than 4 W of power. We test our system with applications on object classification and object detection in real-world scenarios. Our results indicate high-performance efficiency, outperforming all other presented platforms while running these applications.
A flexible FPGA based QDC and TDC for the HADES and the CBM calorimeters
NASA Astrophysics Data System (ADS)
Rost, A.; Galatyuk, T.; Koenig, W.; Michel, J.; Pietraszko, J.; Skott, P.; Traxler, M.
2017-02-01
A Charge-to-Digital-Converter (QDC) and Time-to-Digital-Converter (TDC) based on a commercial FPGA (Field Programmable Gate Array) was developed to read out PMT signals of the planned HADES electromagnetic calorimeter (ECAL) at GSI Helmholtzzentrum für Schwerionenforschung GmbH (Darmstadt, Germany). The main idea is to convert the charge measurement of a detector signal into a time measurement, where the charge is encoded in the width of a digital pulse, while the arrival time information is encoded in the leading edge time of the pulse. The PaDiWa-AMPS prototype front-end board for the TRB3 (General Purpose Trigger and Readout Board—version 3) which implements this conversion method was developed and qualified. The already well established TRB3 platform provides the needed precise time measurements and serves as a data acquisition system. We present the read-out concept and the performance of the prototype boards in laboratory and also under beam conditions. First steps have been completed in order to adapt this concept to SiPM signals of the hadron calorimeter in the CBM experiment at the planned FAIR facility (Darmstadt).
Soft-core processor study for node-based architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Houten, Jonathan Roger; Jarosz, Jason P.; Welch, Benjamin James
2008-09-01
Node-based architecture (NBA) designs for future satellite projects hold the promise of decreasing system development time and costs, size, weight, and power and positioning the laboratory to address other emerging mission opportunities quickly. Reconfigurable Field Programmable Gate Array (FPGA) based modules will comprise the core of several of the NBA nodes. Microprocessing capabilities will be necessary with varying degrees of mission-specific performance requirements on these nodes. To enable the flexibility of these reconfigurable nodes, it is advantageous to incorporate the microprocessor into the FPGA itself, either as a hardcore processor built into the FPGA or as a soft-core processor builtmore » out of FPGA elements. This document describes the evaluation of three reconfigurable FPGA based processors for use in future NBA systems--two soft cores (MicroBlaze and non-fault-tolerant LEON) and one hard core (PowerPC 405). Two standard performance benchmark applications were developed for each processor. The first, Dhrystone, is a fixed-point operation metric. The second, Whetstone, is a floating-point operation metric. Several trials were run at varying code locations, loop counts, processor speeds, and cache configurations. FPGA resource utilization was recorded for each configuration. Cache configurations impacted the results greatly; for optimal processor efficiency it is necessary to enable caches on the processors. Processor caches carry a penalty; cache error mitigation is necessary when operating in a radiation environment.« less
Low-cost and high-speed optical mark reader based on an intelligent line camera
NASA Astrophysics Data System (ADS)
Hussmann, Stephan; Chan, Leona; Fung, Celine; Albrecht, Martin
2003-08-01
Optical Mark Recognition (OMR) is thoroughly reliable and highly efficient provided that high standards are maintained at both the planning and implementation stages. It is necessary to ensure that OMR forms are designed with due attention to data integrity checks, the best use is made of features built into the OMR, used data integrity is checked before the data is processed and data is validated before it is processed. This paper describes the design and implementation of an OMR prototype system for marking multiple-choice tests automatically. Parameter testing is carried out before the platform and the multiple-choice answer sheet has been designed. Position recognition and position verification methods have been developed and implemented in an intelligent line scan camera. The position recognition process is implemented into a Field Programmable Gate Array (FPGA), whereas the verification process is implemented into a micro-controller. The verified results are then sent to the Graphical User Interface (GUI) for answers checking and statistical analysis. At the end of the paper the proposed OMR system will be compared with commercially available system on the market.
FPGA based hardware optimized implementation of signal processing system for LFM pulsed radar
NASA Astrophysics Data System (ADS)
Azim, Noor ul; Jun, Wang
2016-11-01
Signal processing is one of the main parts of any radar system. Different signal processing algorithms are used to extract information about different parameters like range, speed, direction etc, of a target in the field of radar communication. This paper presents LFM (Linear Frequency Modulation) pulsed radar signal processing algorithms which are used to improve target detection, range resolution and to estimate the speed of a target. Firstly, these algorithms are simulated in MATLAB to verify the concept and theory. After the conceptual verification in MATLAB, the simulation is converted into implementation on hardware using Xilinx FPGA. Chosen FPGA is Xilinx Virtex-6 (XC6LVX75T). For hardware implementation pipeline optimization is adopted and also other factors are considered for resources optimization in the process of implementation. Focusing algorithms in this work for improving target detection, range resolution and speed estimation are hardware optimized fast convolution processing based pulse compression and pulse Doppler processing.
Windowing technique in FM radar realized by FPGA for better target resolution
NASA Astrophysics Data System (ADS)
Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique; Kravchenko, Victor F.
2006-09-01
Remote sensing systems, such as SAR usually apply FM signals to resolve nearly placed targets (objects) and improve SNR. Main drawbacks in the pulse compression of FM radar signal that it can add the range side-lobes in reflectivity measurements. Using weighting window processing in time domain it is possible to decrease significantly the side-lobe level (SLL) of output radar signal that permits to resolve small or low power targets those are masked by powerful ones. There are usually used classical windows such as Hamming, Hanning, Blackman-Harris, Kaiser-Bessel, Dolph-Chebyshev, Gauss, etc. in window processing. Additionally to classical ones in here we also use a novel class of windows based on atomic functions (AF) theory. For comparison of simulation and experimental results we applied the standard parameters, such as coefficient of amplification, maximum level of side-lobe, width of main lobe, etc. In this paper we also proposed to implement the compression-windowing model on a hardware level employing Field Programmable Gate Array (FPGA) that offers some benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. It has been investigated the pulse compression design on FPGA applying classical and novel window technique to reduce the SLL in absence and presence of noise. The paper presents simulated and experimental examples of detection of small or nearly placed targets in the imaging radar. Paper also presents the experimental hardware results of windowing in FM radar demonstrating resolution of the several targets for classical rectangular, Hamming, Kaiser-Bessel, and some novel ones: Up(x), fup 4(x)•D 3(x), fup 6(x)•G 3(x), etc. It is possible to conclude that windows created on base of the AFs offer better decreasing of the SLL in cases of presence or absence of noise and when we move away of the main lobe in comparison with classical windows.
A hybrid short read mapping accelerator
2013-01-01
Background The rapid growth of short read datasets poses a new challenge to the short read mapping problem in terms of sensitivity and execution speed. Existing methods often use a restrictive error model for computing the alignments to improve speed, whereas more flexible error models are generally too slow for large-scale applications. A number of short read mapping software tools have been proposed. However, designs based on hardware are relatively rare. Field programmable gate arrays (FPGAs) have been successfully used in a number of specific application areas, such as the DSP and communications domains due to their outstanding parallel data processing capabilities, making them a competitive platform to solve problems that are “inherently parallel”. Results We present a hybrid system for short read mapping utilizing both FPGA-based hardware and CPU-based software. The computation intensive alignment and the seed generation operations are mapped onto an FPGA. We present a computationally efficient, parallel block-wise alignment structure (Align Core) to approximate the conventional dynamic programming algorithm. The performance is compared to the multi-threaded CPU-based GASSST and BWA software implementations. For single-end alignment, our hybrid system achieves faster processing speed than GASSST (with a similar sensitivity) and BWA (with a higher sensitivity); for pair-end alignment, our design achieves a slightly worse sensitivity than that of BWA but has a higher processing speed. Conclusions This paper shows that our hybrid system can effectively accelerate the mapping of short reads to a reference genome based on the seed-and-extend approach. The performance comparison to the GASSST and BWA software implementations under different conditions shows that our hybrid design achieves a high degree of sensitivity and requires less overall execution time with only modest FPGA resource utilization. Our hybrid system design also shows that the performance bottleneck for the short read mapping problem can be changed from the alignment stage to the seed generation stage, which provides an additional requirement for the future development of short read aligners. PMID:23441908
An FPGA-based reconfigurable DDC algorithm
NASA Astrophysics Data System (ADS)
Juszczyk, B.; Kasprowicz, G.
2016-09-01
This paper describes implementation of reconfigurable digital down converter in an FPGA structure. System is designed to work with quadrature signals. One of the main criteria of the project was to provied wide range of reconfiguration in order to fulfill various application rage. Potential applications include: software defined radio receiver, passive noise radars and measurement data compression. This document contains general system overview, short description of hardware used in the project and gateware implementation.
Novel intelligent real-time position tracking system using FPGA and fuzzy logic.
Soares dos Santos, Marco P; Ferreira, J A F
2014-03-01
The main aim of this paper is to test if FPGAs are able to achieve better position tracking performance than software-based soft real-time platforms. For comparison purposes, the same controller design was implemented in these architectures. A Multi-state Fuzzy Logic controller (FLC) was implemented both in a Xilinx(®) Virtex-II FPGA (XC2v1000) and in a soft real-time platform NI CompactRIO(®)-9002. The same sampling time was used. The comparative tests were conducted using a servo-pneumatic actuation system. Steady-state errors lower than 4 μm were reached for an arbitrary vertical positioning of a 6.2 kg mass when the controller was embedded into the FPGA platform. Performance gains up to 16 times in the steady-state error, up to 27 times in the overshoot and up to 19.5 times in the settling time were achieved by using the FPGA-based controller over the software-based FLC controller. © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Berdychowski, Piotr P.; Zabolotny, Wojciech M.
2010-09-01
The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.
NASA Astrophysics Data System (ADS)
Alqasemi, Umar; Li, Hai; Aguirre, Andres; Zhu, Quing
2011-03-01
Co-registering ultrasound (US) and photoacoustic (PA) imaging is a logical extension to conventional ultrasound because both modalities provide complementary information of tumor morphology, tumor vasculature and hypoxia for cancer detection and characterization. In addition, both modalities are capable of providing real-time images for clinical applications. In this paper, a Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP) module-based real-time US/PA imaging system is presented. The system provides real-time US/PA data acquisition and image display for up to 5 fps* using the currently implemented DSP board. It can be upgraded to 15 fps, which is the maximum pulse repetition rate of the used laser, by implementing an advanced DSP module. Additionally, the photoacoustic RF data for each frame is saved for further off-line processing. The system frontend consists of eight 16-channel modules made of commercial and customized circuits. Each 16-channel module consists of two commercial 8-channel receiving circuitry boards and one FPGA board from Analog Devices. Each receiving board contains an IC† that combines. 8-channel low-noise amplifiers, variable-gain amplifiers, anti-aliasing filters, and ADC's‡ in a single chip with sampling frequency of 40MHz. The FPGA board captures the LVDSξ Double Data Rate (DDR) digital output of the receiving board and performs data conditioning and subbeamforming. A customized 16-channel transmission circuitry is connected to the two receiving boards for US pulseecho (PE) mode data acquisition. A DSP module uses External Memory Interface (EMIF) to interface with the eight 16-channel modules through a customized adaptor board. The DSP transfers either sub-beamformed data (US pulse-echo mode or PAI imaging mode) or raw data from FPGA boards to its DDR-2 memory through the EMIF link, then it performs additional processing, after that, it transfer the data to the PC** for further image processing. The PC code performs image processing including demodulation, beam envelope detection and scan conversion. Additionally, the PC code pre-calculates the delay coefficients used for transmission focusing and receiving dynamic focusing for different types of transducers to speed up the imaging process. To further speed up the imaging process, a multi-threads technique is implemented in order to allow formation of previous image frame data and acquisition of the next one simultaneously. The system is also capable of doing semi-real-time automated SO2 imaging at 10 seconds per frame by changing the wavelength knob of the laser automatically using a stepper motor controlled by the system. Initial in vivo experiments were performed on animal tumors to map out its vasculature and hypoxia level, which were superimposed on co-registered US images. The real-time system allows capturing co-registered US/PA images free of motion artifacts and also provides dynamitic information when contrast agents are used.
SpaceCube 2.0: An Advanced Hybrid Onboard Data Processor
NASA Technical Reports Server (NTRS)
Lin, Michael; Flatley, Thomas; Godfrey, John; Geist, Alessandro; Espinosa, Daniel; Petrick, David
2011-01-01
The SpaceCube 2.0 is a compact, high performance, low-power onboard processing system that takes advantage of cutting-edge hybrid (CPU/FPGA/DSP) processing elements. The SpaceCube 2.0 design concept includes two commercial Virtex-5 field-programmable gate array (FPGA) parts protected by gradiation hardened by software" technology, and possesses exceptional size, weight, and power characteristics [5x5x7 in., 3.5 lb (approximately equal to 12.7 x 12.7 x 17.8 cm, 1.6 kg) 5-25 W, depending on the application fs required clock rate]. The two Virtex-5 FPGA parts are implemented in a unique back-toback configuration to maximize data transfer and computing performance. Draft computing power specifications for the SpaceCube 2.0 unit include four PowerPC 440s (1100 DMIPS each), 500+ DSP48Es (2x580 GMACS), 100+ LVDS high-speed serial I/Os (1.25 Gbps each), and 2x190 GFLOPS single-precision (65 GFLOPS double-precision) floating point performance. The SpaceCube 2.0 includes PROM memory for CPU boot, health and safety, and basic command and telemetry functionality; RAM memory for program execution; and FLASH/EEPROM memory to store algorithms and application code for the CPU, FPGA, and DSP processing elements. Program execution can be reconfigured in real time and algorithms can be updated, modified, and/or replaced at any point during the mission. Gigabit Ethernet, Spacewire, SATA and highspeed LVDS serial/parallel I/O channels are available for instrument/sensor data ingest, and mission-unique instrument interfaces can be accommodated using a compact PCI (cPCI) expansion card interface. The SpaceCube 2.0 can be utilized in NASA Earth Science, Helio/Astrophysics and Exploration missions, and Department of Defense satellites for onboard data processing. It can also be used in commercial communication and mapping satellites.
Scintillation-Hardened GPS Receiver
NASA Technical Reports Server (NTRS)
Stephens, Donald R.
2015-01-01
CommLargo, Inc., has developed a scintillation-hardened Global Positioning System (GPS) receiver that improves reliability for low-orbit missions and complies with NASA's Space Telecommunications Radio System (STRS) architecture standards. A software-defined radio (SDR) implementation allows a single hardware element to function as either a conventional radio or as a GPS receiver, providing backup and redundancy for platforms such as the International Space Station (ISS) and high-value remote sensing platforms. The innovation's flexible SDR implementation reduces cost, weight, and power requirements. Scintillation hardening improves mission reliability and variability. In Phase I, CommLargo refactored an open-source GPS software package with Kalman filter-based tracking loops to improve performance during scintillation and also demonstrated improved navigation during a geomagnetic storm. In Phase II, the company generated a new field-programmable gate array (FPGA)-based GPS waveform to demonstrate on NASA's Space Communication and Navigation (SCaN) test bed.
Development of a front end controller/heap manager for PHENIX
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ericson, M.N.; Allen, M.D.; Musrock, M.S.
1996-12-31
A controller/heap manager has been designed for applicability to all detector subsystem types of PHENIX. the heap manager performs all functions associated with front end electronics control including ADC and analog memory control, data collection, command interpretation and execution, and data packet forming and communication. Interfaces to the unit consist of a timing and control bus, a serial bus, a parallel data bus, and a trigger interface. The topology developed is modular so that many functional blocks are identical for a number of subsystem types. Programmability is maximized through the use of flexible modular functions and implementation using field programmablemore » gate arrays (FPGAs). Details of unit design and functionality will be discussed with particular detail given to subsystems having analog memory-based front end electronics. In addition, mode control, serial functions, and FPGA implementation details will be presented.« less
On-line remote monitoring of radioactive waste repositories
NASA Astrophysics Data System (ADS)
Calì, Claudio; Cosentino, Luigi; Litrico, Pietro; Pappalardo, Alfio; Scirè, Carlotta; Scirè, Sergio; Vecchio, Gianfranco; Finocchiaro, Paolo; Alfieri, Severino; Mariani, Annamaria
2014-12-01
A low-cost array of modular sensors for online monitoring of radioactive waste was developed at INFN-LNS. We implemented a new kind of gamma counter, based on Silicon PhotoMultipliers and scintillating fibers, that behaves like a cheap scintillating Geiger-Muller counter. It can be placed in shape of a fine grid around each single waste drum in a repository. Front-end electronics and an FPGA-based counting system were developed to handle the field data, also implementing data transmission, a graphical user interface and a data storage system. A test of four sensors in a real radwaste storage site was performed with promising results. Following the tests an agreement was signed between INFN and Sogin for the joint development and installation of a prototype DMNR (Detector Mesh for Nuclear Repository) system inside the Garigliano radwaste repository in Sessa Aurunca (CE, Italy). Such a development is currently under way, with the installation foreseen within 2014.
Three-phase Four-leg Inverter LabVIEW FPGA Control Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The use of cRIO and sbRIO for power electronics control has developed over the last few yearsmore » to include control of three-phase inverters. Most three-phase inverter topologies include three switching legs. The addition of a fourth-leg to natively generate the neutral connection allows the inverter to serve single-phase loads in a microgrid or stand-alone power system and to balance the three-phase voltages in the presence of significant load imbalance. However, the control of a four-leg inverter is much more complex. In particular, instead of standard two-dimensional space vector modulation (SVM), the inverter requires three-dimensional space vector modulation (3D-SVM). The candidate software implements complete control algorithms in LabVIEW FPGA for a three-phase four-leg inverter. The software includes feedback control loops, three-dimensional space vector modulation gate-drive algorithms, advanced alarm handling capabilities, contactor control, power measurements, and debugging and tuning tools. The feedback control loops allow inverter operation in AC voltage control, AC current control, or DC bus voltage control modes based on external mode selection by a user or supervisory controller. The software includes the ability to synchronize its AC output to the grid or other voltage-source before connection. The software also includes provisions to allow inverter operation in parallel with other voltage regulating devices on the AC or DC buses. This flexibility allows the Inverter to operate as a stand-alone voltage source, connected to the grid, or in parallel with other controllable voltage sources as part of a microgrid or remote power system. In addition, as the inverter is expected to operate under severe unbalanced conditions, the software includes algorithms to accurately compute real and reactive power for each phase based on definitions provided in the IEEE Standard 1459: IEEE Standard Definitions for the Measurement of Electric Power Quantities Under Sinusoidal, Nonsinusoidal, Balanced, or Unbalanced Conditions. Finally, the software includes code to output analog signals for debugging and for tuning of control loops. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, user-settable switching frequencies and synchronized control loop update rates of tens of kHz, and reference waveform generation, including Phase Lock Loop (PLL), update rate of 100 kHz.« less
FPGA-Based Reconfigurable Processor for Ultrafast Interlaced Ultrasound and Photoacoustic Imaging
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2016-01-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models. PMID:22828830
FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging.
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2012-07-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models.
Berger, Andrew J; Page, Michael R; Jacob, Jan; Young, Justin R; Lewis, Jim; Wenzel, Lothar; Bhallamudi, Vidya P; Johnston-Halperin, Ezekiel; Pelekhov, Denis V; Hammel, P Chris
2014-12-01
Understanding the complex properties of electronic and spintronic devices at the micro- and nano-scale is a topic of intense current interest as it becomes increasingly important for scientific progress and technological applications. In operando characterization of such devices by scanning probe techniques is particularly well-suited for the microscopic study of these properties. We have developed a scanning probe microscope (SPM) which is capable of both standard force imaging (atomic, magnetic, electrostatic) and simultaneous electrical transport measurements. We utilize flexible and inexpensive FPGA (field-programmable gate array) hardware and a custom software framework developed in National Instrument's LabVIEW environment to perform the various aspects of microscope operation and device measurement. The FPGA-based approach enables sensitive, real-time cantilever frequency-shift detection. Using this system, we demonstrate electrostatic force microscopy of an electrically biased graphene field-effect transistor device. The combination of SPM and electrical transport also enables imaging of the transport response to a localized perturbation provided by the scanned cantilever tip. Facilitated by the broad presence of LabVIEW in the experimental sciences and the openness of our software solution, our system permits a wide variety of combined scanning and transport measurements by providing standardized interfaces and flexible access to all aspects of a measurement (input and output signals, and processed data). Our system also enables precise control of timing (synchronization of scanning and transport operations) and implementation of sophisticated feedback protocols, and thus should be broadly interesting and useful to practitioners in the field.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berger, Andrew J., E-mail: berger.156@osu.edu; Page, Michael R.; Young, Justin R.
Understanding the complex properties of electronic and spintronic devices at the micro- and nano-scale is a topic of intense current interest as it becomes increasingly important for scientific progress and technological applications. In operando characterization of such devices by scanning probe techniques is particularly well-suited for the microscopic study of these properties. We have developed a scanning probe microscope (SPM) which is capable of both standard force imaging (atomic, magnetic, electrostatic) and simultaneous electrical transport measurements. We utilize flexible and inexpensive FPGA (field-programmable gate array) hardware and a custom software framework developed in National Instrument's LabVIEW environment to perform themore » various aspects of microscope operation and device measurement. The FPGA-based approach enables sensitive, real-time cantilever frequency-shift detection. Using this system, we demonstrate electrostatic force microscopy of an electrically biased graphene field-effect transistor device. The combination of SPM and electrical transport also enables imaging of the transport response to a localized perturbation provided by the scanned cantilever tip. Facilitated by the broad presence of LabVIEW in the experimental sciences and the openness of our software solution, our system permits a wide variety of combined scanning and transport measurements by providing standardized interfaces and flexible access to all aspects of a measurement (input and output signals, and processed data). Our system also enables precise control of timing (synchronization of scanning and transport operations) and implementation of sophisticated feedback protocols, and thus should be broadly interesting and useful to practitioners in the field.« less