high-speed fpga implementation: Topics by Science.gov

Sample records for high-speed fpga implementation

An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed

NASA Astrophysics Data System (ADS)

Kim, H.; Choi, Y.; Yang, Y.

In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
FPGA Implementation of Stereo Disparity with High Throughput for Mobility Applications

NASA Technical Reports Server (NTRS)

Villalpando, Carlos Y.; Morfopolous, Arin; Matthies, Larry; Goldberg, Steven

2011-01-01

High speed stereo vision can allow unmanned robotic systems to navigate safely in unstructured terrain, but the computational cost can exceed the capacity of typical embedded CPUs. In this paper, we describe an end-to-end stereo computation co-processing system optimized for fast throughput that has been implemented on a single Virtex 4 LX160 FPGA. This system is capable of operating on images from a 1024 x 768 3CCD (true RGB) camera pair at 15 Hz. Data enters the FPGA directly from the cameras via Camera Link and is rectified, pre-filtered and converted into a disparity image all within the FPGA, incurring no CPU load. Once complete, a rectified image and the final disparity image are read out over the PCI bus, for a bandwidth cost of 68 MB/sec. Within the FPGA there are 4 distinct algorithms: Camera Link capture, Bilinear rectification, Bilateral subtraction pre-filtering and the Sum of Absolute Difference (SAD) disparity. Each module will be described in brief along with the data flow and control logic for the system. The system has been successfully fielded upon the Carnegie Mellon University's National Robotics Engineering Center (NREC) Crusher system during extensive field trials in 2007 and 2008 and is being implemented for other surface mobility systems at JPL.
Implementation of High Speed Distributed Data Acquisition System

NASA Astrophysics Data System (ADS)

Raju, Anju P.; Sekhar, Ambika

2012-09-01

This paper introduces a high speed distributed data acquisition system based on a field programmable gate array (FPGA). The aim is to develop a "distributed" data acquisition interface. The development of instruments such as personal computers and engineering workstations based on "standard" platforms is the motivation behind this effort. Using standard platforms as the controlling unit allows independence in hardware from a particular vendor and hardware platform. The distributed approach also has advantages from a functional point of view: acquisition resources become available to multiple instruments; the acquisition front-end can be physically remote from the rest of the instrument. High speed data acquisition system transmits data faster to a remote computer system through Ethernet interface. The data is acquired through 16 analog input channels. The input data commands are multiplexed and digitized and then the data is stored in 1K buffer for each input channel. The main control unit in this design is the 16 bit processor implemented in the FPGA. This 16 bit processor is used to set up and initialize the data source and the Ethernet controller, as well as control the flow of data from the memory element to the NIC. Using this processor we can initialize and control the different configuration registers in the Ethernet controller in a easy manner. Then these data packets are sending to the remote PC through the Ethernet interface. The main advantages of the using FPGA as standard platform are its flexibility, low power consumption, short design duration, fast time to market, programmability and high density. The main advantages of using Ethernet controller AX88796 over others are its non PCI interface, the presence of embedded SRAM where transmit and reception buffers are located and high-performance SRAM-like interface. The paper introduces the implementation of the distributed data acquisition using FPGA by VHDL. The main advantages of this system are high
A CMOS high speed imaging system design based on FPGA

NASA Astrophysics Data System (ADS)

Tang, Hong; Wang, Huawei; Cao, Jianzhong; Qiao, Mingrui

2015-10-01

CMOS sensors have more advantages than traditional CCD sensors. The imaging system based on CMOS has become a hot spot in research and development. In order to achieve the real-time data acquisition and high-speed transmission, we design a high-speed CMOS imaging system on account of FPGA. The core control chip of this system is XC6SL75T and we take advantages of CameraLink interface and AM41V4 CMOS image sensors to transmit and acquire image data. AM41V4 is a 4 Megapixel High speed 500 frames per second CMOS image sensor with global shutter and 4/3" optical format. The sensor uses column parallel A/D converters to digitize the images. The CameraLink interface adopts DS90CR287 and it can convert 28 bits of LVCMOS/LVTTL data into four LVDS data stream. The reflected light of objects is photographed by the CMOS detectors. CMOS sensors convert the light to electronic signals and then send them to FPGA. FPGA processes data it received and transmits them to upper computer which has acquisition cards through CameraLink interface configured as full models. Then PC will store, visualize and process images later. The structure and principle of the system are both explained in this paper and this paper introduces the hardware and software design of the system. FPGA introduces the driven clock of CMOS. The data in CMOS is converted to LVDS signals and then transmitted to the data acquisition cards. After simulation, the paper presents a row transfer timing sequence of CMOS. The system realized real-time image acquisition and external controls.
VIRTEX-5 Fpga Implementation of Advanced Encryption Standard Algorithm

NASA Astrophysics Data System (ADS)

Rais, Muhammad H.; Qasim, Syed M.

2010-06-01

In this paper, we present an implementation of Advanced Encryption Standard (AES) cryptographic algorithm using state-of-the-art Virtex-5 Field Programmable Gate Array (FPGA). The design is coded in Very High Speed Integrated Circuit Hardware Description Language (VHDL). Timing simulation is performed to verify the functionality of the designed circuit. Performance evaluation is also done in terms of throughput and area. The design implemented on Virtex-5 (XC5VLX50FFG676-3) FPGA achieves a maximum throughput of 4.34 Gbps utilizing a total of 399 slices.
FPGA Flash Memory High Speed Data Acquisition

NASA Technical Reports Server (NTRS)

Gonzalez, April

2013-01-01

The purpose of this research is to design and implement a VHDL ONFI Controller module for a Modular Instrumentation System. The goal of the Modular Instrumentation System will be to have a low power device that will store data and send the data at a low speed to a processor. The benefit of such a system will give an advantage over other purchased binary IP due to the capability of allowing NASA to re-use and modify the memory controller module. To accomplish the performance criteria of a low power system, an in house auxiliary board (Flash/ADC board), FPGA development kit, debug board, and modular instrumentation board will be jointly used for the data acquisition. The Flash/ADC board contains four, 1 MSPS, input channel signals and an Open NAND Flash memory module with an analog to digital converter. The ADC, data bits, and control line signals from the board are sent to an Microsemi/Actel FPGA development kit for VHDL programming of the flash memory WRITE, READ, READ STATUS, ERASE, and RESET operation waveforms using Libero software. The debug board will be used for verification of the analog input signal and be able to communicate via serial interface with the module instrumentation. The scope of the new controller module was to find and develop an ONFI controller with the debug board layout designed and completed for manufacture. Successful flash memory operation waveform test routines were completed, simulated, and tested to work on the FPGA board. Through connection of the Flash/ADC board with the FPGA, it was found that the device specifications were not being meet with Vdd reaching half of its voltage. Further testing showed that it was the manufactured Flash/ADC board that contained a misalignment with the ONFI memory module traces. The errors proved to be too great to fix in the time limit set for the project.
An FPGA Implementation to Detect Selective Cationic Antibacterial Peptides

PubMed Central

Polanco González, Carlos; Nuño Maganda, Marco Aurelio; Arias-Estrada, Miguel; del Rio, Gabriel

2011-01-01

Exhaustive prediction of physicochemical properties of peptide sequences is used in different areas of biological research. One example is the identification of selective cationic antibacterial peptides (SCAPs), which may be used in the treatment of different diseases. Due to the discrete nature of peptide sequences, the physicochemical properties calculation is considered a high-performance computing problem. A competitive solution for this class of problems is to embed algorithms into dedicated hardware. In the present work we present the adaptation, design and implementation of an algorithm for SCAPs prediction into a Field Programmable Gate Array (FPGA) platform. Four physicochemical properties codes useful in the identification of peptide sequences with potential selective antibacterial activity were implemented into an FPGA board. The speed-up gained in a single-copy implementation was up to 108 times compared with a single Intel processor cycle for cycle. The inherent scalability of our design allows for replication of this code into multiple FPGA cards and consequently improvements in speed are possible. Our results show the first embedded SCAPs prediction solution described and constitutes the grounds to efficiently perform the exhaustive analysis of the sequence-physicochemical properties relationship of peptides. PMID:21738652
The Application of Virtex-II Pro FPGA in High-Speed Image Processing Technology of Robot Vision Sensor

NASA Astrophysics Data System (ADS)

Ren, Y. J.; Zhu, J. G.; Yang, X. Y.; Ye, S. H.

2006-10-01

The Virtex-II Pro FPGA is applied to the vision sensor tracking system of IRB2400 robot. The hardware platform, which undertakes the task of improving SNR and compressing data, is constructed by using the high-speed image processing of FPGA. The lower level image-processing algorithm is realized by combining the FPGA frame and the embedded CPU. The velocity of image processing is accelerated due to the introduction of FPGA and CPU. The usage of the embedded CPU makes it easily to realize the logic design of interface. Some key techniques are presented in the text, such as read-write process, template matching, convolution, and some modules are simulated too. In the end, the compare among the modules using this design, using the PC computer and using the DSP, is carried out. Because the high-speed image processing system core is a chip of FPGA, the function of which can renew conveniently, therefore, to a degree, the measure system is intelligent.
High-Performance CCSDS Encapsulation Service Implementation in FPGA

NASA Technical Reports Server (NTRS)

Clare, Loren P.; Torgerson, Jordan L.; Pang, Jackson

2010-01-01

The Consultative Committee for Space Data Systems (CCSDS) Encapsulation Service is a convergence layer between lower-layer space data link framing protocols, such as CCSDS Advanced Orbiting System (AOS), and higher-layer networking protocols, such as CFDP (CCSDS File Delivery Protocol) and Internet Protocol Extension (IPE). CCSDS Encapsulation Service is considered part of the data link layer. The CCSDS AOS implementation is described in the preceding article. Recent advancement in RF modem technology has allowed multi-megabit transmission over space links. With this increase in data rate, the CCSDS Encapsulation Service needs to be optimized to both reduce energy consumption and operate at a high rate. CCSDS Encapsulation Service has been implemented as an intellectual property core so that the aforementioned problems are solved by way of operating the CCSDS Encapsulation Service inside an FPGA. The CCSDS En capsula tion Service in FPGA implementation consists of both packetizing and de-packetizing features
An FPGA-Based High-Speed Error Resilient Data Aggregation and Control for High Energy Physics Experiment

NASA Astrophysics Data System (ADS)

Mandal, Swagata; Saini, Jogender; Zabołotny, Wojciech M.; Sau, Suman; Chakrabarti, Amlan; Chattopadhyay, Subhasis

2017-03-01

Due to the dramatic increase of data volume in modern high energy physics (HEP) experiments, a robust high-speed data acquisition (DAQ) system is very much needed to gather the data generated during different nuclear interactions. As the DAQ works under harsh radiation environment, there is a fair chance of data corruption due to various energetic particles like alpha, beta, or neutron. Hence, a major challenge in the development of DAQ in the HEP experiment is to establish an error resilient communication system between front-end sensors or detectors and back-end data processing computing nodes. Here, we have implemented the DAQ using field-programmable gate array (FPGA) due to some of its inherent advantages over the application-specific integrated circuit. A novel orthogonal concatenated code and cyclic redundancy check (CRC) have been used to mitigate the effects of data corruption in the user data. Scrubbing with a 32-b CRC has been used against error in the configuration memory of FPGA. Data from front-end sensors will reach to the back-end processing nodes through multiple stages that may add an uncertain amount of delay to the different data packets. We have also proposed a novel memory management algorithm that helps to process the data at the back-end computing nodes removing the added path delays. To the best of our knowledge, the proposed FPGA-based DAQ utilizing optical link with channel coding and efficient memory management modules can be considered as first of its kind. Performance estimation of the implemented DAQ system is done based on resource utilization, bit error rate, efficiency, and robustness to radiation.
A novel pipeline based FPGA implementation of a genetic algorithm

NASA Astrophysics Data System (ADS)

Thirer, Nonel

2014-05-01

To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model

PubMed Central

Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid

2014-01-01

A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well. PMID:25484854
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model.

PubMed

Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid

2014-01-01

A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well.
High speed CMOS acquisition system based on FPGA embedded image processing for electro-optical measurements

NASA Astrophysics Data System (ADS)

Rosu-Hamzescu, Mihnea; Polonschii, Cristina; Oprea, Sergiu; Popescu, Dragos; David, Sorin; Bratu, Dumitru; Gheorghiu, Eugen

2018-06-01

Electro-optical measurements, i.e., optical waveguides and plasmonic based electrochemical impedance spectroscopy (P-EIS), are based on the sensitive dependence of refractive index of electro-optical sensors on surface charge density, modulated by an AC electrical field applied to the sensor surface. Recently, P-EIS has emerged as a new analytical tool that can resolve local impedance with high, optical spatial resolution, without using microelectrodes. This study describes a high speed image acquisition and processing system for electro-optical measurements, based on a high speed complementary metal-oxide semiconductor (CMOS) sensor and a field-programmable gate array (FPGA) board. The FPGA is used to configure CMOS parameters, as well as to receive and locally process the acquired images by performing Fourier analysis for each pixel, deriving the real and imaginary parts of the Fourier coefficients for the AC field frequencies. An AC field generator, for single or multi-sine signals, is synchronized with the high speed acquisition system for phase measurements. The system was successfully used for real-time angle-resolved electro-plasmonic measurements from 30 Hz up to 10 kHz, providing results consistent to ones obtained by a conventional electrical impedance approach. The system was able to detect amplitude variations with a relative variation of ±1%, even for rather low sampling rates per period (i.e., 8 samples per period). The PC (personal computer) acquisition and control software allows synchronized acquisition for multiple FPGA boards, making it also suitable for simultaneous angle-resolved P-EIS imaging.
Rapid and highly integrated FPGA-based Shack-Hartmann wavefront sensor for adaptive optics system

NASA Astrophysics Data System (ADS)

Chen, Yi-Pin; Chang, Chia-Yuan; Chen, Shean-Jen

2018-02-01

In this study, a field programmable gate array (FPGA)-based Shack-Hartmann wavefront sensor (SHWS) programmed on LabVIEW can be highly integrated into customized applications such as adaptive optics system (AOS) for performing real-time wavefront measurement. Further, a Camera Link frame grabber embedded with FPGA is adopted to enhance the sensor speed reacting to variation considering its advantage of the highest data transmission bandwidth. Instead of waiting for a frame image to be captured by the FPGA, the Shack-Hartmann algorithm are implemented in parallel processing blocks design and let the image data transmission synchronize with the wavefront reconstruction. On the other hand, we design a mechanism to control the deformable mirror in the same FPGA and verify the Shack-Hartmann sensor speed by controlling the frequency of the deformable mirror dynamic surface deformation. Currently, this FPGAbead SHWS design can achieve a 266 Hz cyclic speed limited by the camera frame rate as well as leaves 40% logic slices for additionally flexible design.
Implementation of total focusing method for phased array ultrasonic imaging on FPGA

NASA Astrophysics Data System (ADS)

Guo, JianQiang; Li, Xi; Gao, Xiaorong; Wang, Zeyong; Zhao, Quanke

2015-02-01

This paper describes a multi-FPGA imaging system dedicated for the real-time imaging using the Total Focusing Method (TFM) and Full Matrix Capture (FMC). The system was entirely described using Verilog HDL language and implemented on Altera Stratix IV GX FPGA development board. The whole algorithm process is to: establish a coordinate system of image and divide it into grids; calculate the complete acoustic distance of array element between transmitting array element and receiving array element, and transform it into index value; then index the sound pressure values from ROM and superimpose sound pressure values to get pixel value of one focus point; and calculate the pixel values of all focus points to get the final imaging. The imaging result shows that this algorithm has high SNR of defect imaging. And FPGA with parallel processing capability can provide high speed performance, so this system can provide the imaging interface, with complete function and good performance.
Optimization on fixed low latency implementation of the GBT core in FPGA

DOE PAGES

Chen, K.; Chen, H.; Wu, W.; ...

2017-07-11

We present that in the upgrade of ATLAS experiment, the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, themore » GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system is used to interface the front end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. Finally, the system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.« less
Optimization on fixed low latency implementation of the GBT core in FPGA

NASA Astrophysics Data System (ADS)

Chen, K.; Chen, H.; Wu, W.; Xu, H.; Yao, L.

2017-07-01

In the upgrade of ATLAS experiment [1], the front-end electronics components are subjected to a large radiation background. Meanwhile high speed optical links are required for the data transmission between the on-detector and off-detector electronics. The GBT architecture and the Versatile Link (VL) project are designed by CERN to support the 4.8 Gbps line rate bidirectional high-speed data transmission which is called GBT link [2]. In the ATLAS upgrade, besides the link with on-detector, the GBT link is also used between different off-detector systems. The GBTX ASIC is designed for the on-detector front-end, correspondingly for the off-detector electronics, the GBT architecture is implemented in Field Programmable Gate Arrays (FPGA). CERN launches the GBT-FPGA project to provide examples in different types of FPGA [3]. In the ATLAS upgrade framework, the Front-End LInk eXchange (FELIX) system [4, 5] is used to interface the front-end electronics of several ATLAS subsystems. The GBT link is used between them, to transfer the detector data and the timing, trigger, control and monitoring information. The trigger signal distributed in the down-link from FELIX to the front-end requires a fixed and low latency. In this paper, several optimizations on the GBT-FPGA IP core are introduced, to achieve a lower fixed latency. For FELIX, a common firmware will be used to interface different front-ends with support of both GBT modes: the forward error correction mode and the wide mode. The modified GBT-FPGA core has the ability to switch between the GBT modes without FPGA reprogramming. The system clock distribution of the multi-channel FELIX firmware is also discussed in this paper.
Implementation of the 2-D Wavelet Transform into FPGA for Image

NASA Astrophysics Data System (ADS)

León, M.; Barba, L.; Vargas, L.; Torres, C. O.

2011-01-01

This paper presents a hardware system implementation of the of discrete wavelet transform algoritm in two dimensions for FPGA, using the Daubechies filter family of order 2 (db2). The decomposition algorithm of this transform is designed and simulated with the Hardware Description Language VHDL and is implemented in a programmable logic device (FPGA) XC3S1200E reference, Spartan IIIE family, by Xilinx, take advantage the parallels properties of these gives us and speeds processing that can reach them. The architecture is evaluated using images input of different sizes. This implementation is done with the aim of developing a future images encryption hardware system using wavelet transform for security information.
FPGA based hardware optimized implementation of signal processing system for LFM pulsed radar

NASA Astrophysics Data System (ADS)

Azim, Noor ul; Jun, Wang

2016-11-01

Signal processing is one of the main parts of any radar system. Different signal processing algorithms are used to extract information about different parameters like range, speed, direction etc, of a target in the field of radar communication. This paper presents LFM (Linear Frequency Modulation) pulsed radar signal processing algorithms which are used to improve target detection, range resolution and to estimate the speed of a target. Firstly, these algorithms are simulated in MATLAB to verify the concept and theory. After the conceptual verification in MATLAB, the simulation is converted into implementation on hardware using Xilinx FPGA. Chosen FPGA is Xilinx Virtex-6 (XC6LVX75T). For hardware implementation pipeline optimization is adopted and also other factors are considered for resources optimization in the process of implementation. Focusing algorithms in this work for improving target detection, range resolution and speed estimation are hardware optimized fast convolution processing based pulse compression and pulse Doppler processing.

Energy efficiency analysis and implementation of AES on an FPGA

NASA Astrophysics Data System (ADS)

Kenney, David

The Advanced Encryption Standard (AES) was developed by Joan Daemen and Vincent Rjimen and endorsed by the National Institute of Standards and Technology in 2001. It was designed to replace the aging Data Encryption Standard (DES) and be useful for a wide range of applications with varying throughput, area, power dissipation and energy consumption requirements. Field Programmable Gate Arrays (FPGAs) are flexible and reconfigurable integrated circuits that are useful for many different applications including the implementation of AES. Though they are highly flexible, FPGAs are often less efficient than Application Specific Integrated Circuits (ASICs); they tend to operate slower, take up more space and dissipate more power. There have been many FPGA AES implementations that focus on obtaining high throughput or low area usage, but very little research done in the area of low power or energy efficient FPGA based AES; in fact, it is rare for estimates on power dissipation to be made at all. This thesis presents a methodology to evaluate the energy efficiency of FPGA based AES designs and proposes a novel FPGA AES implementation which is highly flexible and energy efficient. The proposed methodology is implemented as part of a novel scripting tool, the AES Energy Analyzer, which is able to fully characterize the power dissipation and energy efficiency of FPGA based AES designs. Additionally, this thesis introduces a new FPGA power reduction technique called Opportunistic Combinational Operand Gating (OCOG) which is used in the proposed energy efficient implementation. The AES Energy Analyzer was able to estimate the power dissipation and energy efficiency of the proposed AES design during its most commonly performed operations. It was found that the proposed implementation consumes less energy per operation than any previous FPGA based AES implementations that included power estimations. Finally, the use of Opportunistic Combinational Operand Gating on an AES cipher
Real-time implementation of camera positioning algorithm based on FPGA & SOPC

NASA Astrophysics Data System (ADS)

Yang, Mingcao; Qiu, Yuehong

2014-09-01

In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
High frequency signal acquisition and control system based on DSP+FPGA

NASA Astrophysics Data System (ADS)

Liu, Xiao-qi; Zhang, Da-zhi; Yin, Ya-dong

2017-10-01

This paper introduces a design and implementation of high frequency signal acquisition and control system based on DSP + FPGA. The system supports internal/external clock and internal/external trigger sampling. It has a maximum sampling rate of 400MBPS and has a 1.4GHz input bandwidth for the ADC. Data can be collected continuously or periodically in systems and they are stored in DDR2. At the same time, the system also supports real-time acquisition, the collected data after digital frequency conversion and Cascaded Integrator-Comb (CIC) filtering, which then be sent to the CPCI bus through the high-speed DSP, can be assigned to the fiber board for subsequent processing. The system integrates signal acquisition and pre-processing functions, which uses high-speed A/D, high-speed DSP and FPGA mixed technology and has a wide range of uses in data acquisition and recording. In the signal processing, the system can be seamlessly connected to the dedicated processor board. The system has the advantages of multi-selectivity, good scalability and so on, which satisfies the different requirements of different signals in different projects.
High speed FPGA-based Phasemeter for the far-infrared laser interferometers on EAST

NASA Astrophysics Data System (ADS)

Yao, Y.; Liu, H.; Zou, Z.; Li, W.; Lian, H.; Jie, Y.

2017-12-01

The far-infrared laser-based HCN interferometer and POlarimeter/INTerferometer\\break (POINT) system are important diagnostics for plasma density measurement on EAST tokamak. Both HCN and POINT provide high spatial and temporal resolution of electron density measurement and used for plasma density feedback control. The density is calculated by measuring the real-time phase difference between the reference beams and the probe beams. For long-pulse operations on EAST, the calculation of density has to meet the requirements of Real-Time and high precision. In this paper, a Phasemeter for far-infrared laser-based interferometers will be introduced. The FPGA-based Phasemeter leverages fast ADCs to obtain the three-frequency signals from VDI planar-diode Mixers, and realizes digital filters and an FFT algorithm in FPGA to provide real-time, high precision electron density output. Implementation of the Phasemeter will be helpful for the future plasma real-time feedback control in long-pulse discharge.
Random number generators for large-scale parallel Monte Carlo simulations on FPGA

NASA Astrophysics Data System (ADS)

Lin, Y.; Wang, F.; Liu, B.

2018-05-01

Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Research and implementation of SATA protocol link layer based on FPGA

NASA Astrophysics Data System (ADS)

Liu, Wen-long; Liu, Xue-bin; Qiang, Si-miao; Yan, Peng; Wen, Zhi-gang; Kong, Liang; Liu, Yong-zheng

2018-02-01

In order to solve the problem high-performance real-time, high-speed the image data storage generated by the detector. In this thesis, it choose an suitable portable image storage hard disk of SATA interface, it is relative to the existing storage media. It has a large capacity, high transfer rate, inexpensive, power-down data which is not lost, and many other advantages. This paper focuses on the link layer of the protocol, analysis the implementation process of SATA2.0 protocol, and build state machines. Then analyzes the characteristics resources of Kintex-7 FPGA family, builds state machines according to the agreement, write Verilog implement link layer modules, and run the simulation test. Finally, the test is on the Kintex-7 development board platform. It meets the requirements SATA2.0 protocol basically.
Resource and Performance Evaluations of Fixed Point QRD-RLS Systolic Array through FPGA Implementation

NASA Astrophysics Data System (ADS)

Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki

At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
High-Speed Scanning Interferometer Using CMOS Image Sensor and FPGA Based on Multifrequency Phase-Tracking Detection

NASA Technical Reports Server (NTRS)

Ohara, Tetsuo

2012-01-01

A sub-aperture stitching optical interferometer can provide a cost-effective solution for an in situ metrology tool for large optics; however, the currently available technologies are not suitable for high-speed and real-time continuous scan. NanoWave s SPPE (Scanning Probe Position Encoder) has been proven to exhibit excellent stability and sub-nanometer precision with a large dynamic range. This same technology can transform many optical interferometers into real-time subnanometer precision tools with only minor modification. The proposed field-programmable gate array (FPGA) signal processing concept, coupled with a new-generation, high-speed, mega-pixel CMOS (complementary metal-oxide semiconductor) image sensor, enables high speed (>1 m/s) and real-time continuous surface profiling that is insensitive to variation of pixel sensitivity and/or optical transmission/reflection. This is especially useful for large optics surface profiling.
Development of a Low-Cost and High-speed Single Event Effects Testers based on Reconfigurable Field Programmable Gate Arrays (FPGA)

NASA Technical Reports Server (NTRS)

Howard, J. W.; Kim, H.; Berg, M.; LaBel, K. A.; Stansberry, S.; Friendlich, M.; Irwin, T.

2006-01-01

A viewgraph presentation on the development of a low cost, high speed tester reconfigurable Field Programmable Gata Array (FPGA) is shown. The topics include: 1) Introduction; 2) Objectives; 3) Tester Descriptions; 4) Tester Validations and Demonstrations; 5) Future Work; and 6) Summary.
High-Speed Current dq PI Controller for Vector Controlled PMSM Drive

PubMed Central

Reaz, Mamun Bin Ibne; Rahman, Labonnah Farzana; Chang, Tae Gyu

2014-01-01

High-speed current controller for vector controlled permanent magnet synchronous motor (PMSM) is presented. The controller is developed based on modular design for faster calculation and uses fixed-point proportional-integral (PI) method for improved accuracy. Current dq controller is usually implemented in digital signal processor (DSP) based computer. However, DSP based solutions are reaching their physical limits, which are few microseconds. Besides, digital solutions suffer from high implementation cost. In this research, the overall controller is realizing in field programmable gate array (FPGA). FPGA implementation of the overall controlling algorithm will certainly trim down the execution time significantly to guarantee the steadiness of the motor. Agilent 16821A Logic Analyzer is employed to validate the result of the implemented design in FPGA. Experimental results indicate that the proposed current dq PI controller needs only 50 ns of execution time in 40 MHz clock, which is the lowest computational cycle for the era. PMID:24574913
TOT measurement implemented in FPGA TDC

NASA Astrophysics Data System (ADS)

Fan, Huan-Huan; Cao, Ping; Liu, Shu-Bin; An, Qi

2015-11-01

Time measurement plays a crucial role for the purpose of particle identification in high energy physics experiments. With increasingly demanding physics goals and the development of electronics, modern time measurement systems need to meet the requirement of excellent resolution specification as well as high integrity. Based on Field Programmable Gate Arrays (FPGAs), FPGA time-to-digital converters (TDCs) have become one of the most mature and prominent time measurement methods in recent years. For correcting the time-walk effect caused by leading timing, a time-over-threshold (TOT) measurement should be added to the FPGA TDC. TOT can be obtained by measuring the interval between the signal leading and trailing edges. Unfortunately, a traditional TDC can recognize only one kind of signal edge, the leading or the trailing. Generally, to measure the interval, two TDC channels need to be used at the same time, one for leading, the other for trailing. However, this method unavoidably increases the amount of FPGA resources used and reduces the TDC's integrity. This paper presents one method of TOT measurement implemented in a Xilinx Virtex-5 FPGA. In this method, TOT measurement can be achieved using only one TDC input channel. The consumed resources and time resolution can both be guaranteed. Testing shows that this TDC can achieve resolution better than 15ps for leading edge measurement and 37 ps for TOT measurement. Furthermore, the TDC measurement dead time is about two clock cycles, which makes it good for applications with higher physics event rates. Supported by National Natural Science Foundation of China (11079003, 10979003)
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

PubMed Central

Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao

2012-01-01

This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
Pulse-coupled neural network implementation in FPGA

NASA Astrophysics Data System (ADS)

Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael

1998-03-01

Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
High-speed line-scan camera with digital time delay integration

NASA Astrophysics Data System (ADS)

Bodenstorfer, Ernst; Fürtler, Johannes; Brodersen, Jörg; Mayer, Konrad J.; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert

2007-02-01

Dealing with high-speed image acquisition and processing systems, the speed of operation is often limited by the amount of available light, due to short exposure times. Therefore, high-speed applications often use line-scan cameras, based on charge-coupled device (CCD) sensors with time delayed integration (TDI). Synchronous shift and accumulation of photoelectric charges on the CCD chip - according to the objects' movement - result in a longer effective exposure time without introducing additional motion blur. This paper presents a high-speed color line-scan camera based on a commercial complementary metal oxide semiconductor (CMOS) area image sensor with a Bayer filter matrix and a field programmable gate array (FPGA). The camera implements a digital equivalent to the TDI effect exploited with CCD cameras. The proposed design benefits from the high frame rates of CMOS sensors and from the possibility of arbitrarily addressing the rows of the sensor's pixel array. For the digital TDI just a small number of rows are read out from the area sensor which are then shifted and accumulated according to the movement of the inspected objects. This paper gives a detailed description of the digital TDI algorithm implemented on the FPGA. Relevant aspects for the practical application are discussed and key features of the camera are listed.
FPGA Implementation of Heart Rate Monitoring System.

PubMed

Panigrahy, D; Rakshit, M; Sahu, P K

2016-03-01

This paper describes a field programmable gate array (FPGA) implementation of a system that calculates the heart rate from Electrocardiogram (ECG) signal. After heart rate calculation, tachycardia, bradycardia or normal heart rate can easily be detected. ECG is a diagnosis tool routinely used to access the electrical activities and muscular function of the heart. Heart rate is calculated by detecting the R peaks from the ECG signal. To provide a portable and the continuous heart rate monitoring system for patients using ECG, needs a dedicated hardware. FPGA provides easy testability, allows faster implementation and verification option for implementing a new design. We have proposed a five-stage based methodology by using basic VHDL blocks like addition, multiplication and data conversion (real to the fixed point and vice-versa). Our proposed heart rate calculation (R-peak detection) method has been validated, using 48 first channel ECG records of the MIT-BIH arrhythmia database. It shows an accuracy of 99.84%, the sensitivity of 99.94% and the positive predictive value of 99.89%. Our proposed method outperforms other well-known methods in case of pathological ECG signals and successfully implemented in FPGA.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.

PubMed

Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan

2017-07-01

In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
Economical Implementation of a Filter Engine in an FPGA

NASA Technical Reports Server (NTRS)

Kowalski, James E.

2009-01-01

A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be
High speed fault tolerant secure communication for muon chamber using FPGA based GBTx emulator

NASA Astrophysics Data System (ADS)

Sau, Suman; Mandal, Swagata; Saini, Jogender; Chakrabarti, Amlan; Chattopadhyay, Subhasis

2015-12-01

The Compressed Baryonic Matter (CBM) experiment is a part of the Facility for Antiproton and Ion Research (FAIR) in Darmstadt at the GSI. The CBM experiment will investigate the highly compressed nuclear matter using nucleus-nucleus collisions. This experiment will examine lieavy-ion collisions in fixed target geometry and will be able to measure hadrons, electrons and muons. CBM requires precise time synchronization, compact hardware, radiation tolerance, self-triggered front-end electronics, efficient data aggregation schemes and capability to handle high data rate (up to several TB/s). As a part of the implementation of read out chain of Muon Cliamber(MUCH) [1] in India, we have tried to implement FPGA based emulator of GBTx in India. GBTx is a radiation tolerant ASIC that can be used to implement multipurpose high speed bidirectional optical links for high-energy physics (HEP) experiments and is developed by CERN. GBTx will be used in highly irradiated area and more prone to be affected by multi bit error. To mitigate this effect instead of single bit error correcting RS code we have used two bit error correcting (15, 7) BCH code. It will increase the redundancy which in turn increases the reliability of the coded data. So the coded data will be less prone to be affected by noise due to radiation. The data will go from detector to PC through multiple nodes through the communication channel. The computing resources are connected to a network which can be accessed by authorized person to prevent unauthorized data access which might happen by compromising the network security. Thus data encryption is essential. In order to make the data communication secure, advanced encryption standard [2] (AES - a symmetric key cryptography) and RSA [3], [4] (asymmetric key cryptography) are used after the channel coding. We have implemented GBTx emulator on two Xilinx Kintex-7 boards (KC705). One will act as transmitter and other will act as receiver and they are connected
FPGA Based High Speed Data Acquisition System for Electrical Impedance Tomography

PubMed Central

Khan, S; Borsic, A; Manwaring, Preston; Hartov, Alexander; Halter, Ryan

2014-01-01

Electrical Impedance Tomography (EIT) systems are used to image tissue bio-impedance. EIT provides a number of features making it attractive for use as a medical imaging device including the ability to image fast physiological processes (>60 Hz), to meet a range of clinical imaging needs through varying electrode geometries and configurations, to impart only non-ionizing radiation to a patient, and to map the significant electrical property contrasts present between numerous benign and pathological tissues. To leverage these potential advantages for medical imaging, we developed a modular 32 channel data acquisition (DAQ) system using National Instruments’ PXI chassis, along with FPGA, ADC, Signal Generator and Timing and Synchronization modules. To achieve high frame rates, signal demodulation and spectral characteristics of higher order harmonics were computed using dedicated FFT-hardware built into the FPGA module. By offloading the computing onto FPGA, we were able to achieve a reduction in throughput required between the FPGA and PC by a factor of 32:1. A custom designed analog front end (AFE) was used to interface electrodes with our system. Our system is wideband, and capable of acquiring data for input signal frequencies ranging from 100 Hz to 12 MHz. The modular design of both the hardware and software will allow this system to be flexibly configured for the particular clinical application. PMID:24729790
FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG

DTIC Science & Technology

2014-06-01

is normalized to π. The proposed burst-mode architecture is written in VHDL and verified using Modelsim. The VHDL design is implemented on a Xilinx...Document Number: SET 2014-0043 412TW-PA-14298 FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG June 2014 Final Report Test...To) 9/11 -- 8/14 4. TITLE AND SUBTITLE FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG 5a. CONTRACT NUMBER: W900KK-11-C-0032 5b

Small Microprocessor for ASIC or FPGA Implementation

NASA Technical Reports Server (NTRS)

Kleyner, Igor; Katz, Richard; Blair-Smith, Hugh

2011-01-01

A small microprocessor, suitable for use in applications in which high reliability is required, was designed to be implemented in either an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA). The design is based on commercial microprocessor architecture, making it possible to use available software development tools and thereby to implement the microprocessor at relatively low cost. The design features enhancements, including trapping during execution of illegal instructions. The internal structure of the design yields relatively high performance, with a significant decrease, relative to other microprocessors that perform the same functions, in the number of microcycles needed to execute macroinstructions. The problem meant to be solved in designing this microprocessor was to provide a modest level of computational capability in a general-purpose processor while adding as little as possible to the power demand, size, and weight of a system into which the microprocessor would be incorporated. As designed, this microprocessor consumes very little power and occupies only a small portion of a typical modern ASIC or FPGA. The microprocessor operates at a rate of about 4 million instructions per second with clock frequency of 20 MHz.
Implementation of real-time nonuniformity correction with multiple NUC tables using FPGA in an uncooled imaging system

NASA Astrophysics Data System (ADS)

Oh, Gyong Jin; Kim, Lyang-June; Sheen, Sue-Ho; Koo, Gyou-Phyo; Jin, Sang-Hun; Yeo, Bo-Yeon; Lee, Jong-Ho

2009-05-01

This paper presents a real time implementation of Non Uniformity Correction (NUC). Two point correction and one point correction with shutter were carried out in an uncooled imaging system which will be applied to a missile application. To design a small, light weight and high speed imaging system for a missile system, SoPC (System On a Programmable Chip) which comprises of FPGA and soft core (Micro-blaze) was used. Real time NUC and generation of control signals are implemented using FPGA. Also, three different NUC tables were made to make the operating time shorter and to reduce the power consumption in a large range of environment temperature. The imaging system consists of optics and four electronics boards which are detector interface board, Analog to Digital converter board, Detector signal generation board and Power supply board. To evaluate the imaging system, NETD was measured. The NETD was less than 160mK in three different environment temperatures.
Computer vision camera with embedded FPGA processing

NASA Astrophysics Data System (ADS)

Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

2000-03-01

Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
20-GFLOPS QR processor on a Xilinx Virtex-E FPGA

NASA Astrophysics Data System (ADS)

Walke, Richard L.; Smith, Robert W. M.; Lightbody, Gaye

2000-11-01

Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.
Development of embedded real-time and high-speed vision platform

NASA Astrophysics Data System (ADS)

Ouyang, Zhenxing; Dong, Yimin; Yang, Hua

2015-12-01

Currently, high-speed vision platforms are widely used in many applications, such as robotics and automation industry. However, a personal computer (PC) whose over-large size is not suitable and applicable in compact systems is an indispensable component for human-computer interaction in traditional high-speed vision platforms. Therefore, this paper develops an embedded real-time and high-speed vision platform, ER-HVP Vision which is able to work completely out of PC. In this new platform, an embedded CPU-based board is designed as substitution for PC and a DSP and FPGA board is developed for implementing image parallel algorithms in FPGA and image sequential algorithms in DSP. Hence, the capability of ER-HVP Vision with size of 320mm x 250mm x 87mm can be presented in more compact condition. Experimental results are also given to indicate that the real-time detection and counting of the moving target at a frame rate of 200 fps at 512 x 512 pixels under the operation of this newly developed vision platform are feasible.
Implementing a Digital Phasemeter in an FPGA

NASA Technical Reports Server (NTRS)

Rao, Shanti R.

2008-01-01

Firmware for implementing a digital phasemeter within a field-programmable gate array (FPGA) has been devised. In the original application of this firmware, the phase that one seeks to measure is the difference between the phases of two nominally-equal-frequency heterodyne signals generated by two interferometers. In that application, zero-crossing detectors convert the heterodyne signals to trains of rectangular pulses, the two pulse trains are fed to a fringe counter (the major part of the phasemeter) controlled by a clock signal having a frequency greater than the heterodyne frequency, and the fringe counter computes a time-averaged estimate of the difference between the phases of the two pulse trains. The firmware also does the following: Causes the FPGA to compute the frequencies of the input signals; Causes the FPGA to implement an Ethernet (or equivalent) transmitter for readout of phase and frequency values; and Provides data for use in diagnosis of communication failures. The readout rate can be set, by programming, to a value between 250 Hz and 1 kHz. Network addresses can be programmed by the user.
Photoelectric radar servo control system based on ARM+FPGA

NASA Astrophysics Data System (ADS)

Wu, Kaixuan; Zhang, Yue; Li, Yeqiu; Dai, Qin; Yao, Jun

2016-01-01

In order to get smaller, faster, and more responsive requirements of the photoelectric radar servo control system. We propose a set of core ARM + FPGA architecture servo controller. Parallel processing capability of FPGA to be used for the encoder feedback data, PWM carrier modulation, A, B code decoding processing and so on; Utilizing the advantage of imaging design in ARM Embedded systems achieves high-speed implementation of the PID algorithm. After the actual experiment, the closed-loop speed of response of the system cycles up to 2000 times/s, in the case of excellent precision turntable shaft, using a PID algorithm to achieve the servo position control with the accuracy of + -1 encoder input code. Firstly, This article carry on in-depth study of the embedded servo control system hardware to determine the ARM and FPGA chip as the main chip with systems based on a pre-measured target required to achieve performance requirements, this article based on ARM chip used Samsung S3C2440 chip of ARM7 architecture , the FPGA chip is chosen xilinx's XC3S400 . ARM and FPGA communicate by using SPI bus, the advantage of using SPI bus is saving a lot of pins for easy system upgrades required thereafter. The system gets the speed datas through the photoelectric-encoder that transports the datas to the FPGA, Then the system transmits the datas through the FPGA to ARM, transforms speed datas into the corresponding position and velocity data in a timely manner, prepares the corresponding PWM wave to control motor rotation by making comparison between the position data and the velocity data setted in advance . According to the system requirements to draw the schematics of the photoelectric radar servo control system and PCB board to produce specially. Secondly, using PID algorithm to control the servo system, the datas of speed obtained from photoelectric-encoder is calculated position data and speed data via high-speed digital PID algorithm and coordinate models. Finally, a
Area, speed and power measurements of FPGA-based complex orthogonal space-time block code channel encoders

NASA Astrophysics Data System (ADS)

Passas, Georgios; Freear, Steven; Fawcett, Darren

2010-01-01

Space-time coding (STC) is an important milestone in modern wireless communications. In this technique, more copies of the same signal are transmitted through different antennas (space) and different symbol periods (time), to improve the robustness of a wireless system by increasing its diversity gain. STCs are channel coding algorithms that can be readily implemented on a field programmable gate array (FPGA) device. This work provides some figures for the amount of required FPGA hardware resources, the speed that the algorithms can operate and the power consumption requirements of a space-time block code (STBC) encoder. Seven encoder very high-speed integrated circuit hardware description language (VHDL) designs have been coded, synthesised and tested. Each design realises a complex orthogonal space-time block code with a different transmission matrix. All VHDL designs are parameterisable in terms of sample precision. Precisions ranging from 4 bits to 32 bits have been synthesised. Alamouti's STBC encoder design [Alamouti, S.M. (1998), 'A Simple Transmit Diversity Technique for Wireless Communications', IEEE Journal on Selected Areas in Communications, 16:55-108.] proved to be the best trade-off, since it is on average 3.2 times smaller, 1.5 times faster and requires slightly less power than the next best trade-off in the comparison, which is a 3/4-rate full-diversity 3Tx-antenna STBC.
Autonomous Lawnmower using FPGA implementation.

NASA Astrophysics Data System (ADS)

Ahmad, Nabihah; Lokman, Nabill bin; Helmy Abd Wahab, Mohd

2016-11-01

Nowadays, there are various types of robot have been invented for multiple purposes. The robots have the special characteristic that surpass the human ability and could operate in extreme environment which human cannot endure. In this paper, an autonomous robot is built to imitate the characteristic of a human cutting grass. A Field Programmable Gate Array (FPGA) is used to control the movements where all data and information would be processed. Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) is used to describe the hardware using Quartus II software. This robot has the ability of avoiding obstacle using ultrasonic sensor. This robot used two DC motors for its movement. It could include moving forward, backward, and turning left and right. The movement or the path of the automatic lawn mower is based on a path planning technique. Four Global Positioning System (GPS) plot are set to create a boundary. This to ensure that the lawn mower operates within the area given by user. Every action of the lawn mower is controlled by the FPGA DE' Board Cyclone II with the help of the sensor. Furthermore, Sketch Up software was used to design the structure of the lawn mower. The autonomous lawn mower was able to operate efficiently and smoothly return to coordinated paths after passing the obstacle. It uses 25% of total pins available on the board and 31% of total Digital Signal Processing (DSP) blocks.
Theory and implementation of a very high throughput true random number generator in field programmable gate array

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Yonggang, E-mail: wangyg@ustc.edu.cn; Hui, Cong; Liu, Chong

The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving,more » so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.« less
Theory and implementation of a very high throughput true random number generator in field programmable gate array.

PubMed

Wang, Yonggang; Hui, Cong; Liu, Chong; Xu, Chao

2016-04-01

The contribution of this paper is proposing a new entropy extraction mechanism based on sampling phase jitter in ring oscillators to make a high throughput true random number generator in a field programmable gate array (FPGA) practical. Starting from experimental observation and analysis of the entropy source in FPGA, a multi-phase sampling method is exploited to harvest the clock jitter with a maximum entropy and fast sampling speed. This parametrized design is implemented in a Xilinx Artix-7 FPGA, where the carry chains in the FPGA are explored to realize the precise phase shifting. The generator circuit is simple and resource-saving, so that multiple generation channels can run in parallel to scale the output throughput for specific applications. The prototype integrates 64 circuit units in the FPGA to provide a total output throughput of 7.68 Gbps, which meets the requirement of current high-speed quantum key distribution systems. The randomness evaluation, as well as its robustness to ambient temperature, confirms that the new method in a purely digital fashion can provide high-speed high-quality random bit sequences for a variety of embedded applications.
High-Speed On-Board Data Processing for Science Instruments

NASA Technical Reports Server (NTRS)

Beyon, Jeffrey Y.; Ng, Tak-Kwong; Lin, Bing; Hu, Yongxiang; Harrison, Wallace

2014-01-01

A new development of on-board data processing platform has been in progress at NASA Langley Research Center since April, 2012, and the overall review of such work is presented in this paper. The project is called High-Speed On-Board Data Processing for Science Instruments (HOPS) and focuses on a high-speed scalable data processing platform for three particular National Research Council's Decadal Survey missions such as Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS), Aerosol-Cloud-Ecosystems (ACE), and Doppler Aerosol Wind Lidar (DAWN) 3-D Winds. HOPS utilizes advanced general purpose computing with Field Programmable Gate Array (FPGA) based algorithm implementation techniques. The significance of HOPS is to enable high speed on-board data processing for current and future science missions with its reconfigurable and scalable data processing platform. A single HOPS processing board is expected to provide approximately 66 times faster data processing speed for ASCENDS, more than 70% reduction in both power and weight, and about two orders of cost reduction compared to the state-of-the-art (SOA) on-board data processing system. Such benchmark predictions are based on the data when HOPS was originally proposed in August, 2011. The details of these improvement measures are also presented. The two facets of HOPS development are identifying the most computationally intensive algorithm segments of each mission and implementing them in a FPGA-based data processing board. A general introduction of such facets is also the purpose of this paper.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA

NASA Astrophysics Data System (ADS)

Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing

Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.

PubMed

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2017-03-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures

PubMed Central

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2016-01-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation
Fpga based L-band pulse doppler radar design and implementation

NASA Astrophysics Data System (ADS)

Savci, Kubilay

As its name implies RADAR (Radio Detection and Ranging) is an electromagnetic sensor used for detection and locating targets from their return signals. Radar systems propagate electromagnetic energy, from the antenna which is in part intercepted by an object. Objects reradiate a portion of energy which is captured by the radar receiver. The received signal is then processed for information extraction. Radar systems are widely used for surveillance, air security, navigation, weather hazard detection, as well as remote sensing applications. In this work, an FPGA based L-band Pulse Doppler radar prototype, which is used for target detection, localization and velocity calculation has been built and a general-purpose Pulse Doppler radar processor has been developed. This radar is a ground based stationary monopulse radar, which transmits a short pulse with a certain pulse repetition frequency (PRF). Return signals from the target are processed and information about their location and velocity is extracted. Discrete components are used for the transmitter and receiver chain. The hardware solution is based on Xilinx Virtex-6 ML605 FPGA board, responsible for the control of the radar system and the digital signal processing of the received signal, which involves Constant False Alarm Rate (CFAR) detection and Pulse Doppler processing. The algorithm is implemented in MATLAB/SIMULINK using the Xilinx System Generator for DSP tool. The field programmable gate arrays (FPGA) implementation of the radar system provides the flexibility of changing parameters such as the PRF and pulse length therefore it can be used with different radar configurations as well. A VHDL design has been developed for 1Gbit Ethernet connection to transfer digitized return signal and detection results to PC. An A-Scope software has been developed with C# programming language to display time domain radar signals and detection results on PC. Data are processed both in FPGA chip and on PC. FPGA uses fixed
A FPGA-based architecture for real-time image matching

NASA Astrophysics Data System (ADS)

Wang, Jianhui; Zhong, Sheng; Xu, Wenhui; Zhang, Weijun; Cao, Zhiguo

2013-10-01

Image matching is a fundamental task in computer vision. It is used to establish correspondence between two images taken at different viewpoint or different time from the same scene. However, its large computational complexity has been a challenge to most embedded systems. This paper proposes a single FPGA-based image matching system, which consists of SIFT feature detection, BRIEF descriptor extraction and BRIEF matching. It optimizes the FPGA architecture for the SIFT feature detection to reduce the FPGA resources utilization. Moreover, we implement BRIEF description and matching on FPGA also. The proposed system can implement image matching at 30fps (frame per second) for 1280x720 images. Its processing speed can meet the demand of most real-life computer vision applications.
A space-efficient quantum computer simulator suitable for high-speed FPGA implementation

NASA Astrophysics Data System (ADS)

Frank, Michael P.; Oniciuc, Liviu; Meyer-Baese, Uwe H.; Chiorescu, Irinel

2009-05-01

Conventional vector-based simulators for quantum computers are quite limited in the size of the quantum circuits they can handle, due to the worst-case exponential growth of even sparse representations of the full quantum state vector as a function of the number of quantum operations applied. However, this exponential-space requirement can be avoided by using general space-time tradeoffs long known to complexity theorists, which can be appropriately optimized for this particular problem in a way that also illustrates some interesting reformulations of quantum mechanics. In this paper, we describe the design and empirical space/time complexity measurements of a working software prototype of a quantum computer simulator that avoids excessive space requirements. Due to its space-efficiency, this design is well-suited to embedding in single-chip environments, permitting especially fast execution that avoids access latencies to main memory. We plan to prototype our design on a standard FPGA development board.
Design of light-small high-speed image data processing system

NASA Astrophysics Data System (ADS)

Yang, Jinbao; Feng, Xue; Li, Fei

2015-10-01

A light-small high speed image data processing system was designed in order to meet the request of image data processing in aerospace. System was constructed of FPGA, DSP and MCU (Micro-controller), implementing a video compress of 3 million pixels@15frames and real-time return of compressed image to the upper system. Programmable characteristic of FPGA, high performance image compress IC and configurable MCU were made best use to improve integration. Besides, hard-soft board design was introduced and PCB layout was optimized. At last, system achieved miniaturization, light-weight and fast heat dispersion. Experiments show that, system's multifunction was designed correctly and worked stably. In conclusion, system can be widely used in the area of light-small imaging.
Logic design and implementation of FPGA for a high frame rate ultrasound imaging system

NASA Astrophysics Data System (ADS)

Liu, Anjun; Wang, Jing; Lu, Jian-Yu

2002-05-01

Recently, a method has been developed for high frame rate medical imaging [Jian-yu Lu, ``2D and 3D high frame rate imaging with limited diffraction beams,'' IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44(4), 839-856 (1997)]. To realize this method, a complicated system [multiple-channel simultaneous data acquisition, large memory in each channel for storing up to 16 seconds of data at 40 MHz and 12-bit resolution, time-variable-gain (TGC) control, Doppler imaging, harmonic imaging, as well as coded transmissions] is designed. Due to the complexity of the system, field programmable gate array (FPGA) (Xilinx Spartn II) is used. In this presentation, the design and implementation of the FPGA for the system will be reported. This includes the synchronous dynamic random access memory (SDRAM) controller and other system controllers, time sharing for auto-refresh of SDRAMs to reduce peak power, transmission and imaging modality selections, ECG data acquisition and synchronization, 160 MHz delay locked loop (DLL) for accurate timing, and data transfer via either a parallel port or a PCI bus for post image processing. [Work supported in part by Grant 5RO1 HL60301 from NIH.

A high data rate universal lattice decoder on FPGA

NASA Astrophysics Data System (ADS)

Ma, Jing; Huang, Xinming; Kura, Swapna

2005-06-01

This paper presents the architecture design of a high data rate universal lattice decoder for MIMO channels on FPGA platform. A phost strategy based lattice decoding algorithm is modified in this paper to reduce the complexity of the closest lattice point search. The data dependency of the improved algorithm is examined and a parallel and pipeline architecture is developed with the iterative decoding function on FPGA and the division intensive channel matrix preprocessing on DSP. Simulation results demonstrate that the improved lattice decoding algorithm provides better bit error rate and less iteration number compared with the original algorithm. The system prototype of the decoder shows that it supports data rate up to 7Mbit/s on a Virtex2-1000 FPGA, which is about 8 times faster than the original algorithm on FPGA platform and two-orders of magnitude better than its implementation on a DSP platform.
FPGA-based Klystron linearization implementations in scope of ILC

DOE PAGES

Omet, M.; Michizono, S.; Matsumoto, T.; ...

2015-01-23

We report the development and implementation of four FPGA-based predistortion-type klystron linearization algorithms. Klystron linearization is essential for the realization of ILC, since it is required to operate the klystrons 7% in power below their saturation. The work presented was performed in international collaborations at the Fermi National Accelerator Laboratory (FNAL), USA and the Deutsches Elektronen Synchrotron (DESY), Germany. With the newly developed algorithms, the generation of correction factors on the FPGA was improved compared to past algorithms, avoiding quantization and decreasing memory requirements. At FNAL, three algorithms were tested at the Advanced Superconducting Test Accelerator (ASTA), demonstrating a successfulmore » implementation for one algorithm and a proof of principle for two algorithms. Furthermore, the functionality of the algorithm implemented at DESY was demonstrated successfully in a simulation.« less
Estimating the circuit delay of FPGA with a transfer learning method

NASA Astrophysics Data System (ADS)

Cui, Xiuhai; Liu, Datong; Peng, Yu; Peng, Xiyuan

2017-10-01

With the increase of FPGA (Field Programmable Gate Array, FPGA) functionality, FPGA has become an on-chip system platform. Due to increase the complexity of FPGA, estimating the delay of FPGA is a very challenge work. To solve the problems, we propose a transfer learning estimation delay (TLED) method to simplify the delay estimation of different speed grade FPGA. In fact, the same style different speed grade FPGA comes from the same process and layout. The delay has some correlation among different speed grade FPGA. Therefore, one kind of speed grade FPGA is chosen as a basic training sample in this paper. Other training samples of different speed grade can get from the basic training samples through of transfer learning. At the same time, we also select a few target FPGA samples as training samples. A general predictive model is trained by these samples. Thus one kind of estimation model is used to estimate different speed grade FPGA circuit delay. The framework of TRED includes three phases: 1) Building a basic circuit delay library which includes multipliers, adders, shifters, and so on. These circuits are used to train and build the predictive model. 2) By contrasting experiments among different algorithms, the forest random algorithm is selected to train predictive model. 3) The target circuit delay is predicted by the predictive model. The Artix-7, Kintex-7, and Virtex-7 are selected to do experiments. Each of them includes -1, -2, -2l, and -3 different speed grade. The experiments show the delay estimation accuracy score is more than 92% with the TLED method. This result shows that the TLED method is a feasible delay assessment method, especially in the high-level synthesis stage of FPGA tool, which is an efficient and effective delay assessment method.
High-speed polarization sensitive optical coherence tomography for retinal diagnostics

NASA Astrophysics Data System (ADS)

Yin, Biwei; Wang, Bingqing; Vemishetty, Kalyanramu; Nagle, Jim; Liu, Shuang; Wang, Tianyi; Rylander, Henry G., III; Milner, Thomas E.

2012-01-01

We report design and construction of an FPGA-based high-speed swept-source polarization-sensitive optical coherence tomography (SS-PS-OCT) system for clinical retinal imaging. Clinical application of the SS-PS-OCT system is accurate measurement and display of thickness, phase retardation and birefringence maps of the retinal nerve fiber layer (RNFL) in human subjects for early detection of glaucoma. The FPGA-based SS-PS-OCT system provides three incident polarization states on the eye and uses a bulk-optic polarization sensitive balanced detection module to record two orthogonal interference fringe signals. Interference fringe signals and relative phase retardation between two orthogonal polarization states are used to obtain Stokes vectors of light returning from each RNFL depth. We implement a Levenberg-Marquardt algorithm on a Field Programmable Gate Array (FPGA) to compute accurate phase retardation and birefringence maps. For each retinal scan, a three-state Levenberg-Marquardt nonlinear algorithm is applied to 360 clusters each consisting of 100 A-scans to determine accurate maps of phase retardation and birefringence in less than 1 second after patient measurement allowing real-time clinical imaging-a speedup of more than 300 times over previous implementations. We report application of the FPGA-based SS-PS-OCT system for real-time clinical imaging of patients enrolled in a clinical study at the Eye Institute of Austin and Duke Eye Center.
A high performance hardware implementation image encryption with AES algorithm

NASA Astrophysics Data System (ADS)

Farmani, Ali; Jafari, Mohamad; Miremadi, Seyed Sohrab

2011-06-01

This paper describes implementation of a high-speed encryption algorithm with high throughput for encrypting the image. Therefore, we select a highly secured symmetric key encryption algorithm AES(Advanced Encryption Standard), in order to increase the speed and throughput using pipeline technique in four stages, control unit based on logic gates, optimal design of multiplier blocks in mixcolumn phase and simultaneous production keys and rounds. Such procedure makes AES suitable for fast image encryption. Implementation of a 128-bit AES on FPGA of Altra company has been done and the results are as follow: throughput, 6 Gbps in 471MHz. The time of encrypting in tested image with 32*32 size is 1.15ms.
Implementation of 4-way Superscalar Hash MIPS Processor Using FPGA

NASA Astrophysics Data System (ADS)

Sahib Omran, Safaa; Fouad Jumma, Laith

2018-05-01

Due to the quick advancements in the personal communications systems and wireless communications, giving data security has turned into a more essential subject. This security idea turns into a more confounded subject when next-generation system requirements and constant calculation speed are considered in real-time. Hash functions are among the most essential cryptographic primitives and utilized as a part of the many fields of signature authentication and communication integrity. These functions are utilized to acquire a settled size unique fingerprint or hash value of an arbitrary length of message. In this paper, Secure Hash Algorithms (SHA) of types SHA-1, SHA-2 (SHA-224, SHA-256) and SHA-3 (BLAKE) are implemented on Field-Programmable Gate Array (FPGA) in a processor structure. The design is described and implemented using a hardware description language, namely VHSIC “Very High Speed Integrated Circuit” Hardware Description Language (VHDL). Since the logical operation of the hash types of (SHA-1, SHA-224, SHA-256 and SHA-3) are 32-bits, so a Superscalar Hash Microprocessor without Interlocked Pipelines (MIPS) processor are designed with only few instructions that were required in invoking the desired Hash algorithms, when the four types of hash algorithms executed sequentially using the designed processor, the total time required equal to approximately 342 us, with a throughput of 4.8 Mbps while the required to execute the same four hash algorithms using the designed four-way superscalar is reduced to 237 us with improved the throughput to 5.1 Mbps.
Single software platform used for high speed data transfer implementation in a 65k pixel camera working in single photon counting mode

NASA Astrophysics Data System (ADS)

Maj, P.; Kasiński, K.; Gryboś, P.; Szczygieł, R.; Kozioł, A.

2015-12-01

Integrated circuits designed for specific applications generally use non-standard communication methods. Hybrid pixel detector readout electronics produces a huge amount of data as a result of number of frames per seconds. The data needs to be transmitted to a higher level system without limiting the ASIC's capabilities. Nowadays, the Camera Link interface is still one of the fastest communication methods, allowing transmission speeds up to 800 MB/s. In order to communicate between a higher level system and the ASIC with a dedicated protocol, an FPGA with dedicated code is required. The configuration data is received from the PC and written to the ASIC. At the same time, the same FPGA should be able to transmit the data from the ASIC to the PC at the very high speed. The camera should be an embedded system enabling autonomous operation and self-monitoring. In the presented solution, at least three different hardware platforms are used—FPGA, microprocessor with real-time operating system and the PC with end-user software. We present the use of a single software platform for high speed data transfer from 65k pixel camera to the personal computer.
High speed true random number generator with a new structure of coarse-tuning PDL in FPGA

NASA Astrophysics Data System (ADS)

Fang, Hongzhen; Wang, Pengjun; Cheng, Xu; Zhou, Keji

2018-03-01

A metastability-based TRNG (true random number generator) is presented in this paper, and implemented in FPGA. The metastable state of a D flip-flop is tunable through a two-stage PDL (programmable delay line). With the proposed coarse-tuning PDL structure, the TRNG core does not require extra placement and routing to ensure its entropy. Furthermore, the core needs fewer stages of coarse-tuning PDL at higher operating frequency, and thus saves more resources in FPGA. The designed TRNG achieves 25 Mbps @ 100 MHz throughput after proper post-processing, which is several times higher than other previous TRNGs based on FPGA. Moreover, the robustness of the system is enhanced with the adoption of a feedback system. The quality of the designed TRNG is verified by NIST (National Institute of Standards and Technology) and also accepted by class P1 of the AIS-20/31 test suite. Project supported by the S&T Plan of Zhejiang Provincial Science and Technology Department (No. 2016C31078), the National Natural Science Foundation of China (Nos. 61574041, 61474068, 61234002), and the K.C. Wong Magna Fund in Ningbo University, China.
Hardware Design and Implementation of Fixed-Width Standard and Truncated 4×4, 6×6, 8×8 and 12×12-BIT Multipliers Using Fpga

NASA Astrophysics Data System (ADS)

Rais, Muhammad H.

2010-06-01

This paper presents Field Programmable Gate Array (FPGA) implementation of standard and truncated multipliers using Very High Speed Integrated Circuit Hardware Description Language (VHDL). Truncated multiplier is a good candidate for digital signal processing (DSP) applications such as finite impulse response (FIR) and discrete cosine transform (DCT). Remarkable reduction in FPGA resources, delay, and power can be achieved using truncated multipliers instead of standard parallel multipliers when the full precision of the standard multiplier is not required. The truncated multipliers show significant improvement as compared to standard multipliers. Results show that the anomaly in Spartan-3 AN average connection and maximum pin delay have been efficiently reduced in Virtex-4 device.
The effect of structural design parameters on FPGA-based feed-forward space-time trellis coding-orthogonal frequency division multiplexing channel encoders

NASA Astrophysics Data System (ADS)

Passas, Georgios; Freear, Steven; Fawcett, Darren

2010-08-01

Orthogonal frequency division multiplexing (OFDM)-based feed-forward space-time trellis code (FFSTTC) encoders can be synthesised as very high speed integrated circuit hardware description language (VHDL) designs. Evaluation of their FPGA implementation can lead to conclusions that help a designer to decide the optimum implementation, given the encoder structural parameters. VLSI architectures based on 1-bit multipliers and look-up tables (LUTs) are compared in terms of FPGA slices and block RAMs (area), as well as in terms of minimum clock period (speed). Area and speed graphs versus encoder memory order are provided for quadrature phase shift keying (QPSK) and 8 phase shift keying (8-PSK) modulation and two transmit antennas, revealing best implementation under these conditions. The effect of number of modulation bits and transmit antennas on the encoder implementation complexity is also investigated.
FPGA Boot Loader and Scrubber

NASA Technical Reports Server (NTRS)

Wade, Randall S.; Jones, Bailey

2009-01-01

A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), reads back and verifies that code, reloads the code if an error is detected, and monitors the performance of the FPGA for errors in the presence of radiation. The program consists mainly of a set of VHDL files (wherein "VHDL" signifies "VHSIC Hardware Description Language" and "VHSIC" signifies "very-high-speed integrated circuit").
FPGA Implementation of an Efficient Algorithm for the Calculation of Charged Particle Trajectories in Cosmic Ray Detectors

NASA Astrophysics Data System (ADS)

Villar, Xabier; Piso, Daniel; Bruguera, Javier D.

2014-02-01

This paper presents an FPGA implementation of an algorithm, previously published, for the the reconstruction of cosmic rays' trajectories and the determination of the time of arrival and velocity of the particles. The accuracy and precision issues of the algorithm have been analyzed to propose a suitable implementation. Thus, a 32-bit fixed-point format has been used for the representation of the data values. Moreover, the dependencies among the different operations have been taken into account to obtain a highly parallel and efficient hardware implementation. The final hardware architecture requires 18 cycles to process every particle, and has been exhaustively simulated to validate all the design decisions. The architecture has been mapped over different commercial FPGAs, with a frequency of operation ranging from 300 MHz to 1.3 GHz, depending on the FPGA being used. Consequently, the number of particle trajectories processed per second is between 16 million and 72 million. The high number of particle trajectories calculated per second shows that the proposed FPGA implementation might be used also in high rate environments such as those found in particle and nuclear physics experiments.
A novel FPGA-programmable switch matrix interconnection element in quantum-dot cellular automata

NASA Astrophysics Data System (ADS)

Hashemi, Sara; Rahimi Azghadi, Mostafa; Zakerolhosseini, Ali; Navi, Keivan

2015-04-01

The Quantum-dot cellular automata (QCA) is a novel nanotechnology, promising extra low-power, extremely dense and very high-speed structure for the construction of logical circuits at a nanoscale. In this paper, initially previous works on QCA-based FPGA's routing elements are investigated, and then an efficient, symmetric and reliable QCA programmable switch matrix (PSM) interconnection element is introduced. This element has a simple structure and offers a complete routing capability. It is implemented using a bottom-up design approach that starts from a dense and high-speed 2:1 multiplexer and utilise it to build the target PSM interconnection element. In this study, simulations of the proposed circuits are carried out using QCAdesigner, a layout and simulation tool for QCA circuits. The results demonstrate high efficiency of the proposed designs in QCA-based FPGA routing.
PCIE interface design for high-speed image storage system based on SSD

NASA Astrophysics Data System (ADS)

Wang, Shiming

2015-02-01

This paper proposes and implements a standard interface of miniaturized high-speed image storage system, which combines PowerPC with FPGA and utilizes PCIE bus as the high speed switching channel. Attached to the PowerPC, mSATA interface SSD(Solid State Drive) realizes RAID3 array storage. At the same time, a high-speed real-time image compression patent IP core also can be embedded in FPGA, which is in the leading domestic level with compression rate and image quality, making that the system can record higher image data rate or achieve longer recording time. The notebook memory card buckle type design is used in the mSATA interface SSD, which make it possible to complete the replacement in 5 seconds just using single hand, thus the total length of repeated recordings is increased. MSI (Message Signaled Interrupts) interruption guarantees the stability and reliability of continuous DMA transmission. Furthermore, only through the gigabit network, the remote display, control and upload to backup function can be realized. According to an optional 25 frame/s or 30 frame/s, upload speeds can be up to more than 84 MB/s. Compared with the existing FLASH array high-speed memory systems, it has higher degree of modularity, better stability and higher efficiency on development, maintenance and upgrading. Its data access rate is up to 300MB/s, realizing the high speed image storage system miniaturization, standardization and modularization, thus it is fit for image acquisition, storage and real-time transmission to server on mobile equipment.
Implementation of a pulse coupled neural network in FPGA.

PubMed

Waldemark, J; Millberg, M; Lindblad, T; Waldemark, K; Becanovic, V

2000-06-01

The Pulse Coupled neural network, PCNN, is a biologically inspired neural net and it can be used in various image analysis applications, e.g. time-critical applications in the field of image pre-processing like segmentation, filtering, etc. a VHDL implementation of the PCNN targeting FPGA was undertaken and the results presented here. The implementation contains many interesting features. By pipelining the PCNN structure a very high throughput of 55 million neuron iterations per second could be achieved. By making the coefficients re-configurable during operation, a complete recognition system could be implemented on one, or maybe two, chip(s). Reconsidering the ranges and resolutions of the constants may save a lot of hardware, since the higher resolution requires larger multipliers, adders, memories etc.
Implementation of a high precision multi-measurement time-to-digital convertor on a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Kuang, Jie; Wang, Yonggang; Cao, Qiang; Liu, Chong

2018-05-01

Time-to-digital convertors (TDCs) based on field programmable gate array (FPGA) are becoming more and more popular. Multi-measurement is an effective method to improve TDC precision beyond the cell delay limitation. However, the implementation of TDC with multi-measurement on FPGAs manufactured with 28 nm and more advanced process is facing new challenges. Benefiting from the ones-counter encoding scheme, which was developed in our previous work, we implement a ring oscillator multi-measurement TDC on a Xilinx Kintex-7 FPGA. Using the two TDC channels to measure time-intervals in the range (0 ns-30 ns), the average RMS precision can be improved to 5.76 ps, meanwhile the logic resource usage remains the same with the one-measurement TDC, and the TDC dead time is only 22 ns. The investigation demonstrates that the multi-measurement methods are still available for current main-stream FPGAs. Furthermore, the new implementation in this paper could make the trade-off among the time precision, resource usage and TDC dead time better than ever before.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

PubMed

Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

2013-01-01

DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
The dynamical analysis of modified two-compartment neuron model and FPGA implementation

NASA Astrophysics Data System (ADS)

Lin, Qianjin; Wang, Jiang; Yang, Shuangming; Yi, Guosheng; Deng, Bin; Wei, Xile; Yu, Haitao

2017-10-01

The complexity of neural models is increasing with the investigation of larger biological neural network, more various ionic channels and more detailed morphologies, and the implementation of biological neural network is a task with huge computational complexity and power consumption. This paper presents an efficient digital design using piecewise linearization on field programmable gate array (FPGA), to succinctly implement the reduced two-compartment model which retains essential features of more complicated models. The design proposes an approximate neuron model which is composed of a set of piecewise linear equations, and it can reproduce different dynamical behaviors to depict the mechanisms of a single neuron model. The consistency of hardware implementation is verified in terms of dynamical behaviors and bifurcation analysis, and the simulation results including varied ion channel characteristics coincide with the biological neuron model with a high accuracy. Hardware synthesis on FPGA demonstrates that the proposed model has reliable performance and lower hardware resource compared with the original two-compartment model. These investigations are conducive to scalability of biological neural network in reconfigurable large-scale neuromorphic system.
Parallel Hough Transform-Based Straight Line Detection and Its FPGA Implementation in Embedded Vision

PubMed Central

Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam

2013-01-01

Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness. PMID:23867746
Parallel Hough Transform-based straight line detection and its FPGA implementation in embedded vision.

PubMed

Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam

2013-07-17

Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.

FPGA implementation of self organizing map with digital phase locked loops.

PubMed

Hikawa, Hiroomi

2005-01-01

The self-organizing map (SOM) has found applicability in a wide range of application areas. Recently new SOM hardware with phase modulated pulse signal and digital phase-locked loops (DPLLs) has been proposed (Hikawa, 2005). The system uses the DPLL as a computing element since the operation of the DPLL is very similar to that of SOM's computation. The system also uses square waveform phase to hold the value of the each input vector element. This paper discuss the hardware implementation of the DPLL SOM architecture. For effective hardware implementation, some components are redesigned to reduce the circuit size. The proposed SOM architecture is described in VHDL and implemented on field programmable gate array (FPGA). Its feasibility is verified by experiments. Results show that the proposed SOM implemented on the FPGA has a good quantization capability, and its circuit size very small.
FPGA Implementation of Metastability-Based True Random Number Generator

NASA Astrophysics Data System (ADS)

Hata, Hisashi; Ichikawa, Shuichi

True random number generators (TRNGs) are important as a basis for computer security. Though there are some TRNGs composed of analog circuit, the use of digital circuits is desired for the application of TRNGs to logic LSIs. Some of the digital TRNGs utilize jitter in free-running ring oscillators as a source of entropy, which consume large power. Another type of TRNG exploits the metastability of a latch to generate entropy. Although this kind of TRNG has been mostly implemented with full-custom LSI technology, this study presents an implementation based on common FPGA technology. Our TRNG is comprised of logic gates only, and can be integrated in any kind of logic LSI. The RS latch in our TRNG is implemented as a hard-macro to guarantee the quality of randomness by minimizing the signal skew and load imbalance of internal nodes. To improve the quality and throughput, the output of 64-256 latches are XOR'ed. The derived design was verified on a Xilinx Virtex-4 FPGA (XC4VFX20), and passed NIST statistical test suite without post-processing. Our TRNG with 256 latches occupies 580 slices, while achieving 12.5Mbps throughput.
High-Speed Digital Scan Converter for High-Frequency Ultrasound Sector Scanners

PubMed Central

Chang, Jin Ho; Yen, Jesse T.; Shung, K. Kirk

2008-01-01

This paper presents a high-speed digital scan converter (DSC) capable of providing more than 400 images per second, which is necessary to examine the activities of the mouse heart whose rate is 5–10 beats per second. To achieve the desired high-speed performance in cost-effective manner, the DSC developed adopts a linear interpolation algorithm in which two nearest samples to each object pixel of a monitor are selected and only angular interpolation is performed. Through computer simulation with the Field II program, its accuracy was investigated by comparing it to that of bilinear interpolation known as the best algorithm in terms of accuracy and processing speed. The simulation results show that the linear interpolation algorithm is capable of providing an acceptable image quality, which means that the difference of the root mean square error (RMSE) values of the linear and bilinear interpolation algorithms is below 1 %, if the sample rate of the envelope samples is at least four times higher than the Nyquist rate for the baseband component of echo signals. The designed DSC was implemented with a single FPGA (Stratix EP1S60F1020C6, Altera Corporation, San Jose, CA) on a DSC board that is a part of a high-speed ultrasound imaging system developed. The temporal and spatial resolutions of the implemented DSC were evaluated by examining its maximum processing time with a time stamp indicating when an image is completely formed and wire phantom testing, respectively. The experimental results show that the implemented DSC is capable of providing images at the rate of 400 images per second with negligible processing error. PMID:18430449
FPGA-Based Efficient Hardware/Software Co-Design for Industrial Systems with Consideration of Output Selection

NASA Astrophysics Data System (ADS)

Deliparaschos, Kyriakos M.; Michail, Konstantinos; Zolotas, Argyrios C.; Tzafestas, Spyros G.

2016-05-01

This work presents a field programmable gate array (FPGA)-based embedded software platform coupled with a software-based plant, forming a hardware-in-the-loop (HIL) that is used to validate a systematic sensor selection framework. The systematic sensor selection framework combines multi-objective optimization, linear-quadratic-Gaussian (LQG)-type control, and the nonlinear model of a maglev suspension. A robustness analysis of the closed-loop is followed (prior to implementation) supporting the appropriateness of the solution under parametric variation. The analysis also shows that quantization is robust under different controller gains. While the LQG controller is implemented on an FPGA, the physical process is realized in a high-level system modeling environment. FPGA technology enables rapid evaluation of the algorithms and test designs under realistic scenarios avoiding heavy time penalty associated with hardware description language (HDL) simulators. The HIL technique facilitates significant speed-up in the required execution time when compared to its software-based counterpart model.
High-speed real-time OFDM transmission based on FPGA

NASA Astrophysics Data System (ADS)

Xiao, Xin; Li, Fan; Yu, Jianjun

2016-02-01

In this paper, we review our recent research progresses on real-time orthogonal frequency division multiplexing (OFDM) transmission based on FPGA. We successfully demonstrated four-channel wavelength-division multiplexing (WDM) 256.51Gb/s 16-ary quadrature amplitude modulation (16QAM)-OFDM signal transmission system for short-reach optical amplifier free inter-connection with real-time reception. Four optical carriers are modulated by four different 16QAM-OFDM signals via 10G-class direct modulation lasers (DMLs). We achieved highest capacity real-time reception optical OFDM signal transmission over 2.4-km SMF with the bit-error ratio (BER) under soft-decision forward error correction (SD-FEC) limitation of 2.4×10-2. In order to achieve higher spectrum efficiency (SE), we demonstrate 4-channel high level QAM-OFDM transmission over 20-km SMF-28 with real-time reception. 58.72-Gb/s 256QAM-OFDM and 56.4-Gb/s 128QAM-OFDM signal transmission within 25-GHz grid is achieved with the BER under 2.4×10-2 and real-time reception.
Parallel Fixed Point Implementation of a Radial Basis Function Network in an FPGA

PubMed Central

de Souza, Alisson C. D.; Fernandes, Marcelo A. C.

2014-01-01

This paper proposes a parallel fixed point radial basis function (RBF) artificial neural network (ANN), implemented in a field programmable gate array (FPGA) trained online with a least mean square (LMS) algorithm. The processing time and occupied area were analyzed for various fixed point formats. The problems of precision of the ANN response for nonlinear classification using the XOR gate and interpolation using the sine function were also analyzed in a hardware implementation. The entire project was developed using the System Generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA. PMID:25268918
AES Cardless Automatic Teller Machine (ATM) Biometric Security System Design Using FPGA Implementation

NASA Astrophysics Data System (ADS)

Ahmad, Nabihah; Rifen, A. Aminurdin M.; Helmy Abd Wahab, Mohd

2016-11-01

Automated Teller Machine (ATM) is an electronic banking outlet that allows bank customers to complete a banking transactions without the aid of any bank official or teller. Several problems are associated with the use of ATM card such card cloning, card damaging, card expiring, cast skimming, cost of issuance and maintenance and accessing customer account by third parties. The aim of this project is to give a freedom to the user by changing the card to biometric security system to access the bank account using Advanced Encryption Standard (AES) algorithm. The project is implemented using Field Programmable Gate Array (FPGA) DE2-115 board with Cyclone IV device, fingerprint scanner, and Multi-Touch Liquid Crystal Display (LCD) Second Edition (MTL2) using Very High Speed Integrated Circuit Hardware (VHSIC) Description Language (VHDL). This project used 128-bits AES for recommend the device with the throughput around 19.016Gbps and utilized around 520 slices. This design offers a secure banking transaction with a low rea and high performance and very suited for restricted space environments for small amounts of RAM or ROM where either encryption or decryption is performed.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark
A digitalized silicon microgyroscope based on embedded FPGA.

PubMed

Xia, Dunzhu; Yu, Cheng; Wang, Yuliang

2012-09-27

This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system.
A Digitalized Silicon Microgyroscope Based on Embedded FPGA

PubMed Central

Xia, Dunzhu; Yu, Cheng; Wang, Yuliang

2012-01-01

This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system. PMID:23201990
FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling.

PubMed

Kim, Chang-Min; Park, Hyung-Min; Kim, Taesu; Choi, Yoon-Kyung; Lee, Soo-Young

2003-01-01

An field programmable gate array (FPGA) implementation of independent component analysis (ICA) algorithm is reported for blind signal separation (BSS) and adaptive noise canceling (ANC) in real time. In order to provide enormous computing power for ICA-based algorithms with multipath reverberation, a special digital processor is designed and implemented in FPGA. The chip design fully utilizes modular concept and several chips may be put together for complex applications with a large number of noise sources. Experimental results with a fabricated test board are reported for ANC only, BSS only, and simultaneous ANC/BSS, which demonstrates successful speech enhancement in real environments in real time.
VHDL Descriptions for the FPGA Implementation of PWL-Function-Based Multi-Scroll Chaotic Oscillators

PubMed Central

2016-01-01

Nowadays, chaos generators are an attractive field for research and the challenge is their realization for the development of engineering applications. From more than three decades ago, chaotic oscillators have been designed using discrete electronic devices, very few with integrated circuit technology, and in this work we propose the use of field-programmable gate arrays (FPGAs) for fast prototyping. FPGA-based applications require that one be expert on programming with very-high-speed integrated circuits hardware description language (VHDL). In this manner, we detail the VHDL descriptions of chaos generators for fast prototyping from high-level programming using Python. The cases of study are three kinds of chaos generators based on piecewise-linear (PWL) functions that can be systematically augmented to generate even and odd number of scrolls. We introduce new algorithms for the VHDL description of PWL functions like saturated functions series, negative slopes and sawtooth. The generated VHDL-code is portable, reusable and open source to be synthesized in an FPGA. Finally, we show experimental results for observing 2, 10 and 30-scroll attractors. PMID:27997930
VHDL Descriptions for the FPGA Implementation of PWL-Function-Based Multi-Scroll Chaotic Oscillators.

PubMed

Tlelo-Cuautle, Esteban; Quintas-Valles, Antonio de Jesus; de la Fraga, Luis Gerardo; Rangel-Magdaleno, Jose de Jesus

2016-01-01

Nowadays, chaos generators are an attractive field for research and the challenge is their realization for the development of engineering applications. From more than three decades ago, chaotic oscillators have been designed using discrete electronic devices, very few with integrated circuit technology, and in this work we propose the use of field-programmable gate arrays (FPGAs) for fast prototyping. FPGA-based applications require that one be expert on programming with very-high-speed integrated circuits hardware description language (VHDL). In this manner, we detail the VHDL descriptions of chaos generators for fast prototyping from high-level programming using Python. The cases of study are three kinds of chaos generators based on piecewise-linear (PWL) functions that can be systematically augmented to generate even and odd number of scrolls. We introduce new algorithms for the VHDL description of PWL functions like saturated functions series, negative slopes and sawtooth. The generated VHDL-code is portable, reusable and open source to be synthesized in an FPGA. Finally, we show experimental results for observing 2, 10 and 30-scroll attractors.
Intelligent FPGA Data Acquisition Framework

NASA Astrophysics Data System (ADS)

Bai, Yunpeng; Gaisbauer, Dominic; Huber, Stefan; Konorov, Igor; Levit, Dmytro; Steffen, Dominik; Paul, Stephan

2017-06-01

In this paper, we present the field programmable gate arrays (FPGA)-based framework intelligent FPGA data acquisition (IFDAQ), which is used for the development of DAQ systems for detectors in high-energy physics. The framework supports Xilinx FPGA and provides a collection of IP cores written in very high speed integrated circuit hardware description language, which use the common interconnect interface. The IP core library offers functionality required for the development of the full DAQ chain. The library consists of Serializer/Deserializer (SERDES)-based time-to-digital conversion channels, an interface to a multichannel 80-MS/s 10-b analog-digital conversion, data transmission, and synchronization protocol between FPGAs, event builder, and slow control. The functionality is distributed among FPGA modules built in the AMC form factor: front end and data concentrator. This modular design also helps to scale and adapt the DAQ system to the needs of the particular experiment. The first application of the IFDAQ framework is the upgrade of the read-out electronics for the drift chambers and the electromagnetic calorimeters (ECALs) of the COMPASS experiment at CERN. The framework will be presented and discussed in the context of this paper.
Parallel point-multiplication architecture using combined group operations for high-speed cryptographic applications.

PubMed

Hossain, Md Selim; Saeedi, Ehsan; Kong, Yinan

2017-01-01

In this paper, we propose a novel parallel architecture for fast hardware implementation of elliptic curve point multiplication (ECPM), which is the key operation of an elliptic curve cryptography processor. The point multiplication over binary fields is synthesized on both FPGA and ASIC technology by designing fast elliptic curve group operations in Jacobian projective coordinates. A novel combined point doubling and point addition (PDPA) architecture is proposed for group operations to achieve high speed and low hardware requirements for ECPM. It has been implemented over the binary field which is recommended by the National Institute of Standards and Technology (NIST). The proposed ECPM supports two Koblitz and random curves for the key sizes 233 and 163 bits. For group operations, a finite-field arithmetic operation, e.g. multiplication, is designed on a polynomial basis. The delay of a 233-bit point multiplication is only 3.05 and 3.56 μs, in a Xilinx Virtex-7 FPGA, for Koblitz and random curves, respectively, and 0.81 μs in an ASIC 65-nm technology, which are the fastest hardware implementation results reported in the literature to date. In addition, a 163-bit point multiplication is also implemented in FPGA and ASIC for fair comparison which takes around 0.33 and 0.46 μs, respectively. The area-time product of the proposed point multiplication is very low compared to similar designs. The performance ([Formula: see text]) and Area × Time × Energy (ATE) product of the proposed design are far better than the most significant studies found in the literature.
A minimal SATA III Host Controller based on FPGA

NASA Astrophysics Data System (ADS)

Liu, Hailiang

2018-03-01

SATA (Serial Advanced Technology Attachment) is an advanced serial bus which has a outstanding performance in transmitting high speed real-time data applied in Personal Computers, Financial Industry, astronautics and aeronautics, etc. In this express, a minimal SATA III Host Controller based on Xilinx Kintex 7 serial FPGA is designed and implemented. Compared to the state-of-art, registers utilization are reduced 25.3% and LUTs utilization are reduced 65.9%. According to the experimental results, the controller works precisely and steady with the reading bandwidth of up to 536 MB per second and the writing bandwidth of up to 512 MB per second, both of which are close to the maximum bandwidth of the SSD(Solid State Disk) device. The host controller is very suitable for high speed data transmission and mass data storage.
Software interface for high-speed readout of particle detectors based on the CoaXPress communication standard

NASA Astrophysics Data System (ADS)

Hejtmánek, M.; Neue, G.; Voleš, P.

2015-06-01

This article is devoted to the software design and development of a high-speed readout application used for interfacing particle detectors via the CoaXPress communication standard. The CoaXPress provides an asymmetric high-speed serial connection over a single coaxial cable. It uses a widely available 75 Ω BNC standard and can operate in various modes with a data throughput ranging from 1.25 Gbps up to 25 Gbps. Moreover, it supports a low speed uplink with a fixed bit rate of 20.833 Mbps, which can be used to control and upload configuration data to the particle detector. The CoaXPress interface is an upcoming standard in medical imaging, therefore its usage promises long-term compatibility and versatility. This work presents an example of how to develop DAQ system for a pixel detector. For this purpose, a flexible DAQ card was developed using the XILINX Spartan 6 FPGA. The DAQ card is connected to the framegrabber FireBird CXP6 Quad, which is plugged in the PCI Express bus of the standard PC. The data transmission was performed between the FPGA and framegrabber card via the standard coaxial cable in communication mode with a bit rate of 3.125 Gbps. Using the Medipix2 Quad pixel detector, the framerate of 100 fps was achieved. The front-end application makes use of the FireBird framegrabber software development kit and is suitable for data acquisition as well as control of the detector through the registers implemented in the FPGA.
Research of x-ray nondestructive detector for high-speed running conveyor belt with steel wire ropes

NASA Astrophysics Data System (ADS)

Wang, Junfeng; Miao, Changyun; Wang, Wei; Lu, Xiaocui

2008-03-01

An X-ray nondestructive detector for high-speed running conveyor belt with steel wire ropes is researched in the paper. The principle of X-ray nondestructive testing (NDT) is analyzed, the general scheme of the X-ray nondestructive testing system is proposed, and the nondestructive detector for high-speed running conveyor belt with steel wire ropes is developed. The hardware of system is designed with Xilinx's VIRTEX-4 FPGA that embeds PowerPC and MAC IP core, and its network communication software based on TCP/IP protocol is programmed by loading LwIP to PowerPC. The nondestructive testing of high-speed conveyor belt with steel wire ropes and network transfer function are implemented. It is a strong real-time system with rapid scanning speed, high reliability and remotely nondestructive testing function. The nondestructive detector can be applied to the detection of product line in industry.
FPGA implementation of advanced FEC schemes for intelligent aggregation networks

NASA Astrophysics Data System (ADS)

Zou, Ding; Djordjevic, Ivan B.

2016-02-01

In state-of-the-art fiber-optics communication systems the fixed forward error correction (FEC) and constellation size are employed. While it is important to closely approach the Shannon limit by using turbo product codes (TPC) and low-density parity-check (LDPC) codes with soft-decision decoding (SDD) algorithm; rate-adaptive techniques, which enable increased information rates over short links and reliable transmission over long links, are likely to become more important with ever-increasing network traffic demands. In this invited paper, we describe a rate adaptive non-binary LDPC coding technique, and demonstrate its flexibility and good performance exhibiting no error floor at BER down to 10-15 in entire code rate range, by FPGA-based emulation, making it a viable solution in the next-generation high-speed intelligent aggregation networks.
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)

PubMed Central

Li, Isaac TS; Shum, Warren; Truong, Kevin

2007-01-01

Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593

160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).

PubMed

Li, Isaac T S; Shum, Warren; Truong, Kevin

2007-06-07

To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Design and implementation of a programming circuit in radiation-hardened FPGA

NASA Astrophysics Data System (ADS)

Lihua, Wu; Xiaowei, Han; Yan, Zhao; Zhongli, Liu; Fang, Yu; Chen, Stanley L.

2011-08-01

We present a novel programming circuit used in our radiation-hardened field programmable gate array (FPGA) chip. This circuit provides the ability to write user-defined configuration data into an FPGA and then read it back. The proposed circuit adopts the direct-access programming point scheme instead of the typical long token shift register chain. It not only saves area but also provides more flexible configuration operations. By configuring the proposed partial configuration control register, our smallest configuration section can be conveniently configured as a single data and a flexible partial configuration can be easily implemented. The hierarchical simulation scheme, optimization of the critical path and the elaborate layout plan make this circuit work well. Also, the radiation hardened by design programming point is introduced. This circuit has been implemented in a static random access memory (SRAM)-based FPGA fabricated by a 0.5 μm partial-depletion silicon-on-insulator CMOS process. The function test results of the fabricated chip indicate that this programming circuit successfully realizes the desired functions in the configuration and read-back. Moreover, the radiation test results indicate that the programming circuit has total dose tolerance of 1 × 105 rad(Si), dose rate survivability of 1.5 × 1011 rad(Si)/s and neutron fluence immunity of 1 × 1014 n/cm2.
Design and reliability analysis of high-speed and continuous data recording system based on disk array

NASA Astrophysics Data System (ADS)

Jiang, Changlong; Ma, Cheng; He, Ning; Zhang, Xugang; Wang, Chongyang; Jia, Huibo

2002-12-01

In many real-time fields the sustained high-speed data recording system is required. This paper proposes a high-speed and sustained data recording system based on the complex-RAID 3+0. The system consists of Array Controller Module (ACM), String Controller Module (SCM) and Main Controller Module (MCM). ACM implemented by an FPGA chip is used to split the high-speed incoming data stream into several lower-speed streams and generate one parity code stream synchronously. It also can inversely recover the original data stream while reading. SCMs record lower-speed streams from the ACM into the SCSI disk drivers. In the SCM, the dual-page buffer technology is adopted to implement speed-matching function and satisfy the need of sustainable recording. MCM monitors the whole system, controls ACM and SCMs to realize the data stripping, reconstruction, and recovery functions. The method of how to determine the system scale is presented. At the end, two new ways Floating Parity Group (FPG) and full 2D-Parity Group (full 2D-PG) are proposed to improve the system reliability and compared with the Traditional Parity Group (TPG). This recording system can be used conveniently in many areas of data recording, storing, playback and remote backup with its high-reliability.
FPGA-accelerated adaptive optics wavefront control

NASA Astrophysics Data System (ADS)

Mauch, S.; Reger, J.; Reinlein, C.; Appelfelder, M.; Goy, M.; Beckert, E.; Tünnermann, A.

2014-03-01

The speed of real-time adaptive optical systems is primarily restricted by the data processing hardware and computational aspects. Furthermore, the application of mirror layouts with increasing numbers of actuators reduces the bandwidth (speed) of the system and, thus, the number of applicable control algorithms. This burden turns out a key-impediment for deformable mirrors with continuous mirror surface and highly coupled actuator influence functions. In this regard, specialized hardware is necessary for high performance real-time control applications. Our approach to overcome this challenge is an adaptive optics system based on a Shack-Hartmann wavefront sensor (SHWFS) with a CameraLink interface. The data processing is based on a high performance Intel Core i7 Quadcore hard real-time Linux system. Employing a Xilinx Kintex-7 FPGA, an own developed PCie card is outlined in order to accelerate the analysis of a Shack-Hartmann Wavefront Sensor. A recently developed real-time capable spot detection algorithm evaluates the wavefront. The main features of the presented system are the reduction of latency and the acceleration of computation For example, matrix multiplications which in general are of complexity O(n3 are accelerated by using the DSP48 slices of the field-programmable gate array (FPGA) as well as a novel hardware implementation of the SHWFS algorithm. Further benefits are the Streaming SIMD Extensions (SSE) which intensively use the parallelization capability of the processor for further reducing the latency and increasing the bandwidth of the closed-loop. Due to this approach, up to 64 actuators of a deformable mirror can be handled and controlled without noticeable restriction from computational burdens.
Parallel point-multiplication architecture using combined group operations for high-speed cryptographic applications

PubMed Central

Saeedi, Ehsan; Kong, Yinan

2017-01-01

In this paper, we propose a novel parallel architecture for fast hardware implementation of elliptic curve point multiplication (ECPM), which is the key operation of an elliptic curve cryptography processor. The point multiplication over binary fields is synthesized on both FPGA and ASIC technology by designing fast elliptic curve group operations in Jacobian projective coordinates. A novel combined point doubling and point addition (PDPA) architecture is proposed for group operations to achieve high speed and low hardware requirements for ECPM. It has been implemented over the binary field which is recommended by the National Institute of Standards and Technology (NIST). The proposed ECPM supports two Koblitz and random curves for the key sizes 233 and 163 bits. For group operations, a finite-field arithmetic operation, e.g. multiplication, is designed on a polynomial basis. The delay of a 233-bit point multiplication is only 3.05 and 3.56 μs, in a Xilinx Virtex-7 FPGA, for Koblitz and random curves, respectively, and 0.81 μs in an ASIC 65-nm technology, which are the fastest hardware implementation results reported in the literature to date. In addition, a 163-bit point multiplication is also implemented in FPGA and ASIC for fair comparison which takes around 0.33 and 0.46 μs, respectively. The area-time product of the proposed point multiplication is very low compared to similar designs. The performance (1Area×Time=1AT) and Area × Time × Energy (ATE) product of the proposed design are far better than the most significant studies found in the literature. PMID:28459831
Tethered Forth system for FPGA applications

NASA Astrophysics Data System (ADS)

Goździkowski, Paweł; Zabołotny, Wojciech M.

2013-10-01

This paper presents the tethered Forth system dedicated for testing and debugging of FPGA based electronic systems. Use of the Forth language allows to interactively develop and run complex testing or debugging routines. The solution is based on a small, 16-bit soft core CPU, used to implement the Forth Virtual Machine. Thanks to the use of the tethered Forth model it is possible to minimize usage of the internal RAM memory in the FPGA. The function of the intelligent terminal, which is an essential part of the tethered Forth system, may be fulfilled by the standard PC computer or by the smartphone. System is implemented in Python (the software for intelligent terminal), and in VHDL (the IP core for FPGA), so it can be easily ported to different hardware platforms. The connection between the terminal and FPGA may be established and disconnected many times without disturbing the state of the FPGA based system. The presented system has been verified in the hardware, and may be used as a tool for debugging, testing and even implementing of control algorithms for FPGA based systems.
A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations

PubMed Central

Mak, Terrence S. T.; Rachmuth, Guy; Lam, Kai-Pui; Poon, Chi-Sang

2008-01-01

Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field Programmable Gate Array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the AMPA and NMDA synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired. PMID:17190033
PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

NASA Astrophysics Data System (ADS)

Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

2004-11-01

Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
Improved pulse laser ranging algorithm based on high speed sampling

NASA Astrophysics Data System (ADS)

Gao, Xuan-yi; Qian, Rui-hai; Zhang, Yan-mei; Li, Huan; Guo, Hai-chao; He, Shi-jie; Guo, Xiao-kang

2016-10-01

Narrow pulse laser ranging achieves long-range target detection using laser pulse with low divergent beams. Pulse laser ranging is widely used in military, industrial, civil, engineering and transportation field. In this paper, an improved narrow pulse laser ranging algorithm is studied based on the high speed sampling. Firstly, theoretical simulation models have been built and analyzed including the laser emission and pulse laser ranging algorithm. An improved pulse ranging algorithm is developed. This new algorithm combines the matched filter algorithm and the constant fraction discrimination (CFD) algorithm. After the algorithm simulation, a laser ranging hardware system is set up to implement the improved algorithm. The laser ranging hardware system includes a laser diode, a laser detector and a high sample rate data logging circuit. Subsequently, using Verilog HDL language, the improved algorithm is implemented in the FPGA chip based on fusion of the matched filter algorithm and the CFD algorithm. Finally, the laser ranging experiment is carried out to test the improved algorithm ranging performance comparing to the matched filter algorithm and the CFD algorithm using the laser ranging hardware system. The test analysis result demonstrates that the laser ranging hardware system realized the high speed processing and high speed sampling data transmission. The algorithm analysis result presents that the improved algorithm achieves 0.3m distance ranging precision. The improved algorithm analysis result meets the expected effect, which is consistent with the theoretical simulation.
Design of a system based on DSP and FPGA for video recording and replaying

NASA Astrophysics Data System (ADS)

Kang, Yan; Wang, Heng

2013-08-01

This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA
Families of FPGA-Based Accelerators for Approximate String Matching1

PubMed Central

Van Court, Tom; Herbordt, Martin C.

2011-01-01

Dynamic programming for approximate string matching is a large family of different algorithms, which vary significantly in purpose, complexity, and hardware utilization. Many implementations have reported impressive speed-ups, but have typically been point solutions – highly specialized and addressing only one or a few of the many possible options. The problem to be solved is creating a hardware description that implements a broad range of behavioral options without losing efficiency due to feature bloat. We report a set of three component types that address different parts of the approximate string matching problem. This allows each application to choose the feature set required, then make maximum use of the FPGA fabric according to that application’s specific resource requirements. Multiple, interchangeable implementations are available for each component type. We show that these methods allow the efficient generation of a large, if not complete, family of accelerators for this application. This flexibility was obtained while retaining high performance: We have evaluated a sample against serial reference codes and found speed-ups of from 150× to 400× over a high-end PC. PMID:21603598
High-speed sorting of grains by color and surface texture

USDA-ARS?s Scientific Manuscript database

A high-speed, low-cost, image-based sorting device was developed to detect and separate grains with different colors/textures. The device directly combines a complementary metal–oxide–semiconductor (CMOS) color image sensor with a field-programmable gate array (FPGA) that was programmed to execute ...
Low-cost, high-speed back-end processing system for high-frequency ultrasound B-mode imaging.

PubMed

Chang, Jin Ho; Sun, Lei; Yen, Jesse T; Shung, K Kirk

2009-07-01

For real-time visualization of the mouse heart (6 to 13 beats per second), a back-end processing system involving high-speed signal processing functions to form and display images has been developed. This back-end system was designed with new signal processing algorithms to achieve a frame rate of more than 400 images per second. These algorithms were implemented in a simple and cost-effective manner with a single field-programmable gate array (FPGA) and software programs written in C++. The operating speed of the back-end system was investigated by recording the time required for transferring an image to a personal computer. Experimental results showed that the back-end system is capable of producing 433 images per second. To evaluate the imaging performance of the back-end system, a complete imaging system was built. This imaging system, which consisted of a recently reported high-speed mechanical sector scanner assembled with the back-end system, was tested by imaging a wire phantom, a pig eye (in vitro), and a mouse heart (in vivo). It was shown that this system is capable of providing high spatial resolution images with fast temporal resolution.
Low-Cost, High-Speed Back-End Processing System for High-Frequency Ultrasound B-Mode Imaging

PubMed Central

Chang, Jin Ho; Sun, Lei; Yen, Jesse T.; Shung, K. Kirk

2009-01-01

For real-time visualization of the mouse heart (6 to 13 beats per second), a back-end processing system involving high-speed signal processing functions to form and display images has been developed. This back-end system was designed with new signal processing algorithms to achieve a frame rate of more than 400 images per second. These algorithms were implemented in a simple and cost-effective manner with a single field-programmable gate array (FPGA) and software programs written in C++. The operating speed of the back-end system was investigated by recording the time required for transferring an image to a personal computer. Experimental results showed that the back-end system is capable of producing 433 images per second. To evaluate the imaging performance of the back-end system, a complete imaging system was built. This imaging system, which consisted of a recently reported high-speed mechanical sector scanner assembled with the back-end system, was tested by imaging a wire phantom, a pig eye (in vitro), and a mouse heart (in vivo). It was shown that this system is capable of providing high spatial resolution images with fast temporal resolution. PMID:19574160
Gas Sensors Characterization and Multilayer Perceptron (MLP) Hardware Implementation for Gas Identification Using a Field Programmable Gate Array (FPGA)

PubMed Central

Benrekia, Fayçal; Attari, Mokhtar; Bouhedda, Mounir

2013-01-01

This paper develops a primitive gas recognition system for discriminating between industrial gas species. The system under investigation consists of an array of eight micro-hotplate-based SnO2 thin film gas sensors with different selectivity patterns. The output signals are processed through a signal conditioning and analyzing system. These signals feed a decision-making classifier, which is obtained via a Field Programmable Gate Array (FPGA) with Very High-Speed Integrated Circuit Hardware Description Language. The classifier relies on a multilayer neural network based on a back propagation algorithm with one hidden layer of four neurons and eight neurons at the input and five neurons at the output. The neural network designed after implementation consists of twenty thousand gates. The achieved experimental results seem to show the effectiveness of the proposed classifier, which can discriminate between five industrial gases. PMID:23529119
Acceleration of Cherenkov angle reconstruction with the new Intel Xeon/FPGA compute platform for the particle identification in the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Faerber, Christian

2017-10-01

The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel
Hardware-based image processing for high-speed inspection of grains

USDA-ARS?s Scientific Manuscript database

A high-speed, low-cost, image-based sorting device was developed to detect and separate grains with slight color differences and small defects on grains The device directly combines a complementary metal–oxide–semiconductor (CMOS) color image sensor with a field-programmable gate array (FPGA) which...
FPGA implementation of motifs-based neuronal network and synchronization analysis

NASA Astrophysics Data System (ADS)

Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao

2016-06-01

Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.
FPGA cluster for high-performance AO real-time control system

NASA Astrophysics Data System (ADS)

Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.

2006-06-01

Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
FPGA Implementation of Optimal 3D-Integer DCT Structure for Video Compression

PubMed Central

2015-01-01

A novel optimal structure for implementing 3D-integer discrete cosine transform (DCT) is presented by analyzing various integer approximation methods. The integer set with reduced mean squared error (MSE) and high coding efficiency are considered for implementation in FPGA. The proposed method proves that the least resources are utilized for the integer set that has shorter bit values. Optimal 3D-integer DCT structure is determined by analyzing the MSE, power dissipation, coding efficiency, and hardware complexity of different integer sets. The experimental results reveal that direct method of computing the 3D-integer DCT using the integer set [10, 9, 6, 2, 3, 1, 1] performs better when compared to other integer sets in terms of resource utilization and power dissipation. PMID:26601120

FPGA implementation of sparse matrix algorithm for information retrieval

NASA Astrophysics Data System (ADS)

Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

2005-06-01

Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Particle Identification on an FPGA Accelerated Compute Platform for the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Fäerber, Christian; Schwemmer, Rainer; Machen, Jonathan; Neufeld, Niko

2017-07-01

The current LHCb readout system will be upgraded in 2018 to a “triggerless” readout of the entire detector at the Large Hadron Collider collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 Gb/s up to 40 Tb/s. The event filter farm will preanalyze the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, and the reason why different new technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building farm or in the event filter farm (trigger), an experimental field programmable gate array (FPGA) accelerated computing platform is considered and, therefore, tested. FPGA compute accelerators are used more and more in standard servers such as for Microsoft Bing search or Baidu search. The platform we use hosts a general Intel CPU and a high-performance FPGA linked via the high-speed Intel QuickPath Interconnect. An accelerator is implemented on the FPGA. It is very likely that these platforms, which are built, in general, for high-performance computing, are also very interesting for the high-energy physics community. First, the performance results of smaller test cases performed at the beginning are presented. Afterward, a part of the existing LHCb RICH particle identification is tested and is ported to the experimental FPGA accelerated platform. We have compared the performance of the LHCb RICH particle identification running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform.
FPGA architecture and implementation of sparse matrix vector multiplication for the finite element method

NASA Astrophysics Data System (ADS)

Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.

2008-04-01

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside
A Real-Time Data Acquisition and Processing Framework Based on FlexRIO FPGA and ITER Fast Plant System Controller

NASA Astrophysics Data System (ADS)

Yang, C.; Zheng, W.; Zhang, M.; Yuan, T.; Zhuang, G.; Pan, Y.

2016-06-01

Measurement and control of the plasma in real-time are critical for advanced Tokamak operation. It requires high speed real-time data acquisition and processing. ITER has designed the Fast Plant System Controllers (FPSC) for these purposes. At J-TEXT Tokamak, a real-time data acquisition and processing framework has been designed and implemented using standard ITER FPSC technologies. The main hardware components of this framework are an Industrial Personal Computer (IPC) with a real-time system and FlexRIO devices based on FPGA. With FlexRIO devices, data can be processed by FPGA in real-time before they are passed to the CPU. The software elements are based on a real-time framework which runs under Red Hat Enterprise Linux MRG-R and uses Experimental Physics and Industrial Control System (EPICS) for monitoring and configuring. That makes the framework accord with ITER FPSC standard technology. With this framework, any kind of data acquisition and processing FlexRIO FPGA program can be configured with a FPSC. An application using the framework has been implemented for the polarimeter-interferometer diagnostic system on J-TEXT. The application is able to extract phase-shift information from the intermediate frequency signal produced by the polarimeter-interferometer diagnostic system and calculate plasma density profile in real-time. Different algorithms implementations on the FlexRIO FPGA are compared in the paper.
Spacewire Routers Implemented with FPGA Technology

NASA Astrophysics Data System (ADS)

Habinc, Sandi; Isomaki, Marko

2011-08-01

Routers are an integral part of SpaceWire networks. Aeroflex Gaisler has developed a highly configurable SpaceWire router VHDL IP core to meet the needs for technology independent router designs. The main design goals have been configurability, technology independence, support of the standard and expandability. The IP core being technologically independent allows it to be used in both ASIC and FPGA technology. The latter is now being used to produce versatile standard products that can reach the market faster than for example an ASIC based product.
FPGA implementation of adaptive beamforming in hearing aids.

PubMed

Samtani, Kartik; Thomas, Jobin; Varma, G Abhinav; Sumam, David S; Deepu, S P

2017-07-01

Beamforming is a spatial filtering technique used in hearing aids to improve target sound reception by reducing interference from other directions. In this paper we propose improvements in an existing architecture present for two omnidirectional microphone array based adaptive beamforming for hearing aid applications and implement the same on Xilinx Artix 7 FPGA using VHDL coding and Xilinx Vivado ® 2015.2. The nulls are introduced in particular directions by combination of two fixed polar patterns. This combination can be adaptively controlled to steer the null in the direction of noise. The beamform patterns and improvements in SNR values obtained from experiments in a conference room environment are analyzed.
Subnanosecond time-to-digital converter implemented in a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Sano, Y.; Horii, Y.; Ikeno, M.; Sasaki, O.; Tomoto, M.; Uchida, T.

2017-12-01

Time-to-digital converters (TDCs) are used in various fields, including high-energy physics. One advantage of implementing TDCs in field-programmable gate arrays (FPGAs) is the flexibility on the modification of the logics, which is useful to cope with the changes in the experimental conditions. Recent FPGAs make it possible to implement TDCs with a time resolution less than 10 ps. On the other hand, various drift chambers require a time resolution of O(0.1) ns, and a simple and easy-to-implement TDC is useful for a robust operation. Herein an eight-channel TDC with a variable bin size down to 0.28 ns is implemented in a Xilinx Kintex-7 FPGA and tested. The TDC is based on a multisampling scheme with quad phase clocks synchronised with an external reference clock. Calibration of the bin size is unnecessary if a stable reference clock is available, which is common in high-energy physics experiments. Depending on the channel, the standard deviation of the differential nonlinearity for a 0.28 ns bin size is 0.13-0.31. The performance has a negligible dependence on the temperature. The power consumption and the potential to extend the number of channels are also discussed.
FPGA based data processing in the ALICE High Level Trigger in LHC Run 2

NASA Astrophysics Data System (ADS)

Engel, Heiko; Alt, Torsten; Kebschull, Udo; ALICE Collaboration

2017-10-01

The ALICE High Level Trigger (HLT) is a computing cluster dedicated to the online compression, reconstruction and calibration of experimental data. The HLT receives detector data via serial optical links into FPGA based readout boards that process the data on a per-link level already inside the FPGA and provide it to the host machines connected with a data transport framework. FPGA based data pre-processing is enabled for the biggest detector of ALICE, the Time Projection Chamber (TPC), with a hardware cluster finding algorithm. This algorithm was ported to the Common Read-Out Receiver Card (C-RORC) as used in the HLT for RUN 2. It was improved to handle double the input bandwidth and adjusted to the upgraded TPC Readout Control Unit (RCU2). A flexible firmware implementation in the HLT handles both the old and the new TPC data format and link rates transparently. Extended protocol and data error detection, error handling and the enhanced RCU2 data ordering scheme provide an improved physics performance of the cluster finder. The performance of the cluster finder was verified against large sets of reference data both in terms of throughput and algorithmic correctness. Comparisons with a software reference implementation confirm significant savings on CPU processing power using the hardware implementation. The C-RORC hardware with the cluster finder for RCU1 data is in use in the HLT since the start of RUN 2. The extended hardware cluster finder implementation for the RCU2 with doubled throughput is active since the upgrade of the TPC readout electronics in early 2016.
A Synchronization Algorithm and Implementation for High-Speed Block Codes Applications. Part 4

NASA Technical Reports Server (NTRS)

Lin, Shu; Zhang, Yu; Nakamura, Eric B.; Uehara, Gregory T.

1998-01-01

Block codes have trellis structures and decoders amenable to high speed CMOS VLSI implementation. For a given CMOS technology, these structures enable operating speeds higher than those achievable using convolutional codes for only modest reductions in coding gain. As a result, block codes have tremendous potential for satellite trunk and other future high-speed communication applications. This paper describes a new approach for implementation of the synchronization function for block codes. The approach utilizes the output of the Viterbi decoder and therefore employs the strength of the decoder. Its operation requires no knowledge of the signal-to-noise ratio of the received signal, has a simple implementation, adds no overhead to the transmitted data, and has been shown to be effective in simulation for received SNR greater than 2 dB.
Investigation of FPGA-Based Real-Time Adaptive Digital Pulse Shaping for High-Count-Rate Applications

NASA Astrophysics Data System (ADS)

Saxena, Shefali; Hawari, Ayman I.

2017-07-01

Digital signal processing techniques have been widely used in radiation spectrometry to provide improved stability and performance with compact physical size over the traditional analog signal processing. In this paper, field-programmable gate array (FPGA)-based adaptive digital pulse shaping techniques are investigated for real-time signal processing. National Instruments (NI) NI 5761 14-bit, 250-MS/s adaptor module is used for digitizing high-purity germanium (HPGe) detector's preamplifier pulses. Digital pulse processing algorithms are implemented on the NI PXIe-7975R reconfigurable FPGA (Kintex-7) using the LabVIEW FPGA module. Based on the time separation between successive input pulses, the adaptive shaping algorithm selects the optimum shaping parameters (rise time and flattop time of trapezoid-shaping filter) for each incoming signal. A digital Sallen-Key low-pass filter is implemented to enhance signal-to-noise ratio and reduce baseline drifting in trapezoid shaping. A recursive trapezoid-shaping filter algorithm is employed for pole-zero compensation of exponentially decayed (with two-decay constants) preamplifier pulses of an HPGe detector. It allows extraction of pulse height information at the beginning of each pulse, thereby reducing the pulse pileup and increasing throughput. The algorithms for RC-CR2 timing filter, baseline restoration, pile-up rejection, and pulse height determination are digitally implemented for radiation spectroscopy. Traditionally, at high-count-rate conditions, a shorter shaping time is preferred to achieve high throughput, which deteriorates energy resolution. In this paper, experimental results are presented for varying count-rate and pulse shaping conditions. Using adaptive shaping, increased throughput is accepted while preserving the energy resolution observed using the longer shaping times.
FPGA-based coprocessor for matrix algorithms implementation

NASA Astrophysics Data System (ADS)

Amira, Abbes; Bensaali, Faycal

2003-03-01

Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.
Time-delayed chameleon: Analysis, synchronization and FPGA implementation

NASA Astrophysics Data System (ADS)

Rajagopal, Karthikeyan; Jafari, Sajad; Laarem, Guessas

2017-12-01

In this paper we report a time-delayed chameleon-like chaotic system which can belong to different families of chaotic attractors depending on the choices of parameters. Such a characteristic of self-excited and hidden chaotic flows in a simple 3D system with time delay has not been reported earlier. Dynamic analysis of the proposed time-delayed systems are analysed in time-delay space and parameter space. A novel adaptive modified functional projective lag synchronization algorithm is derived for synchronizing identical time-delayed chameleon systems with uncertain parameters. The proposed time-delayed systems and the synchronization algorithm with controllers and parameter estimates are then implemented in FPGA using hardware-software co-simulation and the results are presented.
Method to implement the CCD timing generator based on FPGA

NASA Astrophysics Data System (ADS)

Li, Binhua; Song, Qian; He, Chun; Jin, Jianhui; He, Lin

2010-07-01

With the advance of the PFPA technology, the design methodology of digital systems is changing. In recent years we develop a method to implement the CCD timing generator based on FPGA and VHDL. This paper presents the principles and implementation skills of the method. Taking a developed camera as an example, we introduce the structure, input and output clocks/signals of a timing generator implemented in the camera. The generator is composed of a top module and a bottom module. The bottom one is made up of 4 sub-modules which correspond to 4 different operation modes. The modules are implemented by 5 VHDL programs. Frame charts of the architecture of these programs are shown in the paper. We also describe implementation steps of the timing generator in Quartus II, and the interconnections between the generator and a Nios soft core processor which is the controller of this generator. Some test results are presented in the end.
A potent approach for the development of FPGA based DAQ system for HEP experiments

NASA Astrophysics Data System (ADS)

Khan, Shuaib Ahmad; Mitra, Jubin; David, Erno; Kiss, Tivadar; Nayak, Tapan Kumar

2017-10-01

With ever increasing particle beam energies and interaction rates in modern High Energy Physics (HEP) experiments in the present and future accelerator facilities, there has always been the demand for robust Data Acquisition (DAQ) schemes which perform in the harsh radiation environment and handle high data volume. The scheme is required to be flexible enough to adapt to the demands of future detector and electronics upgrades, and at the same time keeping the cost factor in mind. To address these challenges, in the present work, we discuss an efficient DAQ scheme for error resilient, high speed data communication on commercially available state-of-the-art FPGA with optical links. The scheme utilises GigaBit Transceiver (GBT) protocol to establish radiation tolerant communication link between on-detector front-end electronics situated in harsh radiation environment to the back-end Data Processing Unit (DPU) placed in a low radiation zone. The acquired data are reconstructed in DPU which reduces the data volume significantly, and then transmitted to the computing farms through high speed optical links using 10 Gigabit Ethernet (10GbE). In this study, we focus on implementation and testing of GBT protocol and 10GbE links on an Intel FPGA. Results of the measurements of resource utilisation, critical path delays, signal integrity, eye diagram and Bit Error Rate (BER) are presented, which are the indicators for efficient system performance.
FPGA implementation of neuro-fuzzy system with improved PSO learning.

PubMed

Karakuzu, Cihan; Karakaya, Fuat; Çavuşlu, Mehmet Ali

2016-07-01

This paper presents the first hardware implementation of neuro-fuzzy system (NFS) with its metaheuristic learning ability on field programmable gate array (FPGA). Metaheuristic learning of NFS for all of its parameters is accomplished by using the improved particle swarm optimization (iPSO). As a second novelty, a new functional approach, which does not require any memory and multiplier usage, is proposed for the Gaussian membership functions of NFS. NFS and its learning using iPSO are implemented on Xilinx Virtex5 xc5vlx110-3ff1153 and efficiency of the proposed implementation tested on two dynamic system identification problems and licence plate detection problem as a practical application. Results indicate that proposed NFS implementation and membership function approximation is as effective as the other approaches available in the literature but requires less hardware resources. Copyright © 2016 Elsevier Ltd. All rights reserved.
An FPGA Implementation of a Polychronous Spiking Neural Network with Delay Adaptation.

PubMed

Wang, Runchun; Cohen, Gregory; Stiefel, Klaus M; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, André

2013-01-01

We present an FPGA implementation of a re-configurable, polychronous spiking neural network with a large capacity for spatial-temporal patterns. The proposed neural network generates delay paths de novo, so that only connections that actually appear in the training patterns will be created. This allows the proposed network to use all the axons (variables) to store information. Spike Timing Dependent Delay Plasticity is used to fine-tune and add dynamics to the network. We use a time multiplexing approach allowing us to achieve 4096 (4k) neurons and up to 1.15 million programmable delay axons on a Virtex 6 FPGA. Test results show that the proposed neural network is capable of successfully recalling more than 95% of all spikes for 96% of the stored patterns. The tests also show that the neural network is robust to noise from random input spikes.
High-Speed On-Board Data Processing for Science Instruments: HOPS

NASA Technical Reports Server (NTRS)

Beyon, Jeffrey

2015-01-01

The project called High-Speed On-Board Data Processing for Science Instruments (HOPS) has been funded by NASA Earth Science Technology Office (ESTO) Advanced Information Systems Technology (AIST) program during April, 2012 â€" April, 2015. HOPS is an enabler for science missions with extremely high data processing rates. In this three-year effort of HOPS, Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS) and 3-D Winds were of interest in particular. As for ASCENDS, HOPS replaces time domain data processing with frequency domain processing while making the real-time on-board data processing possible. As for 3-D Winds, HOPS offers real-time high-resolution wind profiling with 4,096-point fast Fourier transform (FFT). HOPS is adaptable with quick turn-around time. Since HOPS offers reusable user-friendly computational elements, its FPGA IP Core can be modified for a shorter development period if the algorithm changes. The FPGA and memory bandwidth of HOPS is 20 GB/sec while the typical maximum processor-to-SDRAM bandwidth of the commercial radiation tolerant high-end processors is about 130-150 MB/sec. The inter-board communication bandwidth of HOPS is 4 GB/sec while the effective processor-to-cPCI bandwidth of commercial radiation tolerant high-end boards is about 50-75 MB/sec. Also, HOPS offers VHDL cores for the easy and efficient implementation of ASCENDS and 3-D Winds, and other similar algorithms. A general overview of the 3-year development of HOPS is the goal of this presentation.
Real-time FPGA architectures for computer vision

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar

2000-03-01

This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
A shared synapse architecture for efficient FPGA implementation of autoencoders.

PubMed

Suzuki, Akihiro; Morie, Takashi; Tamukoh, Hakaru

2018-01-01

This paper proposes a shared synapse architecture for autoencoders (AEs), and implements an AE with the proposed architecture as a digital circuit on a field-programmable gate array (FPGA). In the proposed architecture, the values of the synapse weights are shared between the synapses of an input and a hidden layer, and between the synapses of a hidden and an output layer. This architecture utilizes less of the limited resources of an FPGA than an architecture which does not share the synapse weights, and reduces the amount of synapse modules used by half. For the proposed circuit to be implemented into various types of AEs, it utilizes three kinds of parameters; one to change the number of layers' units, one to change the bit width of an internal value, and a learning rate. By altering a network configuration using these parameters, the proposed architecture can be used to construct a stacked AE. The proposed circuits are logically synthesized, and the number of their resources is determined. Our experimental results show that single and stacked AE circuits utilizing the proposed shared synapse architecture operate as regular AEs and as regular stacked AEs. The scalability of the proposed circuit and the relationship between the bit widths and the learning results are also determined. The clock cycles of the proposed circuits are formulated, and this formula is used to estimate the theoretical performance of the circuit when the circuit is used to construct arbitrary networks.
A shared synapse architecture for efficient FPGA implementation of autoencoders

PubMed Central

Morie, Takashi; Tamukoh, Hakaru

2018-01-01

This paper proposes a shared synapse architecture for autoencoders (AEs), and implements an AE with the proposed architecture as a digital circuit on a field-programmable gate array (FPGA). In the proposed architecture, the values of the synapse weights are shared between the synapses of an input and a hidden layer, and between the synapses of a hidden and an output layer. This architecture utilizes less of the limited resources of an FPGA than an architecture which does not share the synapse weights, and reduces the amount of synapse modules used by half. For the proposed circuit to be implemented into various types of AEs, it utilizes three kinds of parameters; one to change the number of layers’ units, one to change the bit width of an internal value, and a learning rate. By altering a network configuration using these parameters, the proposed architecture can be used to construct a stacked AE. The proposed circuits are logically synthesized, and the number of their resources is determined. Our experimental results show that single and stacked AE circuits utilizing the proposed shared synapse architecture operate as regular AEs and as regular stacked AEs. The scalability of the proposed circuit and the relationship between the bit widths and the learning results are also determined. The clock cycles of the proposed circuits are formulated, and this formula is used to estimate the theoretical performance of the circuit when the circuit is used to construct arbitrary networks. PMID:29543909

Integrated High-Speed Torque Control System for a Robotic Joint

NASA Technical Reports Server (NTRS)

Davis, Donald R. (Inventor); Radford, Nicolaus A. (Inventor); Permenter, Frank Noble (Inventor); Valvo, Michael C. (Inventor); Askew, R. Scott (Inventor)

2013-01-01

A control system for achieving high-speed torque for a joint of a robot includes a printed circuit board assembly (PCBA) having a collocated joint processor and high-speed communication bus. The PCBA may also include a power inverter module (PIM) and local sensor conditioning electronics (SCE) for processing sensor data from one or more motor position sensors. Torque control of a motor of the joint is provided via the PCBA as a high-speed torque loop. Each joint processor may be embedded within or collocated with the robotic joint being controlled. Collocation of the joint processor, PIM, and high-speed bus may increase noise immunity of the control system, and the localized processing of sensor data from the joint motor at the joint level may minimize bus cabling to and from each control node. The joint processor may include a field programmable gate array (FPGA).
Optimized FPGA Implementation of Multi-Rate FIR Filters Through Thread Decomposition

NASA Technical Reports Server (NTRS)

Zheng, Jason Xin; Nguyen, Kayla; He, Yutao

2010-01-01

Multirate (decimation/interpolation) filters are among the essential signal processing components in spaceborne instruments where Finite Impulse Response (FIR) filters are often used to minimize nonlinear group delay and finite-precision effects. Cascaded (multi-stage) designs of Multi-Rate FIR (MRFIR) filters are further used for large rate change ratio, in order to lower the required throughput while simultaneously achieving comparable or better performance than single-stage designs. Traditional representation and implementation of MRFIR employ polyphase decomposition of the original filter structure, whose main purpose is to compute only the needed output at the lowest possible sampling rate. In this paper, an alternative representation and implementation technique, called TD-MRFIR (Thread Decomposition MRFIR), is presented. The basic idea is to decompose MRFIR into output computational threads, in contrast to a structural decomposition of the original filter as done in the polyphase decomposition. Each thread represents an instance of the finite convolution required to produce a single output of the MRFIR. The filter is thus viewed as a finite collection of concurrent threads. The technical details of TD-MRFIR will be explained, first showing its applicability to the implementation of downsampling, upsampling, and resampling FIR filters, and then describing a general strategy to optimally allocate the number of filter taps. A particular FPGA design of multi-stage TD-MRFIR for the L-band radar of NASA's SMAP (Soil Moisture Active Passive) instrument is demonstrated; and its implementation results in several targeted FPGA devices are summarized in terms of the functional (bit width, fixed-point error) and performance (time closure, resource usage, and power estimation) parameters.
FPGA design for constrained energy minimization

NASA Astrophysics Data System (ADS)

Wang, Jianwei; Chang, Chein-I.; Cao, Mang

2004-02-01

The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.
FPGA implementation of Santos-Victor optical flow algorithm for real-time image processing: an useful attempt

NASA Astrophysics Data System (ADS)

Cobos Arribas, Pedro; Monasterio Huelin Macia, Felix

2003-04-01

A FPGA based hardware implementation of the Santos-Victor optical flow algorithm, useful in robot guidance applications, is described in this paper. The system used to do contains an ALTERA FPGA (20K100), an interface with a digital camera, three VRAM memories to contain the data input and some output memories (a VRAM and a EDO) to contain the results. The system have been used previously to develop and test other vision algorithms, such as image compression, optical flow calculation with differential and correlation methods. The designed system let connect the digital camera, or the FPGA output (results of algorithms) to a PC, throw its Firewire or USB port. The problems take place in this occasion have motivated to adopt another hardware structure for certain vision algorithms with special requirements, that need a very hard code intensive processing.
Implementation of data acquisition interface using on-board field-programmable gate array (FPGA) universal serial bus (USB) link

NASA Astrophysics Data System (ADS)

Yussup, N.; Ibrahim, M. M.; Lombigit, L.; Rahman, N. A. A.; Zin, M. R. M.

2014-02-01

Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of data acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.
Implementation of data acquisition interface using on-board field-programmable gate array (FPGA) universal serial bus (USB) link

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yussup, N.; Ibrahim, M. M.; Lombigit, L.

Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of datamore » acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.« less
FPGA implementation of high-performance QC-LDPC decoder for optical communications

NASA Astrophysics Data System (ADS)

Zou, Ding; Djordjevic, Ivan B.

2015-01-01

Forward error correction is as one of the key technologies enabling the next-generation high-speed fiber optical communications. Quasi-cyclic (QC) low-density parity-check (LDPC) codes have been considered as one of the promising candidates due to their large coding gain performance and low implementation complexity. In this paper, we present our designed QC-LDPC code with girth 10 and 25% overhead based on pairwise balanced design. By FPGAbased emulation, we demonstrate that the 5-bit soft-decision LDPC decoder can achieve 11.8dB net coding gain with no error floor at BER of 10-15 avoiding using any outer code or post-processing method. We believe that the proposed single QC-LDPC code is a promising solution for 400Gb/s optical communication systems and beyond.
The Use of Field Programmable Gate Arrays (FPGA) in Small Satellite Communication Systems

NASA Technical Reports Server (NTRS)

Varnavas, Kosta; Sims, William Herbert; Casas, Joseph

2015-01-01

This paper will describe the use of digital Field Programmable Gate Arrays (FPGA) to contribute to advancing the state-of-the-art in software defined radio (SDR) transponder design for the emerging SmallSat and CubeSat industry and to provide advances for NASA as described in the TAO5 Communication and Navigation Roadmap (Ref 4). The use of software defined radios (SDR) has been around for a long time. A typical implementation of the SDR is to use a processor and write software to implement all the functions of filtering, carrier recovery, error correction, framing etc. Even with modern high speed and low power digital signal processors, high speed memories, and efficient coding, the compute intensive nature of digital filters, error correcting and other algorithms is too much for modern processors to get efficient use of the available bandwidth to the ground. By using FPGAs, these compute intensive tasks can be done in parallel, pipelined fashion and more efficiently use every clock cycle to significantly increase throughput while maintaining low power. These methods will implement digital radios with significant data rates in the X and Ka bands. Using these state-of-the-art technologies, unprecedented uplink and downlink capabilities can be achieved in a 1/2 U sized telemetry system. Additionally, modern FPGAs have embedded processing systems, such as ARM cores, integrated inside the FPGA allowing mundane tasks such as parameter commanding to occur easily and flexibly. Potential partners include other NASA centers, industry and the DOD. These assets are associated with small satellite demonstration flights, LEO and deep space applications. MSFC currently has an SDR transponder test-bed using Hardware-in-the-Loop techniques to evaluate and improve SDR technologies.
A high speed implementation of the random decrement algorithm

NASA Technical Reports Server (NTRS)

Kiraly, L. J.

1982-01-01

The algorithm is useful for measuring net system damping levels in stochastic processes and for the development of equivalent linearized system response models. The algorithm works by summing together all subrecords which occur after predefined threshold level is crossed. The random decrement signature is normally developed by scanning stored data and adding subrecords together. The high speed implementation of the random decrement algorithm exploits the digital character of sampled data and uses fixed record lengths of 2(n) samples to greatly speed up the process. The contributions to the random decrement signature of each data point was calculated only once and in the same sequence as the data were taken. A hardware implementation of the algorithm using random logic is diagrammed and the process is shown to be limited only by the record size and the threshold crossing frequency of the sampled data. With a hardware cycle time of 200 ns and 1024 point signature, a threshold crossing frequency of 5000 Hertz can be processed and a stably averaged signature presented in real time.
Packet based serial link realized in FPGA dedicated for high resolution infrared image transmission

NASA Astrophysics Data System (ADS)

Bieszczad, Grzegorz

2015-05-01

In article the external digital interface specially designed for thermographic camera built in Military University of Technology is described. The aim of article is to illustrate challenges encountered during design process of thermal vision camera especially related to infrared data processing and transmission. Article explains main requirements for interface to transfer Infra-Red or Video digital data and describes the solution which we elaborated based on Low Voltage Differential Signaling (LVDS) physical layer and signaling scheme. Elaborated link for image transmission is built using FPGA integrated circuit with built-in high speed serial transceivers achieving up to 2500Gbps throughput. Image transmission is realized using proprietary packet protocol. Transmission protocol engine was described in VHDL language and tested in FPGA hardware. The link is able to transmit 1280x1024@60Hz 24bit video data using one signal pair. Link was tested to transmit thermal-vision camera picture to remote monitor. Construction of dedicated video link allows to reduce power consumption compared to solutions with ASIC based encoders and decoders realizing video links like DVI or packed based Display Port, with simultaneous reduction of wires needed to establish link to one pair. Article describes functions of modules integrated in FPGA design realizing several functions like: synchronization to video source, video stream packeting, interfacing transceiver module and dynamic clock generation for video standard conversion.
FPGA in-the-loop simulations of cardiac excitation model under voltage clamp conditions

NASA Astrophysics Data System (ADS)

Othman, Norliza; Adon, Nur Atiqah; Mahmud, Farhanahani

2017-01-01

Voltage clamp technique allows the detection of single channel currents in biological membranes in identifying variety of electrophysiological problems in the cellular level. In this paper, a simulation study of the voltage clamp technique has been presented to analyse current-voltage (I-V) characteristics of ion currents based on Luo-Rudy Phase-I (LR-I) cardiac model by using a Field Programmable Gate Array (FPGA). Nowadays, cardiac models are becoming increasingly complex which can cause a vast amount of time to run the simulation. Thus, a real-time hardware implementation using FPGA could be one of the best solutions for high-performance real-time systems as it provides high configurability and performance, and able to executes in parallel mode operation. For shorter time development while retaining high confidence results, FPGA-based rapid prototyping through HDL Coder from MATLAB software has been used to construct the algorithm for the simulation system. Basically, the HDL Coder is capable to convert the designed MATLAB Simulink blocks into hardware description language (HDL) for the FPGA implementation. As a result, the voltage-clamp fixed-point design of LR-I model has been successfully conducted in MATLAB Simulink and the simulation of the I-V characteristics of the ionic currents has been verified on Xilinx FPGA Virtex-6 XC6VLX240T development board through an FPGA-in-the-loop (FIL) simulation.
An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm

NASA Technical Reports Server (NTRS)

Weir, John M.; Wells, B. Earl

2003-01-01

Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.
A FPGA Implementation of the CAR-FAC Cochlear Model.

PubMed

Xu, Ying; Thakur, Chetan S; Singh, Ram K; Hamilton, Tara Julia; Wang, Runchun M; van Schaik, André

2018-01-01

This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on.
A FPGA Implementation of the CAR-FAC Cochlear Model

PubMed Central

Xu, Ying; Thakur, Chetan S.; Singh, Ram K.; Hamilton, Tara Julia; Wang, Runchun M.; van Schaik, André

2018-01-01

This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on. PMID:29692700
FPGA Coprocessor for Accelerated Classification of Images

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.

2008-01-01

An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.
Analysis of High Speed Jets Produced by a Servo Tube Driven Liquid Jet Injector

NASA Astrophysics Data System (ADS)

Portaro, Rocco; Ng, Hoi Dick

2017-11-01

In today's healthcare environment many types of medication must be administered through the use of hypodermic needles. Although this practice has been in use for many years, drawbacks such as accidental needle stick injuries, transmission of deadly viruses and bio-hazardous waste are still present. This study focuses on improving a needle free technology known as liquid jet injection, through the implementation of a linear servo tube actuator for the construction of a fully closed loop liquid jet injection system. This device has the ability to deliver both micro- and macro- molecules, high viscosity fluids whilst providing real time control of the jet pressure profile for accurate depth and dispersion control. The experiments are conducted using a prototype that consists of a 3 kW servo tube actuator, coupled to a specially designed injection head allowing nozzle size and injection volume to be varied. The device is controlled via a high speed servo amplifier and FPGA. The high speed jets emanating from the injector are assessed via high speed photography and through the use of a force transducer. Preliminary results indicate that the system allows for accurate shaping of the jet pressure profile, making it possible to target different tissue depths/types accurately.
FPGA implementation cost and performance evaluation of IEEE 802.11 protocol encryption security schemes

NASA Astrophysics Data System (ADS)

Sklavos, N.; Selimis, G.; Koufopavlou, O.

2005-01-01

The explosive growth of internet and consumer demand for mobility has fuelled the exponential growth of wireless communications and networks. Mobile users want access to services and information, from both internet and personal devices, from a range of locations without the use of a cable medium. IEEE 802.11 is one of the most widely used wireless standards of our days. The amount of access and mobility into wireless networks requires a security infrastructure that protects communication within that network. The security of this protocol is based on the wired equivalent privacy (WEP) scheme. Currently, all the IEEE 802.11 market products support WEP. But recently, the 802.11i working group introduced the advanced encryption standard (AES), as the security scheme for the future IEEE 802.11 applications. In this paper, the hardware integrations of WEP and AES are studied. A field programmable gate array (FPGA) device has been used as the hardware implementation platform, for a fair comparison between the two security schemes. Measurements for the FPGA implementation cost, operating frequency, power consumption and performance are given.
FPGA implementation of predictive degradation model for engine oil lifetime

NASA Astrophysics Data System (ADS)

Idros, M. F. M.; Razak, A. H. A.; Junid, S. A. M. Al; Suliman, S. I.; Halim, A. K.

2018-03-01

This paper presents the implementation of linear regression model for degradation prediction on Register Transfer Logic (RTL) using QuartusII. A stationary model had been identified in the degradation trend for the engine oil in a vehicle in time series method. As for RTL implementation, the degradation model is written in Verilog HDL and the data input are taken at a certain time. Clock divider had been designed to support the timing sequence of input data. At every five data, a regression analysis is adapted for slope variation determination and prediction calculation. Here, only the negative value are taken as the consideration for the prediction purposes for less number of logic gate. Least Square Method is adapted to get the best linear model based on the mean values of time series data. The coded algorithm has been implemented on FPGA for validation purposes. The result shows the prediction time to change the engine oil.
Achieving High Performance with FPGA-Based Computing

PubMed Central

Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug

2011-01-01

Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
High precision, fast ultrasonic thermometer based on measurement of the speed of sound in air

NASA Astrophysics Data System (ADS)

Huang, K. N.; Huang, C. F.; Li, Y. C.; Young, M. S.

2002-11-01

This study presents a microcomputer-based ultrasonic system which measures air temperature by detecting variations in the speed of sound in the air. Changes in the speed of sound are detected by phase shift variations of a 40 kHz continuous ultrasonic wave. In a test embodiment, two 40 kHz ultrasonic transducers are set face to face at a constant distance. Phase angle differences between transmitted and received signals are determined by a FPGA digital phase detector and then analyzed in an 89C51 single-chip microcomputer. Temperature is calculated and then sent to a LCD display and, optionally, to a PC. Accuracy of measurement is within 0.05 degC at an inter-transducer distance of 10 cm. Temperature variations are displayed within 10 ms. The main advantages of the proposed system are high resolution, rapid temperature measurement, noncontact measurement and easy implementation.

High-definition video display based on the FPGA and THS8200

NASA Astrophysics Data System (ADS)

Qian, Jia; Sui, Xiubao

2014-11-01

This paper presents a high-definition video display solution based on the FPGA and THS8200. THS8200 is a video decoder chip launched by TI company, this chip has three 10-bit DAC channels which can capture video data in both 4:2:2 and 4:4:4 formats, and its data synchronization can be either through the dedicated synchronization signals HSYNC and VSYNC, or extracted from the embedded video stream synchronization information SAV / EAV code. In this paper, we will utilize the address and control signals generated by FPGA to access to the data-storage array, and then the FPGA generates the corresponding digital video signals YCbCr. These signals combined with the synchronization signals HSYNC and VSYNC that are also generated by the FPGA act as the input signals of THS8200. In order to meet the bandwidth requirements of the high-definition TV, we adopt video input in the 4:2:2 format over 2×10-bit interface. THS8200 is needed to be controlled by FPGA with I2C bus to set the internal registers, and as a result, it can generate the synchronous signal that is satisfied with the standard SMPTE and transfer the digital video signals YCbCr into analog video signals YPbPr. Hence, the composite analog output signals YPbPr are consist of image data signal and synchronous signal which are superimposed together inside the chip THS8200. The experimental research indicates that the method presented in this paper is a viable solution for high-definition video display, which conforms to the input requirements of the new high-definition display devices.
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor

NASA Astrophysics Data System (ADS)

Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram

2007-09-01

Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
Fast semivariogram computation using FPGA architectures

NASA Astrophysics Data System (ADS)

Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang

2015-02-01

The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments
System, apparatus and methods to implement high-speed network analyzers

DOEpatents

Ezick, James; Lethin, Richard; Ros-Giralt, Jordi; Szilagyi, Peter; Wohlford, David E

2015-11-10

Systems, apparatus and methods for the implementation of high-speed network analyzers are provided. A set of high-level specifications is used to define the behavior of the network analyzer emitted by a compiler. An optimized inline workflow to process regular expressions is presented without sacrificing the semantic capabilities of the processing engine. An optimized packet dispatcher implements a subset of the functions implemented by the network analyzer, providing a fast and slow path workflow used to accelerate specific processing units. Such dispatcher facility can also be used as a cache of policies, wherein if a policy is found, then packet manipulations associated with the policy can be quickly performed. An optimized method of generating DFA specifications for network signatures is also presented. The method accepts several optimization criteria, such as min-max allocations or optimal allocations based on the probability of occurrence of each signature input bit.
Optimizing latency in Xilinx FPGA implementations of the GBT

NASA Astrophysics Data System (ADS)

Muschter, S.; Baron, S.; Bohm, C.; Cachemiche, J.-P.; Soos, C.

2010-12-01

The GigaBit Transceiver (GBT) [1] system has been developed to replace the Timing, Trigger and Control (TTC) system [2], currently used by LHC, as well as to provide data transmission between on-detector and off-detector components in future sLHC detectors. A VHDL version of the GBT-SERDES, designed for FPGAs, was released in March 2010 as a GBT-FPGA Starter Kit for future GBT users and for off-detector GBT implementation [3]. This code was optimized for resource utilization [4], as the GBT protocol is very demanding. It was not, however, optimized for latency — which will be a critical parameter when used in the trigger path. The GBT-FPGA Starter Kit firmware was first analyzed in terms of latency by looking at the separate components of the VHDL version. Once the parts which contribute most to the latency were identified and modified, two possible optimizations were chosen, resulting in a latency reduced by a factor of three. The modifications were also analyzed in terms of logic utilization. The latency optimization results were compared with measurement results from a Virtex 6 ML605 development board [5] equipped with a XC6VLX240T with speedgrade-1 and the package FF1156. Bit error rate tests were also performed to ensure an error free operation. The two final optimizations were analyzed for utilization and compared with the original code, distributed in the Starter Kit.
Software-based high-level synthesis design of FPGA beamformers for synthetic aperture imaging.

PubMed

Amaro, Joao; Yiu, Billy Y S; Falcao, Gabriel; Gomes, Marco A C; Yu, Alfred C H

2015-05-01

Field-programmable gate arrays (FPGAs) can potentially be configured as beamforming platforms for ultrasound imaging, but a long design time and skilled expertise in hardware programming are typically required. In this article, we present a novel approach to the efficient design of FPGA beamformers for synthetic aperture (SA) imaging via the use of software-based high-level synthesis techniques. Software kernels (coded in OpenCL) were first developed to stage-wise handle SA beamforming operations, and their corresponding FPGA logic circuitry was emulated through a high-level synthesis framework. After design space analysis, the fine-tuned OpenCL kernels were compiled into register transfer level descriptions to configure an FPGA as a beamformer module. The processing performance of this beamformer was assessed through a series of offline emulation experiments that sought to derive beamformed images from SA channel-domain raw data (40-MHz sampling rate, 12 bit resolution). With 128 channels, our FPGA-based SA beamformer can achieve 41 frames per second (fps) processing throughput (3.44 × 10(8) pixels per second for frame size of 256 × 256 pixels) at 31.5 W power consumption (1.30 fps/W power efficiency). It utilized 86.9% of the FPGA fabric and operated at a 196.5 MHz clock frequency (after optimization). Based on these findings, we anticipate that FPGA and high-level synthesis can together foster rapid prototyping of real-time ultrasound processor modules at low power consumption budgets.
FPGA/NIOS Implementation of an Adaptive FIR Filter Using Linear Prediction to Reduce Narrow-Band RFI for Radio Detection of Cosmic Rays

NASA Astrophysics Data System (ADS)

Szadkowski, Zbigniew; Fraenkel, E. D.; van den Berg, Ad M.

2013-10-01

We present the FPGA/NIOS implementation of an adaptive finite impulse response (FIR) filter based on linear prediction to suppress radio frequency interference (RFI). This technique will be used for experiments that observe coherent radio emission from extensive air showers induced by ultra-high-energy cosmic rays. These experiments are designed to make a detailed study of the development of the electromagnetic part of air showers. Therefore, these radio signals provide information that is complementary to that obtained by water-Cherenkov detectors which are predominantly sensitive to the particle content of an air shower at ground. The radio signals from air showers are caused by the coherent emission due to geomagnetic and charge-excess processes. These emissions can be observed in the frequency band between 10-100 MHz. However, this frequency range is significantly contaminated by narrow-band RFI and other human-made distortions. A FIR filter implemented in the FPGA logic segment of the front-end electronics of a radio sensor significantly improves the signal-to-noise ratio. In this paper we discuss an adaptive filter which is based on linear prediction. The coefficients for the linear predictor (LP) are dynamically refreshed and calculated in the embedded NIOS processor, which is implemented in the same FPGA chip. The Levinson recursion, used to obtain the filter coefficients, is also implemented in the NIOS and is partially supported by direct multiplication in the DSP blocks of the logic FPGA segment. Tests confirm that the LP can be an alternative to other methods involving multiple time-to-frequency domain conversions using an FFT procedure. These multiple conversions draw heavily on the power consumption of the FPGA and are avoided by the linear prediction approach. Minimization of the power consumption is an important issue because the final system will be powered by solar panels. The FIR filter has been successfully tested in the Altera development kits
FPGA Implementation of Reed-Solomon Decoder for IEEE 802.16 WiMAX Systems using Simulink-Sysgen Design Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobrek, Miljko; Albright, Austin P

This paper presents FPGA implementation of the Reed-Solomon decoder for use in IEEE 802.16 WiMAX systems. The decoder is based on RS(255,239) code, and is additionally shortened and punctured according to the WiMAX specifications. Simulink model based on Sysgen library of Xilinx blocks was used for simulation and hardware implementation. At the end, simulation results and hardware implementation performances are presented.
HDL Based FPGA Interface Library for Data Acquisition and Multipurpose Real Time Algorithms

NASA Astrophysics Data System (ADS)

Fernandes, Ana M.; Pereira, R. C.; Sousa, J.; Batista, A. J. N.; Combo, A.; Carvalho, B. B.; Correia, C. M. B. A.; Varandas, C. A. F.

2011-08-01

The inherent parallelism of the logic resources, the flexibility in its configuration and the performance at high processing frequencies makes the field programmable gate array (FPGA) the most suitable device to be used both for real time algorithm processing and data transfer in instrumentation modules. Moreover, the reconfigurability of these FPGA based modules enables exploiting different applications on the same module. When using a reconfigurable module for various applications, the availability of a common interface library for easier implementation of the algorithms on the FPGA leads to more efficient development. The FPGA configuration is usually specified in a hardware description language (HDL) or other higher level descriptive language. The critical paths, such as the management of internal hardware clocks that require deep knowledge of the module behavior shall be implemented in HDL to optimize the timing constraints. The common interface library should include these critical paths, freeing the application designer from hardware complexity and able to choose any of the available high-level abstraction languages for the algorithm implementation. With this purpose a modular Verilog code was developed for the Virtex 4 FPGA of the in-house Transient Recorder and Processor (TRP) hardware module, based on the Advanced Telecommunications Computing Architecture (ATCA), with eight channels sampling at up to 400 MSamples/s (MSPS). The TRP was designed to perform real time Pulse Height Analysis (PHA), Pulse Shape Discrimination (PSD) and Pile-Up Rejection (PUR) algorithms at a high count rate (few Mevent/s). A brief description of this modular code is presented and examples of its use as an interface with end user algorithms, including a PHA with PUR, are described.
Chicago-St. Louis high speed rail plan

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stead, M.E.

1994-12-31

The Illinois Department of Transportation (IDOT), in cooperation with Amtrak, undertook the Chicago-St. Louis High Speed Rail Financial and Implementation Plan study in order to develop a realistic and achievable blueprint for implementation of high speed rail in the Chicago-St. Louis corridor. This report presents a summary of the Price Waterhouse Project Team`s analysis and the Financial and Implementation Plan for implementing high speed rail service in the Chicago-St. Louis corridor.
Design and implementation of interface units for high speed fiber optics local area networks and broadband integrated services digital networks

NASA Technical Reports Server (NTRS)

Tobagi, Fouad A.; Dalgic, Ismail; Pang, Joseph

1990-01-01

The design and implementation of interface units for high speed Fiber Optic Local Area Networks and Broadband Integrated Services Digital Networks are discussed. During the last years, a number of network adapters that are designed to support high speed communications have emerged. This approach to the design of a high speed network interface unit was to implement package processing functions in hardware, using VLSI technology. The VLSI hardware implementation of a buffer management unit, which is required in such architectures, is described.
Implementation of a sliding-mode-based position sensorless drive for high-speed micro permanent-magnet synchronous motors.

PubMed

Chi, Wen-Chun; Cheng, Ming-Yang

2014-03-01

Due to issues such as limited space, it is difficult if it is not impossible to employ a position sensor in the drive control of high-speed micro PMSMs. In order to alleviate this problem, this paper analyzes and implements a simple and robust position sensorless field-oriented control method of high-speed micro PMSMs based on the sliding-mode observer. In particular, the angular position and velocity of the rotor of the high-speed micro PMSM are estimated using the sliding-mode observer. This observer is able to accurately estimate rotor position in the low speed region and guarantee fast convergence of the observer in the high speed region. The proposed position sensorless control method is suitable for electric dental handpiece motor drives where a wide speed range operation is essential. The proposed sensorless FOC method is implemented using a cost-effective 16-bit microcontroller and tested in a prototype electric dental handpiece motor. Several experiments are performed to verify the effectiveness of the proposed method. © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
Design of a real-time system of moving ship tracking on-board based on FPGA in remote sensing images

NASA Astrophysics Data System (ADS)

Yang, Tie-jun; Zhang, Shen; Zhou, Guo-qing; Jiang, Chuan-xian

2015-12-01

With the broad attention of countries in the areas of sea transportation and trade safety, the requirements of efficiency and accuracy of moving ship tracking are becoming higher. Therefore, a systematic design of moving ship tracking onboard based on FPGA is proposed, which uses the Adaptive Inter Frame Difference (AIFD) method to track a ship with different speed. For the Frame Difference method (FD) is simple but the amount of computation is very large, it is suitable for the use of FPGA to implement in parallel. But Frame Intervals (FIs) of the traditional FD method are fixed, and in remote sensing images, a ship looks very small (depicted by only dozens of pixels) and moves slowly. By applying invariant FIs, the accuracy of FD for moving ship tracking is not satisfactory and the calculation is highly redundant. So we use the adaptation of FD based on adaptive extraction of key frames for moving ship tracking. A FPGA development board of Xilinx Kintex-7 series is used for simulation. The experiments show that compared with the traditional FD method, the proposed one can achieve higher accuracy of moving ship tracking, and can meet the requirement of real-time tracking in high image resolution.
FPGA implementation of image dehazing algorithm for real time applications

NASA Astrophysics Data System (ADS)

Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.

2017-09-01

Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
An implementation of the SNR high speed network communication protocol (Receiver part)

NASA Astrophysics Data System (ADS)

Wan, Wen-Jyh

1995-03-01

This thesis work is to implement the receiver pan of the SNR high speed network transport protocol. The approach was to use the Systems of Communicating Machines (SCM) as the formal definition of the protocol. Programs were developed on top of the Unix system using C programming language. The Unix system features that were adopted for this implementation were multitasking, signals, shared memory, semaphores, sockets, timers and process control. The problems encountered, and solved, were signal loss, shared memory conflicts, process synchronization, scheduling, data alignment and errors in the SCM specification itself. The result was a correctly functioning program which implemented the SNR protocol. The system was tested using different connection modes, lost packets, duplicate packets and large data transfers. The contributions of this thesis are: (1) implementation of the receiver part of the SNR high speed transport protocol; (2) testing and integration with the transmitter part of the SNR transport protocol on an FDDI data link layered network; (3) demonstration of the functions of the SNR transport protocol such as connection management, sequenced delivery, flow control and error recovery using selective repeat methods of retransmission; and (4) modifications to the SNR transport protocol specification such as corrections for incorrect predicate conditions, defining of additional packet types formats, solutions for signal lost and processes contention problems etc.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
FPGA-based trigger system for the Fermilab SeaQuest experimentz

NASA Astrophysics Data System (ADS)

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; Chang, Ting-Hua; Chang, Wen-Chen; Chen, Yen-Chu; Gilman, Ron; Nakano, Kenichi; Peng, Jen-Chieh; Wang, Su-Yin

2015-12-01

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ+ and μ- produced in 120 GeV/c proton-nucleon interactions in a high rate environment. The trigger system consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is then examined against pre-determined trigger matrices to identify candidate muon tracks. Information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz

DOE PAGES

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; ...

2015-09-10

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.

PubMed

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2016-10-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
FPGA implementation of a ZigBee wireless network control interface to transmit biomedical signals

NASA Astrophysics Data System (ADS)

Gómez López, M. A.; Goy, C. B.; Bolognini, P. C.; Herrera, M. C.

2011-12-01

In recent years, cardiac hemodynamic monitors have incorporated new technologies based on wireless sensor networks which can implement different types of communication protocols. More precisely, a digital conductance catheter system recently developed adds a wireless ZigBee module (IEEE 802.15.4 standards) to transmit cardiac signals (ECG, intraventricular pressure and volume) which would allow the physicians to evaluate the patient's cardiac status in a noninvasively way. The aim of this paper is to describe a control interface, implemented in a FPGA device, to manage a ZigBee wireless network. ZigBee technology is used due to its excellent performance including simplicity, low-power consumption, short-range transmission and low cost. FPGA internal memory stores 8-bit signals with which the control interface prepares the information packets. These data were send to the ZigBee END DEVICE module that receives and transmits wirelessly to the external COORDINATOR module. Using an USB port, the COORDINATOR sends the signals to a personal computer for displaying. Each functional block of control interface was assessed by means of temporal diagrams. Three biological signals, organized in packets and converted to RS232 serial protocol, were sucessfully transmitted and displayed in a PC screen. For this purpose, a custom-made graphical software was designed using LabView.

Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Ripple FPN reduced algorithm based on temporal high-pass filter and hardware implementation

NASA Astrophysics Data System (ADS)

Li, Yiyang; Li, Shuo; Zhang, Zhipeng; Jin, Weiqi; Wu, Lei; Jin, Minglei

2016-11-01

Cooled infrared detector arrays always suffer from undesired Ripple Fixed-Pattern Noise (FPN) when observe the scene of sky. The Ripple Fixed-Pattern Noise seriously affect the imaging quality of thermal imager, especially for small target detection and tracking. It is hard to eliminate the FPN by the Calibration based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified space low-pass and temporal high-pass nonuniformity correction algorithm using adaptive time domain threshold (THP&GM). The threshold is designed to significantly reduce ghosting artifacts. We test the algorithm on real infrared in comparison to several previously published methods. This algorithm not only can effectively correct common FPN such as Stripe, but also has obviously advantage compared with the current methods in terms of detail protection and convergence speed, especially for Ripple FPN correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA). The hardware implementation of the algorithm based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay (less than 20 lines). The hardware has been successfully applied in actual system.
High-speed low-complexity video coding with EDiCTius: a DCT coding proposal for JPEG XS

NASA Astrophysics Data System (ADS)

Richter, Thomas; Fößel, Siegfried; Keinert, Joachim; Scherl, Christian

2017-09-01

In its 71th meeting, the JPEG committee issued a call for low complexity, high speed image coding, designed to address the needs of low-cost video-over-ip applications. As an answer to this call, Fraunhofer IIS and the Computing Center of the University of Stuttgart jointly developed an embedded DCT image codec requiring only minimal resources while maximizing throughput on FPGA and GPU implementations. Objective and subjective tests performed for the 73rd meeting confirmed its excellent performance and suitability for its purpose, and it was selected as one of the two key contributions for the development of a joined test model. In this paper, its authors describe the design principles of the codec, provide a high-level overview of the encoder and decoder chain and provide evaluation results on the test corpus selected by the JPEG committee.
Design of optical axis jitter control system for multi beam lasers based on FPGA

NASA Astrophysics Data System (ADS)

Ou, Long; Li, Guohui; Xie, Chuanlin; Zhou, Zhiqiang

2018-02-01

A design of optical axis closed-loop control system for multi beam lasers coherent combining based on FPGA was introduced. The system uses piezoelectric ceramics Fast Steering Mirrors (FSM) as actuator, the Fairfield spot detection of multi beam lasers by the high speed CMOS camera for optical detecting, a control system based on FPGA for real-time optical axis jitter suppression. The algorithm for optical axis centroid detecting and PID of anti-Integral saturation were realized by FPGA. Optimize the structure of logic circuit by reuse resource and pipeline, as a result of reducing logic resource but reduced the delay time, and the closed-loop bandwidth increases to 100Hz. The jitter of laser less than 40Hz was reduced 40dB. The cost of the system is low but it works stably.
Integration of multi-interface conversion channel using FPGA for modular photonic network

NASA Astrophysics Data System (ADS)

Janicki, Tomasz; Pozniak, Krzysztof T.; Romaniuk, Ryszard S.

2010-09-01

The article discusses the integration of different types of interfaces with FPGA circuits using a reconfigurable communication platform. The solution has been implemented in practice in a single node of a distributed measurement system. Construction of communication platform has been presented with its selected hardware modules, described in VHDL and implemented in FPGA circuits. The graphical user interface (GUI) has been described that allows a user to control the operation of the system. In the final part of the article selected practical solutions have been introduced. The whole measurement system resides on multi-gigabit optical network. The optical network construction is highly modular, reconfigurable and scalable.
Design and FPGA implementation for MAC layer of Ethernet PON

NASA Astrophysics Data System (ADS)

Zhu, Zengxi; Lin, Rujian; Chen, Jian; Ye, Jiajun; Chen, Xinqiao

2004-04-01

Ethernet passive optical network (EPON), which represents the convergence of low-cost, high-bandwidth and supporting multiple services, appears to be one of the best candidates for the next-generation access network. The work of standardizing EPON as a solution for access network is still underway in the IEEE802.3ah Ethernet in the first mile (EFM) task force. The final release is expected in 2004. Up to now, there has been no standard application specific integrated circuit (ASIC) chip available which fulfills the functions of media access control (MAC) layer of EPON. The MAC layer in EPON system has many functions, such as point-to-point emulation (P2PE), Ethernet MAC functionality, multi-point control protocol (MPCP), network operation, administration and maintenance (OAM) and link security. To implement those functions mentioned above, an embedded real-time operating system (RTOS) and a flexible programmable logic device (PLD) with an embedded processor are used. The software and hardware functions in MAC layer are realized through programming embedded microprocessor and field programmable gate array(FPGA). Finally, some experimental results are given in this paper. The method stated here can provide a valuable reference for developing EPON MAC layer ASIC.
Generic FPGA-Based Platform for Distributed IO in Proton Therapy Patient Safety Interlock System

NASA Astrophysics Data System (ADS)

Eichin, Michael; Carmona, Pablo Fernandez; Johansen, Ernst; Grossmann, Martin; Mayor, Alexandre; Erhardt, Daniel; Gomperts, Alexander; Regele, Harald; Bula, Christian; Sidler, Christof

2017-06-01

At the Paul Scherrer Institute (PSI) in Switzerland, cancer patients are treated with protons. Proton therapy at PSI has a long history and started in the 1980s. More than 30 years later, a new gantry has recently been installed in the existing facility. This new machine has been delivered by an industry partner. A big challenge is the integration of the vendor's safety system into the existing PSI environment. Different interface standards and the complexity of the system made it necessary to find a technical solution connecting an industry system to the existing PSI infrastructure. A novel very flexible distributed IO system based on field-programmable gate array (FPGA) technology was developed, supporting many different IO interface standards and high-speed communication links connecting the device to a PSI standard versa module eurocard-bus input output controller. This paper summarizes the features of the hardware technology, the FPGA framework with its high-speed communication link protocol, and presents our first measurement results.
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale

PubMed Central

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2017-01-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
An improved non-uniformity correction algorithm and its hardware implementation on FPGA

NASA Astrophysics Data System (ADS)

Rong, Shenghui; Zhou, Huixin; Wen, Zhigang; Qin, Hanlin; Qian, Kun; Cheng, Kuanhong

2017-09-01

The Non-uniformity of Infrared Focal Plane Arrays (IRFPA) severely degrades the infrared image quality. An effective non-uniformity correction (NUC) algorithm is necessary for an IRFPA imaging and application system. However traditional scene-based NUC algorithm suffers the image blurring and artificial ghosting. In addition, few effective hardware platforms have been proposed to implement corresponding NUC algorithms. Thus, this paper proposed an improved neural-network based NUC algorithm by the guided image filter and the projection-based motion detection algorithm. First, the guided image filter is utilized to achieve the accurate desired image to decrease the artificial ghosting. Then a projection-based moving detection algorithm is utilized to determine whether the correction coefficients should be updated or not. In this way the problem of image blurring can be overcome. At last, an FPGA-based hardware design is introduced to realize the proposed NUC algorithm. A real and a simulated infrared image sequences are utilized to verify the performance of the proposed algorithm. Experimental results indicated that the proposed NUC algorithm can effectively eliminate the fix pattern noise with less image blurring and artificial ghosting. The proposed hardware design takes less logic elements in FPGA and spends less clock cycles to process one frame of image.
Imaging photomultiplier array with integrated amplifiers and high-speed USB interfacea)

NASA Astrophysics Data System (ADS)

Blacksell, M.; Wach, J.; Anderson, D.; Howard, J.; Collis, S. M.; Blackwell, B. D.; Andruczyk, D.; James, B. W.

2008-10-01

Multianode photomultiplier tube (PMT) arrays are finding application as convenient high-speed light sensitive devices for plasma imaging. This paper describes the development of a USB-based "plug-n-play" 16-channel PMT camera with 16bits simultaneous acquisition of 16 signal channels at rates up to 2MS/s per channel. The preamplifiers and digital hardware are packaged in a compact housing which incorporates magnetic shielding, on-board generation of the high-voltage PMT bias, an optical filter mount and slits, and F-mount lens adaptor. Triggering, timing, and acquisition are handled by four field-programmable gate arrays (FPGAs) under instruction from a master FPGA controlled by a computer with a LABVIEW interface. We present technical design details and specifications and illustrate performance with high-speed images obtained on the H-1 heliac at the ANU.
Imaging photomultiplier array with integrated amplifiers and high-speed USB interface.

PubMed

Blacksell, M; Wach, J; Anderson, D; Howard, J; Collis, S M; Blackwell, B D; Andruczyk, D; James, B W

2008-10-01

Multianode photomultiplier tube (PMT) arrays are finding application as convenient high-speed light sensitive devices for plasma imaging. This paper describes the development of a USB-based "plug-n-play" 16-channel PMT camera with 16 bits simultaneous acquisition of 16 signal channels at rates up to 2 MSs per channel. The preamplifiers and digital hardware are packaged in a compact housing which incorporates magnetic shielding, on-board generation of the high-voltage PMT bias, an optical filter mount and slits, and F-mount lens adaptor. Triggering, timing, and acquisition are handled by four field-programmable gate arrays (FPGAs) under instruction from a master FPGA controlled by a computer with a LABVIEW interface. We present technical design details and specifications and illustrate performance with high-speed images obtained on the H-1 heliac at the ANU.
High-Speed On-Board Data Processing Platform for LIDAR Projects at NASA Langley Research Center

NASA Astrophysics Data System (ADS)

Beyon, J.; Ng, T. K.; Davis, M. J.; Adams, J. K.; Lin, B.

2015-12-01

The project called High-Speed On-Board Data Processing for Science Instruments (HOPS) has been funded by NASA Earth Science Technology Office (ESTO) Advanced Information Systems Technology (AIST) program during April, 2012 - April, 2015. HOPS is an enabler for science missions with extremely high data processing rates. In this three-year effort of HOPS, Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS) and 3-D Winds were of interest in particular. As for ASCENDS, HOPS replaces time domain data processing with frequency domain processing while making the real-time on-board data processing possible. As for 3-D Winds, HOPS offers real-time high-resolution wind profiling with 4,096-point fast Fourier transform (FFT). HOPS is adaptable with quick turn-around time. Since HOPS offers reusable user-friendly computational elements, its FPGA IP Core can be modified for a shorter development period if the algorithm changes. The FPGA and memory bandwidth of HOPS is 20 GB/sec while the typical maximum processor-to-SDRAM bandwidth of the commercial radiation tolerant high-end processors is about 130-150 MB/sec. The inter-board communication bandwidth of HOPS is 4 GB/sec while the effective processor-to-cPCI bandwidth of commercial radiation tolerant high-end boards is about 50-75 MB/sec. Also, HOPS offers VHDL cores for the easy and efficient implementation of ASCENDS and 3-D Winds, and other similar algorithms. A general overview of the 3-year development of HOPS is the goal of this presentation.
Central FPGA-based destination and load control in the LHCb MHz event readout

NASA Astrophysics Data System (ADS)

Jacobsson, R.

2012-10-01

The readout strategy of the LHCb experiment is based on complete event readout at 1 MHz. A set of 320 sub-detector readout boards transmit event fragments at total rate of 24.6 MHz at a bandwidth usage of up to 70 GB/s over a commercial switching network based on Gigabit Ethernet to a distributed event building and high-level trigger processing farm with 1470 individual multi-core computer nodes. In the original specifications, the readout was based on a pure push protocol. This paper describes the proposal, implementation, and experience of a non-conventional mixture of a push and a pull protocol, akin to credit-based flow control. An FPGA-based central master module, partly operating at the LHC bunch clock frequency of 40.08 MHz and partly at a double clock speed, is in charge of the entire trigger and readout control from the front-end electronics up to the high-level trigger farm. One FPGA is dedicated to controlling the event fragment packing in the readout boards, the assignment of the farm node destination for each event, and controls the farm load based on an asynchronous pull mechanism from each farm node. This dynamic readout scheme relies on generic event requests and the concept of node credit allowing load control and trigger rate regulation as a function of the global farm load. It also allows the vital task of fast central monitoring and automatic recovery in-flight of failing nodes while maintaining dead-time and event loss at a minimum. This paper demonstrates the strength and suitability of implementing this real-time task for a very large distributed system in an FPGA where no random delays are introduced, and where extreme reliability and accurate event accounting are fundamental requirements. It was in use during the entire commissioning phase of LHCb and has been in faultless operation during the first two years of physics luminosity data taking.
Design and FPGA Implementation of a Universal Chaotic Signal Generator Based on the Verilog HDL Fixed-Point Algorithm and State Machine Control

NASA Astrophysics Data System (ADS)

Qiu, Mo; Yu, Simin; Wen, Yuqiong; Lü, Jinhu; He, Jianbin; Lin, Zhuosheng

In this paper, a novel design methodology and its FPGA hardware implementation for a universal chaotic signal generator is proposed via the Verilog HDL fixed-point algorithm and state machine control. According to continuous-time or discrete-time chaotic equations, a Verilog HDL fixed-point algorithm and its corresponding digital system are first designed. In the FPGA hardware platform, each operation step of Verilog HDL fixed-point algorithm is then controlled by a state machine. The generality of this method is that, for any given chaotic equation, it can be decomposed into four basic operation procedures, i.e. nonlinear function calculation, iterative sequence operation, iterative values right shifting and ceiling, and chaotic iterative sequences output, each of which corresponds to only a state via state machine control. Compared with the Verilog HDL floating-point algorithm, the Verilog HDL fixed-point algorithm can save the FPGA hardware resources and improve the operation efficiency. FPGA-based hardware experimental results validate the feasibility and reliability of the proposed approach.
Hardware Implementation of 32-Bit High-Speed Direct Digital Frequency Synthesizer

PubMed Central

Ibrahim, Salah Hasan; Ali, Sawal Hamid Md.; Islam, Md. Shabiul

2014-01-01

The design and implementation of a high-speed direct digital frequency synthesizer are presented. A modified Brent-Kung parallel adder is combined with pipelining technique to improve the speed of the system. A gated clock technique is proposed to reduce the number of registers in the phase accumulator design. The quarter wave symmetry technique is used to store only one quarter of the sine wave. The ROM lookup table (LUT) is partitioned into three 4-bit sub-ROMs based on angular decomposition technique and trigonometric identity. Exploiting the advantages of sine-cosine symmetrical attributes together with XOR logic gates, one sub-ROM block can be removed from the design. These techniques, compressed the ROM into 368 bits. The ROM compressed ratio is 534.2 : 1, with only two adders, two multipliers, and XOR-gates with high frequency resolution of 0.029 Hz. These techniques make the direct digital frequency synthesizer an attractive candidate for wireless communication applications. PMID:24991635
A New FPGA Architecture of FAST and BRIEF Algorithm for On-Board Corner Detection and Matching.

PubMed

Huang, Jingjin; Zhou, Guoqing; Zhou, Xiang; Zhang, Rongting

2018-03-28

Although some researchers have proposed the Field Programmable Gate Array (FPGA) architectures of Feature From Accelerated Segment Test (FAST) and Binary Robust Independent Elementary Features (BRIEF) algorithm, there is no consideration of image data storage in these traditional architectures that will result in no image data that can be reused by the follow-up algorithms. This paper proposes a new FPGA architecture that considers the reuse of sub-image data. In the proposed architecture, a remainder-based method is firstly designed for reading the sub-image, a FAST detector and a BRIEF descriptor are combined for corner detection and matching. Six pairs of satellite images with different textures, which are located in the Mentougou district, Beijing, China, are used to evaluate the performance of the proposed architecture. The Modelsim simulation results found that: (i) the proposed architecture is effective for sub-image reading from DDR3 at a minimum cost; (ii) the FPGA implementation is corrected and efficient for corner detection and matching, such as the average value of matching rate of natural areas and artificial areas are approximately 67% and 83%, respectively, which are close to PC's and the processing speed by FPGA is approximately 31 and 2.5 times faster than those by PC processing and by GPU processing, respectively.
FPGA Based Adaptive Rate and Manifold Pattern Projection for Structured Light 3D Camera System †

PubMed Central

Lee, Sukhan

2018-01-01

The quality of the captured point cloud and the scanning speed of a structured light 3D camera system depend upon their capability of handling the object surface of a large reflectance variation in the trade-off of the required number of patterns to be projected. In this paper, we propose and implement a flexible embedded framework that is capable of triggering the camera single or multiple times for capturing single or multiple projections within a single camera exposure setting. This allows the 3D camera system to synchronize the camera and projector even for miss-matched frame rates such that the system is capable of projecting different types of patterns for different scan speed applications. This makes the system capturing a high quality of 3D point cloud even for the surface of a large reflectance variation while achieving a high scan speed. The proposed framework is implemented on the Field Programmable Gate Array (FPGA), where the camera trigger is adaptively generated in such a way that the position and the number of triggers are automatically determined according to camera exposure settings. In other words, the projection frequency is adaptive to different scanning applications without altering the architecture. In addition, the proposed framework is unique as it does not require any external memory for storage because pattern pixels are generated in real-time, which minimizes the complexity and size of the application-specific integrated circuit (ASIC) design and implementation. PMID:29642506
FPGA-based voltage and current dual drive system for high frame rate electrical impedance tomography.

PubMed

Khan, Shadab; Manwaring, Preston; Borsic, Andrea; Halter, Ryan

2015-04-01

Electrical impedance tomography (EIT) is used to image the electrical property distribution of a tissue under test. An EIT system comprises complex hardware and software modules, which are typically designed for a specific application. Upgrading these modules is a time-consuming process, and requires rigorous testing to ensure proper functioning of new modules with the existing ones. To this end, we developed a modular and reconfigurable data acquisition (DAQ) system using National Instruments' (NI) hardware and software modules, which offer inherent compatibility over generations of hardware and software revisions. The system can be configured to use up to 32-channels. This EIT system can be used to interchangeably apply current or voltage signal, and measure the tissue response in a semi-parallel fashion. A novel signal averaging algorithm, and 512-point fast Fourier transform (FFT) computation block was implemented on the FPGA. FFT output bins were classified as signal or noise. Signal bins constitute a tissue's response to a pure or mixed tone signal. Signal bins' data can be used for traditional applications, as well as synchronous frequency-difference imaging. Noise bins were used to compute noise power on the FPGA. Noise power represents a metric of signal quality, and can be used to ensure proper tissue-electrode contact. Allocation of these computationally expensive tasks to the FPGA reduced the required bandwidth between PC, and the FPGA for high frame rate EIT. In 16-channel configuration, with a signal-averaging factor of 8, the DAQ frame rate at 100 kHz exceeded 110 frames s (-1), and signal-to-noise ratio exceeded 90 dB across the spectrum. Reciprocity error was found to be for frequencies up to 1 MHz. Static imaging experiments were performed on a high-conductivity inclusion placed in a saline filled tank; the inclusion was clearly localized in the reconstructions obtained for both absolute current and voltage mode data.
A FPGA-based Measurement System for Nonvolatile Semiconductor Memory Characterization

NASA Astrophysics Data System (ADS)

Bu, Jiankang; White, Marvin

2002-03-01

Low voltage, long retention, high density SONOS nonvolatile semiconductor memory (NVSM) devices are ideally suited for PCMCIA, FLASH and 'smart' cards. The SONOS memory transistor requires characterization with an accurate, rapid measurement system with minimum disturbance to the device. The FPGA-based measurement system includes three parts: 1) a pattern generator implemented with XILINX FPGAs and corresponding software, 2) a high-speed, constant-current, threshold voltage detection circuit, 3) and a data evaluation program, implemented with a LABVIEW program. Fig. 1 shows the general block diagram of the FPGA-based measurement system. The function generator is designed and simulated with XILINX Foundation Software. Under the control of the specific erase/write/read pulses, the analog detect circuit applies operational modes to the SONOS device under test (DUT) and determines the change of the memory-state of the SONOS nonvolatile memory transistor. The TEK460 digitizes the analog threshold voltage output and sends to the PC computer. The data is filtered and averaged with a LABVIEWTM program running on the PC computer and displayed on the monitor in real time. We have implemented the pattern generator with XILINX FPGAs. Fig. 2 shows the block diagram of the pattern generator. We realized the logic control by a method of state machine design. Fig. 3 shows a small part of the state machine. The flexibility of the FPGAs enhances the capabilities of this system and allows measurement variations without hardware changes. The characterization of the nonvolatile memory transistor device under test (DUT), as function of programming voltage and time, is achieved by a high-speed, constant-current threshold voltage detection circuit. The analog detection circuit incorporating fast analog switches controlled digitally with the FPGAs. The schematic circuit diagram is shown in Fig. 4. The various operational modes for the DUT are realized with control signals applied to the
Real-time machine vision system using FPGA and soft-core processor

NASA Astrophysics Data System (ADS)

Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad

2012-06-01

This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.

Distributed Continuous Event-Based Data Acquisition Using the IEEE 1588 Synchronization and FlexRIO FPGA

NASA Astrophysics Data System (ADS)

Taliercio, C.; Luchetta, A.; Manduchi, G.; Rigoni, A.

2017-07-01

High-speed event driven acquisition is normally performed by analog-to-digital converter (ADC) boards with a given number of pretrigger sample and posttrigger sample that are recorded upon the occurrence of a hardware trigger. A direct physical connection is, therefore, required between the source of event (trigger) and the ADC, because any other software-based communication method would introduce a delay in triggering that would turn out to be not acceptable in many cases. This paper proposes a solution for the relaxation of the event communication time that can be, in this case, carried out by software messaging (e.g., via an LAN), provided that the system components are synchronized in time using the IEEE 1588 synchronization mechanism. The information about the exact event occurrence time is contained in the software packet that is sent to communicate the event and is used by the ADC FPGA to identify the exact sample in the ADC sample queue. The length of the ADC sample queue will depend on the maximum delay in software event message communication time. A prototype implementation using a National FlexRIO FPGA board connected with an ADC device is presented as the proof of concept.
FPGA Based Reconfigurable ATM Switch Test Bed

NASA Technical Reports Server (NTRS)

Chu, Pong P.; Jones, Robert E.

1998-01-01

Various issues associated with "FPGA Based Reconfigurable ATM Switch Test Bed" are presented in viewgraph form. Specific topics include: 1) Network performance evaluation; 2) traditional approaches; 3) software simulation; 4) hardware emulation; 5) test bed highlights; 6) design environment; 7) test bed architecture; 8) abstract sheared-memory switch; 9) detailed switch diagram; 10) traffic generator; 11) data collection circuit and user interface; 12) initial results; and 13) the following conclusions: Advances in FPGA make hardware emulation feasible for performance evaluation, hardware emulation can provide several orders of magnitude speed-up over software simulation; due to the complexity of hardware synthesis process, development in emulation is much more difficult than simulation and requires knowledge in both networks and digital design.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less
FPGA implementation of a configurable neuromorphic CPG-based locomotion controller.

PubMed

Barron-Zambrano, Jose Hugo; Torres-Huitzil, Cesar

2013-09-01

Neuromorphic engineering is a discipline devoted to the design and development of computational hardware that mimics the characteristics and capabilities of neuro-biological systems. In recent years, neuromorphic hardware systems have been implemented using a hybrid approach incorporating digital hardware so as to provide flexibility and scalability at the cost of power efficiency and some biological realism. This paper proposes an FPGA-based neuromorphic-like embedded system on a chip to generate locomotion patterns of periodic rhythmic movements inspired by Central Pattern Generators (CPGs). The proposed implementation follows a top-down approach where modularity and hierarchy are two desirable features. The locomotion controller is based on CPG models to produce rhythmic locomotion patterns or gaits for legged robots such as quadrupeds and hexapods. The architecture is configurable and scalable for robots with either different morphologies or different degrees of freedom (DOFs). Experiments performed on a real robot are presented and discussed. The obtained results demonstrate that the CPG-based controller provides the necessary flexibility to generate different rhythmic patterns at run-time suitable for adaptable locomotion. Copyright © 2013 Elsevier Ltd. All rights reserved.
High-Speed, High-Resolution Time-to-Digital Conversion

NASA Technical Reports Server (NTRS)

Katz, Richard; Kleyner, Igor; Garcia, Rafael

2013-01-01

This innovation is a series of time-tag pulses from a photomultiplier tube, featuring short time interval between pulses (e.g., 2.5 ns). Using the previous art, dead time between pulses is too long, or too much hardware is required, including a very-high-speed demultiplexer. A faster method is needed. The goal of this work is to provide circuits to time-tag pulses that arrive at a high rate using the hardwired logic in an FPGA - specifically the carry chain - to create what is (in effect) an analog delay line. High-speed pulses travel down the chain in a "wave." For instance, a pulse train has been demonstrated from a 1- GHz source reliably traveling down the carry chain. The size of the carry chain is over 10 ns in the time domain. Thus, multiple pulses will travel down the carry chain in a wave simultaneously. A register clocked by a low-skew clock takes a "snapshot" of the wave. Relatively simple logic can extract the pulses from the snapshot picture by detecting the transitions between logic states. The propagation delay of CMOS (complementary metal oxide semiconductor) logic circuits will differ and/or change as a result of temperature, voltage, age, radiation, and manufacturing variances. The time-to-digital conversion circuits can be calibrated with test signals, or the changes can be nulled by a separate on-die calibration channel, in a closed loop circuit.
Computing Models for FPGA-Based Accelerators

PubMed Central

Herbordt, Martin C.; Gu, Yongfeng; VanCourt, Tom; Model, Josh; Sukhwani, Bharat; Chiu, Matt

2011-01-01

Field-programmable gate arrays are widely considered as accelerators for compute-intensive applications. A critical phase of FPGA application development is finding and mapping to the appropriate computing model. FPGA computing enables models with highly flexible fine-grained parallelism and associative operations such as broadcast and collective response. Several case studies demonstrate the effectiveness of using these computing models in developing FPGA applications for molecular modeling. PMID:21603152
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System

NASA Astrophysics Data System (ADS)

Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.

2016-03-01

We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix

PubMed Central

Zhang, Min; Wang, Hai; Liu, Yan

2017-01-01

In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time. PMID:28420121
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix.

PubMed

Zhang, Min; Wang, Hai; Liu, Yan

2017-04-14

In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time.
Implementation in an FPGA circuit of Edge detection algorithm based on the Discrete Wavelet Transforms

NASA Astrophysics Data System (ADS)

Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia

2017-07-01

The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
Onboard FPGA-based SAR processing for future spaceborne systems

NASA Technical Reports Server (NTRS)

Le, Charles; Chan, Samuel; Cheng, Frank; Fang, Winston; Fischman, Mark; Hensley, Scott; Johnson, Robert; Jourdan, Michael; Marina, Miguel; Parham, Bruce;

2004-01-01

We present a real-time high-performance and fault-tolerant FPGA-based hardware architecture for the processing of synthetic aperture radar (SAR) images in future spaceborne system. In particular, we will discuss the integrated design approach, from top-level algorithm specifications and system requirements, design methodology, functional verification and performance validation, down to hardware design and implementation.

Hardware Implementation of a Bilateral Subtraction Filter

NASA Technical Reports Server (NTRS)

Huertas, Andres; Watson, Robert; Villalpando, Carlos; Goldberg, Steven

2009-01-01

A bilateral subtraction filter has been implemented as a hardware module in the form of a field-programmable gate array (FPGA). In general, a bilateral subtraction filter is a key subsystem of a high-quality stereoscopic machine vision system that utilizes images that are large and/or dense. Bilateral subtraction filters have been implemented in software on general-purpose computers, but the processing speeds attainable in this way even on computers containing the fastest processors are insufficient for real-time applications. The present FPGA bilateral subtraction filter is intended to accelerate processing to real-time speed and to be a prototype of a link in a stereoscopic-machine- vision processing chain, now under development, that would process large and/or dense images in real time and would be implemented in an FPGA. In terms that are necessarily oversimplified for the sake of brevity, a bilateral subtraction filter is a smoothing, edge-preserving filter for suppressing low-frequency noise. The filter operation amounts to replacing the value for each pixel with a weighted average of the values of that pixel and the neighboring pixels in a predefined neighborhood or window (e.g., a 9 9 window). The filter weights depend partly on pixel values and partly on the window size. The present FPGA implementation of a bilateral subtraction filter utilizes a 9 9 window. This implementation was designed to take advantage of the ability to do many of the component computations in parallel pipelines to enable processing of image data at the rate at which they are generated. The filter can be considered to be divided into the following parts (see figure): a) An image pixel pipeline with a 9 9- pixel window generator, b) An array of processing elements; c) An adder tree; d) A smoothing-and-delaying unit; and e) A subtraction unit. After each 9 9 window is created, the affected pixel data are fed to the processing elements. Each processing element is fed the pixel value for
An improved real time superresolution FPGA system

NASA Astrophysics Data System (ADS)

Lakshmi Narasimha, Pramod; Mudigoudar, Basavaraj; Yue, Zhanfeng; Topiwala, Pankaj

2009-05-01

In numerous computer vision applications, enhancing the quality and resolution of captured video can be critical. Acquired video is often grainy and low quality due to motion, transmission bottlenecks, etc. Postprocessing can enhance it. Superresolution greatly decreases camera jitter to deliver a smooth, stabilized, high quality video. In this paper, we extend previous work on a real-time superresolution application implemented in ASIC/FPGA hardware. A gradient based technique is used to register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved regularization technique in which the image is iteratively modified by applying back-projection to get a sharp and undistorted image. The algorithm was first tested in software and migrated to hardware, to achieve 320x240 -> 1280x960, about 30 fps, a stunning superresolution by 16X in total pixels. Various input parameters, such as size of input image, enlarging factor and the number of nearest neighbors, can be tuned conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in Matlab Simulink as well as in FPGA hardware, which gives us a fine balance between the number of bits and performance. The proposed system is robust and highly efficient. We have shown the performance improvement of the hardware superresolution over the software version (C code).
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
Multichannel FPGA based MVT system for high precision time (20 ps RMS) and charge measurement

NASA Astrophysics Data System (ADS)

Pałka, M.; Strzempek, P.; Korcyl, G.; Bednarski, T.; Niedźwiecki, Sz.; Białas, P.; Czerwiński, E.; Dulski, K.; Gajos, A.; Głowacz, B.; Gorgol, M.; Jasińska, B.; Kamińska, D.; Kajetanowicz, M.; Kowalski, P.; Kozik, T.; Krzemień, W.; Kubicz, E.; Mohhamed, M.; Raczyński, L.; Rudy, Z.; Rundel, O.; Salabura, P.; Sharma, N. G.; Silarski, M.; Smyrski, J.; Strzelecki, A.; Wieczorek, A.; Wiślicki, W.; Zieliński, M.; Zgardzińska, B.; Moskal, P.

2017-08-01

In this article it is presented an FPGA based Multi-Voltage Threshold (MVT) system which allows of sampling fast signals (1-2 ns rising and falling edge) in both voltage and time domain. It is possible to achieve a precision of time measurement of 20 ps RMS and reconstruct charge of signals, using a simple approach, with deviation from real value smaller than 10%. Utilization of the differential inputs of an FPGA chip as comparators together with an implementation of a TDC inside an FPGA allowed us to achieve a compact multi-channel system characterized by low power consumption and low production costs. This paper describes realization and functioning of the system comprising 192-channel TDC board and a four mezzanine cards which split incoming signals and discriminate them. The boards have been used to validate a newly developed Time-of-Flight Positron Emission Tomography system based on plastic scintillators. The achieved full system time resolution of σ(TOF) ≈ 68 ps is by factor of two better with respect to the current TOF-PET systems.
Dynamic high-speed acquisition system design of transmission error with USB based on LabVIEW and FPGA

NASA Astrophysics Data System (ADS)

Zheng, Yong; Chen, Yan

2013-10-01

To realize the design of dynamic acquisition system for real-time detection of transmission chain error is very important to improve the machining accuracy of machine tool. In this paper, the USB controller and FPGA is used for hardware platform design, combined with LabVIEW to design user applications, NI-VISA is taken for develop USB drivers, and ultimately achieve the dynamic acquisition system design of transmission error
FPGA implementation for real-time background subtraction based on Horprasert model.

PubMed

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W.
FPGA Implementation for Real-Time Background Subtraction Based on Horprasert Model

PubMed Central

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J.; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W. PMID:22368487
STRS Compliant FPGA Waveform Development

NASA Technical Reports Server (NTRS)

Nappier, Jennifer; Downey, Joseph; Mortensen, Dale

2008-01-01

The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. The extension of STRS to the SSP hardware will promote easier waveform reconfiguration and reuse. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. A FPGA-based transmit waveform implementation of the proposed standard interfaces on a laboratory breadboard SDR will be discussed.
Cleveland-Columbus-Cincinnati high-speed rail study

DOT National Transportation Integrated Search

2001-07-01

In the past five years, the evaluation of different high-speed rail (HSR) studies in the Midwest has resulted in a realization that high speed rail, with speeds greater than 110 miles per hour, is too expensive in the short term to be implemented in ...

High-Performance CCSDS AOS Protocol Implementation in FPGA

NASA Technical Reports Server (NTRS)

Clare, Loren P.; Torgerson, Jordan L.; Pang, Jackson

2010-01-01

The Consultative Committee for Space Data Systems (CCSDS) Advanced Orbiting Systems (AOS) space data link protocol provides a framing layer between channel coding such as LDPC (low-density parity-check) and higher-layer link multiplexing protocols such as CCSDS Encapsulation Service, which is described in the following article. Recent advancement in RF modem technology has allowed multi-megabit transmission over space links. With this increase in data rate, the CCSDS AOS protocol implementation needs to be optimized to both reduce energy consumption and operate at a high rate.
FPGA and USB based control board for quantum random number generator

NASA Astrophysics Data System (ADS)

Wang, Jian; Wan, Xu; Zhang, Hong-Fei; Gao, Yuan; Chen, Teng-Yun; Liang, Hao

2009-09-01

The design and implementation of FPGA-and-USB-based control board for quantum experiments are discussed. The usage of quantum true random number generator, control- logic in FPGA and communication with computer through USB protocol are proposed in this paper. Programmable controlled signal input and output ports are implemented. The error-detections of data frame header and frame length are designed. This board has been used in our decoy-state based quantum key distribution (QKD) system successfully.
Mitigated FPGA design of multi-gigabit transceivers for application in high radiation environments of High Energy Physics experiments

DOE PAGES

Brusati, M.; Camplani, A.; Cannon, M.; ...

2017-02-20

SRAM-ba8ed Field Programmable Gate Array (FPGA) logic devices arc very attractive in applications where high data throughput is needed, such as the latest generation of High Energy Physics (HEP) experiments. FPGAs have been rarely used in such experiments because of their sensitivity to radiation. The present paper proposes a mitigation approach applied to commercial FPGA devices to meet the reliability requirements for the front-end electronics of the Liquid Argon (LAr) electromagnetic calorimeter of the ATLAS experiment, located at CERN. Particular attention will be devoted to define a proper mitigation scheme of the multi-gigabit transceivers embedded in the FPGA, which ismore » a critical part of the LAr data acquisition chain. A demonstrator board is being developed to validate the proposed methodology. :!\\litigation techniques such as Triple Modular Redundancy (T:t\\IR) and scrubbing will be used to increase the robustness of the design and to maximize the fault tolerance from Single-Event Upsets (SEUs).« less
Implementing a GPU-based numerical algorithm for modelling dynamics of a high-speed train

NASA Astrophysics Data System (ADS)

Sytov, E. S.; Bratus, A. S.; Yurchenko, D.

2018-04-01

This paper discusses the initiative of implementing a GPU-based numerical algorithm for studying various phenomena associated with dynamics of a high-speed railway transport. The proposed numerical algorithm for calculating a critical speed of the bogie is based on the first Lyapunov number. Numerical algorithm is validated by analytical results, derived for a simple model. A dynamic model of a carriage connected to a new dual-wheelset flexible bogie is studied for linear and dry friction damping. Numerical results obtained by CPU, MPU and GPU approaches are compared and appropriateness of these methods is discussed.
Optimized FPGA Implementation of the Thyroid Hormone Secretion Mechanism Using CAD Tools.

PubMed

Alghazo, Jaafar M

2017-02-01

The goal of this paper is to implement the secretion mechanism of the Thyroid Hormone (TH) based on bio-mathematical differential eqs. (DE) on an FPGA chip. Hardware Descriptive Language (HDL) is used to develop a behavioral model of the mechanism derived from the DE. The Thyroid Hormone secretion mechanism is simulated with the interaction of the related stimulating and inhibiting hormones. Synthesis of the simulation is done with the aid of CAD tools and downloaded on a Field Programmable Gate Arrays (FPGAs) Chip. The chip output shows identical behavior to that of the designed algorithm through simulation. It is concluded that the chip mimics the Thyroid Hormone secretion mechanism. The chip, operating in real-time, is computer-independent stand-alone system.
Design of an FPGA-based electronic flow regulator (EFR) for spacecraft propulsion system

NASA Astrophysics Data System (ADS)

Manikandan, J.; Jayaraman, M.; Jayachandran, M.

2011-02-01

This paper describes a scheme for electronically regulating the flow of propellant to the thruster from a high-pressure storage tank used in spacecraft application. Precise flow delivery of propellant to thrusters ensures propulsion system operation at best efficiency by maximizing the propellant and power utilization for the mission. The proposed field programmable gate array (FPGA) based electronic flow regulator (EFR) is used to ensure precise flow of propellant to the thrusters from a high-pressure storage tank used in spacecraft application. This paper presents hardware and software design of electronic flow regulator and implementation of the regulation logic onto an FPGA.Motivation for proposed FPGA-based electronic flow regulation is on the disadvantages of conventional approach of using analog circuits. Digital flow regulation overcomes the analog equivalent as digital circuits are highly flexible, are not much affected due to noise, accurate performance is repeatable, interface is easier to computers, storing facilities are possible and finally failure rate of digital circuits is less. FPGA has certain advantages over ASIC and microprocessor/micro-controller that motivated us to opt for FPGA-based electronic flow regulator. Also the control algorithm being software, it is well modifiable without changing the hardware. This scheme is simple enough to adopt for a wide range of applications, where the flow is to be regulated for efficient operation.The proposed scheme is based on a space-qualified re-configurable field programmable gate arrays (FPGA) and hybrid micro circuit (HMC). A graphical user interface (GUI) based application software is also developed for debugging, monitoring and controlling the electronic flow regulator from PC COM port.
Motion camera based on a custom vision sensor and an FPGA architecture

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel

1998-09-01

A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Supercomputer implementation of finite element algorithms for high speed compressible flows

NASA Technical Reports Server (NTRS)

Thornton, E. A.; Ramakrishnan, R.

1986-01-01

Prediction of compressible flow phenomena using the finite element method is of recent origin and considerable interest. Two shock capturing finite element formulations for high speed compressible flows are described. A Taylor-Galerkin formulation uses a Taylor series expansion in time coupled with a Galerkin weighted residual statement. The Taylor-Galerkin algorithms use explicit artificial dissipation, and the performance of three dissipation models are compared. A Petrov-Galerkin algorithm has as its basis the concepts of streamline upwinding. Vectorization strategies are developed to implement the finite element formulations on the NASA Langley VPS-32. The vectorization scheme results in finite element programs that use vectors of length of the order of the number of nodes or elements. The use of the vectorization procedure speeds up processing rates by over two orders of magnitude. The Taylor-Galerkin and Petrov-Galerkin algorithms are evaluated for 2D inviscid flows on criteria such as solution accuracy, shock resolution, computational speed and storage requirements. The convergence rates for both algorithms are enhanced by local time-stepping schemes. Extension of the vectorization procedure for predicting 2D viscous and 3D inviscid flows are demonstrated. Conclusions are drawn regarding the applicability of the finite element procedures for realistic problems that require hundreds of thousands of nodes.
FPGA Sequencer for Radar Altimeter Applications

NASA Technical Reports Server (NTRS)

Berkun, Andrew C.; Pollard, Brian D.; Chen, Curtis W.

2011-01-01

A sequencer for a radar altimeter provides accurate attitude information for a reliable soft landing of the Mars Science Laboratory (MSL). This is a field-programmable- gate-array (FPGA)-only implementation. A table loaded externally into the FPGA controls timing, processing, and decision structures. Radar is memory-less and does not use previous acquisitions to assist in the current acquisition. All cycles complete in exactly 50 milliseconds, regardless of range or whether a target was found. A RAM (random access memory) within the FPGA holds instructions for up to 15 sets. For each set, timing is run, echoes are processed, and a comparison is made. If a target is seen, more detailed processing is run on that set. If no target is seen, the next set is tried. When all sets have been run, the FPGA terminates and waits for the next 50-millisecond event. This setup simplifies testing and improves reliability. A single vertex chip does the work of an entire assembly. Output products require minor processing to become range and velocity. This technology is the heart of the Terminal Descent Sensor, which is an integral part of the Entry Decent and Landing system for MSL. In addition, it is a strong candidate for manned landings on Mars or the Moon.
Embedded System Implementation on FPGA System With μCLinux OS

NASA Astrophysics Data System (ADS)

Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna

2011-02-01

Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
Design and implementation of projects with Xilinx Zynq FPGA: a practical case

NASA Astrophysics Data System (ADS)

Travaglini, R.; D'Antone, I.; Meneghini, S.; Rignanese, L.; Zuffa, M.

The main advantage when using FPGAs with embedded processors is the availability of additional several high-performance resources in the same physical device. Moreover, the FPGA programmability allows for connect custom peripherals. Xilinx have designed a programmable device named Zynq-7000 (simply called Zynq in the following), which integrates programmable logic (identical to the other Xilinx "serie 7" devices) with a System on Chip (SOC) based on two embedded ARM processors. Since both parts are deeply connected, the designers benefit from performance of hardware SOC and flexibility of programmability as well. In this paper a design developed by the Electronic Design Department at the Bologna Division of INFN will be presented as a practical case of project based on Zynq device. It is developed by using a commercial board called ZedBoard hosting a FMC mezzanine with a 12-bit 500 MS/s ADC. The Zynq FPGA on the ZedBoard receives digital outputs from the ADC and send them to the acquisition PC, after proper formatting, through a Gigabit Ethernet link. The major focus of the paper will be about the methodology to develop a Zynq-based design with the Xilinx Vivado software, enlightening how to configure the SOC and connect it with the programmable logic. Firmware design techniques will be presented: in particular both VHDL and IP core based strategies will be discussed. Further, the procedure to develop software for the embedded processor will be presented. Finally, some debugging tools, like the embedded Logic Analyzer, will be shown. Advantages and disadvantages with respect to adopting FPGA without embedded processors will be discussed.
Extending the BEAGLE library to a multi-FPGA platform.

PubMed

Jin, Zheming; Bakos, Jason D

2013-01-19

Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory
Facial emotion recognition system for autistic children: a feasible study based on FPGA implementation.

PubMed

Smitha, K G; Vinod, A P

2015-11-01

Children with autism spectrum disorder have difficulty in understanding the emotional and mental states from the facial expressions of the people they interact. The inability to understand other people's emotions will hinder their interpersonal communication. Though many facial emotion recognition algorithms have been proposed in the literature, they are mainly intended for processing by a personal computer, which limits their usability in on-the-move applications where portability is desired. The portability of the system will ensure ease of use and real-time emotion recognition and that will aid for immediate feedback while communicating with caretakers. Principal component analysis (PCA) has been identified as the least complex feature extraction algorithm to be implemented in hardware. In this paper, we present a detailed study of the implementation of serial and parallel implementation of PCA in order to identify the most feasible method for realization of a portable emotion detector for autistic children. The proposed emotion recognizer architectures are implemented on Virtex 7 XC7VX330T FFG1761-3 FPGA. We achieved 82.3% detection accuracy for a word length of 8 bits.
High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting

PubMed Central

Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel

2008-01-01

Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553
The implementation of contour-based object orientation estimation algorithm in FPGA-based on-board vision system

NASA Astrophysics Data System (ADS)

Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery

2016-10-01

This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Rad-Hard/HI-REL FPGA

NASA Technical Reports Server (NTRS)

Wang, Jih-Jong; Cronquist, Brian E.; McGowan, John E.; Katz, Richard B.

1997-01-01

The goals for a radiation hardened (RAD-HARD) and high reliability (HI-REL) field programmable gate array (FPGA) are described. The first qualified manufacturer list (QML) radiation hardened RH1280 and RH1020 were developed. The total radiation dose and single event effects observed on the antifuse FPGA RH1280 are reported on. Tradeoffs and the limitations in the single event upset hardening are discussed.
Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

PubMed Central

Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M

2008-01-01

Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation. PMID:18412963
Multi-DSP and FPGA based Multi-channel Direct IF/RF Digital receiver for atmospheric radar

NASA Astrophysics Data System (ADS)

Yasodha, Polisetti; Jayaraman, Achuthan; Kamaraj, Pandian; Durga rao, Meka; Thriveni, A.

2016-07-01

Modern phased array radars depend highly on digital signal processing (DSP) to extract the echo signal information and to accomplish reliability along with programmability and flexibility. The advent of ASIC technology has made various digital signal processing steps to be realized in one DSP chip, which can be programmed as per the application and can handle high data rates, to be used in the radar receiver to process the received signal. Further, recent days field programmable gate array (FPGA) chips, which can be re-programmed, also present an opportunity to utilize them to process the radar signal. A multi-channel direct IF/RF digital receiver (MCDRx) is developed at NARL, taking the advantage of high speed ADCs and high performance DSP chips/FPGAs, to be used for atmospheric radars working in HF/VHF bands. Multiple channels facilitate the radar t be operated in multi-receiver modes and also to obtain the wind vector with improved time resolution, without switching the antenna beam. MCDRx has six channels, implemented on a custom built digital board, which is realized using six numbers of ADCs for simultaneous processing of the six input signals, Xilinx vertex5 FPGA and Spartan6 FPGA, and two ADSPTS201 DSP chips, each of which performs one phase of processing. MCDRx unit interfaces with the data storage/display computer via two gigabit ethernet (GbE) links. One of the six channels is used for Doppler beam swinging (DBS) mode and the other five channels are used for multi-receiver mode operations, dedicatedly. Each channel has (i) ADC block, to digitize RF/IF signal, (ii) DDC block for digital down conversion of the digitized signal, (iii) decoding block to decode the phase coded signal, and (iv) coherent integration block for integrating the data preserving phase intact. ADC block consists of Analog devices make AD9467 16-bit ADCs, to digitize the input signal at 80 MSPS. The output of ADC is centered around (80 MHz - input frequency). The digitized data is fed
LinoSPAD: a time-resolved 256×1 CMOS SPAD line sensor system featuring 64 FPGA-based TDC channels running at up to 8.5 giga-events per second

NASA Astrophysics Data System (ADS)

Burri, Samuel; Homulle, Harald; Bruschini, Claudio; Charbon, Edoardo

2016-04-01

LinoSPAD is a reconfigurable camera sensor with a 256×1 CMOS SPAD (single-photon avalanche diode) pixel array connected to a low cost Xilinx Spartan 6 FPGA. The LinoSPAD sensor's line of pixels has a pitch of 24 μm and 40% fill factor. The FPGA implements an array of 64 TDCs and histogram engines capable of processing up to 8.5 giga-photons per second. The LinoSPAD sensor measures 1.68 mm×6.8 mm and each pixel has a direct digital output to connect to the FPGA. The chip is bonded on a carrier PCB to connect to the FPGA motherboard. 64 carry chain based TDCs sampled at 400 MHz can generate a timestamp every 7.5 ns with a mean time resolution below 25 ps per code. The 64 histogram engines provide time-of-arrival histograms covering up to 50 ns. An alternative mode allows the readout of 28 bit timestamps which have a range of up to 4.5 ms. Since the FPGA TDCs have considerable non-linearity we implemented a correction module capable of increasing histogram linearity at real-time. The TDC array is interfaced to a computer using a super-speed USB3 link to transfer over 150k histograms per second for the 12.5 ns reference period used in our characterization. After characterization and subsequent programming of the post-processing we measure an instrument response histogram shorter than 100 ps FWHM using a strong laser pulse with 50 ps FWHM. A timing resolution that when combined with the high fill factor makes the sensor well suited for a wide variety of applications from fluorescence lifetime microscopy over Raman spectroscopy to 3D time-of-flight.
Design of CMOS imaging system based on FPGA

NASA Astrophysics Data System (ADS)

Hu, Bo; Chen, Xiaolai

2017-10-01

In order to meet the needs of engineering applications for high dynamic range CMOS camera under the rolling shutter mode, a complete imaging system is designed based on the CMOS imaging sensor NSC1105. The paper decides CMOS+ADC+FPGA+Camera Link as processing architecture and introduces the design and implementation of the hardware system. As for camera software system, which consists of CMOS timing drive module, image acquisition module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The ISE 14.6 emulator ISim is used in the simulation of signals. The imaging experimental results show that the system exhibits a 1280*1024 pixel resolution, has a frame frequency of 25 fps and a dynamic range more than 120dB. The imaging quality of the system satisfies the requirement of the index.

A FPGA embedded web server for remote monitoring and control of smart sensors networks.

PubMed

Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique

2013-12-27

This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology.
A FPGA Embedded Web Server for Remote Monitoring and Control of Smart Sensors Networks

PubMed Central

Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique

2014-01-01

This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology. PMID:24379047
A cellular automata based FPGA realization of a new metaheuristic bat-inspired algorithm

NASA Astrophysics Data System (ADS)

Progias, Pavlos; Amanatiadis, Angelos A.; Spataro, William; Trunfio, Giuseppe A.; Sirakoulis, Georgios Ch.

2016-10-01

Optimization algorithms are often inspired by processes occuring in nature, such as animal behavioral patterns. The main concern with implementing such algorithms in software is the large amounts of processing power they require. In contrast to software code, that can only perform calculations in a serial manner, an implementation in hardware, exploiting the inherent parallelism of single-purpose processors, can prove to be much more efficient both in speed and energy consumption. Furthermore, the use of Cellular Automata (CA) in such an implementation would be efficient both as a model for natural processes, as well as a computational paradigm implemented well on hardware. In this paper, we propose a VHDL implementation of a metaheuristic algorithm inspired by the echolocation behavior of bats. More specifically, the CA model is inspired by the metaheuristic algorithm proposed earlier in the literature, which could be considered at least as efficient than other existing optimization algorithms. The function of the FPGA implementation of our algorithm is explained in full detail and results of our simulations are also demonstrated.
An FPGA-Based People Detection System

NASA Astrophysics Data System (ADS)

Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.

2005-12-01

This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.
High-speed multiple sequence alignment on a reconfigurable platform.

PubMed

Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

2006-01-01

Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.
An FPGA-based reconfigurable DDC algorithm

NASA Astrophysics Data System (ADS)

Juszczyk, B.; Kasprowicz, G.

2016-09-01

This paper describes implementation of reconfigurable digital down converter in an FPGA structure. System is designed to work with quadrature signals. One of the main criteria of the project was to provied wide range of reconfiguration in order to fulfill various application rage. Potential applications include: software defined radio receiver, passive noise radars and measurement data compression. This document contains general system overview, short description of hardware used in the project and gateware implementation.
FPGA based control system for space instrumentation

NASA Astrophysics Data System (ADS)

Di Giorgio, Anna M.; Cerulli Irelli, Pasquale; Nuzzolo, Francesco; Orfei, Renato; Spinoglio, Luigi; Liu, Giovanni S.; Saraceno, Paolo

2008-07-01

The prototype for a general purpose FPGA based control system for space instrumentation is presented, with particular attention to the instrument control application software. The system HW is based on the LEON3FT processor, which gives the flexibility to configure the chip with only the necessary HW functionalities, from simple logic up to small dedicated processors. The instrument control SW is developed in ANSI C and for time critical (<10μs) commanding sequences implements an internal instructions sequencer, triggered via an interrupt service routine based on a HW high priority interrupt.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.

PubMed

Zierke, Stephanie; Bakos, Jason D

2010-04-12

Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Performance evaluation of multiple (32 channels) sub-nanosecond TDC implemented in low-cost FPGA

NASA Astrophysics Data System (ADS)

Lichard, P.; Konstantinou, G.; Villar Vilanueva, A.; Palladino, V.

2014-03-01

NA62 experiment Straw tracker frontend board serves as a gas-tight detector cover and integrates two CARIOCA chips, a low cost FPGA (Cyclon III, Altera) and a set of 400Mbit/s links to the backend. The FPGA houses 16 pairs of sub-nanosecond resolution TDCs with derandomizers and an output link serializer. Evaluation methods, including simulations, and performance results of the system in the lab and on a detector prototype are presented.
Embedded algorithms within an FPGA-based system to process nonlinear time series data

NASA Astrophysics Data System (ADS)

Jones, Jonathan D.; Pei, Jin-Song; Tull, Monte P.

2008-03-01

This paper presents some preliminary results of an ongoing project. A pattern classification algorithm is being developed and embedded into a Field-Programmable Gate Array (FPGA) and microprocessor-based data processing core in this project. The goal is to enable and optimize the functionality of onboard data processing of nonlinear, nonstationary data for smart wireless sensing in structural health monitoring. Compared with traditional microprocessor-based systems, fast growing FPGA technology offers a more powerful, efficient, and flexible hardware platform including on-site (field-programmable) reconfiguration capability of hardware. An existing nonlinear identification algorithm is used as the baseline in this study. The implementation within a hardware-based system is presented in this paper, detailing the design requirements, validation, tradeoffs, optimization, and challenges in embedding this algorithm. An off-the-shelf high-level abstraction tool along with the Matlab/Simulink environment is utilized to program the FPGA, rather than coding the hardware description language (HDL) manually. The implementation is validated by comparing the simulation results with those from Matlab. In particular, the Hilbert Transform is embedded into the FPGA hardware and applied to the baseline algorithm as the centerpiece in processing nonlinear time histories and extracting instantaneous features of nonstationary dynamic data. The selection of proper numerical methods for the hardware execution of the selected identification algorithm and consideration of the fixed-point representation are elaborated. Other challenges include the issues of the timing in the hardware execution cycle of the design, resource consumption, approximation accuracy, and user flexibility of input data types limited by the simplicity of this preliminary design. Future work includes making an FPGA and microprocessor operate together to embed a further developed algorithm that yields better
A Real-Time System for Lane Detection Based on FPGA and DSP

NASA Astrophysics Data System (ADS)

Xiao, Jing; Li, Shutao; Sun, Bin

2016-12-01

This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
Real-time co-registered ultrasound and photoacoustic imaging system based on FPGA and DSP architecture

NASA Astrophysics Data System (ADS)

Alqasemi, Umar; Li, Hai; Aguirre, Andres; Zhu, Quing

2011-03-01

Co-registering ultrasound (US) and photoacoustic (PA) imaging is a logical extension to conventional ultrasound because both modalities provide complementary information of tumor morphology, tumor vasculature and hypoxia for cancer detection and characterization. In addition, both modalities are capable of providing real-time images for clinical applications. In this paper, a Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP) module-based real-time US/PA imaging system is presented. The system provides real-time US/PA data acquisition and image display for up to 5 fps* using the currently implemented DSP board. It can be upgraded to 15 fps, which is the maximum pulse repetition rate of the used laser, by implementing an advanced DSP module. Additionally, the photoacoustic RF data for each frame is saved for further off-line processing. The system frontend consists of eight 16-channel modules made of commercial and customized circuits. Each 16-channel module consists of two commercial 8-channel receiving circuitry boards and one FPGA board from Analog Devices. Each receiving board contains an IC† that combines. 8-channel low-noise amplifiers, variable-gain amplifiers, anti-aliasing filters, and ADC's‡ in a single chip with sampling frequency of 40MHz. The FPGA board captures the LVDSξ Double Data Rate (DDR) digital output of the receiving board and performs data conditioning and subbeamforming. A customized 16-channel transmission circuitry is connected to the two receiving boards for US pulseecho (PE) mode data acquisition. A DSP module uses External Memory Interface (EMIF) to interface with the eight 16-channel modules through a customized adaptor board. The DSP transfers either sub-beamformed data (US pulse-echo mode or PAI imaging mode) or raw data from FPGA boards to its DDR-2 memory through the EMIF link, then it performs additional processing, after that, it transfer the data to the PC** for further image processing. The PC code
An Efficient, FPGA-Based, Cluster Detection Algorithm Implementation for a Strip Detector Readout System in a Time Projection Chamber Polarimeter

NASA Technical Reports Server (NTRS)

Gregory, Kyle J.; Hill, Joanne E. (Editor); Black, J. Kevin; Baumgartner, Wayne H.; Jahoda, Keith

2016-01-01

A fundamental challenge in a spaceborne application of a gas-based Time Projection Chamber (TPC) for observation of X-ray polarization is handling the large amount of data collected. The TPC polarimeter described uses the APV-25 Application Specific Integrated Circuit (ASIC) to readout a strip detector. Two dimensional photoelectron track images are created with a time projection technique and used to determine the polarization of the incident X-rays. The detector produces a 128x30 pixel image per photon interaction with each pixel registering 12 bits of collected charge. This creates challenging requirements for data storage and downlink bandwidth with only a modest incidence of photons and can have a significant impact on the overall mission cost. An approach is described for locating and isolating the photoelectron track within the detector image, yielding a much smaller data product, typically between 8x8 pixels and 20x20 pixels. This approach is implemented using a Microsemi RT-ProASIC3-3000 Field-Programmable Gate Array (FPGA), clocked at 20 MHz and utilizing 10.7k logic gates (14% of FPGA), 20 Block RAMs (17% of FPGA), and no external RAM. Results will be presented, demonstrating successful photoelectron track cluster detection with minimal impact to detector dead-time.
The integration of FPGA TDC inside White Rabbit node

NASA Astrophysics Data System (ADS)

Li, H.; Xue, T.; Gong, G.; Li, J.

2017-04-01

White Rabbit technology is capable of delivering sub-nanosecond accuracy and picosecond precision of synchronization and normal data packets over the fiber network. Carry chain structure in FPGA is a popular way to build TDC and tens of picosecond RMS resolution has been achieved. The integration of WR technology with FPGA TDC can enhance and simplify the TDC in many aspects that includes providing a low jitter clock for TDC, a synchronized absolute UTC/TAI timestamp for coarse counter, a fancy way to calibrate the carry chain DNL and an easy to use Ethernet link for data and control information transmit. This paper presents a FPGA TDC implemented inside a normal White Rabbit node with sub-nanosecond measurement precision. The measured standard deviation reaches 50ps between two distributed TDCs. Possible applications of this distributed TDC are also discussed.
SAD5 Stereo Correlation Line-Striping in an FPGA

NASA Technical Reports Server (NTRS)

Villalpando, Carlos Y.; Morfopoulos, Arin C.

2011-01-01

High precision SAD5 stereo computations can be performed in an FPGA (field-programmable gate array) at much higher speeds than possible in a conventional CPU (central processing unit), but this uses large amounts of FPGA resources that scale with image size. Of the two key resources in an FPGA, Slices and BRAM (block RAM), Slices scale linearly in the new algorithm with image size, and BRAM scales quadratically with image size. An approach was developed to trade latency for BRAM by sub-windowing the image vertically into overlapping strips and stitching the outputs together to create a single continuous disparity output. In stereo, the general rule of thumb is that the disparity search range must be 1/10 the image size. In the new algorithm, BRAM usage scales linearly with disparity search range and scales again linearly with line width. So a doubling of image size, say from 640 to 1,280, would in the previous design be an effective 4 of BRAM usage: 2 for line width, 2 again for disparity search range. The minimum strip size is twice the search range, and will produce an output strip width equal to the disparity search range. So assuming a disparity search range of 1/10 image width, 10 sequential runs of the minimum strip size would produce a full output image. This approach allowed the innovators to fit 1280 960 wide SAD5 stereo disparity in less than 80 BRAM, 52k Slices on a Virtex 5LX330T, 25% and 24% of resources, respectively. Using a 100-MHz clock, this build would perform stereo at 39 Hz. Of particular interest to JPL is that there is a flight qualified version of the Virtex 5: this could produce stereo results even for very large image sizes at 3 orders of magnitude faster than could be computed on the PowerPC 750 flight computer. The work covered in the report allows the stereo algorithm to run on much larger images than before, and using much less BRAM. This opens up choices for a smaller flight FPGA (which saves power and space), or for other algorithms
A Re-programmable Platform for Dynamic Burn-in Test of Xilinx Virtexll 3000 FPGA for Military and Aerospace Applications

NASA Technical Reports Server (NTRS)

Roosta, Ramin; Wang, Xinchen; Sadigursky, Michael; Tracton, Phil

2004-01-01

Field Programmable Gate Arrays (FPGA) have played increasingly important roles in military and aerospace applications. Xilinx SRAM-based FPGAs have been extensively used in commercial applications. They have been used less frequently in space flight applications due to their susceptibility to single-event upsets. Reliability of these devices in space applications is a concern that has not been addressed. The objective of this project is to design a fully programmable hardware/software platform that allows (but is not limited to) comprehensive static/dynamic burn-in test of Virtex-II 3000 FPGAs, at speed test and SEU test. Conventional methods test very few discrete AC parameters (primarily switching) of a given integrated circuit. This approach will test any possible configuration of the FPGA and any associated performance parameters. It allows complete or partial re-programming of the FPGA and verification of the program by using read back followed by dynamic test. Designers have full control over which functional elements of the FPGA to stress. They can completely simulate all possible types of configurations/functions. Another benefit of this platform is that it allows collecting information on elevation of the junction temperature as a function of gate utilization, operating frequency and functionality. A software tool has been implemented to demonstrate the various features of the system. The software consists of three major parts: the parallel interface driver, main system procedure and a graphical user interface (GUI).
High-Speed TCP Testing

NASA Technical Reports Server (NTRS)

Brooks, David E.; Gassman, Holly; Beering, Dave R.; Welch, Arun; Hoder, Douglas J.; Ivancic, William D.

1999-01-01

Transmission Control Protocol (TCP) is the underlying protocol used within the Internet for reliable information transfer. As such, there is great interest to have all implementations of TCP efficiently interoperate. This is particularly important for links exhibiting long bandwidth-delay products. The tools exist to perform TCP analysis at low rates and low delays. However, for extremely high-rate and lone-delay links such as 622 Mbps over geosynchronous satellites, new tools and testing techniques are required. This paper describes the tools and techniques used to analyze and debug various TCP implementations over high-speed, long-delay links.
Diagnostic layer integration in FPGA-based pipeline measurement systems for HEP experiments

NASA Astrophysics Data System (ADS)

Pozniak, Krzysztof T.

2007-08-01

Integrated triggering and data acquisition systems for high energy physics experiments may be considered as fast, multichannel, synchronous, distributed, pipeline measurement systems. A considerable extension of functional, technological and monitoring demands, which has recently been imposed on them, forced a common usage of large field-programmable gate array (FPGA), digital signal processing-enhanced matrices and fast optical transmission for their realization. This paper discusses modelling, design, realization and testing of pipeline measurement systems. A distribution of synchronous data stream flows is considered in the network. A general functional structure of a single network node is presented. A suggested, novel block structure of the node model facilitates full implementation in the FPGA chip, circuit standardization and parametrization, as well as integration of functional and diagnostic layers. A general method for pipeline system design was derived. This method is based on a unified model of the synchronous data network node. A few examples of practically realized, FPGA-based, pipeline measurement systems were presented. The described systems were applied in ZEUS and CMS.
Extending the BEAGLE library to a multi-FPGA platform

PubMed Central

2013-01-01

Background Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein’s pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein’s pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. Results The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform’s peak memory bandwidth and the implementation’s memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE’s CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE’s GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. Conclusions The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design
STAR: FPGA-based software defined satellite transponder

NASA Astrophysics Data System (ADS)

Davalle, Daniele; Cassettari, Riccardo; Saponara, Sergio; Fanucci, Luca; Cucchi, Luca; Bigongiari, Franco; Errico, Walter

2013-05-01

This paper presents STAR, a flexible Telemetry, Tracking & Command (TT&C) transponder for Earth Observation (EO) small satellites, developed in collaboration with INTECS and SITAEL companies. With respect to state-of-the-art EO transponders, STAR includes the possibility of scientific data transfer thanks to the 40 Mbps downlink data-rate. This feature represents an important optimization in terms of hardware mass, which is important for EO small satellites. Furthermore, in-flight re-configurability of communication parameters via telecommand is important for in-orbit link optimization, which is especially useful for low orbit satellites where visibility can be as short as few hundreds of seconds. STAR exploits the principles of digital radio to minimize the analog section of the transceiver. 70MHz intermediate frequency (IF) is the interface with an external S/X band radio-frequency front-end. The system is composed of a dedicated configurable high-speed digital signal processing part, the Signal Processor (SP), described in technology-independent VHDL working with a clock frequency of 184.32MHz and a low speed control part, the Control Processor (CP), based on the 32-bit Gaisler LEON3 processor clocked at 32 MHz, with SpaceWire and CAN interfaces. The quantization parameters were fine-tailored to reach a trade-off between hardware complexity and implementation loss which is less than 0.5 dB at BER = 10-5 for the RX chain. The IF ports require 8-bit precision. The system prototype is fitted on the Xilinx Virtex 6 VLX75T-FF484 FPGA of which a space-qualified version has been announced. The total device occupation is 82 %.

Timing generator of scientific grade CCD camera and its implementation based on FPGA technology

NASA Astrophysics Data System (ADS)

Si, Guoliang; Li, Yunfei; Guo, Yongfei

2010-10-01

The Timing Generator's functions of Scientific Grade CCD Camera is briefly presented: it generates various kinds of impulse sequence for the TDI-CCD, video processor and imaging data output, acting as the synchronous coordinator for time in the CCD imaging unit. The IL-E2TDI-CCD sensor produced by DALSA Co.Ltd. use in the Scientific Grade CCD Camera. Driving schedules of IL-E2 TDI-CCD sensor has been examined in detail, the timing generator has been designed for Scientific Grade CCD Camera. FPGA is chosen as the hardware design platform, schedule generator is described with VHDL. The designed generator has been successfully fulfilled function simulation with EDA software and fitted into XC2VP20-FF1152 (a kind of FPGA products made by XILINX). The experiments indicate that the new method improves the integrated level of the system. The Scientific Grade CCD camera system's high reliability, stability and low power supply are achieved. At the same time, the period of design and experiment is sharply shorted.
A FPGA implementation for linearly unmixing a hyperspectral image using OpenCL

NASA Astrophysics Data System (ADS)

Guerra, Raúl; López, Sebastián.; Sarmiento, Roberto

2017-10-01

Hyperspectral imaging systems provide images in which single pixels have information from across the electromagnetic spectrum of the scene under analysis. These systems divide the spectrum into many contiguos channels, which may be even out of the visible part of the spectra. The main advantage of the hyperspectral imaging technology is that certain objects leave unique fingerprints in the electromagnetic spectrum, known as spectral signatures, which allow to distinguish between different materials that may look like the same in a traditional RGB image. Accordingly, the most important hyperspectral imaging applications are related with distinguishing or identifying materials in a particular scene. In hyperspectral imaging applications under real-time constraints, the huge amount of information provided by the hyperspectral sensors has to be rapidly processed and analysed. For such purpose, parallel hardware devices, such as Field Programmable Gate Arrays (FPGAs) are typically used. However, developing hardware applications typically requires expertise in the specific targeted device, as well as in the tools and methodologies which can be used to perform the implementation of the desired algorithms in the specific device. In this scenario, the Open Computing Language (OpenCL) emerges as a very interesting solution in which a single high-level synthesis design language can be used to efficiently develop applications in multiple and different hardware devices. In this work, the Fast Algorithm for Linearly Unmixing Hyperspectral Images (FUN) has been implemented into a Bitware Stratix V Altera FPGA using OpenCL. The obtained results demonstrate the suitability of OpenCL as a viable design methodology for quickly creating efficient FPGAs designs for real-time hyperspectral imaging applications.
High-speed, multi-channel detector readout electronics for fast radiation detectors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hennig, Wolfgang

2012-06-22

In this project, we are developing a high speed digital spectrometer that a) captures detector waveforms at rates up to 500 MSPS b) has upgraded event data acquisition with additional data buffers for zero dead time operation c) moves energy calculations to the FPGA to increase spectrometer throughput in fast scintillator applications d) uses a streamlined architecture and high speed data interface for even faster readout to the host PC These features are in addition to the standard functions in our existing spectrometers such as digitization, programmable trigger and energy filters, pileup inspection, data acquisition with energy and time stamps,more » MCA histograms, and run statistics. In Phase I, we upgraded one of our existing spectrometer designs to demonstrate the key principle of fast waveform capture using a 500 MSPS, 12 bit ADC and a Xilinx Virtex-4 FPGA. This upgraded spectrometer, named P500, performed well in initial tests of energy resolution, pulse shape analysis, and timing measurements, thus achieving item (a) above. In Phase II, we are revising the P500 to build a commercial prototype with the improvements listed in items (b)-(d). As described in the previous report, two devices were built to pursue this goal, named the Pixie-500 and the Pixie-500 Express. The Pixie-500 has only minor improvements from the Phase I prototype and is intended as an early commercial product (its production and part of its development were funded outside the SBIR). It also allows testing of the ADC performance in real applications.The Pixie-500 Express (or Pixie-500e) includes all of the improvements (b)-(d). At the end of Phase II of the project, we have tested and debugged the hardware, firmware and software of the Pixie-500 Express prototype boards delivered 12/3/2010. This proved substantially more complex than anticipated. At the time of writing, all hardware bugs have been fixed, the PCI Express interface is working, the SDRAM has been successfully tested and
An FPGA-based heterogeneous image fusion system design method

NASA Astrophysics Data System (ADS)

Song, Le; Lin, Yu-chi; Chen, Yan-hua; Zhao, Mei-rong

2011-08-01

Taking the advantages of FPGA's low cost and compact structure, an FPGA-based heterogeneous image fusion platform is established in this study. Altera's Cyclone IV series FPGA is adopted as the core processor of the platform, and the visible light CCD camera and infrared thermal imager are used as the image-capturing device in order to obtain dualchannel heterogeneous video images. Tailor-made image fusion algorithms such as gray-scale weighted averaging, maximum selection and minimum selection methods are analyzed and compared. VHDL language and the synchronous design method are utilized to perform a reliable RTL-level description. Altera's Quartus II 9.0 software is applied to simulate and implement the algorithm modules. The contrast experiments of various fusion algorithms show that, preferably image quality of the heterogeneous image fusion can be obtained on top of the proposed system. The applied range of the different fusion algorithms is also discussed.
Note: Design of FPGA based system identification module with application to atomic force microscopy

NASA Astrophysics Data System (ADS)

Ghosal, Sayan; Pradhan, Sourav; Salapaka, Murti

2018-05-01

The science of system identification is widely utilized in modeling input-output relationships of diverse systems. In this article, we report field programmable gate array (FPGA) based implementation of a real-time system identification algorithm which employs forgetting factors and bias compensation techniques. The FPGA module is employed to estimate the mechanical properties of surfaces of materials at the nano-scale with an atomic force microscope (AFM). The FPGA module is user friendly which can be interfaced with commercially available AFMs. Extensive simulation and experimental results validate the design.
FPGA implementation of low complexity LDPC iterative decoder

NASA Astrophysics Data System (ADS)

Verma, Shivani; Sharma, Sanjay

2016-07-01

Low-density parity-check (LDPC) codes, proposed by Gallager, emerged as a class of codes which can yield very good performance on the additive white Gaussian noise channel as well as on the binary symmetric channel. LDPC codes have gained lots of importance due to their capacity achieving property and excellent performance in the noisy channel. Belief propagation (BP) algorithm and its approximations, most notably min-sum, are popular iterative decoding algorithms used for LDPC and turbo codes. The trade-off between the hardware complexity and the decoding throughput is a critical factor in the implementation of the practical decoder. This article presents introduction to LDPC codes and its various decoding algorithms followed by realisation of LDPC decoder by using simplified message passing algorithm and partially parallel decoder architecture. Simplified message passing algorithm has been proposed for trade-off between low decoding complexity and decoder performance. It greatly reduces the routing and check node complexity of the decoder. Partially parallel decoder architecture possesses high speed and reduced complexity. The improved design of the decoder possesses a maximum symbol throughput of 92.95 Mbps and a maximum of 18 decoding iterations. The article presents implementation of 9216 bits, rate-1/2, (3, 6) LDPC decoder on Xilinx XC3D3400A device from Spartan-3A DSP family.
FPGA-Based Smart Sensor for Online Displacement Measurements Using a Heterodyne Interferometer

PubMed Central

Vera-Salas, Luis Alberto; Moreno-Tapia, Sandra Veronica; Garcia-Perez, Arturo; de Jesus Romero-Troncoso, Rene; Osornio-Rios, Roque Alfredo; Serroukh, Ibrahim; Cabal-Yepez, Eduardo

2011-01-01

The measurement of small displacements on the nanometric scale demands metrological systems of high accuracy and precision. In this context, interferometer-based displacement measurements have become the main tools used for traceable dimensional metrology. The different industrial applications in which small displacement measurements are employed requires the use of online measurements, high speed processes, open architecture control systems, as well as good adaptability to specific process conditions. The main contribution of this work is the development of a smart sensor for large displacement measurement based on phase measurement which achieves high accuracy and resolution, designed to be used with a commercial heterodyne interferometer. The system is based on a low-cost Field Programmable Gate Array (FPGA) allowing the integration of several functions in a single portable device. This system is optimal for high speed applications where online measurement is needed and the reconfigurability feature allows the addition of different modules for error compensation, as might be required by a specific application. PMID:22164040
A 4.2 ps Time-Interval RMS Resolution Time-to-Digital Converter Using a Bin Decimation Method in an UltraScale FPGA

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Liu, Chong

2016-10-01

The common solution for a field programmable gate array (FPGA)-based time-to-digital converter (TDC) is constructing a tapped delay line (TDL) for time interpolation to yield a sub-clock time resolution. The granularity and uniformity of the delay elements of TDL determine the TDC time resolution. In this paper, we propose a dual-sampling TDL architecture and a bin decimation method that could make the delay elements as small and uniform as possible, so that the implemented TDCs can achieve a high time resolution beyond the intrinsic cell delay. Two identical full hardware-based TDCs were implemented in a Xilinx UltraScale FPGA for performance evaluation. For fixed time intervals in the range from 0 to 440 ns, the average time-interval RMS resolution is measured by the two TDCs with 4.2 ps, thus the timestamp resolution of single TDC is derived as 2.97 ps. The maximum hit rate of the TDC is as high as half the system clock rate of FPGA, namely 250 MHz in our demo prototype. Because the conventional online bin-by-bin calibration is not needed, the implementation of the proposed TDC is straightforward and relatively resource-saving.
FPGA-based digital signal processing for the next generation radio astronomy instruments: ultra-pure sideband separation and polarization detection

NASA Astrophysics Data System (ADS)

Alvear, Andrés.; Finger, Ricardo; Fuentes, Roberto; Sapunar, Raúl; Geelen, Tom; Curotto, Franco; Rodríguez, Rafael; Monasterio, David; Reyes, Nicolás.; Mena, Patricio; Bronfman, Leonardo

2016-07-01

Field Programmable Gate Arrays (FPGAs) capacity and Analog to Digital Converters (ADCs) speed have largely increased in the last decade. Nowadays we can find one million or more logic blocks (slices) as well as several thousand arithmetic units (ALUs/DSP) available on a single FPGA chip. We can also commercially procure ADC chips reaching 10 GSPS, with 8 bits resolution or more. This unprecedented power of computing hardware has allowed the digitalization of signal processes traditionally performed by analog components. In radio astronomy, the clearest example has been the development of digital sideband separating receivers which, by replacing the IF hybrid and calibrating the system imbalances, have exhibited a sideband rejection above 40dB; this is 20 to 30dB higher than traditional analog sideband separating (2SB) receivers. In Rodriguez et al.,1 and Finger et al.,2 we have demonstrated very high digital sideband separation at 3mm and 1mm wavelengths, using laboratory setups. We here show the first implementation of such technique with a 3mm receiver integrated into a telescope, where the calibration was performed by quasi-optical injection of the test tone in front of the Cassegrain antenna. We also reported progress in digital polarization synthesis, particularly in the implementation of a calibrated Digital Ortho-Mode Transducer (DOMT) based on the Morgan et al. proof of concept.3 They showed off- line synthesis of polarization with isolation higher than 40dB. We plan to implement a digital polarimeter in a real-time FPGA-based (ROACH-2) platform, to show ultra-pure polarization isolation in a non-stop integrating spectrometer.
40-Gbps optical backbone network deep packet inspection based on FPGA

NASA Astrophysics Data System (ADS)

Zuo, Yuan; Huang, Zhiping; Su, Shaojing

2014-11-01

In the era of information, the big data, which contains huge information, brings about some problems, such as high speed transmission, storage and real-time analysis and process. As the important media for data transmission, the Internet is the significant part for big data processing research. With the large-scale usage of the Internet, the data streaming of network is increasing rapidly. The speed level in the main fiber optic communication of the present has reached 40Gbps, even 100Gbps, therefore data on the optical backbone network shows some features of massive data. Generally, data services are provided via IP packets on the optical backbone network, which is constituted with SDH (Synchronous Digital Hierarchy). Hence this method that IP packets are directly mapped into SDH payload is named POS (Packet over SDH) technology. Aiming at the problems of real time process of high speed massive data, this paper designs a process system platform based on ATCA for 40Gbps POS signal data stream recognition and packet content capture, which employs the FPGA as the CPU. This platform offers pre-processing of clustering algorithms, service traffic identification and data mining for the following big data storage and analysis with high efficiency. Also, the operational procedure is proposed in this paper. Four channels of 10Gbps POS signal decomposed by the analysis module, which chooses FPGA as the kernel, are inputted to the flow classification module and the pattern matching component based on TCAM. Based on the properties of the length of payload and net flows, buffer management is added to the platform to keep the key flow information. According to data stream analysis, DPI (deep packet inspection) and flow balance distribute, the signal is transmitted to the backend machine through the giga Ethernet ports on back board. Practice shows that the proposed platform is superior to the traditional applications based on ASIC and NP.
Bio-Inspired Controller on an FPGA Applied to Closed-Loop Diaphragmatic Stimulation

PubMed Central

Zbrzeski, Adeline; Bornat, Yannick; Hillen, Brian; Siu, Ricardo; Abbas, James; Jung, Ranu; Renaud, Sylvie

2016-01-01

Cervical spinal cord injury can disrupt connections between the brain respiratory network and the respiratory muscles which can lead to partial or complete loss of ventilatory control and require ventilatory assistance. Unlike current open-loop technology, a closed-loop diaphragmatic pacing system could overcome the drawbacks of manual titration as well as respond to changing ventilation requirements. We present an original bio-inspired assistive technology for real-time ventilation assistance, implemented in a digital configurable Field Programmable Gate Array (FPGA). The bio-inspired controller, which is a spiking neural network (SNN) inspired by the medullary respiratory network, is as robust as a classic controller while having a flexible, low-power and low-cost hardware design. The system was simulated in MATLAB with FPGA-specific constraints and tested with a computational model of rat breathing; the model reproduced experimentally collected respiratory data in eupneic animals. The open-loop version of the bio-inspired controller was implemented on the FPGA. Electrical test bench characterizations confirmed the system functionality. Open and closed-loop paradigm simulations were simulated to test the FPGA system real-time behavior using the rat computational model. The closed-loop system monitors breathing and changes in respiratory demands to drive diaphragmatic stimulation. The simulated results inform future acute animal experiments and constitute the first step toward the development of a neuromorphic, adaptive, compact, low-power, implantable device. The bio-inspired hardware design optimizes the FPGA resource and time costs while harnessing the computational power of spike-based neuromorphic hardware. Its real-time feature makes it suitable for in vivo applications. PMID:27378844
Real-time windowing in imaging radar using FPGA technique

NASA Astrophysics Data System (ADS)

Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique

2005-02-01

The imaging radar uses the high frequency electromagnetic waves reflected from different objects for estimating of its parameters. Pulse compression is a standard signal processing technique used to minimize the peak transmission power and to maximize SNR, and to get a better resolution. Usually the pulse compression can be achieved using a matched filter. The level of the side-lobes in the imaging radar can be reduced using the special weighting function processing. There are very known different weighting functions: Hamming, Hanning, Blackman, Chebyshev, Blackman-Harris, Kaiser-Bessel, etc., widely used in the signal processing applications. Field Programmable Gate Arrays (FPGAs) offers great benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. This reconfiguration makes FPGAs a better solution over custom-made integrated circuits. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal and pulse compression using Matlab, Simulink, and System Generator. Employing FPGA and mentioned software we have proposed the pulse compression design on FPGA using classical and novel windows technique to reduce the side-lobes level. This permits increasing the detection ability of the small or nearly placed targets in imaging radar. The advantage of FPGA that can do parallelism in real time processing permits to realize the proposed algorithms. The paper also presents the experimental results of proposed windowing procedure in the marine radar with such the parameters: signal is linear FM (Chirp); frequency deviation DF is 9.375MHz; the pulse width T is 3.2μs taps number in the matched filter is 800 taps; sampling frequency 253.125*106 MHz. It has been realized the reducing of side-lobes levels in real time permitting better resolution of the small targets.
FPGA-based real-time phase measuring profilometry algorithm design and implementation

NASA Astrophysics Data System (ADS)

Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng

2016-11-01

Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
Clock and carrier recovery in high-speed coherent optical communication systems

NASA Astrophysics Data System (ADS)

Amado, Sofia B.; Ferreira, Ricardo; Costa, Pedro S.; Guiomar, Fernando P.; Ziaie, Somayeh; Teixeira, António L.; Muga, Nelson J.; Pinto, Armando N.

2014-08-01

In this paper, the implementations of clock and carrier recovery in digital domain are analyzed. Hardware implementation details, resources estimation and real-time results are presented. Analog-to-Digital Converters (ADC), operating at 1.25Gsa/s, and a Virtex-6 Field-Programmable Gate Array (FPGA), have been used, allowing the implementation of a real-time Quadrature Phase Shift Keying (QPSK) system operating at 1.25Gb/s. The real-time mode operation is successfully demonstrated over 80 km of Standard Single Mode Fiber (SSMF).
Real-time orthorectification by FPGA-based hardware acceleration

NASA Astrophysics Data System (ADS)

Kuo, David; Gordon, Don

2010-10-01

Orthorectification that corrects the perspective distortion of remote sensing imagery, providing accurate geolocation and ease of correlation to other images is a valuable first-step in image processing for information extraction. However, the large amount of metadata and the floating-point matrix transformations required to operate on each pixel make this a computation and I/O (Input/Output) intensive process. As result much imagery is either left unprocessed or loses timesensitive value in the long processing cycle. However, the computation on each pixel can be reduced substantially by using computational results of the neighboring pixels and accelerated by special pipelined hardware architecture in one to two orders of magnitude. A specialized coprocessor that is implemented inside an FPGA (Field Programmable Gate Array) chip and surrounded by vendorsupported hardware IP (Intellectual Property) shares the computation workload with CPU through PCI-Express interface. The ultimate speed of one pixel per clock (125 MHz) is achieved by the pipelined systolic array architecture. The optimal partition between software and hardware, the timing profile among image I/O and computation, and the highly automated GUI (Graphical User Interface) that fully exploits this speed increase to maximize overall image production throughput will also be discussed. The software that runs on a workstation with the acceleration hardware orthorectifies 16 Megapixels per second, which is 16 times faster than without the hardware. It turns the production time from months to days. A real-life successful story of an imaging satellite company that adopted such workstations for their orthorectified imagery production will be presented. The potential candidacy of the image processing computation that can be accelerated more efficiently by the same approach will also be analyzed.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms

NASA Astrophysics Data System (ADS)

Du, Hongtao; Qi, Hairong; Szu, Harold H.

2005-03-01

Recent advances in intelligence technology have boosted the development of micro- Unmanned Air Vehicles (UAVs) including Sliver Fox, Shadow, and Scan Eagle for various surveillance and reconnaissance applications. These affordable and reusable devices have to fit a series of size, weight, and power constraints. Cameras used on such micro-UAVs are therefore mounted directly at a fixed angle without any motion-compensated gimbals. This mounting scheme has resulted in the so-called jitter effect in which jitter is defined as sub-pixel or small amplitude vibrations. The jitter blur caused by the jitter effect needs to be corrected before any other processing algorithms can be practically applied. Jitter restoration has been solved by various optimization techniques, including Wiener approximation, maximum a-posteriori probability (MAP), etc. However, these algorithms normally assume a spatial-invariant blur model that is not the case with jitter blur. Szu et al. developed a smart real-time algorithm based on auto-regression (AR) with its natural generalization of unsupervised artificial neural network (ANN) learning to achieve restoration accuracy at the sub-pixel level. This algorithm resembles the capability of the human visual system, in which an agreement between the pair of eyes indicates "signal", otherwise, the jitter noise. Using this non-statistical method, for each single pixel, a deterministic blind sources separation (BSS) process can then be carried out independently based on a deterministic minimum of the Helmholtz free energy with a generalization of Shannon's information theory applied to open dynamic systems. From a hardware implementation point of view, the process of jitter restoration of an image using Szu's algorithm can be optimized by pixel-based parallelization. In our previous work, a parallelly structured independent component analysis (ICA) algorithm has been implemented on both Field Programmable Gate Array (FPGA) and Application
High-Speed Interrogation for Large-Scale Fiber Bragg Grating Sensing

PubMed Central

Hu, Chenyuan; Bai, Wei

2018-01-01

A high-speed interrogation scheme for large-scale fiber Bragg grating (FBG) sensing arrays is presented. This technique employs parallel computing and pipeline control to modulate incident light and demodulate the reflected sensing signal. One Electro-optic modulator (EOM) and one semiconductor optical amplifier (SOA) were used to generate a phase delay to filter reflected spectrum form multiple candidate FBGs with the same optical path difference (OPD). Experimental results showed that the fastest interrogation delay time for the proposed method was only about 27.2 us for a single FBG interrogation, and the system scanning period was only limited by the optical transmission delay in the sensing fiber owing to the multiple simultaneous central wavelength calculations. Furthermore, the proposed FPGA-based technique had a verified FBG wavelength demodulation stability of ±1 pm without average processing. PMID:29495263
High-Speed Interrogation for Large-Scale Fiber Bragg Grating Sensing.

PubMed

Hu, Chenyuan; Bai, Wei

2018-02-24

A high-speed interrogation scheme for large-scale fiber Bragg grating (FBG) sensing arrays is presented. This technique employs parallel computing and pipeline control to modulate incident light and demodulate the reflected sensing signal. One Electro-optic modulator (EOM) and one semiconductor optical amplifier (SOA) were used to generate a phase delay to filter reflected spectrum form multiple candidate FBGs with the same optical path difference (OPD). Experimental results showed that the fastest interrogation delay time for the proposed method was only about 27.2 us for a single FBG interrogation, and the system scanning period was only limited by the optical transmission delay in the sensing fiber owing to the multiple simultaneous central wavelength calculations. Furthermore, the proposed FPGA-based technique had a verified FBG wavelength demodulation stability of ±1 pm without average processing.
Hardware Implementation of Lossless Adaptive Compression of Data From a Hyperspectral Imager

NASA Technical Reports Server (NTRS)

Keymeulen, Didlier; Aranki, Nazeeh I.; Klimesh, Matthew A.; Bakhshi, Alireza

2012-01-01

Efficient onboard data compression can reduce the data volume from hyperspectral imagers on NASA and DoD spacecraft in order to return as much imagery as possible through constrained downlink channels. Lossless compression is important for signature extraction, object recognition, and feature classification capabilities. To provide onboard data compression, a hardware implementation of a lossless hyperspectral compression algorithm was developed using a field programmable gate array (FPGA). The underlying algorithm is the Fast Lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral- Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), p. 26 with the modification reported in Lossless, Multi-Spectral Data Comressor for Improved Compression for Pushbroom-Type Instruments (NPO-45473), NASA Tech Briefs, Vol. 32, No. 7 (July 2008) p. 63, which provides improved compression performance for data from pushbroom-type imagers. An FPGA implementation of the unmodified FL algorithm was previously developed and reported in Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System (NPO-46867), NASA Tech Briefs, Vol. 36, No. 5 (May 2012) p. 42. The essence of the FL algorithm is adaptive linear predictive compression using the sign algorithm for filter adaption. The FL compressor achieves a combination of low complexity and compression effectiveness that exceeds that of stateof- the-art techniques currently in use. The modification changes the predictor structure to tolerate differences in sensitivity of different detector elements, as occurs in pushbroom-type imagers, which are suitable for spacecraft use. The FPGA implementation offers a low-cost, flexible solution compared to traditional ASIC (application specific integrated circuit) and can be integrated as an intellectual property (IP) for part of, e.g., a design that manages the instrument interface. The FPGA implementation was benchmarked on the Xilinx
Frontend electronics for high-precision single photo-electron timing using FPGA-TDCs

NASA Astrophysics Data System (ADS)

Cardinali, M.; Dzyhgadlo, R.; Gerhardt, A.; Götzen, K.; Hohler, R.; Kalicy, G.; Kumawat, H.; Lehmann, D.; Lewandowski, B.; Patsyuk, M.; Peters, K.; Schepers, G.; Schmitt, L.; Schwarz, C.; Schwiening, J.; Traxler, M.; Ugur, C.; Zühlsdorf, M.; Dodokhov, V. Kh.; Britting, A.; Eyrich, W.; Lehmann, A.; Uhlig, F.; Düren, M.; Föhl, K.; Hayrapetyan, A.; Kröck, B.; Merle, O.; Rieke, J.; Cowie, E.; Keri, T.; Montgomery, R.; Rosner, G.; Achenbach, P.; Corell, O.; Ferretti Bondy, M. I.; Hoek, M.; Lauth, W.; Rosner, C.; Sfienti, C.; Thiel, M.; Bühler, P.; Gruber, L.; Marton, J.; Suzuki, K.

2014-12-01

The next generation of high-luminosity experiments requires excellent particle identification detectors which calls for Imaging Cherenkov counters with fast electronics to cope with the expected hit rates. A Barrel DIRC will be used in the central region of the Target Spectrometer of the planned PANDA experiment at FAIR. A single photo-electron timing resolution of better than 100 ps is required by the Barrel DIRC to disentangle the complicated patterns created on the image plane. R&D studies have been performed to provide a design based on the TRB3 readout using FPGA-TDCs with a precision better than 20 ps RMS and custom frontend electronics with high-bandwidth pre-amplifiers and fast discriminators. The discriminators also provide time-over-threshold information thus enabling walk corrections to improve the timing resolution. Two types of frontend electronics cards optimised for reading out 64-channel PHOTONIS Planacon MCP-PMTs were tested: one based on the NINO ASIC and the other, called PADIWA, on FPGA discriminators. Promising results were obtained in a full characterisation using a fast laser setup and in a test experiment at MAMI, Mainz, with a small scale DIRC prototype.

Novel intelligent real-time position tracking system using FPGA and fuzzy logic.

PubMed

Soares dos Santos, Marco P; Ferreira, J A F

2014-03-01

The main aim of this paper is to test if FPGAs are able to achieve better position tracking performance than software-based soft real-time platforms. For comparison purposes, the same controller design was implemented in these architectures. A Multi-state Fuzzy Logic controller (FLC) was implemented both in a Xilinx(®) Virtex-II FPGA (XC2v1000) and in a soft real-time platform NI CompactRIO(®)-9002. The same sampling time was used. The comparative tests were conducted using a servo-pneumatic actuation system. Steady-state errors lower than 4 μm were reached for an arbitrary vertical positioning of a 6.2 kg mass when the controller was embedded into the FPGA platform. Performance gains up to 16 times in the steady-state error, up to 27 times in the overshoot and up to 19.5 times in the settling time were achieved by using the FPGA-based controller over the software-based FLC controller. © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
STRS Compliant FPGA Waveform Development

NASA Technical Reports Server (NTRS)

Nappier, Jennifer; Downey, Joseph

2008-01-01

The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. Current standards were researched and new standard interfaces were proposed. The implementation of the proposed standard interfaces on a laboratory breadboard SDR will be presented.
Design of area array CCD image acquisition and display system based on FPGA

NASA Astrophysics Data System (ADS)

Li, Lei; Zhang, Ning; Li, Tianting; Pan, Yue; Dai, Yuming

2014-09-01

With the development of science and technology, CCD(Charge-coupled Device) has been widely applied in various fields and plays an important role in the modern sensing system, therefore researching a real-time image acquisition and display plan based on CCD device has great significance. This paper introduces an image data acquisition and display system of area array CCD based on FPGA. Several key technical challenges and problems of the system have also been analyzed and followed solutions put forward .The FPGA works as the core processing unit in the system that controls the integral time sequence .The ICX285AL area array CCD image sensor produced by SONY Corporation has been used in the system. The FPGA works to complete the driver of the area array CCD, then analog front end (AFE) processes the signal of the CCD image, including amplification, filtering, noise elimination, CDS correlation double sampling, etc. AD9945 produced by ADI Corporation to convert analog signal to digital signal. Developed Camera Link high-speed data transmission circuit, and completed the PC-end software design of the image acquisition, and realized the real-time display of images. The result through practical testing indicates that the system in the image acquisition and control is stable and reliable, and the indicators meet the actual project requirements.
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA

PubMed Central

Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang

2009-01-01

Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
The performance and limitations of FPGA-based digital servos for atomic, molecular, and optical physics experiments

NASA Astrophysics Data System (ADS)

Yu, Shi Jing; Fajeau, Emma; Liu, Lin Qiao; Jones, David J.; Madison, Kirk W.

2018-02-01

In this work, we address the advantages, limitations, and technical subtleties of employing field programmable gate array (FPGA)-based digital servos for high-bandwidth feedback control of lasers in atomic, molecular, and optical physics experiments. Specifically, we provide the results of benchmark performance tests in experimental setups including noise, bandwidth, and dynamic range for two digital servos built with low and mid-range priced FPGA development platforms. The digital servo results are compared to results obtained from a commercially available state-of-the-art analog servo using the same plant for control (intensity stabilization). The digital servos have feedback bandwidths of 2.5 MHz, limited by the total signal latency, and we demonstrate improvements beyond the transfer function offered by the analog servo including a three-pole filter and a two-pole filter with phase compensation to suppress resonances. We also discuss limitations of our FPGA-servo implementation and general considerations when designing and using digital servos.
The performance and limitations of FPGA-based digital servos for atomic, molecular, and optical physics experiments.

PubMed

Yu, Shi Jing; Fajeau, Emma; Liu, Lin Qiao; Jones, David J; Madison, Kirk W

2018-02-01

In this work, we address the advantages, limitations, and technical subtleties of employing field programmable gate array (FPGA)-based digital servos for high-bandwidth feedback control of lasers in atomic, molecular, and optical physics experiments. Specifically, we provide the results of benchmark performance tests in experimental setups including noise, bandwidth, and dynamic range for two digital servos built with low and mid-range priced FPGA development platforms. The digital servo results are compared to results obtained from a commercially available state-of-the-art analog servo using the same plant for control (intensity stabilization). The digital servos have feedback bandwidths of 2.5 MHz, limited by the total signal latency, and we demonstrate improvements beyond the transfer function offered by the analog servo including a three-pole filter and a two-pole filter with phase compensation to suppress resonances. We also discuss limitations of our FPGA-servo implementation and general considerations when designing and using digital servos.
Design exploration and verification platform, based on high-level modeling and FPGA prototyping, for fast and flexible digital communication in physics experiments

NASA Astrophysics Data System (ADS)

Magazzù, G.; Borgese, G.; Costantino, N.; Fanucci, L.; Incandela, J.; Saponara, S.

2013-02-01

In many research fields as high energy physics (HEP), astrophysics, nuclear medicine or space engineering with harsh operating conditions, the use of fast and flexible digital communication protocols is becoming more and more important. The possibility to have a smart and tested top-down design flow for the design of a new protocol for control/readout of front-end electronics is very useful. To this aim, and to reduce development time, costs and risks, this paper describes an innovative design/verification flow applied as example case study to a new communication protocol called FF-LYNX. After the description of the main FF-LYNX features, the paper presents: the definition of a parametric SystemC-based Integrated Simulation Environment (ISE) for high-level protocol definition and validation; the set up of figure of merits to drive the design space exploration; the use of ISE for early analysis of the achievable performances when adopting the new communication protocol and its interfaces for a new (or upgraded) physics experiment; the design of VHDL IP cores for the TX and RX protocol interfaces; their implementation on a FPGA-based emulator for functional verification and finally the modification of the FPGA-based emulator for testing the ASIC chipset which implements the rad-tolerant protocol interfaces. For every step, significant results will be shown to underline the usefulness of this design and verification approach that can be applied to any new digital protocol development for smart detectors in physics experiments.
Design and implementation of an optical Gaussian noise generator

NASA Astrophysics Data System (ADS)

Za~O, Leonardo; Loss, Gustavo; Coelho, Rosângela

2009-08-01

A design of a fast and accurate optical Gaussian noise generator is proposed and demonstrated. The noise sample generation is based on the Box-Muller algorithm. The functions implementation was performed on a high-speed Altera Stratix EP1S25 field-programmable gate array (FPGA) development kit. It enabled the generation of 150 million 16-bit noise samples per second. The Gaussian noise generator required only 7.4% of the FPGA logic elements, 1.2% of the RAM memory, 0.04% of the ROM memory, and a laser source. The optical pulses were generated by a laser source externally modulated by the data bit samples using the frequency-shift keying technique. The accuracy of the noise samples was evaluated for different sequences size and confidence intervals. The noise sample pattern was validated by the Bhattacharyya distance (Bd) and the autocorrelation function. The results showed that the proposed design of the optical Gaussian noise generator is very promising to evaluate the performance of optical communications channels with very low bit-error-rate values.
ARINC 818 adds capabilities for high-speed sensors and systems

NASA Astrophysics Data System (ADS)

Keller, Tim; Grunwald, Paul

2014-06-01

ARINC 818, titled Avionics Digital Video Bus (ADVB), is the standard for cockpit video that has gained wide acceptance in both the commercial and military cockpits including the Boeing 787, the A350XWB, the A400M, the KC- 46A and many others. Initially conceived of for cockpit displays, ARINC 818 is now propagating into high-speed sensors, such as infrared and optical cameras due to its high-bandwidth and high reliability. The ARINC 818 specification that was initially release in the 2006 and has recently undergone a major update that will enhance its applicability as a high speed sensor interface. The ARINC 818-2 specification was published in December 2013. The revisions to the specification include: video switching, stereo and 3-D provisions, color sequential implementations, regions of interest, data-only transmissions, multi-channel implementations, bi-directional communication, higher link rates to 32Gbps, synchronization signals, options for high-speed coax interfaces and optical interface details. The additions to the specification are especially appealing for high-bandwidth, multi sensor systems that have issues with throughput bottlenecks and SWaP concerns. ARINC 818 is implemented on either copper or fiber optic high speed physical layers, and allows for time multiplexing multiple sensors onto a single link. This paper discusses each of the new capabilities in the ARINC 818-2 specification and the benefits for ISR and countermeasures implementations, several examples are provided.
Remote monitoring and fault recovery for FPGA-based field controllers of telescope and instruments

NASA Astrophysics Data System (ADS)

Zhu, Yuhua; Zhu, Dan; Wang, Jianing

2012-09-01

As the increasing size and more and more functions, modern telescopes have widely used the control architecture, i.e. central control unit plus field controller. FPGA-based field controller has the advantages of field programmable, which provide a great convenience for modifying software and hardware of control system. It also gives a good platform for implementation of the new control scheme. Because of multi-controlled nodes and poor working environment in scattered locations, reliability and stability of the field controller should be fully concerned. This paper mainly describes how we use the FPGA-based field controller and Ethernet remote to construct monitoring system with multi-nodes. When failure appearing, the new FPGA chip does self-recovery first in accordance with prerecovery strategies. In case of accident, remote reconstruction for the field controller can be done through network intervention if the chip is not being restored. This paper also introduces the network remote reconstruction solutions of controller, the system structure and transport protocol as well as the implementation methods. The idea of hardware and software design is given based on the FPGA. After actual operation on the large telescopes, desired results have been achieved. The improvement increases system reliability and reduces workload of maintenance, showing good application and popularization.
An FPGA Architecture for Extracting Real-Time Zernike Coefficients from Measured Phase Gradients

NASA Astrophysics Data System (ADS)

Moser, Steven; Lee, Peter; Podoleanu, Adrian

2015-04-01

Zernike modes are commonly used in adaptive optics systems to represent optical wavefronts. However, real-time calculation of Zernike modes is time consuming due to two factors: the large factorial components in the radial polynomials used to define them and the large inverse matrix calculation needed for the linear fit. This paper presents an efficient parallel method for calculating Zernike coefficients from phase gradients produced by a Shack-Hartman sensor and its real-time implementation using an FPGA by pre-calculation and storage of subsections of the large inverse matrix. The architecture exploits symmetries within the Zernike modes to achieve a significant reduction in memory requirements and a speed-up of 2.9 when compared to published results utilising a 2D-FFT method for a grid size of 8×8. Analysis of processor element internal word length requirements show that 24-bit precision in precalculated values of the Zernike mode partial derivatives ensures less than 0.5% error per Zernike coefficient and an overall error of <1%. The design has been synthesized on a Xilinx Spartan-6 XC6SLX45 FPGA. The resource utilisation on this device is <3% of slice registers, <15% of slice LUTs, and approximately 48% of available DSP blocks independent of the Shack-Hartmann grid size. Block RAM usage is <16% for Shack-Hartmann grid sizes up to 32×32.
V&V Plan for FPGA-based ESF-CCS Using System Engineering Approach.

NASA Astrophysics Data System (ADS)

Maerani, Restu; Mayaka, Joyce; El Akrat, Mohamed; Cheon, Jung Jae

2018-02-01

Instrumentation and Control (I&C) systems play an important role in maintaining the safety of Nuclear Power Plant (NPP) operation. However, most current I&C safety systems are based on Programmable Logic Controller (PLC) hardware, which is difficult to verify and validate, and is susceptible to software common cause failure. Therefore, a plan for the replacement of the PLC-based safety systems, such as the Engineered Safety Feature - Component Control System (ESF-CCS), with Field Programmable Gate Arrays (FPGA) is needed. By using a systems engineering approach, which ensures traceability in every phase of the life cycle, from system requirements, design implementation to verification and validation, the system development is guaranteed to be in line with the regulatory requirements. The Verification process will ensure that the customer and stakeholder’s needs are satisfied in a high quality, trustworthy, cost efficient and schedule compliant manner throughout a system’s entire life cycle. The benefit of the V&V plan is to ensure that the FPGA based ESF-CCS is correctly built, and to ensure that the measurement of performance indicators has positive feedback that “do we do the right thing” during the re-engineering process of the FPGA based ESF-CCS.
Implementation of an RBF neural network on embedded systems: real-time face tracking and identity verification.

PubMed

Yang, Fan; Paindavoine, M

2003-01-01

This paper describes a real time vision system that allows us to localize faces in video sequences and verify their identity. These processes are image processing techniques based on the radial basis function (RBF) neural network approach. The robustness of this system has been evaluated quantitatively on eight video sequences. We have adapted our model for an application of face recognition using the Olivetti Research Laboratory (ORL), Cambridge, UK, database so as to compare the performance against other systems. We also describe three hardware implementations of our model on embedded systems based on the field programmable gate array (FPGA), zero instruction set computer (ZISC) chips, and digital signal processor (DSP) TMS320C62, respectively. We analyze the algorithm complexity and present results of hardware implementations in terms of the resources used and processing speed. The success rates of face tracking and identity verification are 92% (FPGA), 85% (ZISC), and 98.2% (DSP), respectively. For the three embedded systems, the processing speeds for images size of 288 /spl times/ 352 are 14 images/s, 25 images/s, and 4.8 images/s, respectively.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Ormsby, John (Technical Monitor)

2002-01-01

Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing (DSP) functions. Such capability also makes and FPGA a suitable platform for the digital implementation of closed loop controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance in a compact form-factor. Other researchers have presented the notion that a second order digital filter with proportional-integral-derivative (PID) control functionality can be implemented in an FPGA. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSF) devices. Our goal is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. Meeting our goals requires alternative compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching these goals.
FPGA-Based Optical Cavity Phase Stabilization for Coherent Pulse Stacking

DOE PAGES

Xu, Yilun; Wilcox, Russell; Byrd, John; ...

2017-11-20

Coherent pulse stacking (CPS) is a new time-domain coherent addition technique that stacks several optical pulses into a single output pulse, enabling high pulse energy from fiber lasers. We develop a robust, scalable, and distributed digital control system with firmware and software integration for algorithms, to support the CPS application. We model CPS as a digital filter in the Z domain and implement a pulse-pattern-based cavity phase detection algorithm on an field-programmable gate array (FPGA). A two-stage (2+1 cavities) 15-pulse stacking system achieves an 11.0 peak-power enhancement factor. Each optical cavity is fed back at 1.5kHz, and stabilized at anmore » individually-prescribed round-trip phase with 0.7deg and 2.1deg rms phase errors for Stages 1 and 2, respectively. Optical cavity phase control with nanometer accuracy ensures 1.2% intensity stability of the stacked pulse over 12 h. The FPGA-based feedback control system can be scaled to large numbers of optical cavities.« less
FPGA-Based Optical Cavity Phase Stabilization for Coherent Pulse Stacking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, Yilun; Wilcox, Russell; Byrd, John

Coherent pulse stacking (CPS) is a new time-domain coherent addition technique that stacks several optical pulses into a single output pulse, enabling high pulse energy from fiber lasers. We develop a robust, scalable, and distributed digital control system with firmware and software integration for algorithms, to support the CPS application. We model CPS as a digital filter in the Z domain and implement a pulse-pattern-based cavity phase detection algorithm on an field-programmable gate array (FPGA). A two-stage (2+1 cavities) 15-pulse stacking system achieves an 11.0 peak-power enhancement factor. Each optical cavity is fed back at 1.5kHz, and stabilized at anmore » individually-prescribed round-trip phase with 0.7deg and 2.1deg rms phase errors for Stages 1 and 2, respectively. Optical cavity phase control with nanometer accuracy ensures 1.2% intensity stability of the stacked pulse over 12 h. The FPGA-based feedback control system can be scaled to large numbers of optical cavities.« less
Heavy-Ion Microbeam Fault Injection into SRAM-Based FPGA Implementations of Cryptographic Circuits

NASA Astrophysics Data System (ADS)

Li, Huiyun; Du, Guanghua; Shao, Cuiping; Dai, Liang; Xu, Guoqing; Guo, Jinlong

2015-06-01

Transistors hit by heavy ions may conduct transiently, thereby introducing transient logic errors. Attackers can exploit these abnormal behaviors and extract sensitive information from the electronic devices. This paper demonstrates an ion irradiation fault injection attack experiment into a cryptographic field-programmable gate-array (FPGA) circuit. The experiment proved that the commercial FPGA chip is vulnerable to low-linear energy transfer carbon irradiation, and the attack can cause the leakage of secret key bits. A statistical model is established to estimate the possibility of an effective fault injection attack on cryptographic integrated circuits. The model incorporates the effects from temporal, spatial, and logical probability of an effective attack on the cryptographic circuits. The rate of successful attack calculated from the model conforms well to the experimental results. This quantitative success rate model can help evaluate security risk for designers as well as for the third-party assessment organizations.
A digital frequency stabilization system of external cavity diode laser based on LabVIEW FPGA

NASA Astrophysics Data System (ADS)

Liu, Zhuohuan; Hu, Zhaohui; Qi, Lu; Wang, Tao

2015-10-01

Frequency stabilization for external cavity diode laser has played an important role in physics research. Many laser frequency locking solutions have been proposed by researchers. Traditionally, the locking process was accomplished by analog system, which has fast feedback control response speed. However, analog system is susceptible to the effects of environment. In order to improve the automation level and reliability of the frequency stabilization system, we take a grating-feedback external cavity diode laser as the laser source and set up a digital frequency stabilization system based on National Instrument's FPGA (NI FPGA). The system consists of a saturated absorption frequency stabilization of beam path, a differential photoelectric detector, a NI FPGA board and a host computer. Many functions, such as piezoelectric transducer (PZT) sweeping, atomic saturation absorption signal acquisition, signal peak identification, error signal obtaining and laser PZT voltage feedback controlling, are totally completed by LabVIEW FPGA program. Compared with the analog system, the system built by the logic gate circuits, performs stable and reliable. User interface programmed by LabVIEW is friendly. Besides, benefited from the characteristics of reconfiguration, the LabVIEW program is good at transplanting in other NI FPGA boards. Most of all, the system periodically checks the error signal. Once the abnormal error signal is detected, FPGA will restart frequency stabilization process without manual control. Through detecting the fluctuation of error signal of the atomic saturation absorption spectrum line in the frequency locking state, we can infer that the laser frequency stability can reach 1MHz.
Low-cost and high-speed optical mark reader based on an intelligent line camera

NASA Astrophysics Data System (ADS)

Hussmann, Stephan; Chan, Leona; Fung, Celine; Albrecht, Martin

2003-08-01

Optical Mark Recognition (OMR) is thoroughly reliable and highly efficient provided that high standards are maintained at both the planning and implementation stages. It is necessary to ensure that OMR forms are designed with due attention to data integrity checks, the best use is made of features built into the OMR, used data integrity is checked before the data is processed and data is validated before it is processed. This paper describes the design and implementation of an OMR prototype system for marking multiple-choice tests automatically. Parameter testing is carried out before the platform and the multiple-choice answer sheet has been designed. Position recognition and position verification methods have been developed and implemented in an intelligent line scan camera. The position recognition process is implemented into a Field Programmable Gate Array (FPGA), whereas the verification process is implemented into a micro-controller. The verified results are then sent to the Graphical User Interface (GUI) for answers checking and statistical analysis. At the end of the paper the proposed OMR system will be compared with commercially available system on the market.
FPGA platform for MEMS Disc Resonance Gyroscope (DRG) control

NASA Astrophysics Data System (ADS)

Keymeulen, Didier; Peay, Chris; Foor, David; Trung, Tran; Bakhshi, Alireza; Withington, Phil; Yee, Karl; Terrile, Rich

2008-04-01

Inertial navigation systems based upon optical gyroscopes tend to be expensive, large, power consumptive, and are not long lived. Micro-Electromechanical Systems (MEMS) based gyros do not have these shortcomings; however, until recently, the performance of MEMS based gyros had been below navigation grade. Boeing and JPL have been cooperating since 1997 to develop high performance MEMS gyroscopes for miniature, low power space Inertial Reference Unit applications. The efforts resulted in demonstration of a Post Resonator Gyroscope (PRG). This experience led to the more compact Disc Resonator Gyroscope (DRG) for further reduced size and power with potentially increased performance. Currently, the mass, volume and power of the DRG are dominated by the size of the electronics. This paper will detail the FPGA based digital electronics architecture and its implementation for the DRG which will allow reduction of size and power and will increase performance through a reduction in electronics noise. Using the digital control based on FPGA, we can program and modify in real-time the control loop to adapt to the specificity of each particular gyro and the change of the mechanical characteristic of the gyro during its life time.

A High-Speed Design of Montgomery Multiplier

NASA Astrophysics Data System (ADS)

Fan, Yibo; Ikenaga, Takeshi; Goto, Satoshi

With the increase of key length used in public cryptographic algorithms such as RSA and ECC, the speed of Montgomery multiplication becomes a bottleneck. This paper proposes a high speed design of Montgomery multiplier. Firstly, a modified scalable high-radix Montgomery algorithm is proposed to reduce critical path. Secondly, a high-radix clock-saving dataflow is proposed to support high-radix operation and one clock cycle delay in dataflow. Finally, a hardware-reused architecture is proposed to reduce the hardware cost and a parallel radix-16 design of data path is proposed to accelerate the speed. By using HHNEC 0.25μm standard cell library, the implementation results show that the total cost of Montgomery multiplier is 130 KGates, the clock frequency is 180MHz and the throughput of 1024-bit RSA encryption is 352kbps. This design is suitable to be used in high speed RSA or ECC encryption/decryption. As a scalable design, it supports any key-length encryption/decryption up to the size of on-chip memory.
Temporal high-pass non-uniformity correction algorithm based on grayscale mapping and hardware implementation

NASA Astrophysics Data System (ADS)

Jin, Minglei; Jin, Weiqi; Li, Yiyang; Li, Shuo

2015-08-01

In this paper, we propose a novel scene-based non-uniformity correction algorithm for infrared image processing-temporal high-pass non-uniformity correction algorithm based on grayscale mapping (THP and GM). The main sources of non-uniformity are: (1) detector fabrication inaccuracies; (2) non-linearity and variations in the read-out electronics and (3) optical path effects. The non-uniformity will be reduced by non-uniformity correction (NUC) algorithms. The NUC algorithms are often divided into calibration-based non-uniformity correction (CBNUC) algorithms and scene-based non-uniformity correction (SBNUC) algorithms. As non-uniformity drifts temporally, CBNUC algorithms must be repeated by inserting a uniform radiation source which SBNUC algorithms do not need into the view, so the SBNUC algorithm becomes an essential part of infrared imaging system. The SBNUC algorithms' poor robustness often leads two defects: artifacts and over-correction, meanwhile due to complicated calculation process and large storage consumption, hardware implementation of the SBNUC algorithms is difficult, especially in Field Programmable Gate Array (FPGA) platform. The THP and GM algorithm proposed in this paper can eliminate the non-uniformity without causing defects. The hardware implementation of the algorithm only based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay: less than 20 lines, it can be transplanted to a variety of infrared detectors equipped with FPGA image processing module, it can reduce the stripe non-uniformity and the ripple non-uniformity.
FPGA-based protein sequence alignment : A review

NASA Astrophysics Data System (ADS)

Isa, Mohd. Nazrin Md.; Muhsen, Ku Noor Dhaniah Ku; Saiful Nurdin, Dayana; Ahmad, Muhammad Imran; Anuar Zainol Murad, Sohiful; Nizam Mohyar, Shaiful; Harun, Azizi; Hussin, Razaidi

2017-11-01

Sequence alignment have been optimized using several techniques in order to accelerate the computation time to obtain the optimal score by implementing DP-based algorithm into hardware such as FPGA-based platform. During hardware implementation, there will be performance challenges such as the frequent memory access and highly data dependent in computation process. Therefore, investigation in processing element (PE) configuration where involves more on memory access in load or access the data (substitution matrix, query sequence character) and the PE configuration time will be the main focus in this paper. There are various approaches to enhance the PE configuration performance that have been done in previous works such as by using serial configuration chain and parallel configuration chain i.e. the configuration data will be loaded into each PEs sequentially and simultaneously respectively. Some researchers have proven that the performance using parallel configuration chain has optimized both the configuration time and area.
Dilemma zone protection on high-speed arterials.

DOT National Transportation Integrated Search

2014-12-01

Driver behavior within the dilemma zone can be a major safety concern at high-speed signalized intersections, especially : for heavy trucks. The Nebraska Department of Roads (NDOR) has developed and implemented an Actuated Advance : Warning (AAW) dil...
Field testing and implementation of dilemma zone protection and signal coordination at closely-spaced high-speed intersections : final report, May 19, 2005.

DOT National Transportation Integrated Search

2005-05-01

The report presents the details of a study carried out to test and implement a dilemma zone protection technique at three high-speed closely-spaced intersections on Roosevelt Blvd in Middletown, Ohio.
Implementation en VHDl/FPGA d'afficheur video numerique (AVN) pour des applications aerospatiales

NASA Astrophysics Data System (ADS)

Pelletier, Sebastien

L'objectif de ce projet est de developper un controleur video en langage VHDL afin de remplacer la composante specialisee presentement utilisee chez CMC Electronique. Une recherche approfondie des tendances et de ce qui se fait actuellement dans le domaine des controleurs video est effectuee afin de definir les specifications du systeme. Les techniques d'entreposage et d'affichage des images sont expliquees afin de mener ce projet a terme. Le nouveau controleur est developpe sur une plateforme electronique possedant un FPGA, un port VGA et de la memoire pour emmagasiner les donnees. Il est programmable et prend peu d'espace dans un FPGA, ce qui lui permet de s'inserer dans n'importe quelle nouvelle technologie de masse a faible cout. Il s'adapte rapidement a toutes les resolutions d'affichage puisqu'il est modulaire et configurable. A court terme, ce projet permettra un controle ameliore des specifications et des normes de qualite liees aux contraintes de l'avionique.
FPGA based charge acquisition algorithm for soft x-ray diagnostics system

NASA Astrophysics Data System (ADS)

Wojenski, A.; Kasprowicz, G.; Pozniak, K. T.; Zabolotny, W.; Byszuk, A.; Juszczyk, B.; Kolasinski, P.; Krawczyk, R. D.; Zienkiewicz, P.; Chernyshova, M.; Czarski, T.

2015-09-01

Soft X-ray (SXR) measurement systems working in tokamaks or with laser generated plasma can expect high photon fluxes. Therefore it is necessary to focus on data processing algorithms to have the best possible efficiency in term of processed photon events per second. This paper refers to recently designed algorithm and data-flow for implementation of charge data acquisition in FPGA. The algorithms are currently on implementation stage for the soft X-ray diagnostics system. In this paper despite of the charge processing algorithm is also described general firmware overview, data storage methods and other key components of the measurement system. The simulation section presents algorithm performance and expected maximum photon rate.
Uranus: a rapid prototyping tool for FPGA embedded computer vision

NASA Astrophysics Data System (ADS)

Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.

2007-01-01

The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Architectural design for a low cost FPGA-based traffic signal detection system in vehicles

NASA Astrophysics Data System (ADS)

López, Ignacio; Salvador, Rubén; Alarcón, Jaime; Moreno, Félix

2007-05-01

In this paper we propose an architecture for an embedded traffic signal detection system. Development of Advanced Driver Assistance Systems (ADAS) is one of the major trends of research in automotion nowadays. Examples of past and ongoing projects in the field are CHAMELEON ("Pre-Crash Application all around the vehicle" IST 1999-10108), PREVENT (Preventive and Active Safety Applications, FP6-507075, http://www.prevent-ip.org/) and AVRT in the US (Advanced Vision-Radar Threat Detection (AVRT): A Pre-Crash Detection and Active Safety System). It can be observed a major interest in systems for real-time analysis of complex driving scenarios, evaluating risk and anticipating collisions. The system will use a low cost CCD camera on the dashboard facing the road. The images will be processed by an Altera Cyclone family FPGA. The board does median and Sobel filtering of the incoming frames at PAL rate, and analyzes them for several categories of signals. The result is conveyed to the driver. The scarce resources provided by the hardware require an architecture developed for optimal use. The system will use a combination of neural networks and an adapted blackboard architecture. Several neural networks will be used in sequence for image analysis, by reconfiguring a single, generic hardware neural network in the FPGA. This generic network is optimized for speed, in order to admit several executions within the frame rate. The sequence will follow the execution cycle of the blackboard architecture. The global, blackboard architecture being developed and the hardware architecture for the generic, reconfigurable FPGA perceptron will be explained in this paper. The project is still at an early stage. However, some hardware implementation results are already available and will be offered in the paper.
Implementation of Temperature Sequential Controller on Variable Speed Drive

NASA Astrophysics Data System (ADS)

Cheong, Z. X.; Barsoum, N. N.

2008-10-01

There are many pump and motor installations with quite extensive speed variation, such as Sago conveyor, heating, ventilation and air conditioning (HVAC) and water pumping system. A common solution for these applications is to run several fixed speed motors in parallel, with flow control accomplish by turning the motors on and off. This type of control method causes high in-rush current, and adds a risk of damage caused by pressure transients. This paper explains the design and implementation of a temperature speed control system for use in industrial and commercial sectors. Advanced temperature speed control can be achieved by using ABB ACS800 variable speed drive-direct torque sequential control macro, programmable logic controller and temperature transmitter. The principle of direct torque sequential control macro (DTC-SC) is based on the control of torque and flux utilizing the stator flux field orientation over seven preset constant speed. As a result of continuous comparison of ambient temperature to the references temperatures; electromagnetic torque response is particularly fast to the motor state and it is able maintain constant speeds. Experimental tests have been carried out by using ABB ACS800-U1-0003-2, to validate the effectiveness and dynamic respond of ABB ACS800 against temperature variation, loads, and mechanical shocks.
High-speed image processing system and its micro-optics application

NASA Astrophysics Data System (ADS)

Ohba, Kohtaro; Ortega, Jesus C. P.; Tanikawa, Tamio; Tanie, Kazuo; Tajima, Kenji; Nagai, Hiroshi; Tsuji, Masataka; Yamada, Shigeru

2003-07-01

In this paper, a new application system with high speed photography, i.e. an observational system for the tele-micro-operation, has been proposed with a dynamic focusing system and a high-speed image processing system using the "Depth From Focus (DFF)" criteria. In micro operation, such as for the microsurgery, DNA operation and etc., the small depth of a focus on the microscope makes bad observation. For example, if the focus is on the object, the actuator cannot be seen with the microscope. On the other hand, if the focus is on the actuator, the object cannot be observed. In this sense, the "all-in-focus image," which holds the in-focused texture all over the image, is useful to observe the microenvironments on the microscope. It is also important to obtain the "depth map" which could show the 3D micro virtual environments in real-time to actuate the micro objects, intuitively. To realize the real-time micro operation with DFF criteria, which has to integrate several images to obtain "all-in-focus image" and "depth map," at least, the 240 frames par second based image capture and processing system should be required. At first, this paper briefly reviews the criteria of "depth from focus" to achieve the all-in-focus image and the 3D microenvironments' reconstruction, simultaneously. After discussing the problem in our past system, a new frame-rate system is constructed with the high-speed video camera and FPGA hardware with 240 frames par second. To apply this system in the real microscope, a new criterion "ghost filtering" technique to reconstruct the all-in-focus image is proposed. Finally, the micro observation shows the validity of this system.
Application of speed-enhanced spatial domain correlation filters for real-time security monitoring

NASA Astrophysics Data System (ADS)

Gardezi, Akber; Bangalore, Nagachetan; Al-Kandri, Ahmed; Birch, Philip; Young, Rupert; Chatwin, Chris

2011-11-01

A speed enhanced space variant correlation filer which has been designed to be invariant to change in orientation and scale of the target object but also to be spatially variant, i.e. the filter function becoming dependant on local clutter conditions within the image. The speed enhancement of the filter is due to the use of optimization techniques employing low-pass filtering to restrict kernel movement to be within regions of interest. The detection and subsequent identification capability of the two-stage process has been evaluated in highly cluttered backgrounds using both visible and thermal imagery acquired from civil and defense domains along with associated training data sets for target detection and classification. In this paper a series of tests have been conducted in multiple scenarios relating to situations that pose a security threat. Performance matrices comprised of peak-to-correlation energy (PCE) and peak-to-side lobe ratio (PSR) measurements of the correlation output have been calculated to allow the definition of a recognition criterion. The hardware implementation of the system has been discussed in terms of Field Programmable Gate Array (FPGA) chipsets with implementation bottle necks and their solution being considered.
Architecture and implementation considerations of a high-speed Viterbi decoder for a Reed-Muller subcode

NASA Technical Reports Server (NTRS)

Lin, Shu (Principal Investigator); Uehara, Gregory T.; Nakamura, Eric; Chu, Cecilia W. P.

1996-01-01

The (64, 40, 8) subcode of the third-order Reed-Muller (RM) code for high-speed satellite communications is proposed. The RM subcode can be used either alone or as an inner code of a concatenated coding system with the NASA standard (255, 233, 33) Reed-Solomon (RS) code as the outer code to achieve high performance (or low bit-error rate) with reduced decoding complexity. It can also be used as a component code in a multilevel bandwidth efficient coded modulation system to achieve reliable bandwidth efficient data transmission. The progress made toward achieving the goal of implementing a decoder system based upon this code is summarized. The development of the integrated circuit prototype sub-trellis IC, particularly focusing on the design methodology, is addressed.
Real-time distortion correction for visual inspection systems based on FPGA

NASA Astrophysics Data System (ADS)

Liang, Danhua; Zhang, Zhaoxia; Chen, Xiaodong; Yu, Daoyin

2008-03-01

Visual inspection is a kind of new technology based on the research of computer vision, which focuses on the measurement of the object's geometry and location. It can be widely used in online measurement, and other real-time measurement process. Because of the defects of the traditional visual inspection, a new visual detection mode -all-digital intelligent acquisition and transmission is presented. The image processing, including filtering, image compression, binarization, edge detection and distortion correction, can be completed in the programmable devices -FPGA. As the wide-field angle lens is adopted in the system, the output images have serious distortion. Limited by the calculating speed of computer, software can only correct the distortion of static images but not the distortion of dynamic images. To reach the real-time need, we design a distortion correction system based on FPGA. The method of hardware distortion correction is that the spatial correction data are calculated first under software circumstance, then converted into the address of hardware storage and stored in the hardware look-up table, through which data can be read out to correct gray level. The major benefit using FPGA is that the same circuit can be used for other circularly symmetric wide-angle lenses without being modified.
Low-Cost High-Speed Techniques for Real-Time Simulation of Power Electronic Systems

DTIC Science & Technology

2007-06-01

first implemented on the RT-Lab using Simulink S- fuctions . An effort was then initiated to code at least part of the simulation on the available FPGA. It...time simulation, and the use of simulation packages such as Matlab and Spice. The primary purpose of these calculations was to confirm that the
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Monenegro, Justino (Technical Monitor)

2002-01-01

Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used proportional-integral-derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM-based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a DSP (Digital Signal Processor) or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSP) devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. An alternative is required for compact implementation of such functionality to withstand the harsh environment
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Montenegro, Justino (Technical Monitor)

2002-01-01

Much has been made of the capabilities of Field Programmable Gate Arrays (FPGA's) in the hardware implementation of fast digital signal processing functions. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used Proportional-Integral-Derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a Digital Signal Processor (DSP) device or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using DSP devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, Pulse Width Modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacemap. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive-control algorithm
Multi-camera synchronization core implemented on USB3 based FPGA platform

NASA Astrophysics Data System (ADS)

Sousa, Ricardo M.; Wäny, Martin; Santos, Pedro; Dias, Morgado

2015-03-01

Centered on Awaiba's NanEye CMOS image sensor family and a FPGA platform with USB3 interface, the aim of this paper is to demonstrate a new technique to synchronize up to 8 individual self-timed cameras with minimal error. Small form factor self-timed camera modules of 1 mm x 1 mm or smaller do not normally allow external synchronization. However, for stereo vision or 3D reconstruction with multiple cameras as well as for applications requiring pulsed illumination it is required to synchronize multiple cameras. In this work, the challenge of synchronizing multiple selftimed cameras with only 4 wire interface has been solved by adaptively regulating the power supply for each of the cameras. To that effect, a control core was created to constantly monitor the operating frequency of each camera by measuring the line period in each frame based on a well-defined sampling signal. The frequency is adjusted by varying the voltage level applied to the sensor based on the error between the measured line period and the desired line period. To ensure phase synchronization between frames, a Master-Slave interface was implemented. A single camera is defined as the Master, with its operating frequency being controlled directly through a PC based interface. The remaining cameras are setup in Slave mode and are interfaced directly with the Master camera control module. This enables the remaining cameras to monitor its line and frame period and adjust their own to achieve phase and frequency synchronization. The result of this work will allow the implementation of smaller than 3mm diameter 3D stereo vision equipment in medical endoscopic context, such as endoscopic surgical robotic or micro invasive surgery.
FPGA-based GEM detector signal acquisition for SXR spectroscopy system

NASA Astrophysics Data System (ADS)

Wojenski, A.; Pozniak, K. T.; Kasprowicz, G.; Kolasinski, P.; Krawczyk, R.; Zabolotny, W.; Chernyshova, M.; Czarski, T.; Malinowski, K.

2016-11-01

The presented work is related to the Gas Electron Multiplier (GEM) detector soft X-ray spectroscopy system for tokamak applications. The used GEM detector has one-dimensional, 128 channel readout structure. The channels are connected to the radiation-hard electronics with configurable analog stage and fast ADCs, supporting speeds of 125 MSPS for each channel. The digitalized data is sent directly to the FPGAs using fast serial links. The preprocessing algorithms are implemented in the FPGAs, with the data buffering made in the on-board 2Gb DDR3 memory chips. After the algorithmic stage, the data is sent to the Intel Xeon-based PC for further postprocessing using PCI-Express link Gen 2. For connection of multiple FPGAs, PCI-Express switch 8-to-1 was designed. The whole system can support up to 2048 analog channels. The scope of the work is an FPGA-based implementation of the recorder of the raw signal from GEM detector. Since the system will work in a very challenging environment (neutron radiation, intense electro-magnetic fields), the registered signals from the GEM detector can be corrupted. In the case of the very intense hot plasma radiation (e.g. laser generated plasma), the registered signals can overlap. Therefore, it is valuable to register the raw signals from the GEM detector with high number of events during soft X-ray radiation. The signal analysis will have the direct impact on the implementation of photon energy computation algorithms. As the result, the system will produce energy spectra and topological distribution of soft X-ray radiation. The advanced software was developed in order to perform complex system startup and monitoring of hardware units. Using the array of two one-dimensional GEM detectors it will be possible to perform tomographic reconstruction of plasma impurities radiation in the SXR region.
A low power flash-FPGA based brain implant micro-system of PID control.

PubMed

Lijuan Xia; Fattah, Nabeel; Soltan, Ahmed; Jackson, Andrew; Chester, Graeme; Degenaar, Patrick

2017-07-01

In this paper, we demonstrate that a low power flash FPGA based micro-system can provide a low power programmable interface for closed-loop brain implant inter- faces. The proposed micro-system receives recording local field potential (LFP) signals from an implanted probe, performs closed-loop control using a first order control system, then converts the signal into an optogenetic control stimulus pattern. Stimulus can be implemented through optoelectronic probes. The long term target is for both fundamental neuroscience applications and for clinical use in treating epilepsy. Utilizing our device, closed-loop processing consumes only 14nJ of power per PID cycle compared to 1.52μJ per cycle for a micro-controller implementation. Compared to an application specific digital integrated circuit, flash FPGA's are inherently programmable.

High-Speed Soft-Decision Decoding of Two Reed-Muller Codes

NASA Technical Reports Server (NTRS)

Lin, Shu; Uehara, Gregory T.

1996-01-01

In this research, we have proposed the (64, 40, 8) subcode of the third-order Reed-Muller (RM) code to NASA for high-speed satellite communications. This RM subcode can be used either alone or as an inner code of a concatenated coding system with the NASA standard (255, 233, 33) Reed-Solomon (RS) code as the outer code to achieve high performance (or low bit-error rate) with reduced decoding complexity. It can also be used as a component code in a multilevel bandwidth efficient coded modulation system to achieve reliable bandwidth efficient data transmission. This report will summarize the key progress we have made toward achieving our eventual goal of implementing, a decoder system based upon this code. In the first phase of study, we investigated the complexities of various sectionalized trellis diagrams for the proposed (64, 40, 8) RM subcode. We found a specific 8-trellis diagram for this code which requires the least decoding complexity with a high possibility of achieving a decoding speed of 600 M bits per second (Mbps). The combination of a large number of states and a high data rate will be made possible due to the utilization of a high degree of parallelism throughout the architecture. This trellis diagram will be presented and briefly described. In the second phase of study, which was carried out through the past year, we investigated circuit architectures to determine the feasibility of VLSI implementation of a high-speed Viterbi decoder based on this 8-section trellis diagram. We began to examine specific design and implementation approaches to implement a fully custom integrated circuit (IC) which will be a key building block for a decoder system implementation. The key results will be presented in this report. This report will be divided into three primary sections. First, we will briefly describe the system block diagram in which the proposed decoder is assumed to be operating, and present some of the key architectural approaches being used to
An 18-ps TDC using timing adjustment and bin realignment methods in a Cyclone-IV FPGA

NASA Astrophysics Data System (ADS)

Cao, Guiping; Xia, Haojie; Dong, Ning

2018-05-01

The method commonly used to produce a field-programmable gate array (FPGA)-based time-to-digital converter (TDC) creates a tapped delay line (TDL) for time interpolation to yield high time precision. We conduct timing adjustment and bin realignment to implement a TDC in the Altera Cyclone-IV FPGA. The former tunes the carry look-up table (LUT) cell delay by changing the LUT's function through low-level primitives according to timing analysis results, while the latter realigns bins according to the timing result obtained by timing adjustment so as to create a uniform TDL with bins of equivalent width. The differential nonlinearity and time resolution can be improved by realigning the bins. After calibration, the TDC has a 18 ps root-mean-square timing resolution and a 45 ps least-significant bit resolution.
A high-speed DAQ framework for future high-level trigger and event building clusters

NASA Astrophysics Data System (ADS)

Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.

2017-03-01

Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Investigation of High-Level Synthesis tools’ applicability to data acquisition systems design based on the CMS ECAL Data Concentrator Card example

NASA Astrophysics Data System (ADS)

HUSEJKO, Michal; EVANS, John; RASTEIRO DA SILVA, Jose Carlos

2015-12-01

High-Level Synthesis (HLS) for Field-Programmable Logic Array (FPGA) programming is becoming a practical alternative to well-established VHDL and Verilog languages. This paper describes a case study in the use of HLS tools to design FPGA-based data acquisition systems (DAQ). We will present the implementation of the CERN CMS detector ECAL Data Concentrator Card (DCC) functionality in HLS and lessons learned from using HLS design flow. The DCC functionality and a definition of the initial system-level performance requirements (latency, bandwidth, and throughput) will be presented. We will describe how its packet processing control centric algorithm was implemented with VHDL and Verilog languages. We will then show how the HLS flow could speed up design-space exploration by providing loose coupling between functions interface design and functions algorithm implementation. We conclude with results of real-life hardware tests performed with the HLS flow-generated design with a DCC Tester system.
High-Speed Soft-Decision Decoding of Two Reed-Muller Codes

NASA Technical Reports Server (NTRS)

Lin, Shu; Uehara, Gregory T.

1996-01-01

In his research, we have proposed the (64, 40, 8) subcode of the third-order Reed-Muller (RM) code to NASA for high-speed satellite communications. This RM subcode can be used either alone or as an inner code of a concatenated coding system with the NASA standard (255, 233, 33) Reed-Solomon (RS) code as the outer code to achieve high performance (or low bit-error rate) with reduced decoding complexity. It can also be used as a component code in a multilevel bandwidth efficient coded modulation system to achieve reliable bandwidth efficient data transmission. This report will summarize the key progress we have made toward achieving our eventual goal of implementing a decoder system based upon this code. In the first phase of study, we investigated the complexities of various sectionalized trellis diagrams for the proposed (64, 40, 8) RNI subcode. We found a specific 8-trellis diagram for this code which requires the least decoding complexity with a high possibility of achieving a decoding speed of 600 M bits per second (Mbps). The combination of a large number of states and a hi ch data rate will be made possible due to the utilization of a high degree of parallelism throughout the architecture. This trellis diagram will be presented and briefly described. In the second phase of study which was carried out through the past year, we investigated circuit architectures to determine the feasibility of VLSI implementation of a high-speed Viterbi decoder based on this 8-section trellis diagram. We began to examine specific design and implementation approaches to implement a fully custom integrated circuit (IC) which will be a key building block for a decoder system implementation. The key results will be presented in this report. This report will be divided into three primary sections. First, we will briefly describe the system block diagram in which the proposed decoder is assumed to be operating and present some of the key architectural approaches being used to
Efficient lossy compression implementations of hyperspectral images: tools, hardware platforms, and comparisons

NASA Astrophysics Data System (ADS)

García, Aday; Santos, Lucana; López, Sebastián.; Callicó, Gustavo M.; Lopez, Jose F.; Sarmiento, Roberto

2014-05-01

Efficient onboard satellite hyperspectral image compression represents a necessity and a challenge for current and future space missions. Therefore, it is mandatory to provide hardware implementations for this type of algorithms in order to achieve the constraints required for onboard compression. In this work, we implement the Lossy Compression for Exomars (LCE) algorithm on an FPGA by means of high-level synthesis (HSL) in order to shorten the design cycle. Specifically, we use CatapultC HLS tool to obtain a VHDL description of the LCE algorithm from C-language specifications. Two different approaches are followed for HLS: on one hand, introducing the whole C-language description in CatapultC and on the other hand, splitting the C-language description in functional modules to be implemented independently with CatapultC, connecting and controlling them by an RTL description code without HLS. In both cases the goal is to obtain an FPGA implementation. We explain the several changes applied to the original Clanguage source code in order to optimize the results obtained by CatapultC for both approaches. Experimental results show low area occupancy of less than 15% for a SRAM-based Virtex-5 FPGA and a maximum frequency above 80 MHz. Additionally, the LCE compressor was implemented into an RTAX2000S antifuse-based FPGA, showing an area occupancy of 75% and a frequency around 53 MHz. All these serve to demonstrate that the LCE algorithm can be efficiently executed on an FPGA onboard a satellite. A comparison between both implementation approaches is also provided. The performance of the algorithm is finally compared with implementations on other technologies, specifically a graphics processing unit (GPU) and a single-threaded CPU.
FPGA-based firmware model for extended measurement systems with data quality monitoring

NASA Astrophysics Data System (ADS)

Wojenski, A.; Pozniak, K. T.; Mazon, D.; Chernyshova, M.

2017-08-01

Modern physics experiments requires construction of advanced, modular measurement systems for data processing and registration purposes. Components are often designed in one of the common mechanical and electrical standards, e.g. VME or uTCA. The paper is focused on measurement systems using FPGAs as data processing blocks, especially for plasma diagnostics using GEM detectors with data quality monitoring aspects. In the article is proposed standardized model of HDL FPGA firmware implementation, for use in a wide range of different measurement system. The effort was made in term of flexible implementation of data quality monitoring along with source data dynamic selection. In the paper is discussed standard measurement system model followed by detailed model of FPGA firmware for modular measurement systems. Considered are both: functional blocks and data buses. In the summary, necessary blocks and signal lines are described. Implementation of firmware following the presented rules should provide modular design, with ease of change different parts of it. The key benefit is construction of universal, modular HDL design, that can be applied in different measurement system with simple adjustments.
Design of the ANTARES LCM-DAQ board test bench using a FPGA-based system-on-chip approach

NASA Astrophysics Data System (ADS)

Anvar, S.; Kestener, P.; Le Provost, H.

2006-11-01

The System-on-Chip (SoC) approach consists in using state-of-the-art FPGA devices with embedded RISC processor cores, high-speed differential LVDS links and ready-to-use multi-gigabit transceivers allowing development of compact systems with substantial number of IO channels. Required performances are obtained through a subtle separation of tasks between closely cooperating programmable hardware logic and user-friendly software environment. We report about our experience in using the SoC approach for designing the production test bench of the off-shore readout system for the ANTARES neutrino experiment.
FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

PubMed Central

2011-01-01

Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582
Low Speed and High Speed Correlation of SMART Active Flap Rotor Loads

NASA Technical Reports Server (NTRS)

Kottapalli, Sesi B. R.

2010-01-01

Measured, open loop and closed loop data from the SMART rotor test in the NASA Ames 40- by 80- Foot Wind Tunnel are compared with CAMRAD II calculations. One open loop high-speed case and four closed loop cases are considered. The closed loop cases include three high-speed cases and one low-speed case. Two of these high-speed cases include a 2 deg flap deflection at 5P case and a test maximum-airspeed case. This study follows a recent, open loop correlation effort that used a simple correction factor for the airfoil pitching moment Mach number. Compared to the earlier effort, the current open loop study considers more fundamental corrections based on advancing blade aerodynamic conditions. The airfoil tables themselves have been studied. Selected modifications to the HH-06 section flap airfoil pitching moment table are implemented. For the closed loop condition, the effect of the flap actuator is modeled by increased flap hinge stiffness. Overall, the open loop correlation is reasonable, thus confirming the basic correctness of the current semi-empirical modifications; the closed loop correlation is also reasonable considering that the current flap model is a first generation model. Detailed correlation results are given in the paper.
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems

NASA Technical Reports Server (NTRS)

Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.

2012-01-01

Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor

PubMed Central

Tayara, Hilal; Ham, Woonchul; Chong, Kil To

2016-01-01

This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.

PubMed

Tayara, Hilal; Ham, Woonchul; Chong, Kil To

2016-12-15

This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
Frequency domain near-infrared multiwavelength imager design using high-speed, direct analog-to-digital conversion

NASA Astrophysics Data System (ADS)

Zimmermann, Bernhard B.; Fang, Qianqian; Boas, David A.; Carp, Stefan A.

2016-01-01

Frequency domain near-infrared spectroscopy (FD-NIRS) has proven to be a reliable method for quantification of tissue absolute optical properties. We present a full-sampling direct analog-to-digital conversion FD-NIR imager. While we developed this instrument with a focus on high-speed optical breast tomographic imaging, the proposed design is suitable for a wide-range of biophotonic applications where fast, accurate quantification of absolute optical properties is needed. Simultaneous dual wavelength operation at 685 and 830 nm is achieved by concurrent 67.5 and 75 MHz frequency modulation of each laser source, respectively, followed by digitization using a high-speed (180 MS/s) 16-bit A/D converter and hybrid FPGA-assisted demodulation. The instrument supports 25 source locations and features 20 concurrently operating detectors. The noise floor of the instrument was measured at <1.4 pW/√Hz, and a dynamic range of 115+ dB, corresponding to nearly six orders of magnitude, has been demonstrated. Titration experiments consisting of 200 different absorption and scattering values were conducted to demonstrate accurate optical property quantification over the entire range of physiologically expected values.
Frequency domain near-infrared multiwavelength imager design using high-speed, direct analog-to-digital conversion

PubMed Central

Zimmermann, Bernhard B.; Fang, Qianqian; Boas, David A.; Carp, Stefan A.

2016-01-01

Abstract. Frequency domain near-infrared spectroscopy (FD-NIRS) has proven to be a reliable method for quantification of tissue absolute optical properties. We present a full-sampling direct analog-to-digital conversion FD-NIR imager. While we developed this instrument with a focus on high-speed optical breast tomographic imaging, the proposed design is suitable for a wide-range of biophotonic applications where fast, accurate quantification of absolute optical properties is needed. Simultaneous dual wavelength operation at 685 and 830 nm is achieved by concurrent 67.5 and 75 MHz frequency modulation of each laser source, respectively, followed by digitization using a high-speed (180 MS/s) 16-bit A/D converter and hybrid FPGA-assisted demodulation. The instrument supports 25 source locations and features 20 concurrently operating detectors. The noise floor of the instrument was measured at <1.4 pW/√Hz, and a dynamic range of 115+ dB, corresponding to nearly six orders of magnitude, has been demonstrated. Titration experiments consisting of 200 different absorption and scattering values were conducted to demonstrate accurate optical property quantification over the entire range of physiologically expected values. PMID:26813081
Design of an MR image processing module on an FPGA chip.

PubMed

Li, Limin; Wyrwicz, Alice M

2015-06-01

We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128×128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. Copyright © 2015 Elsevier Inc. All rights reserved.
Design of an MR image processing module on an FPGA chip

NASA Astrophysics Data System (ADS)

Li, Limin; Wyrwicz, Alice M.

2015-06-01

We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments.
Design of an MR image processing module on an FPGA chip

PubMed Central

Li, Limin; Wyrwicz, Alice M.

2015-01-01

We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. PMID:25909646
A Pipelined Non-Deterministic Finite Automaton-Based String Matching Scheme Using Merged State Transitions in an FPGA

PubMed Central

Choi, Kang-Il

2016-01-01

This paper proposes a pipelined non-deterministic finite automaton (NFA)-based string matching scheme using field programmable gate array (FPGA) implementation. The characteristics of the NFA such as shared common prefixes and no failure transitions are considered in the proposed scheme. In the implementation of the automaton-based string matching using an FPGA, each state transition is implemented with a look-up table (LUT) for the combinational logic circuit between registers. In addition, multiple state transitions between stages can be performed in a pipelined fashion. In this paper, it is proposed that multiple one-to-one state transitions, called merged state transitions, can be performed with an LUT. By cutting down the number of used LUTs for implementing state transitions, the hardware overhead of combinational logic circuits is greatly reduced in the proposed pipelined NFA-based string matching scheme. PMID:27695114
A Pipelined Non-Deterministic Finite Automaton-Based String Matching Scheme Using Merged State Transitions in an FPGA.

PubMed

Kim, HyunJin; Choi, Kang-Il

2016-01-01

This paper proposes a pipelined non-deterministic finite automaton (NFA)-based string matching scheme using field programmable gate array (FPGA) implementation. The characteristics of the NFA such as shared common prefixes and no failure transitions are considered in the proposed scheme. In the implementation of the automaton-based string matching using an FPGA, each state transition is implemented with a look-up table (LUT) for the combinational logic circuit between registers. In addition, multiple state transitions between stages can be performed in a pipelined fashion. In this paper, it is proposed that multiple one-to-one state transitions, called merged state transitions, can be performed with an LUT. By cutting down the number of used LUTs for implementing state transitions, the hardware overhead of combinational logic circuits is greatly reduced in the proposed pipelined NFA-based string matching scheme.

Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions

Treesearch

Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners

1995-01-01

In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...
High-speed wavefront modulation in complex media (Conference Presentation)

NASA Astrophysics Data System (ADS)

Turtaev, Sergey; Leite, Ivo T.; Cizmár, TomáÅ.¡

2017-02-01

Using spatial light modulators(SLM) to control light propagation through scattering media is a critical topic for various applications in biomedical imaging, optical micromanipulation, and fibre endoscopy. Having limited switching rate, typically 10-100Hz, current liquid-crystal SLM can no longer meet the growing demands of high-speed imaging. A new way based on binary-amplitude holography implemented on digital micromirror devices(DMD) has been introduced recently, allowing to reach refreshing rates of 30kHz. Here, we summarise the advantages and limitations in speed, efficiency, scattering noise, and pixel cross-talk for each device in ballistic and diffusive regimes, paving the way for high-speed imaging through multimode fibres.
Numerical Simulation of High-Speed Turbulent Reacting Flows

NASA Technical Reports Server (NTRS)

Givi, P.; Taulbee, D. B.; Madnia, C. K.; Jaberi, F. A.; Colucci, P. J.; Gicquel, L. Y. M.; Adumitroaie, V.; James, S.

1999-01-01

The objectives of this research are: (1) to develop and implement a new methodology for large eddy simulation of (LES) of high-speed reacting turbulent flows. (2) To develop algebraic turbulence closures for statistical description of chemically reacting turbulent flows.
Evaluating outcomes of raising speed limits on high speed non-freeways.

DOT National Transportation Integrated Search

2015-04-01

The purpose of this research was to assist in determining the potential impacts of implementing a : proposed 65 mph speed limit on non-freeways in Michigan. Consideration was given to a broad range of : performance measures, including operating speed...
The characterization and application of a low resource FPGA-based time to digital converter

NASA Astrophysics Data System (ADS)

Balla, Alessandro; Mario Beretta, Matteo; Ciambrone, Paolo; Gatta, Maurizio; Gonnella, Francesco; Iafolla, Lorenzo; Mascolo, Matteo; Messi, Roberto; Moricciani, Dario; Riondino, Domenico

2014-03-01

Time to Digital Converters (TDCs) are very common devices in particles physics experiments. A lot of "off-the-shelf" TDCs can be employed but the necessity of a custom DAta acQuisition (DAQ) system makes the TDCs implemented on the Field-Programmable Gate Arrays (FPGAs) desirable. Most of the architectures developed so far are based on the tapped delay lines with precision down to 10 ps, obtained with high FPGA resources usage and non-linearity issues to be managed. Often such precision is not necessary; in this case TDC architectures with low resources occupancy are preferable allowing the implementation of data processing systems and of other utilities on the same device. In order to reconstruct γγ physics events tagged with High Energy Tagger (HET) in the KLOE-2 (K LOng Experiment 2), we need to measure the Time Of Flight (TOF) of the electrons and positrons from the KLOE-2 Interaction Point (IP) to our tagging stations (11 m apart). The required resolution must be better than the bunch spacing (2.7 ns). We have developed and implemented on a Xilinx Virtex-5 FPGA a 32 channel TDC with a precision of 255 ps and low non-linearity effects along with an embedded data acquisition system and the interface to the online FARM of KLOE-2. The TDC is based on a low resources occupancy technique: the 4×Oversampling technique which, in this work, is pushed to its best resolution and its performances were exhaustively measured.
Real time mitigation of atmospheric turbulence in long distance imaging using the lucky region fusion algorithm with FPGA and GPU hardware acceleration

NASA Astrophysics Data System (ADS)

Jackson, Christopher Robert

"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
A 3.9 ps Time-Interval RMS Precision Time-to-Digital Converter Using a Dual-Sampling Method in an UltraScale FPGA

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Liu, Chong

2016-10-01

Field programmable gate arrays (FPGAs) manufactured with more advanced processing technology have faster carry chains and smaller delay elements, which are favorable for the design of tapped delay line (TDL)-style time-to-digital converters (TDCs) in FPGA. However, new challenges are posed in using them to implement TDCs with a high time precision. In this paper, we propose a bin realignment method and a dual-sampling method for TDC implementation in a Xilinx UltraScale FPGA. The former realigns the disordered time delay taps so that the TDC precision can approach the limit of its delay granularity, while the latter doubles the number of taps in the delay line so that the TDC precision beyond the cell delay limitation can be expected. Two TDC channels were implemented in a Kintex UltraScale FPGA, and the effectiveness of the new methods was evaluated. For fixed time intervals in the range from 0 to 440 ns, the average RMS precision measured by the two TDC channels reaches 5.8 ps using the bin realignment, and it further improves to 3.9 ps by using the dual-sampling method. The time precision has a 5.6% variation in the measured temperature range. Every part of the TDC, including dual-sampling, encoding, and on-line calibration, could run at a 500 MHz clock frequency. The system measurement dead time is only 4 ns.
VLSI Implementation of a 2.8 Gevent/s Packet-Based AER Interface with Routing and Event Sorting Functionality

PubMed Central

Scholze, Stefan; Schiefer, Stefan; Partzsch, Johannes; Hartmann, Stephan; Mayr, Christian Georg; Höppner, Sebastian; Eisenreich, Holger; Henker, Stephan; Vogginger, Bernhard; Schüffny, Rene

2011-01-01

State-of-the-art large-scale neuromorphic systems require sophisticated spike event communication between units of the neural network. We present a high-speed communication infrastructure for a waferscale neuromorphic system, based on application-specific neuromorphic communication ICs in an field programmable gate arrays (FPGA)-maintained environment. The ICs implement configurable axonal delays, as required for certain types of dynamic processing or for emulating spike-based learning among distant cortical areas. Measurements are presented which show the efficacy of these delays in influencing behavior of neuromorphic benchmarks. The specialized, dedicated address-event-representation communication in most current systems requires separate, low-bandwidth configuration channels. In contrast, the configuration of the waferscale neuromorphic system is also handled by the digital packet-based pulse channel, which transmits configuration data at the full bandwidth otherwise used for pulse transmission. The overall so-called pulse communication subgroup (ICs and FPGA) delivers a factor 25–50 more event transmission rate than other current neuromorphic communication infrastructures. PMID:22016720
Development and characterisation of FPGA modems using forward error correction for FSOC

NASA Astrophysics Data System (ADS)

Mudge, Kerry A.; Grant, Kenneth J.; Clare, Bradley A.; Biggs, Colin L.; Cowley, William G.; Manning, Sean; Lechner, Gottfried

2016-05-01

In this paper we report on the performance of a free-space optical communications (FSOC) modem implemented in FPGA, with data rate variable up to 60 Mbps. To combat the effects of atmospheric scintillation, a 7/8 rate low density parity check (LDPC) forward error correction is implemented along with custom bit and frame synchronisation and a variable length interleaver. We report on the systematic performance evaluation of an optical communications link employing the FPGA modems using a laboratory test-bed to simulate the effects of atmospheric turbulence. Log-normal fading is imposed onto the transmitted free-space beam using a custom LabVIEW program and an acoustic-optic modulator. The scintillation index, transmitted optical power and the scintillation bandwidth can all be independently varied allowing testing over a wide range of optical channel conditions. In particular, bit-error-ratio (BER) performance for different interleaver lengths is investigated as a function of the scintillation bandwidth. The laboratory results are compared to field measurements over 1.5km.
FPGA-based prototype storage system with phase change memory

NASA Astrophysics Data System (ADS)

Li, Gezi; Chen, Xiaogang; Chen, Bomy; Li, Shunfen; Zhou, Mi; Han, Wenbing; Song, Zhitang

2016-10-01

With the ever-increasing amount of data being stored via social media, mobile telephony base stations, and network devices etc. the database systems face severe bandwidth bottlenecks when moving vast amounts of data from storage to the processing nodes. At the same time, Storage Class Memory (SCM) technologies such as Phase Change Memory (PCM) with unique features like fast read access, high density, non-volatility, byte-addressability, positive response to increasing temperature, superior scalability, and zero standby leakage have changed the landscape of modern computing and storage systems. In such a scenario, we present a storage system called FLEET which can off-load partial or whole SQL queries to the storage engine from CPU. FLEET uses an FPGA rather than conventional CPUs to implement the off-load engine due to its highly parallel nature. We have implemented an initial prototype of FLEET with PCM-based storage. The results demonstrate that significant performance and CPU utilization gains can be achieved by pushing selected query processing components inside in PCM-based storage.
Montaje Experimental de Optica Adaptiva con Tecnología FPGA

NASA Astrophysics Data System (ADS)

Rodriguez Brizuela, F.; Verasay, J. P.; Recabarren, P.

An experimental platform based on FPGA devices, dedicated to implement active and adaptive optic software in HDL has been developed. The devel- oped assembly is the first of a series of works focused on this important area of instrumental astronomy. The exposed development is part of a Final Project of Electronic Engineering of the National University of Cordoba. FULL TEXT IN SPANISH
System-on-chip architecture and validation for real-time transceiver optimization: APC implementation on FPGA

NASA Astrophysics Data System (ADS)

Suarez, Hernan; Zhang, Yan R.

2015-05-01

New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
High-speed and ultrahigh-speed cinematographic recording techniques

NASA Astrophysics Data System (ADS)

Miquel, J. C.

1980-12-01

A survey is presented of various high-speed and ultrahigh-speed cinematographic recording systems (covering a range of speeds from 100 to 14-million pps). Attention is given to the functional and operational characteristics of cameras and to details of high-speed cinematography techniques (including image processing, and illumination). A list of cameras (many of them French) available in 1980 is presented
From OO to FPGA :

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kou, Stephen; Palsberg, Jens; Brooks, Jeffrey

Consumer electronics today such as cell phones often have one or more low-power FPGAs to assist with energy-intensive operations in order to reduce overall energy consumption and increase battery life. However, current techniques for programming FPGAs require people to be specially trained to do so. Ideally, software engineers can more readily take advantage of the benefits FPGAs offer by being able to program them using their existing skills, a common one being object-oriented programming. However, traditional techniques for compiling object-oriented languages are at odds with todays FPGA tools, which support neither pointers nor complex data structures. Open until now ismore » the problem of compiling an object-oriented language to an FPGA in a way that harnesses this potential for huge energy savings. In this paper, we present a new compilation technique that feeds into an existing FPGA tool chain and produces FPGAs with up to almost an order of magnitude in energy savings compared to a low-power microprocessor while still retaining comparable performance and area usage.« less
FPGA-based real time controller for high order correction in EDIFISE

NASA Astrophysics Data System (ADS)

Rodríguez-Ramos, L. F.; Chulani, H.; Martín, Y.; Dorta, T.; Alonso, A.; Fuensalida, J. J.

2012-07-01

EDIFISE is a technology demonstrator instrument developed at the Institute of Astrophysics of the Canary Islands (IAC), intended to explore the feasibility of combining Adaptive Optics with attenuated optical fibers in order to obtain high spatial resolution spectra at the surroundings of a star, as an alternative to coronagraphy. A simplified version with only tip tilt correction has been tested at the OGS telescope in Observatorio del Teide (Canary islands, Spain) and a complete version is intended to be tested at the OGS and at the WHT telescope in Observatorio del Roque de los Muchachos, (Canary Islands, Spain). This paper describes the FPGA-based real time control of the High Order unit, responsible of the computation of the actuation values of a 97-actuactor deformable mirror (11x11) with the information provided by a configurable wavefront sensor of up to 16x16 subpupils at 500 Hz (128x128 pixels). The reconfigurable logic hardware will allow both zonal and modal control approaches, will full access to select which mode loops should be closed and with a number of utilities for influence matrix and open loop response measurements. The system has been designed in a modular way to allow for easy upgrade to faster frame rates (1500 Hz) and bigger wavefront sensors (240x240 pixels), accepting also several interfaces from the WFS and towards the mirror driver. The FPGA-based (Field Programmable Gate Array) real time controller provides bias and flat-fielding corrections, subpupil slopes to modal matrix computation for up to 97 modes, independent servo loop controllers for each mode with user control for independent loop opening or closing, mode to actuator matrix computation and non-common path aberration correction capability. It also provides full housekeeping control via UPD/IP for matrix reloading and full system data logging.
A 256-channel, high throughput and precision time-to-digital converter with a decomposition encoding scheme in a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Song, Z.; Wang, Y.; Kuang, J.

2018-05-01

Field Programmable Gate Arrays (FPGAs) made with 28 nm and more advanced process technology have great potentials for implementation of high precision time-to-digital convertors (TDC), because the delay cells in the tapped delay line (TDL) used for time interpolation are getting smaller and smaller. However, the bubble problems in the TDL status are becoming more complicated, which make it difficult to achieve TDCs on these chips with a high time precision. In this paper, we are proposing a novel decomposition encoding scheme, which not only can solve the bubble problem easily, but also has a high encoding efficiency. The potential of these chips to realize TDC can be fully released with the scheme. In a Xilinx Kintex-7 FPGA chip, we implemented a TDC system with 256 TDC channels, which doubles the number of TDC channels that our previous technique could achieve. Performances of all these TDC channels are evaluated. The average RMS time precision among them is 10.23 ps in the time-interval measurement range of (0–10 ns), and their measurement throughput reaches 277 M measures per second.
High-speed and low-power repeater for VLSI interconnects

NASA Astrophysics Data System (ADS)

Karthikeyan, A.; Mallick, P. S.

2017-10-01

This paper proposes a repeater for boosting the speed of interconnects with low power dissipation. We have designed and implemented at 45 and 32 nm technology nodes. Delay and power dissipation performances are analyzed for various voltage levels at these technology nodes using Spice simulations. A significant reduction in delay and power dissipation are observed compared to a conventional repeater. The results show that the proposed high-speed low-power repeater has a reduced delay for higher load capacitance. The proposed repeater is also compared with LPTG CMOS repeater, and the results shows that the proposed repeater has reduced delay. The proposed repeater can be suitable for high-speed global interconnects and has the capacity to drive large loads.
Spindle speed variation technique in turning operations: Modeling and real implementation

NASA Astrophysics Data System (ADS)

Urbikain, G.; Olvera, D.; de Lacalle, L. N. López; Elías-Zúñiga, A.

2016-11-01

Chatter is still one of the most challenging problems in machining vibrations. Researchers have focused their efforts to prevent, avoid or reduce chatter vibrations by introducing more accurate predictive physical methods. Among them, the techniques based on varying the rotational speed of the spindle (or SSV, Spindle Speed Variation) have gained great relevance. However, several problems need to be addressed due to technical and practical reasons. On one hand, they can generate harmful overheating of the spindle especially at high speeds. On the other hand, the machine may be unable to perform the interpolation properly. Moreover, it is not trivial to select the most appropriate tuning parameters. This paper conducts a study of the real implementation of the SSV technique in turning systems. First, a stability model based on perturbation theory was developed for simulation purposes. Secondly, the procedure to realistically implement the technique in a conventional turning center was tested and developed. The balance between the improved stability margins and acceptable behavior of the spindle is ensured by energy consumption measurements. Mathematical model shows good agreement with experimental cutting tests.
[Hardware Implementation of Numerical Simulation Function of Hodgkin-Huxley Model Neurons Action Potential Based on Field Programmable Gate Array].

PubMed

Wang, Jinlong; Lu, Mai; Hu, Yanwen; Chen, Xiaoqiang; Pan, Qiangqiang

2015-12-01

Neuron is the basic unit of the biological neural system. The Hodgkin-Huxley (HH) model is one of the most realistic neuron models on the electrophysiological characteristic description of neuron. Hardware implementation of neuron could provide new research ideas to clinical treatment of spinal cord injury, bionics and artificial intelligence. Based on the HH model neuron and the DSP Builder technology, in the present study, a single HH model neuron hardware implementation was completed in Field Programmable Gate Array (FPGA). The neuron implemented in FPGA was stimulated by different types of current, the action potential response characteristics were analyzed, and the correlation coefficient between numerical simulation result and hardware implementation result were calculated. The results showed that neuronal action potential response of FPGA was highly consistent with numerical simulation result. This work lays the foundation for hardware implementation of neural network.
Region-Oriented Placement Algorithm for Coarse-Grained Power-Gating FPGA Architecture

NASA Astrophysics Data System (ADS)

Li, Ce; Dong, Yiping; Watanabe, Takahiro

An FPGA plays an essential role in industrial products due to its fast, stable and flexible features. But the power consumption of FPGAs used in portable devices is one of critical issues. Top-down hierarchical design method is commonly used in both ASIC and FPGA design. But, in the case where plural modules are integrated in an FPGA and some of them might be in sleep-mode, current FPGA architecture cannot be fully effective. In this paper, coarse-grained power gating FPGA architecture is proposed where a whole area of an FPGA is partitioned into several regions and power supply is controlled for each region, so that modules in sleep mode can be effectively power-off. We also propose a region oriented FPGA placement algorithm fitted to this user's hierarchical design based on VPR[1]. Simulation results show that this proposed method could reduce power consumption of FPGA by 38% on average by setting unused modules or regions in sleep mode.

Bridging FPGA and GPU technologies for AO real-time control

NASA Astrophysics Data System (ADS)

Perret, Denis; Lainé, Maxime; Bernard, Julien; Gratadour, Damien; Sevin, Arnaud

2016-07-01

Our team has developed a common environment for high performance simulations and real-time control of AO systems based on the use of Graphics Processors Units in the context of the COMPASS project. Such a solution, based on the ability of the real time core in the simulation to provide adequate computing performance, limits the cost of developing AO RTC systems and makes them more scalable. A code developed and validated in the context of the simulation may be injected directly into the system and tested on sky. Furthermore, the use of relatively low cost components also offers significant advantages for the system hardware platform. However, the use of GPUs in an AO loop comes with drawbacks: the traditional way of offloading computation from CPU to GPUs - involving multiple copies and unacceptable overhead in kernel launching - is not well suited in a real time context. This last application requires the implementation of a solution enabling direct memory access (DMA) to the GPU memory from a third party device, bypassing the operating system. This allows this device to communicate directly with the real-time core of the simulation feeding it with the WFS camera pixel stream. We show that DMA between a custom FPGA-based frame-grabber and a computation unit (GPU, FPGA, or Coprocessor such as Xeon-phi) across PCIe allows us to get latencies compatible with what will be needed on ELTs. As a fine-grained synchronization mechanism is not yet made available by GPU vendors, we propose the use of memory polling to avoid interrupts handling and involvement of a CPU. Network and Vision protocols are handled by the FPGA-based Network Interface Card (NIC). We present the results we obtained on a complete AO loop using camera and deformable mirror simulators.
Implementing Legacy-C Algorithms in FPGA Co-Processors for Performance Accelerated Smart Payloads

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.; Hartzell, Christine

2008-01-01

Accurate, on-board classification of instrument data is used to increase science return by autonomously identifying regions of interest for priority transmission or generating summary products to conserve transmission bandwidth. Due to on-board processing constraints, such classification has been limited to using the simplest functions on a small subset of the full instrument data. FPGA co-processor designs for SVM1 classifiers will lead to significant improvement in on-board classification capability and accuracy.
Compact opto-electronic engine for high-speed compressive sensing

NASA Astrophysics Data System (ADS)

Tidman, James; Weston, Tyler; Hewitt, Donna; Herman, Matthew A.; McMackin, Lenore

2013-09-01

The measurement efficiency of Compressive Sensing (CS) enables the computational construction of images from far fewer measurements than what is usually considered necessary by the Nyquist- Shannon sampling theorem. There is now a vast literature around CS mathematics and applications since the development of its theoretical principles about a decade ago. Applications include quantum information to optical microscopy to seismic and hyper-spectral imaging. In the application of shortwave infrared imaging, InView has developed cameras based on the CS single-pixel camera architecture. This architecture is comprised of an objective lens to image the scene onto a Texas Instruments DLP® Micromirror Device (DMD), which by using its individually controllable mirrors, modulates the image with a selected basis set. The intensity of the modulated image is then recorded by a single detector. While the design of a CS camera is straightforward conceptually, its commercial implementation requires significant development effort in optics, electronics, hardware and software, particularly if high efficiency and high-speed operation are required. In this paper, we describe the development of a high-speed CS engine as implemented in a lab-ready workstation. In this engine, configurable measurement patterns are loaded into the DMD at speeds up to 31.5 kHz. The engine supports custom reconstruction algorithms that can be quickly implemented. Our work includes optical path design, Field programmable Gate Arrays for DMD pattern generation, and circuit boards for front end data acquisition, ADC and system control, all packaged in a compact workstation.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-01-01

This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Design of video interface conversion system based on FPGA

NASA Astrophysics Data System (ADS)

Zhao, Heng; Wang, Xiang-jun

2014-11-01

This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
ACTS High-Speed VSAT Demonstrated

NASA Technical Reports Server (NTRS)

Tran, Quang K.

1999-01-01

The Advanced Communication Technology Satellite (ACTS) developed by NASA has demonstrated the breakthrough technologies of Ka-band transmission, spot-beam antennas, and onboard processing. These technologies have enabled the development of very small and ultrasmall aperture terminals (VSAT s and USAT's), which have capabilities greater than have been possible with conventional satellite technologies. The ACTS High Speed VSAT (HS VSAT) is an effort at the NASA Glenn Research Center at Lewis Field to experimentally demonstrate the maximum user throughput data rate that can be achieved using the technologies developed and implemented on ACTS. This was done by operating the system uplinks as frequency division multiple access (FDMA), essentially assigning all available time division multiple access (TDMA) time slots to a single user on each of two uplink frequencies. Preliminary results show that, using a 1.2-m antenna in this mode, the High Speed VSAT can achieve between 22 and 24 Mbps of the 27.5 Mbps burst rate, for a throughput efficiency of 80 to 88 percent.
Design of Belief Propagation Based on FPGA for the Multistereo CAFADIS Camera

PubMed Central

Magdaleno, Eduardo; Lüke, Jonás Philipp; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we describe a fast, specialized hardware implementation of the belief propagation algorithm for the CAFADIS camera, a new plenoptic sensor patented by the University of La Laguna. This camera captures the lightfield of the scene and can be used to find out at which depth each pixel is in focus. The algorithm has been designed for FPGA devices using VHDL. We propose a parallel and pipeline architecture to implement the algorithm without external memory. Although the BRAM resources of the device increase considerably, we can maintain real-time restrictions by using extremely high-performance signal processing capability through parallelism and by accessing several memories simultaneously. The quantifying results with 16 bit precision have shown that performances are really close to the original Matlab programmed algorithm. PMID:22163404
Design of belief propagation based on FPGA for the multistereo CAFADIS camera.

PubMed

Magdaleno, Eduardo; Lüke, Jonás Philipp; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we describe a fast, specialized hardware implementation of the belief propagation algorithm for the CAFADIS camera, a new plenoptic sensor patented by the University of La Laguna. This camera captures the lightfield of the scene and can be used to find out at which depth each pixel is in focus. The algorithm has been designed for FPGA devices using VHDL. We propose a parallel and pipeline architecture to implement the algorithm without external memory. Although the BRAM resources of the device increase considerably, we can maintain real-time restrictions by using extremely high-performance signal processing capability through parallelism and by accessing several memories simultaneously. The quantifying results with 16 bit precision have shown that performances are really close to the original Matlab programmed algorithm.
FPGA Online Tracking Algorithm for the PANDA Straw Tube Tracker

NASA Astrophysics Data System (ADS)

Liang, Yutie; Ye, Hua; Galuska, Martin J.; Gessler, Thomas; Kuhn, Wolfgang; Lange, Jens Soren; Wagner, Milan N.; Liu, Zhen'an; Zhao, Jingzhou

2017-06-01

A novel FPGA based online tracking algorithm for helix track reconstruction in a solenoidal field, developed for the PANDA spectrometer, is described. Employing the Straw Tube Tracker detector with 4636 straw tubes, the algorithm includes a complex track finder, and a track fitter. Implemented in VHDL, the algorithm is tested on a Xilinx Virtex-4 FX60 FPGA chip with different types of events, at different event rates. A processing time of 7 $\\mu$s per event for an average of 6 charged tracks is obtained. The momentum resolution is about 3\\% (4\\%) for $p_t$ ($p_z$) at 1 GeV/c. Comparing to the algorithm running on a CPU chip (single core Intel Xeon E5520 at 2.26 GHz), an improvement of 3 orders of magnitude in processing time is obtained. The algorithm can handle severe overlapping of events which are typical for interaction rates above 10 MHz.
High-speed bipolar phototransistors in a 180 nm CMOS process.

PubMed

Kostov, P; Gaberl, W; Zimmermann, H

2013-03-01

Several high-speed pnp phototransistors built in a standard 180 nm CMOS process are presented. The phototransistors were implemented in sizes of 40×40 μm 2 and 100×100 μm 2 . Different base and emitter areas lead to different characteristics of the phototransistors. As starting material a p + wafer with a p - epitaxial layer on top was used. The phototransistors were optically characterized at wavelengths of 410, 675 and 850 nm. Bandwidths up to 92 MHz and dynamic responsivities up to 2.95 A/W were achieved. Evaluating the results, we can say that the presented phototransistors are well suited for high speed photosensitive optical applications where inherent amplification is needed. Further on, the standard silicon CMOS implementation opens the possibility for cheap integration of integrated optoelectronic circuits. Possible applications for the presented phototransistors are low cost high speed image sensors, opto-couplers, etc.
Fine-grained parallelism accelerating for RNA secondary structure prediction with pseudoknots based on FPGA.

PubMed

Xia, Fei; Jin, Guoqing

2014-06-01

PKNOTS is a most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard four-dimensional (4D) dynamic programming (DP) method and is the basis of many variants and improved algorithms. Unfortunately, the O(N(6)) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS package and prototype system for accelerating RNA folding application based on FPGA chip. We adopted a series of storage optimization strategies to resolve the "Memory Wall" problem. We aggressively exploit parallel computing strategies to improve computational efficiency. We also propose several methods that collectively reduce the storage requirements for FPGA on-chip memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D DP problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 50x average speedup over the PKNOTS-1.08 software running on a PC platform with Intel Core2 Q9400 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform

NASA Technical Reports Server (NTRS)

White, Michael J.

2004-01-01

This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
Development of a prototype sensor system for ultra-high-speed LDA-PIV

NASA Astrophysics Data System (ADS)

Griffiths, Jennifer A.; Royle, Gary J.; Bohndiek, Sarah E.; Turchetta, Renato; Chen, Daoyi

2008-04-01

Laser Doppler Anemometry (LDA) and Particle Image Velocimetry (PIV) are commonly used in the analysis of particulates in fluid flows. Despite the successes of these techniques, current instrumentation has placed limitations on the size and shape of the particles undergoing measurement, thus restricting the available data for the many industrial processes now utilising nano/micro particles. Data for spherical and irregularly shaped particles down to the order of 0.1 Âµm is now urgently required. Therefore, an ultra-fast LDA-PIV system is being constructed for the acquisition of this data. A key component of this instrument is the PIV optical detection system. Both the size and speed of the particles under investigation place challenging constraints on the system specifications: magnification is required within the system in order to visualise particles of the size of interest, but this restricts the corresponding field of view in a linearly inverse manner. Thus, for several images of a single particle in a fast fluid flow to be obtained, the image capture rate and sensitivity of the system must be sufficiently high. In order to fulfil the instrumentation criteria, the optical detection system chosen is a high-speed, lensed, digital imaging system based on state-of-the-art CMOS technology - the 'Vanilla' sensor developed by the UK based MI3 consortium. This novel Active Pixel Sensor is capable of high frame rates and sparse readout. When coupled with an image intensifier, it will have single photon detection capabilities. An FPGA based DAQ will allow real-time operation with minimal data transfer.
The RTE inversion on FPGA aboard the solar orbiter PHI instrument

NASA Astrophysics Data System (ADS)

Cobos Carrascosa, J. P.; Aparicio del Moral, B.; Ramos Mas, J. L.; Balaguer, M.; López Jiménez, A. C.; del Toro Iniesta, J. C.

2016-07-01

In this work we propose a multiprocessor architecture to reach high performance in floating point operations by using radiation tolerant FPGA devices, and under narrow time and power constraints. This architecture is used in the PHI instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. The proposed architecture, in a SIMD flavor, is aimed to be an accelerator within the Data Processing Unit (it is composed by a main Leon processor and two FPGAs) for carrying out the RTE inversion on board the spacecraft using a relatively slow FPGA device - Xilinx XQR4VSX55-. The proposed architecture squeezes the FPGA resources in order to reach the computational requirements and improves the ground-based system performance based on commercial CPUs regarding time and power consumption. In this work we demonstrate the feasibility of using this FPGA devices embedded in the SO/PHI instrument. With that goal in mind, we perform tests to evaluate the scientific results and to measure the processing time and power consumption for carrying out the RTE inversion.
A high performance DAC /DDS daughter module for the RHIC LLRF platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hayes, T.; Harvey, M.; Narayan, G.

The RHIC LLRF upgrade is a flexible, modular system. Output signals are generated by a custom designed XMC card with 4 high speed digital to analog (DAC) converters interfaced to a high performance field programmable gate array (FPGA). This paper discusses the hardware details of the XMC DAC board as well as the implementation of a low noise rf synthesizer with digital IQ modulation. This synthesizer also provides injection phase cogging and frequency hop rebucketing capabilities. A new modular RHIC LLRF system was recently designed and commissioned based on custom designed XMC cards. As part of that effort a highmore » speed, four channel DAC board was designed. The board uses Maxim MAX5891 16 bit DACs with a maximum update rate of 600 Msps. Since this module is intended to be used for many different systems throughout the Collider Accelerator complex, it was designed to be as generic as possible. One major application of this DAC card is to implement digital synthesizers to provide drive signals to the various cavities at RHIC. Since RHIC is a storage ring with stores that typically last many hours, extremely low RF noise is a critical requirement. Synchrotron frequencies at RHIC range from a few hertz to several hundred hertz depending on the species and point in the acceleration cycle so close in phase noise is a major concern. The RHIC LLRF system uses the Update Link, a deterministic, high speed data link that broadcasts the revolution frequency and the synchronous phase angle. The digital synthesizers use this data to generate a properly phased analog drive signal. The synthesizers must also provide smooth phase shifts for cogging and support frequency shift rebucketing. One additional feature implemented in the FPGA is a digital waveform generator (WFG) that generates I and Q data pairs based on a user selected amplitude and phase profile as a function of time.« less
FPGA based charge fast histogramming for GEM detector

NASA Astrophysics Data System (ADS)

Poźniak, Krzysztof T.; Byszuk, A.; Chernyshova, M.; Cieszewski, R.; Czarski, T.; Dominik, W.; Jakubowska, K.; Kasprowicz, G.; Rzadkiewicz, J.; Scholz, M.; Zabolotny, W.

2013-10-01

This article presents a fast charge histogramming method for the position sensitive X-ray GEM detector. The energy resolved measurements are carried out simultaneously for 256 channels of the GEM detector. The whole process of histogramming is performed in 21 FPGA chips (Spartan-6 series from Xilinx) . The results of the histogramming process are stored in an external DDR3 memory. The structure of an electronic measuring equipment and a firmware functionality implemented in the FPGAs is described. Examples of test measurements are presented.
A simulation-based study of HighSpeed TCP and its deployment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Souza, Evandro de

2003-05-01

The current congestion control mechanism used in TCP has difficulty reaching full utilization on high speed links, particularly on wide-area connections. For example, the packet drop rate needed to fill a Gigabit pipe using the present TCP protocol is below the currently achievable fiber optic error rates. HighSpeed TCP was recently proposed as a modification of TCP's congestion control mechanism to allow it to achieve reasonable performance in high speed wide-area links. In this research, simulation results showing the performance of HighSpeed TCP and the impact of its use on the present implementation of TCP are presented. Network conditions includingmore » different degrees of congestion, different levels of loss rate, different degrees of bursty traffic and two distinct router queue management policies were simulated. The performance and fairness of HighSpeed TCP were compared to the existing TCP and solutions for bulk-data transfer using parallel streams.« less
Generate stepper motor linear speed profile in real time

NASA Astrophysics Data System (ADS)

Stoychitch, M. Y.

2018-01-01

In this paper we consider the problem of realization of linear speed profile of stepper motors in real time. We considered the general case when changes of speed in the phases of acceleration and deceleration are different. The new and practical algorithm of the trajectory planning is given. The algorithms of the real time speed control which are suitable for realization to the microcontroller and FPGA circuits are proposed. The practical realization one of these algorithms, using Arduino platform, is given also.
Design of FPGA-based radiation tolerant quench detectors for LHC

NASA Astrophysics Data System (ADS)

Steckert, J.; Skoczen, A.

2017-04-01

The Large Hadron Collider (LHC) comprises many superconducting circuits. Most elements of these circuits require active protection. The functionality of the quench detectors was initially implemented as microcontroller based equipment. After the initial stage of the LHC operation with beams the introduction of a new type of quench detector began. This article presents briefly the main ideas and architectures applied to the design and the validation of FPGA-based quench detectors.
VLSI implementation of RSA encryption system using ancient Indian Vedic mathematics

NASA Astrophysics Data System (ADS)

Thapliyal, Himanshu; Srinivas, M. B.

2005-06-01

This paper proposes the hardware implementation of RSA encryption/decryption algorithm using the algorithms of Ancient Indian Vedic Mathematics that have been modified to improve performance. The recently proposed hierarchical overlay multiplier architecture is used in the RSA circuitry for multiplication operation. The most significant aspect of the paper is the development of a division architecture based on Straight Division algorithm of Ancient Indian Vedic Mathematics and embedding it in RSA encryption/decryption circuitry for improved efficiency. The coding is done in Verilog HDL and the FPGA synthesis is done using Xilinx Spartan library. The results show that RSA circuitry implemented using Vedic division and multiplication is efficient in terms of area/speed compared to its implementation using conventional multiplication and division architectures.

High-Modulation-Speed LEDs Based on III-Nitride

NASA Astrophysics Data System (ADS)

Chen, Hong

III-nitride InGaN light-emitting diodes (LEDs) enable wide range of applications in solid-state lighting, full-color displays, and high-speed visible-light communication. Conventional InGaN quantum well LEDs grown on polar c-plane substrate suffer from quantum confined Stark effect due to the large internal polarization-related fields, leading to a reduced radiative recombination rate and device efficiency, which limits the performance of InGaN LEDs in high-speed communication applications. To circumvent these negative effects, non-trivial-cavity designs such as flip-chip LEDs, metallic grating coated LEDs are proposed. This oral defense will show the works on the high-modulation-speed LEDs from basic ideas to applications. Fundamental principles such as rate equations for LEDs/laser diodes (LDs), plasmonic effects, Purcell effects will be briefly introduced. For applications, the modal properties of flip-chip LEDs are solved by implementing finite difference method in order to study the modulation response. The emission properties of highly polarized InGaN LEDs coated by metallic gratings are also investigated by finite difference time domain method.
Miniature high speed compressor having embedded permanent magnet motor

NASA Technical Reports Server (NTRS)

Zhou, Lei (Inventor); Zheng, Liping (Inventor); Chow, Louis (Inventor); Kapat, Jayanta S. (Inventor); Wu, Thomas X. (Inventor); Kota, Krishna M. (Inventor); Li, Xiaoyi (Inventor); Acharya, Dipjyoti (Inventor)

2011-01-01

A high speed centrifugal compressor for compressing fluids includes a permanent magnet synchronous motor (PMSM) having a hollow shaft, the being supported on its ends by ball bearing supports. A permanent magnet core is embedded inside the shaft. A stator with a winding is located radially outward of the shaft. The PMSM includes a rotor including at least one impeller secured to the shaft or integrated with the shaft as a single piece. The rotor is a high rigidity rotor providing a bending mode speed of at least 100,000 RPM which advantageously permits implementation of relatively low-cost ball bearing supports.
Tuple spaces in hardware for accelerated implicit routing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Zachary Kent; Tripp, Justin

2010-12-01

Organizing and optimizing data objects on networks with support for data migration and failing nodes is a complicated problem to handle as systems grow. The goal of this work is to demonstrate that high levels of speedup can be achieved by moving responsibility for finding, fetching, and staging data into an FPGA-based network card. We present a system for implicit routing of data via FPGA-based network cards. In this system, data structures are requested by name, and the network of FPGAs finds the data within the network and relays the structure to the requester. This is acheived through successive examinationmore » of hardware hash tables implemented in the FPGA. By avoiding software stacks between nodes, the data is quickly fetched entirely through FPGA-FPGA interaction. The performance of this system is orders of magnitude faster than software implementations due to the improved speed of the hash tables and lowered latency between the network nodes.« less
77 FR 14346 - Proposed Information Collection; Comment Request; Implementation of Vessel Speed Restrictions To...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-03-09

... Collection; Comment Request; Implementation of Vessel Speed Restrictions To Reduce the Threat of Ship... implementing speed restrictions to reduce the incidence and severity of ship collisions with North Atlantic... document that a deviation from the 10-knot speed limit was necessary for safe maneuverability under certain...
An FPGA-Based Real-Time Maximum Likelihood 3D Position Estimation for a Continuous Crystal PET Detector

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Xiao, Yong; Cheng, Xinyi; Li, Deng; Wang, Liwei

2016-02-01

For the continuous crystal-based positron emission tomography (PET) detector built in our lab, a maximum likelihood algorithm adapted for implementation on a field programmable gate array (FPGA) is proposed to estimate the three-dimensional (3D) coordinate of interaction position with the single-end detected scintillation light response. The row-sum and column-sum readout scheme organizes the 64 channels of photomultiplier (PMT) into eight row signals and eight column signals to be readout for X- and Y-coordinates estimation independently. By the reference events irradiated in a known oblique angle, the probability density function (PDF) for each depth-of-interaction (DOI) segment is generated, by which the reference events in perpendicular irradiation are assigned to DOI segments for generating the PDFs for X and Y estimation in each DOI layer. Evaluated by the experimental data, the algorithm achieves an average X resolution of 1.69 mm along the central X-axis, and DOI resolution of 3.70 mm over the whole thickness (0-10 mm) of crystal. The performance improvements from 2D estimation to the 3D algorithm are also presented. Benefiting from abundant resources of FPGA and a hierarchical storage arrangement, the whole algorithm can be implemented into a middle-scale FPGA. By a parallel structure in pipelines, the 3D position estimator on the FPGA can achieve a processing throughput of 15 M events/s, which is sufficient for the requirement of real-time PET imaging.
Re-Form: FPGA-Powered True Codesign Flow for High-Performance Computing In The Post-Moore Era

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cappello, Franck; Yoshii, Kazutomo; Finkel, Hal

Multicore scaling will end soon because of practical power limits. Dark silicon is becoming a major issue even more than the end of Moore’s law. In the post-Moore era, the energy efficiency of computing will be a major concern. FPGAs could be a key to maximizing the energy efficiency. In this paper we address severe challenges in the adoption of FPGA in HPC and describe “Re-form,” an FPGA-powered codesign flow.
Volumetric visualization algorithm development for an FPGA-based custom computing machine

NASA Astrophysics Data System (ADS)

Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

1998-05-01

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
The analysis and compensation of errors of precise simple harmonic motion control under high speed and large load conditions based on servo electric cylinder

NASA Astrophysics Data System (ADS)

Ma, Chen-xi; Ding, Guo-qing

2017-10-01

Simple harmonic waves and synthesized simple harmonic waves are widely used in the test of instruments. However, because of the errors caused by clearance of gear and time-delay error of FPGA, it is difficult to control servo electric cylinder in precise simple harmonic motion under high speed, high frequency and large load conditions. To solve the problem, a method of error compensation is proposed in this paper. In the method, a displacement sensor is fitted on the piston rod of the electric cylinder. By using the displacement sensor, the real-time displacement of the piston rod is obtained and fed back to the input of servo motor, then a closed loop control is realized. There is compensation of pulses in the next period of the synthetic waves. This paper uses FPGA as the processing core. The software mainly comprises a waveform generator, an Ethernet module, a memory module, a pulse generator, a pulse selector, a protection module, an error compensation module. A durability of shock absorbers is used as the testing platform. The durability mainly comprises a single electric cylinder, a servo motor for driving the electric cylinder, and the servo motor driver.
DNA Assembly with De Bruijn Graphs Using an FPGA Platform.

PubMed

Poirier, Carl; Gosselin, Benoit; Fortier, Paul

2018-01-01

This paper presents an FPGA implementation of a DNA assembly algorithm, called Ray, initially developed to run on parallel CPUs. The OpenCL language is used and the focus is placed on modifying and optimizing the original algorithm to better suit the new parallelization tool and the radically different hardware architecture. The results show that the execution time is roughly one fourth that of the CPU and factoring energy consumption yields a tenfold savings.
In-situ FPGA debug driven by on-board microcontroller

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Zachary Kent

2009-01-01

Often we are faced with the situation that the behavior of a circuit changes in an unpredictable way when chassis cover is attached or the system is not easily accessible. For instance, in a deployed environment, such as space, hardware can malfunction in unpredictable ways. What can a designer do to ascertain the cause of the problem? Register interrogations only go so far, and sometimes the problem being debugged is register transactions themselves, or the problem lies in FPGA programming. This work provides a solution to this; namely, the ability to drive a JTAG chain via an on-board microcontroller andmore » use a simple clone of the Xilinx Chipscope core without a Xilinx JTAG cable or any external interfaces required. We have demonstrated the functionality of the prototype system using a Xilinx Spartan 3E FPGA and a Microchip PIC18j2550 microcontroller. This paper will discuss the implementation details as well as present case studies describing how the tools have aided satellite hardware development.« less
Anti Theft Mechanism Through Face recognition Using FPGA

NASA Astrophysics Data System (ADS)

Sundari, Y. B. T.; Laxminarayana, G.; Laxmi, G. Vijaya

2012-11-01

The use of vehicle is must for everyone. At the same time, protection from theft is also very important. Prevention of vehicle theft can be done remotely by an authorized person. The location of the car can be found by using GPS and GSM controlled by FPGA. In this paper, face recognition is used to identify the persons and comparison is done with the preloaded faces for authorization. The vehicle will start only when the authorized personís face is identified. In the event of theft attempt or unauthorized personís trial to drive the vehicle, an MMS/SMS will be sent to the owner along with the location. Then the authorized person can alert the security personnel for tracking and catching the vehicle. For face recognition, a Principal Component Analysis (PCA) algorithm is developed using MATLAB. The control technique for GPS and GSM is developed using VHDL over SPTRAN 3E FPGA. The MMS sending method is written in VB6.0. The proposed application can be implemented with some modifications in the systems wherever the face recognition or detection is needed like, airports, international borders, banking applications etc.
High-speed spectral calibration by complex FIR filter in phase-sensitive optical coherence tomography.

PubMed

Kim, Sangmin; Raphael, Patrick D; Oghalai, John S; Applegate, Brian E

2016-04-01

Swept-laser sources offer a number of advantages for Phase-sensitive Optical Coherence Tomography (PhOCT). However, inter- and intra-sweep variability leads to calibration errors that adversely affect phase sensitivity. While there are several approaches to overcoming this problem, our preferred method is to simply calibrate every sweep of the laser. This approach offers high accuracy and phase stability at the expense of a substantial processing burden. In this approach, the Hilbert phase of the interferogram from a reference interferometer provides the instantaneous wavenumber of the laser, but is computationally expensive. Fortunately, the Hilbert transform may be approximated by a Finite Impulse-Response (FIR) filter. Here we explore the use of several FIR filter based Hilbert transforms for calibration, explicitly considering the impact of filter choice on phase sensitivity and OCT image quality. Our results indicate that the complex FIR filter approach is the most robust and accurate among those considered. It provides similar image quality and slightly better phase sensitivity than the traditional FFT-IFFT based Hilbert transform while consuming fewer resources in an FPGA implementation. We also explored utilizing the Hilbert magnitude of the reference interferogram to calculate an ideal window function for spectral amplitude calibration. The ideal window function is designed to carefully control sidelobes on the axial point spread function. We found that after a simple chromatic correction, calculating the window function using the complex FIR filter and the reference interferometer gave similar results to window functions calculated using a mirror sample and the FFT-IFFT Hilbert transform. Hence, the complex FIR filter can enable accurate and high-speed calibration of the magnitude and phase of spectral interferograms.
High-speed spectral calibration by complex FIR filter in phase-sensitive optical coherence tomography

PubMed Central

Kim, Sangmin; Raphael, Patrick D.; Oghalai, John S.; Applegate, Brian E.

2016-01-01

Swept-laser sources offer a number of advantages for Phase-sensitive Optical Coherence Tomography (PhOCT). However, inter- and intra-sweep variability leads to calibration errors that adversely affect phase sensitivity. While there are several approaches to overcoming this problem, our preferred method is to simply calibrate every sweep of the laser. This approach offers high accuracy and phase stability at the expense of a substantial processing burden. In this approach, the Hilbert phase of the interferogram from a reference interferometer provides the instantaneous wavenumber of the laser, but is computationally expensive. Fortunately, the Hilbert transform may be approximated by a Finite Impulse-Response (FIR) filter. Here we explore the use of several FIR filter based Hilbert transforms for calibration, explicitly considering the impact of filter choice on phase sensitivity and OCT image quality. Our results indicate that the complex FIR filter approach is the most robust and accurate among those considered. It provides similar image quality and slightly better phase sensitivity than the traditional FFT-IFFT based Hilbert transform while consuming fewer resources in an FPGA implementation. We also explored utilizing the Hilbert magnitude of the reference interferogram to calculate an ideal window function for spectral amplitude calibration. The ideal window function is designed to carefully control sidelobes on the axial point spread function. We found that after a simple chromatic correction, calculating the window function using the complex FIR filter and the reference interferometer gave similar results to window functions calculated using a mirror sample and the FFT-IFFT Hilbert transform. Hence, the complex FIR filter can enable accurate and high-speed calibration of the magnitude and phase of spectral interferograms. PMID:27446666
Optoelectronic date acquisition system based on FPGA

NASA Astrophysics Data System (ADS)

Li, Xin; Liu, Chunyang; Song, De; Tong, Zhiguo; Liu, Xiangqing

2015-11-01

An optoelectronic date acquisition system is designed based on FPGA. FPGA chip that is EP1C3T144C8 of Cyclone devices from Altera corporation is used as the centre of logic control, XTP2046 chip is used as A/D converter, host computer that communicates with the date acquisition system through RS-232 serial communication interface are used as display device and photo resistance is used as photo sensor. We use Verilog HDL to write logic control code about FPGA. It is proved that timing sequence is correct through the simulation of ModelSim. Test results indicate that this system meets the design requirement, has fast response and stable operation by actual hardware circuit test.
Design of video processing and testing system based on DSP and FPGA

NASA Astrophysics Data System (ADS)

Xu, Hong; Lv, Jun; Chen, Xi'ai; Gong, Xuexia; Yang, Chen'na

2007-12-01

Based on high speed Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA), a video capture, processing and display system is presented, which is of miniaturization and low power. In this system, a triple buffering scheme was used for the capture and display, so that the application can always get a new buffer without waiting; The Digital Signal Processor has an image process ability and it can be used to test the boundary of workpiece's image. A video graduation technology is used to aim at the position which is about to be tested, also, it can enhance the system's flexibility. The character superposition technology realized by DSP is used to display the test result on the screen in character format. This system can process image information in real time, ensure test precision, and help to enhance product quality and quality management.
High speed fiber optics local area networks: Design and implementation

NASA Technical Reports Server (NTRS)

Tobagi, Fouad A.

1988-01-01

The design of high speed local area networks (HSLAN) for communication among distributed devices requires solving problems in three areas: (1) the network medium and its topology; (2) the medium access control; and (3) the network interface. Considerable progress has been made in all areas. Accomplishments are divided into two groups according to their theoretical or experimental nature. A brief summary is given in Section 2, including references to papers which appeared in the literature, as well as to Ph.D. dissertations and technical reports published at Stanford University.
FPGA-Based, Self-Checking, Fault-Tolerant Computers

NASA Technical Reports Server (NTRS)

Some, Raphael; Rennels, David

2004-01-01

A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing
High speed transient sampler

DOEpatents

McEwan, Thomas E.

1995-01-01

A high speed sampler comprises a meandered sample transmission line for transmitting an input signal, a straight strobe transmission line for transmitting a strobe signal, and a plurality of sampling gates along the transmission lines. The sampling gates comprise a four terminal diode bridge having a first strobe resistor connected from a first terminal of the bridge to the positive strobe line, a second strobe resistor coupled from the third terminal of the bridge to the negative strobe line, a tap connected to the second terminal of the bridge and to the sample transmission line, and a sample holding capacitor connected to the fourth terminal of the bridge. The resistance of the first and second strobe resistors is much higher than the signal transmission line impedance in the preferred system. This results in a sampling gate which applies a very small load on the sample transmission line and on the strobe generator. The sample holding capacitor is implemented using a smaller capacitor and a larger capacitor isolated from the smaller capacitor by resistance. The high speed sampler of the present invention is also characterized by other optimizations, including transmission line tap compensation, stepped impedance strobe line, a multi-layer physical layout, and unique strobe generator design. A plurality of banks of such samplers are controlled for concatenated or interleaved sample intervals to achieve long sample lengths or short sample spacing.
Digital hardware implementation of a stochastic two-dimensional neuron model.

PubMed

Grassia, F; Kohno, T; Levi, T

2016-11-01

This study explores the feasibility of stochastic neuron simulation in digital systems (FPGA), which realizes an implementation of a two-dimensional neuron model. The stochasticity is added by a source of current noise in the silicon neuron using an Ornstein-Uhlenbeck process. This approach uses digital computation to emulate individual neuron behavior using fixed point arithmetic operation. The neuron model's computations are performed in arithmetic pipelines. It was designed in VHDL language and simulated prior to mapping in the FPGA. The experimental results confirmed the validity of the developed stochastic FPGA implementation, which makes the implementation of the silicon neuron more biologically plausible for future hybrid experiments. Copyright © 2017 Elsevier Ltd. All rights reserved.
DSP+FPGA-based real-time histogram equalization system of infrared image

NASA Astrophysics Data System (ADS)

Gu, Dongsheng; Yang, Nansheng; Pi, Defu; Hua, Min; Shen, Xiaoyan; Zhang, Ruolan

2001-10-01

Histogram Modification is a simple but effective method to enhance an infrared image. There are several methods to equalize an infrared image's histogram due to the different characteristics of the different infrared images, such as the traditional HE (Histogram Equalization) method, and the improved HP (Histogram Projection) and PE (Plateau Equalization) method and so on. If to realize these methods in a single system, the system must have a mass of memory and extremely fast speed. In our system, we introduce a DSP + FPGA based real-time procession technology to do these things together. FPGA is used to realize the common part of these methods while DSP is to do the different part. The choice of methods and the parameter can be input by a keyboard or a computer. By this means, the function of the system is powerful while it is easy to operate and maintain. In this article, we give out the diagram of the system and the soft flow chart of the methods. And at the end of it, we give out the infrared image and its histogram before and after the process of HE method.

An efficient HW and SW design of H.264 video compression, storage and playback on FPGA devices for handheld thermal imaging systems

NASA Astrophysics Data System (ADS)

Gunay, Omer; Ozsarac, Ismail; Kamisli, Fatih

2017-05-01

Video recording is an essential property of new generation military imaging systems. Playback of the stored video on the same device is also desirable as it provides several operational benefits to end users. Two very important constraints for many military imaging systems, especially for hand-held devices and thermal weapon sights, are power consumption and size. To meet these constraints, it is essential to perform most of the processing applied to the video signal, such as preprocessing, compression, storing, decoding, playback and other system functions on a single programmable chip, such as FPGA, DSP, GPU or ASIC. In this work, H.264/AVC (Advanced Video Coding) compatible video compression, storage, decoding and playback blocks are efficiently designed and implemented on FPGA platforms using FPGA fabric and Altera NIOS II soft processor. Many subblocks that are used in video encoding are also used during video decoding in order to save FPGA resources and power. Computationally complex blocks are designed using FPGA fabric, while blocks such as SD card write/read, H.264 syntax decoding and CAVLC decoding are done using NIOS processor to benefit from software flexibility. In addition, to keep power consumption low, the system was designed to require limited external memory access. The design was tested using 640x480 25 fps thermal camera on CYCLONE V FPGA, which is the ALTERA's lowest power FPGA family, and consumes lower than 40% of CYCLONE V 5CEFA7 FPGA resources on average.
FPGA wavelet processor design using language for instruction-set architectures (LISA)

NASA Astrophysics Data System (ADS)

Meyer-Bäse, Uwe; Vera, Alonzo; Rao, Suhasini; Lenk, Karl; Pattichis, Marios

2007-04-01

The design of an microprocessor is a long, tedious, and error-prone task consisting of typically three design phases: architecture exploration, software design (assembler, linker, loader, profiler), architecture implementation (RTL generation for FPGA or cell-based ASIC) and verification. The Language for instruction-set architectures (LISA) allows to model a microprocessor not only from instruction-set but also from architecture description including pipelining behavior that allows a design and development tool consistency over all levels of the design. To explore the capability of the LISA processor design platform a.k.a. CoWare Processor Designer we present in this paper three microprocessor designs that implement a 8/8 wavelet transform processor that is typically used in today's FBI fingerprint compression scheme. We have designed a 3 stage pipelined 16 bit RISC processor (NanoBlaze). Although RISC μPs are usually considered "fast" processors due to design concept like constant instruction word size, deep pipelines and many general purpose registers, it turns out that DSP operations consume essential processing time in a RISC processor. In a second step we have used design principles from programmable digital signal processor (PDSP) to improve the throughput of the DWT processor. A multiply-accumulate operation along with indirect addressing operation were the key to achieve higher throughput. A further improvement is possible with today's FPGA technology. Today's FPGAs offer a large number of embedded array multipliers and it is now feasible to design a "true" vector processor (TVP). A multiplication of two vectors can be done in just one clock cycle with our TVP, a complete scalar product in two clock cycles. Code profiling and Xilinx FPGA ISE synthesis results are provided that demonstrate the essential improvement that a TVP has compared with traditional RISC or PDSP designs.
High speed, high performance, portable, dual-channel, optical fiber Bragg grating (FBG) demodulator

NASA Astrophysics Data System (ADS)

Zhang, Hongtao; Wei, Zhanxiong; Fan, Lingling; Wang, Pengfei; Zhao, Xilin; Wang, Zhenhua; Yang, Shangming; Cui, Hong-Liang

2009-10-01

A high speed, high performance, portable, dual-channel, optical Fiber Bragg Grating demodulator based on fiber Fabry- Pérot tunable filter (FFP-FT) is reported in this paper. The high speed demodulation can be achieved to detect the dynamical loads of vehicles with speed of 15 mph. However, the drifts of piezoelectric transducer (PZT) in the cavity of FFP-FT dramatically degrade the stability of system. Two schemes are implemented to improve the stability of system. Firstly, a temperature control system is installed to effectively remove the thermal drifts of PZT. Secondly, a scheme of changing the bias voltage of FFP-FT to restrain non-thermal drifts has been realized at lab and will be further developed to an automatic control system based on microcontroller. Although this demodulator is originally used in Weight-In- Motion (WIM) sensing system, it can be extended into other aspects and the schemes presented in this paper will be useful in many applications.
Multichannel FPGA-Based Data-Acquisition-System for Time-Resolved Synchrotron Radiation Experiments

NASA Astrophysics Data System (ADS)

Choe, Hyeokmin; Gorfman, Semen; Heidbrink, Stefan; Pietsch, Ullrich; Vogt, Marco; Winter, Jens; Ziolkowski, Michael

2017-06-01

The aim of this contribution is to describe our recent development of a novel compact field-programmable gatearray (FPGA)-based data acquisition (DAQ) system for use with multichannel X-ray detectors at synchrotron radiation facilities. The system is designed for time resolved counting of single photons arriving from several-currently 12-independent detector channels simultaneously. Detector signals of at least 2.8 ns duration are latched by asynchronous logic and then synchronized with the system clock of 100 MHz. The incoming signals are subsequently sorted out into 10 000 time-bins where they are counted. This occurs according to the arrival time of photons with respect to the trigger signal. Repeatable mode of triggered operation is used to achieve high statistic of accumulated counts. The time-bin width is adjustable from 10 ns to 1 ms. In addition, a special mode of operation with 2 ns time resolution is provided for two detector channels. The system is implemented in a pocketsize FPGA-based hardware of 10 cm × 10 cm × 3 cm and thus can easily be transported between synchrotron radiation facilities. For setup of operation and data read-out, the hardware is connected via USB interface to a portable control computer. DAQ applications are provided in both LabVIEW and MATLAB environments.
FPGA based digital phase-coding quantum key distribution system

NASA Astrophysics Data System (ADS)

Lu, XiaoMing; Zhang, LiJun; Wang, YongGang; Chen, Wei; Huang, DaJun; Li, Deng; Wang, Shuang; He, DeYong; Yin, ZhenQiang; Zhou, Yu; Hui, Cong; Han, ZhengFu

2015-12-01

Quantum key distribution (QKD) is a technology with the potential capability to achieve information-theoretic security. Phasecoding is an important approach to develop practical QKD systems in fiber channel. In order to improve the phase-coding modulation rate, we proposed a new digital-modulation method in this paper and constructed a compact and robust prototype of QKD system using currently available components in our lab to demonstrate the effectiveness of the method. The system was deployed in laboratory environment over a 50 km fiber and continuously operated during 87 h without manual interaction. The quantum bit error rate (QBER) of the system was stable with an average value of 3.22% and the secure key generation rate is 8.91 kbps. Although the modulation rate of the photon in the demo system was only 200 MHz, which was limited by the Faraday-Michelson interferometer (FMI) structure, the proposed method and the field programmable gate array (FPGA) based electronics scheme have a great potential for high speed QKD systems with Giga-bits/second modulation rate.
High-speed reconstruction of compressed images

NASA Astrophysics Data System (ADS)

Cox, Jerome R., Jr.; Moore, Stephen M.

1990-07-01

A compression scheme is described that allows high-definition radiological images with greater than 8-bit intensity resolution to be represented by 8-bit pixels. Reconstruction of the images with their original intensity resolution can be carried out by means of a pipeline architecture suitable for compact, high-speed implementation. A reconstruction system is described that can be fabricated according to this approach and placed between an 8-bit display buffer and the display's video system thereby allowing contrast control of images at video rates. Results for 50 CR chest images are described showing that error-free reconstruction of the original 10-bit CR images can be achieved.
On the implementation of an auxiliary pantograph for speed increase on existing lines

NASA Astrophysics Data System (ADS)

Liu, Zhendong; Jönsson, Per-Anders; Stichel, Sebastian; Rønnquist, Anders

2016-08-01

The contact between pantograph and catenary at high speeds suffers from high dynamic contact force variation due to stiffness variations and wave propagation. To increase operational speed on an existing catenary system, especially for soft catenary systems, technical upgrading is usually necessary. Therefore, it is desirable to explore a more practical and cost-saving method to increase the operational speed. Based on a 3D pantograph-catenary finite element model, a parametric study on two-pantograph operation with short spacing distances at high speeds shows that, although the performance of the leading pantograph gets deteriorated, the trailing pantograph feels an improvement if pantographs are spaced at a proper distance. Then, two main positive effects, which can cause the improvement, are addressed. Based on a discussion on wear mechanisms, this paper suggests to use the leading pantograph as an auxiliary pantograph, which does not conduct any electric current, to minimise additional wear caused by the leading pantograph. To help implementation and achieve further improvement under this working condition, this paper investigates cases with optimised uplift force on the leading pantograph and with system parameter deviations. The results show that the two positive effects still remain even with some system parameter deviations. About 30% of speed increase should be possibly achieved still sustaining a good dynamic performance with help of the optimised uplift force.
Real-time implementing wavefront reconstruction for adaptive optics

NASA Astrophysics Data System (ADS)

Wang, Caixia; Li, Mei; Wang, Chunhong; Zhou, Luchun; Jiang, Wenhan

2004-12-01

The capability of real time wave-front reconstruction is important for an adaptive optics (AO) system. The bandwidth of system and the real-time processing ability of the wave-front processor is mainly affected by the speed of calculation. The system requires enough number of subapertures and high sampling frequency to compensate atmospheric turbulence. The number of reconstruction operation is increased accordingly. Since the performance of AO system improves with the decrease of calculation latency, it is necessary to study how to increase the speed of wavefront reconstruction. There are two methods to improve the real time of the reconstruction. One is to convert the wavefront reconstruction matrix, such as by wavelet or FFT. The other is enhancing the performance of the processing element. Analysis shows that the latency cutting is performed with the cost of reconstruction precision by the former method. In this article, the latter method is adopted. From the characteristic of the wavefront reconstruction algorithm, a systolic array by FPGA is properly designed to implement real-time wavefront reconstruction. The system delay is reduced greatly by the utilization of pipeline and parallel processing. The minimum latency of reconstruction is the reconstruction calculation of one subaperture.
A Gigabit-per-Second Ka-Band Demonstration Using a Reconfigurable FPGA Modulator

NASA Technical Reports Server (NTRS)

Lee, Dennis; Gray, Andrew A.; Kang, Edward C.; Tsou, Haiping; Lay, Norman E.; Fong, Wai; Fisher, Dave; Hoy, Scott

2005-01-01

Gigabit-per-second communications have been a desired target for future NASA Earth science missions, and for potential manned lunar missions. Frequency bandwidth at S-band and X-band is typically insufficient to support missions at these high data rates. In this paper, we present the results of a 1 Gbps 32-QAM end-to-end experiment at Ka-band using a reconfigurable Field Programmable Gate Array (FPGA) baseband modulator board. Bit error rate measurements of the received signal using a software receiver demonstrate the feasibility of using ultra-high data rates at Ka-band, although results indicate that error correcting coding and/or modulator predistortion must be implemented in addition. Also, results of the demonstration validate the low-cost, MOS-based reconfigurable modulator approach taken to development of a high rate modulator, as opposed to more expensive ASIC or pure analog approaches.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters.

PubMed

Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Serrano, Alejandro; Godoy, Jorge; Martínez-Álvarez, Antonio; Villagra, Jorge

2017-11-11

Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters

PubMed Central

Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Godoy, Jorge; Martínez-Álvarez, Antonio

2017-01-01

Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle. PMID:29137137
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.

PubMed

Ly, Daniel Le; Chow, Paul

2010-11-01

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
Development of an FPGA-based multipoint laser pyroshock measurement system for explosive bolts

NASA Astrophysics Data System (ADS)

Abbas, Syed Haider; Jang, Jae-Kyeong; Lee, Jung-Ryul; Kim, Zaeill

2016-07-01

Pyroshock can cause failure to the objective of an aerospace structure by damaging its sensitive electronic equipment, which is responsible for performing decisive operations. A pyroshock is the high intensity shock wave that is generated when a pyrotechnic device is explosively triggered to separate, release, or activate structural subsystems of an aerospace architecture. Pyroshock measurement plays an important role in experimental simulations to understand the characteristics of pyroshock on the host structure. This paper presents a technology to measure a pyroshock wave at multiple points using laser Doppler vibrometers (LDVs). These LDVs detect the pyroshock wave generated due to an explosive-based pyrotechnical event. Field programmable gate array (FPGA) based data acquisition is used in the study to acquire pyroshock signals simultaneously from multiple channels. This paper describes the complete system design for multipoint pyroshock measurement. The firmware architecture for the implementation of multichannel data acquisition on an FPGA-based development board is also discussed. An experiment using explosive bolts was configured to test the reliability of the system. Pyroshock was generated using explosive excitation on a 22-mm-thick steel plate. Three LDVs were deployed to capture the pyroshock wave at different points. The pyroshocks captured were displayed as acceleration plots. The results showed that our system effectively captured the pyroshock wave with a peak-to-peak magnitude of 303 741 g. The contribution of this paper is a specialized architecture of firmware design programmed in FPGA for data acquisition of large amount of multichannel pyroshock data. The advantages of the developed system are the near-field, multipoint, non-contact, and remote measurement of a pyroshock wave, which is dangerous and expensive to produce in aerospace pyrotechnic tests.
Design and implementation of digital controllers for smart structures using field-programmable gate arrays

NASA Astrophysics Data System (ADS)

Kelly, Jamie S.; Bowman, Hiroshi C.; Rao, Vittal S.; Pottinger, Hardy J.

1997-06-01

Implementation issues represent an unfamiliar challenge to most control engineers, and many techniques for controller design ignore these issues outright. Consequently, the design of controllers for smart structural systems usually proceeds without regard for their eventual implementation, thus resulting either in serious performance degradation or in hardware requirements that squander power, complicate integration, and drive up cost. The level of integration assumed by the Smart Patch further exacerbates these difficulties, and any design inefficiency may render the realization of a single-package sensor-controller-actuator system infeasible. The goal of this research is to automate the controller implementation process and to relieve the design engineer of implementation concerns like quantization, computational efficiency, and device selection. We specifically target Field Programmable Gate Arrays (FPGAs) as our hardware platform because these devices are highly flexible, power efficient, and reprogrammable. The current study develops an automated implementation sequence that minimizes hardware requirements while maintaining controller performance. Beginning with a state space representation of the controller, the sequence automatically generates a configuration bitstream for a suitable FPGA implementation. MATLAB functions optimize and simulate the control algorithm before translating it into the VHSIC hardware description language. These functions improve power efficiency and simplify integration in the final implementation by performing a linear transformation that renders the controller computationally friendly. The transformation favors sparse matrices in order to reduce multiply operations and the hardware necessary to support them; simultaneously, the remaining matrix elements take on values that minimize limit cycles and parameter sensitivity. The proposed controller design methodology is implemented on a simple cantilever beam test structure using FPGA
Design of transient light signal simulator based on FPGA

NASA Astrophysics Data System (ADS)

Kang, Jing; Chen, Rong-li; Wang, Hong

2014-11-01

A design scheme of transient light signal simulator based on Field Programmable gate Array (FPGA) was proposed in this paper. Based on the characteristics of transient light signals and measured feature points of optical intensity signals, a fitted curve was created in MATLAB. And then the wave data was stored in a programmed memory chip AT29C1024 by using SUPERPRO programmer. The control logic was realized inside one EP3C16 FPGA chip. Data readout, data stream cache and a constant current buck regulator for powering high-brightness LEDs were all controlled by FPGA. A 12-Bit multiplying CMOS digital-to-analog converter (DAC) DAC7545 and an amplifier OPA277 were used to convert digital signals to voltage signals. A voltage-controlled current source constituted by a NPN transistor and an operational amplifier controlled LED array diming to achieve simulation of transient light signal. LM3405A, 1A Constant Current Buck Regulator for Powering LEDs, was used to simulate strong background signal in space. Experimental results showed that the scheme as a transient light signal simulator can satisfy the requests of the design stably.
High speed transient sampler

DOEpatents

McEwan, T.E.

1995-11-28

A high speed sampler comprises a meandered sample transmission line for transmitting an input signal, a straight strobe transmission line for transmitting a strobe signal, and a plurality of sampling gates along the transmission lines. The sampling gates comprise a four terminal diode bridge having a first strobe resistor connected from a first terminal of the bridge to the positive strobe line, a second strobe resistor coupled from the third terminal of the bridge to the negative strobe line, a tap connected to the second terminal of the bridge and to the sample transmission line, and a sample holding capacitor connected to the fourth terminal of the bridge. The resistance of the first and second strobe resistors is much higher than the signal transmission line impedance in the preferred system. This results in a sampling gate which applies a very small load on the sample transmission line and on the strobe generator. The sample holding capacitor is implemented using a smaller capacitor and a larger capacitor isolated from the smaller capacitor by resistance. The high speed sampler of the present invention is also characterized by other optimizations, including transmission line tap compensation, stepped impedance strobe line, a multi-layer physical layout, and unique strobe generator design. A plurality of banks of such samplers are controlled for concatenated or interleaved sample intervals to achieve long sample lengths or short sample spacing. 17 figs.
Enhancement of DRPE performance with a novel scheme based on new RAC: Principle, security analysis and FPGA implementation

NASA Astrophysics Data System (ADS)

Neji, N.; Jridi, M.; Alfalou, A.; Masmoudi, N.

2016-02-01

The double random phase encryption (DRPE) method is a well-known all-optical architecture which has many advantages especially in terms of encryption efficiency. However, the method presents some vulnerabilities against attacks and requires a large quantity of information to encode the complex output plane. In this paper, we present an innovative hybrid technique to enhance the performance of DRPE method in terms of compression and encryption. An optimized simultaneous compression and encryption method is applied simultaneously on the real and imaginary components of the DRPE output plane. The compression and encryption technique consists in using an innovative randomized arithmetic coder (RAC) that can well compress the DRPE output planes and at the same time enhance the encryption. The RAC is obtained by an appropriate selection of some conditions in the binary arithmetic coding (BAC) process and by using a pseudo-random number to encrypt the corresponding outputs. The proposed technique has the capabilities to process video content and to be standard compliant with modern video coding standards such as H264 and HEVC. Simulations demonstrate that the proposed crypto-compression system has presented the drawbacks of the DRPE method. The cryptographic properties of DRPE have been enhanced while a compression rate of one-sixth can be achieved. FPGA implementation results show the high performance of the proposed method in terms of maximum operating frequency, hardware occupation, and dynamic power consumption.
FPGA-based RF spectrum merging and adaptive hopset selection

NASA Astrophysics Data System (ADS)

McLean, R. K.; Flatley, B. N.; Silvius, M. D.; Hopkinson, K. M.

The radio frequency (RF) spectrum is a limited resource. Spectrum allotment disputes stem from this scarcity as many radio devices are confined to a fixed frequency or frequency sequence. One alternative is to incorporate cognition within a reconfigurable radio platform, therefore enabling the radio to adapt to dynamic RF spectrum environments. In this way, the radio is able to actively sense the RF spectrum, decide, and act accordingly, thereby sharing the spectrum and operating in more flexible manner. In this paper, we present a novel solution for merging many distributed RF spectrum maps into one map and for subsequently creating an adaptive hopset. We also provide an example of our system in operation, the result of which is a pseudorandom adaptive hopset. The paper then presents a novel hardware design for the frequency merger and adaptive hopset selector, both of which are written in VHDL and implemented as a custom IP core on an FPGA-based embedded system using the Xilinx Embedded Development Kit (EDK) software tool. The design of the custom IP core is optimized for area, and it can process a high-volume digital input via a low-latency circuit architecture. The complete embedded system includes the Xilinx PowerPC microprocessor, UART serial connection, and compact flash memory card IP cores, and our custom map merging/hopset selection IP core, all of which are targeted to the Virtex IV FPGA. This system is then incorporated into a cognitive radio prototype on a Rice University Wireless Open Access Research Platform (WARP) reconfigurable radio.
Three-phase Four-leg Inverter LabVIEW FPGA Control Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The use of cRIO and sbRIO for power electronics control has developed over the last few yearsmore » to include control of three-phase inverters. Most three-phase inverter topologies include three switching legs. The addition of a fourth-leg to natively generate the neutral connection allows the inverter to serve single-phase loads in a microgrid or stand-alone power system and to balance the three-phase voltages in the presence of significant load imbalance. However, the control of a four-leg inverter is much more complex. In particular, instead of standard two-dimensional space vector modulation (SVM), the inverter requires three-dimensional space vector modulation (3D-SVM). The candidate software implements complete control algorithms in LabVIEW FPGA for a three-phase four-leg inverter. The software includes feedback control loops, three-dimensional space vector modulation gate-drive algorithms, advanced alarm handling capabilities, contactor control, power measurements, and debugging and tuning tools. The feedback control loops allow inverter operation in AC voltage control, AC current control, or DC bus voltage control modes based on external mode selection by a user or supervisory controller. The software includes the ability to synchronize its AC output to the grid or other voltage-source before connection. The software also includes provisions to allow inverter operation
FPGA implementation of concatenated non-binary QC-LDPC codes for high-speed optical transport.

PubMed

Zou, Ding; Djordjevic, Ivan B

2015-06-01

In this paper, we propose a soft-decision-based FEC scheme that is the concatenation of a non-binary LDPC code and hard-decision FEC code. The proposed NB-LDPC + RS with overhead of 27.06% provides a superior NCG of 11.9dB at a post-FEC BER of 10^-15. As a result, the proposed NB-LDPC codes represent the strong FEC candidate of soft-decision FEC for beyond 100Gb/s optical transmission systems.

Real-time FPGA-based radar imaging for smart mobility systems

NASA Astrophysics Data System (ADS)

Saponara, Sergio; Neri, Bruno

2016-04-01

The paper presents an X-band FMCW (Frequency Modulated Continuous Wave) Radar Imaging system, called X-FRI, for surveillance in smart mobility applications. X-FRI allows for detecting the presence of targets (e.g. obstacles in a railway crossing or urban road crossing, or ships in a small harbor), as well as their speed and their position. With respect to alternative solutions based on LIDAR or camera systems, X-FRI operates in real-time also in bad lighting and weather conditions, night and day. The radio-frequency transceiver is realized through COTS (Commercial Off The Shelf) components on a single-board. An FPGA-based baseband platform allows for real-time Radar image processing.
FPGA acceleration of rigid-molecule docking codes

PubMed Central

Sukhwani, B.; Herbordt, M.C.

2011-01-01

Modelling the interactions of biological molecules, or docking, is critical both to understanding basic life processes and to designing new drugs. The field programmable gate array (FPGA) based acceleration of a recently developed, complex, production docking code is described. The authors found that it is necessary to extend their previous three-dimensional (3D) correlation structure in several ways, most significantly to support simultaneous computation of several correlation functions. The result for small-molecule docking is a 100-fold speed-up of a section of the code that represents over 95% of the original run-time. An additional 2% is accelerated through a previously described method, yielding a total acceleration of 36× over a single core and 10× over a quad-core. This approach is found to be an ideal complement to graphics processing unit (GPU) based docking, which excels in the protein–protein domain. PMID:21857870
Large area high-speed metrology SPM system.

PubMed

Klapetek, P; Valtr, M; Picco, L; Payton, O D; Martinek, J; Yacoot, A; Miles, M

2015-02-13

We present a large area high-speed measuring system capable of rapidly generating nanometre resolution scanning probe microscopy data over mm(2) regions. The system combines a slow moving but accurate large area XYZ scanner with a very fast but less accurate small area XY scanner. This arrangement enables very large areas to be scanned by stitching together the small, rapidly acquired, images from the fast XY scanner while simultaneously moving the slow XYZ scanner across the region of interest. In order to successfully merge the image sequences together two software approaches for calibrating the data from the fast scanner are described. The first utilizes the low uncertainty interferometric sensors of the XYZ scanner while the second implements a genetic algorithm with multiple parameter fitting during the data merging step of the image stitching process. The basic uncertainty components related to these high-speed measurements are also discussed. Both techniques are shown to successfully enable high-resolution, large area images to be generated at least an order of magnitude faster than with a conventional atomic force microscope.
Large area high-speed metrology SPM system

NASA Astrophysics Data System (ADS)

Klapetek, P.; Valtr, M.; Picco, L.; Payton, O. D.; Martinek, J.; Yacoot, A.; Miles, M.

2015-02-01

We present a large area high-speed measuring system capable of rapidly generating nanometre resolution scanning probe microscopy data over mm2 regions. The system combines a slow moving but accurate large area XYZ scanner with a very fast but less accurate small area XY scanner. This arrangement enables very large areas to be scanned by stitching together the small, rapidly acquired, images from the fast XY scanner while simultaneously moving the slow XYZ scanner across the region of interest. In order to successfully merge the image sequences together two software approaches for calibrating the data from the fast scanner are described. The first utilizes the low uncertainty interferometric sensors of the XYZ scanner while the second implements a genetic algorithm with multiple parameter fitting during the data merging step of the image stitching process. The basic uncertainty components related to these high-speed measurements are also discussed. Both techniques are shown to successfully enable high-resolution, large area images to be generated at least an order of magnitude faster than with a conventional atomic force microscope.
Experimental high-speed network

NASA Astrophysics Data System (ADS)

McNeill, Kevin M.; Klein, William P.; Vercillo, Richard; Alsafadi, Yasser H.; Parra, Miguel V.; Dallas, William J.

1993-09-01

Many existing local area networking protocols currently applied in medical imaging were originally designed for relatively low-speed, low-volume networking. These protocols utilize small packet sizes appropriate for text based communication. Local area networks of this type typically provide raw bandwidth under 125 MHz. These older network technologies are not optimized for the low delay, high data traffic environment of a totally digital radiology department. Some current implementations use point-to-point links when greater bandwidth is required. However, the use of point-to-point communications for a total digital radiology department network presents many disadvantages. This paper describes work on an experimental multi-access local area network called XFT. The work includes the protocol specification, and the design and implementation of network interface hardware and software. The protocol specifies the Physical and Data Link layers (OSI layers 1 & 2) for a fiber-optic based token ring providing a raw bandwidth of 500 MHz. The protocol design and implementation of the XFT interface hardware includes many features to optimize image transfer and provide flexibility for additional future enhancements which include: a modular hardware design supporting easy portability to a variety of host system buses, a versatile message buffer design providing 16 MB of memory, and the capability to extend the raw bandwidth of the network to 3.0 GHz.
Testing Microshutter Arrays Using Commercial FPGA Hardware

NASA Technical Reports Server (NTRS)

Rapchun, David

2008-01-01

NASA is developing micro-shutter arrays for the Near Infrared Spectrometer (NIRSpec) instrument on the James Webb Space Telescope (JWST). These micro-shutter arrays allow NIRspec to do Multi Object Spectroscopy, a key part of the mission. Each array consists of 62414 individual 100 x 200 micron shutters. These shutters are magnetically opened and held electrostatically. Individual shutters are then programmatically closed using a simple row/column addressing technique. A common approach to provide these data/clock patterns is to use a Field Programmable Gate Array (FPGA). Such devices require complex VHSIC Hardware Description Language (VHDL) programming and custom electronic hardware. Due to JWST's rapid schedule on the development of the micro-shutters, rapid changes were required to the FPGA code to facilitate new approaches being discovered to optimize the array performance. Such rapid changes simply could not be made using conventional VHDL programming. Subsequently, National Instruments introduced an FPGA product that could be programmed through a Labview interface. Because Labview programming is considerably easier than VHDL programming, this method was adopted and brought success. The software/hardware allowed the rapid change the FPGA code and timely results of new micro-shutter array performance data. As a result, numerous labor hours and money to the project were conserved.
Hardware and Software Design of FPGA-based PCIe Gen3 interface for APEnet+ network interconnect system

NASA Astrophysics Data System (ADS)

Ammendola, R.; Biagioni, A.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Rossetti, D.; Simula, F.; Tosoratto, L.; Vicini, P.

2015-12-01

In the attempt to develop an interconnection architecture optimized for hybrid HPC systems dedicated to scientific computing, we designed APEnet+, a point-to-point, low-latency and high-performance network controller supporting 6 fully bidirectional off-board links over a 3D torus topology. The first release of APEnet+ (named V4) was a board based on a 40 nm Altera FPGA, integrating 6 channels at 34 Gbps of raw bandwidth per direction and a PCIe Gen2 x8 host interface. It has been the first-of-its-kind device to implement an RDMA protocol to directly read/write data from/to Fermi and Kepler NVIDIA GPUs using NVIDIA peer-to-peer and GPUDirect RDMA protocols, obtaining real zero-copy GPU-to-GPU transfers over the network. The latest generation of APEnet+ systems (now named V5) implements a PCIe Gen3 x8 host interface on a 28 nm Altera Stratix V FPGA, with multi-standard fast transceivers (up to 14.4 Gbps) and an increased amount of configurable internal resources and hardware IP cores to support main interconnection standard protocols. Herein we present the APEnet+ V5 architecture, the status of its hardware and its system software design. Both its Linux Device Driver and the low-level libraries have been redeveloped to support the PCIe Gen3 protocol, introducing optimizations and solutions based on hardware/software co-design.
FPGA design of correlation-based pattern recognition

NASA Astrophysics Data System (ADS)

Jridi, Maher; Alfalou, Ayman

2017-05-01

Optical/Digital pattern recognition and tracking based on optical/digital correlation are a well-known techniques to detect, identify and localize a target object in a scene. Despite the limited number of treatments required by the correlation scheme, computational time and resources are relatively high. The most computational intensive treatment required by the correlation is the transformation from spatial to spectral domain and then from spectral to spatial domain. Furthermore, these transformations are used on optical/digital encryption schemes like the double random phase encryption (DRPE). In this paper, we present a VLSI architecture for the correlation scheme based on the fast Fourier transform (FFT). One interesting feature of the proposed scheme is its ability to stream image processing in order to perform correlation for video sequences. A trade-off between the hardware consumption and the robustness of the correlation can be made in order to understand the limitations of the correlation implementation in reconfigurable and portable platforms. Experimental results obtained from HDL simulations and FPGA prototype have demonstrated the advantages of the proposed scheme.
Automatic HDL firmware generation for FPGA-based reconfigurable measurement and control systems with mezzanines in FMC standard

NASA Astrophysics Data System (ADS)

Wojenski, Andrzej; Kasprowicz, Grzegorz; Pozniak, Krzysztof T.; Romaniuk, Ryszard

2013-10-01

The paper describes a concept of automatic firmware generation for reconfigurable measurement systems, which uses FPGA devices and measurement cards in FMC standard. Following sections are described in details: automatic HDL code generation for FPGA devices, automatic communication interfaces implementation, HDL drivers for measurement cards, automatic serial connection between multiple measurement backplane boards, automatic build of memory map (address space), automatic generated firmware management. Presented solutions are required in many advanced measurement systems, like Beam Position Monitors or GEM detectors. This work is a part of a wider project for automatic firmware generation and management of reconfigurable systems. Solutions presented in this paper are based on previous publication in SPIE.
Machining Chatter Analysis for High Speed Milling Operations

NASA Astrophysics Data System (ADS)

Sekar, M.; Kantharaj, I.; Amit Siddhappa, Savale

2017-10-01

Chatter in high speed milling is characterized by time delay differential equations (DDE). Since closed form solution exists only for simple cases, the governing non-linear DDEs of chatter problems are solved by various numerical methods. Custom codes to solve DDEs are tedious to build, implement and not error free and robust. On the other hand, software packages provide solution to DDEs, however they are not straight forward to implement. In this paper an easy way to solve DDE of chatter in milling is proposed and implemented with MATLAB. Time domain solution permits the study and model of non-linear effects of chatter vibration with ease. Time domain results are presented for various stable and unstable conditions of cut and compared with stability lobe diagrams.
Independent component analysis algorithm FPGA design to perform real-time blind source separation

NASA Astrophysics Data System (ADS)

Meyer-Baese, Uwe; Odom, Crispin; Botella, Guillermo; Meyer-Baese, Anke

2015-05-01

The conditions that arise in the Cocktail Party Problem prevail across many fields creating a need for of Blind Source Separation. The need for BSS has become prevalent in several fields of work. These fields include array processing, communications, medical signal processing, and speech processing, wireless communication, audio, acoustics and biomedical engineering. The concept of the cocktail party problem and BSS led to the development of Independent Component Analysis (ICA) algorithms. ICA proves useful for applications needing real time signal processing. The goal of this research was to perform an extensive study on ability and efficiency of Independent Component Analysis algorithms to perform blind source separation on mixed signals in software and implementation in hardware with a Field Programmable Gate Array (FPGA). The Algebraic ICA (A-ICA), Fast ICA, and Equivariant Adaptive Separation via Independence (EASI) ICA were examined and compared. The best algorithm required the least complexity and fewest resources while effectively separating mixed sources. The best algorithm was the EASI algorithm. The EASI ICA was implemented on hardware with Field Programmable Gate Arrays (FPGA) to perform and analyze its performance in real time.
A Configurable Event-Driven Convolutional Node with Rate Saturation Mechanism for Modular ConvNet Systems Implementation.

PubMed

Camuñas-Mesa, Luis A; Domínguez-Cordero, Yaisel L; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé

2018-01-01

Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network.
A Configurable Event-Driven Convolutional Node with Rate Saturation Mechanism for Modular ConvNet Systems Implementation

PubMed Central

Camuñas-Mesa, Luis A.; Domínguez-Cordero, Yaisel L.; Linares-Barranco, Alejandro; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé

2018-01-01

Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network. PMID:29515349
Flexible High Speed Codec (FHSC)

NASA Technical Reports Server (NTRS)

Segallis, G. P.; Wernlund, J. V.

1991-01-01

The ongoing NASA/Harris Flexible High Speed Codec (FHSC) program is described. The program objectives are to design and build an encoder decoder that allows operation in either burst or continuous modes at data rates of up to 300 megabits per second. The decoder handles both hard and soft decision decoding and can switch between modes on a burst by burst basis. Bandspreading is low since the code rate is greater than or equal to 7/8. The encoder and a hard decision decoder fit on a single application specific integrated circuit (ASIC) chip. A soft decision applique is implemented using 300 K emitter coupled logic (ECL) which can be easily translated to an ECL gate array.
Numerical Simulation of High-Speed Turbulent Reacting Flows

NASA Technical Reports Server (NTRS)

Givi, P.; Taulbee, D. B.; Madnia, C. K.; Jaberi, F. A.; Colucci, P. J.; Gicquel, L. Y. M.; Adumitroaie, V.; James, S.

1999-01-01

The objectives of this research are: (1) to develop and implement a new methodology for large eddy simulation of (LES) of high-speed reacting turbulent flows. (2) To develop algebraic turbulence closures for statistical description of chemically reacting turbulent flows. We have just completed the third year of Phase III of this research. This is the Final Report of our activities on this research sponsored by the NASA LaRC.
Maximum speed limits. Volume 3, A programmed implementation manual for setting a speed limit based on the 85th percentile

DOT National Transportation Integrated Search

1970-10-01

This report contains the implementation manual developed as a part of the project "Maximum Speed Limits." The manual consists of a programed educational unit and a field workguide concerning the setting of speed limits based on the 85th percentile sp...
FPGA based demodulation of laser induced fluorescence in plasmas

NASA Astrophysics Data System (ADS)

Mattingly, Sean W.; Skiff, Fred

2018-04-01

We present a field programmable gate array (FPGA)-based system that counts photons from laser-induced fluorescence (LIF) on a laboratory plasma. This is accomplished with FPGA-based up/down counters that demodulate the data, giving a background-subtracted LIF signal stream that is updated with a new point as each laser amplitude modulation cycle completes. We demonstrate using the FPGA to modulate a laser at 1 MHz and demodulate the resulting LIF data stream. This data stream is used to calculate an LIF-based measurement sampled at 1 MHz of a plasma ion fluctuation spectrum.
Operateurs et engins de calcul en virgule flottante et leur application a la simulation en temps reel sur FPGA

NASA Astrophysics Data System (ADS)

Ould Bachir, Tarek

thesis contributes to the realm of FPGA-based real-time simulation in many ways. The research work proposes a new summation algorithm, which is a generalization of the so-called self-alignment technique. The new formulation is broader, simpler in its expression and hardware implementation. Our research helps formulating criteria to guarantee good accuracy, the criteria being established on a theoretical, as well as empirical basis. Moreover, the thesis offers a comprehensive analysis on the use of the redundant high radix carry-save (HRCS) format. The HRCS format is used to perform rapid additions of large mantissas. Two new HRCS operators are also proposed, namely an endomorphic adder and a HRCS to conventional converter. Once the mean to single-cycle accumulation is defined as a combination of the self-alignment technique and the HRCS format, the research focuses on the FPGA implementation of SIMD calculation engines using parallel floating-point MACs or DPs. The proposed operators are characterized by low latencies, allowing the engines to reach very low time-steps. The document finally discusses power electronic circuits modelling, and concludes with the presentation of a versatile calculation engine capable of simulating power converter with arbitrary topologies and up to 24 switches, while achieving time steps below 1 mus and allowing switching frequencies in the range of tens kilohertz. The latter realization has led to commercialization of a product by our industrial partner.
Speed versus accuracy in decision-making ants: expediting politics and policy implementation.

PubMed

Franks, Nigel R; Dechaume-Moncharmont, François-Xavier; Hanmore, Emma; Reynolds, Jocelyn K

2009-03-27

Compromises between speed and accuracy are seemingly inevitable in decision-making when accuracy depends on time-consuming information gathering. In collective decision-making, such compromises are especially likely because information is shared to determine corporate policy. This political process will also take time. Speed-accuracy trade-offs occur among house-hunting rock ants, Temnothorax albipennis. A key aspect of their decision-making is quorum sensing in a potential new nest. Finding a sufficient number of nest-mates, i.e. a quorum threshold (QT), in a potential nest site indicates that many ants find it suitable. Quorum sensing collates information. However, the QT is also used as a switch, from recruitment of nest-mates to their new home by slow tandem running, to recruitment by carrying, which is three times faster. Although tandem running is slow, it effectively enables one successful ant to lead and teach another the route between the nests. Tandem running creates positive feedback; more and more ants are shown the way, as tandem followers become, in turn, tandem leaders. The resulting corps of trained ants can then quickly carry their nest-mates; but carried ants do not learn the route. Therefore, the QT seems to set both the amount of information gathered and the speed of the emigration. Low QTs might cause more errors and a slower emigration--the worst possible outcome. This possible paradox of quick decisions leading to slow implementation might be resolved if the ants could deploy another positive-feedback recruitment process when they have used a low QT. Reverse tandem runs occur after carrying has begun and lead ants back from the new nest to the old one. Here we show experimentally that reverse tandem runs can bring lost scouts into an active role in emigrations and can help to maintain high-speed emigrations. Thus, in rock ants, although quick decision-making and rapid implementation of choices are initially in opposition, a third recruitment
FPGA-Based Pulse Pile-Up Correction With Energy and Timing Recovery.

PubMed

Haselman, M D; Pasko, J; Hauck, S; Lewellen, T K; Miyaoka, R S

2012-10-01

Modern field programmable gate arrays (FPGAs) are capable of performing complex discrete signal processing algorithms with clock rates well above 100 MHz. This, combined with FPGA's low expense, ease of use, and selected dedicated hardware make them an ideal technology for a data acquisition system for a positron emission tomography (PET) scanner. The University of Washington is producing a high-resolution, small-animal PET scanner that utilizes FPGAs as the core of the front-end electronics. For this scanner, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilized to add significant signal processing power to produce higher quality images. In this paper we report on an all-digital pulse pile-up correction algorithm that has been developed for the FPGA. The pile-up mitigation algorithm will allow the scanner to run at higher count rates without incurring large data losses due to the overlapping of scintillation signals. This correction technique utilizes a reference pulse to extract timing and energy information for most pile-up events. Using pulses acquired from a Zecotech Photonics MAPD-N with an LFS-3 scintillator, we show that good timing and energy information can be achieved in the presence of pile-up utilizing a moderate amount of FPGA resources.

High-speed GPU-based finite element simulations for NDT

NASA Astrophysics Data System (ADS)

Huthwaite, P.; Shi, F.; Van Pamel, A.; Lowe, M. J. S.

2015-03-01

The finite element method solved with explicit time increments is a general approach which can be applied to many ultrasound problems. It is widely used as a powerful tool within NDE for developing and testing inspection techniques, and can also be used in inversion processes. However, the solution technique is computationally intensive, requiring many calculations to be performed for each simulation, so traditionally speed has been an issue. For maximum speed, an implementation of the method, called Pogo [Huthwaite, J. Comp. Phys. 2014, doi: 10.1016/j.jcp.2013.10.017], has been developed to run on graphics cards, exploiting the highly parallelisable nature of the algorithm. Pogo typically demonstrates speed improvements of 60-90x over commercial CPU alternatives. Pogo is applied to three NDE examples, where the speed improvements are important: guided wave tomography, where a full 3D simulation must be run for each source transducer and every different defect size; scattering from rough cracks, where many simulations need to be run to build up a statistical model of the behaviour; and ultrasound propagation within coarse-grained materials where the mesh must be highly refined and many different cases run.
Novel windowing technique realized in FPGA for radar system

NASA Astrophysics Data System (ADS)

Escamilla-Hernandez, E.; Kravchenko, V. F.; Ponomaryov, V. I.; Ikuo, Arai

2006-02-01

To improve the weak target detection ability in radar applications a pulse compression is usually used that in the case linear FM modulation can improve the SNR. One drawback in here is that it can add the range side-lobes in reflectivity measurements. Using weighting window processing in time domain it is possible to decrease significantly the side-lobe level (SLL) and resolve small or low power targets those are masked by powerful ones. There are usually used classical windows such as Hamming, Hanning, etc. in window processing. Additionally to classical ones in this paper we also use a novel class of windows based on atomic functions (AF) theory. For comparison of simulation and experimental results we applied the standard parameters, such as coefficient of amplification, maximum level of side-lobe, width of main lobe, etc. To implement the compression-windowing model on hardware level it has been employed FPGA. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal, pulse compression and windowing employing FPGA's. Classical and novel AF window technique has been investigated to reduce the SLL taking into account the noise influence and increasing the detection ability of the small or weak targets in the imaging radar. Paper presents the experimental hardware results of windowing in pulse compression radar resolving several targets for rectangular, Hamming, Kaiser-Bessel, (see manuscript for formula) functions windows. The windows created by use the atomic functions offer sufficiently better decreasing of the SLL in case of noise presence and when we move away of the main lobe in comparison with classical windows.
CMOS Image Sensors for High Speed Applications.

PubMed

El-Desouki, Munir; Deen, M Jamal; Fang, Qiyin; Liu, Louis; Tse, Frances; Armstrong, David

2009-01-01

Recent advances in deep submicron CMOS technologies and improved pixel designs have enabled CMOS-based imagers to surpass charge-coupled devices (CCD) imaging technology for mainstream applications. The parallel outputs that CMOS imagers can offer, in addition to complete camera-on-a-chip solutions due to being fabricated in standard CMOS technologies, result in compelling advantages in speed and system throughput. Since there is a practical limit on the minimum pixel size (4∼5 μm) due to limitations in the optics, CMOS technology scaling can allow for an increased number of transistors to be integrated into the pixel to improve both detection and signal processing. Such smart pixels truly show the potential of CMOS technology for imaging applications allowing CMOS imagers to achieve the image quality and global shuttering performance necessary to meet the demands of ultrahigh-speed applications. In this paper, a review of CMOS-based high-speed imager design is presented and the various implementations that target ultrahigh-speed imaging are described. This work also discusses the design, layout and simulation results of an ultrahigh acquisition rate CMOS active-pixel sensor imager that can take 8 frames at a rate of more than a billion frames per second (fps).
Real-time digital signal processing in multiphoton and time-resolved microscopy

NASA Astrophysics Data System (ADS)

Wilson, Jesse W.; Warren, Warren S.; Fischer, Martin C.

2016-03-01

The use of multiphoton interactions in biological tissue for imaging contrast requires highly sensitive optical measurements. These often involve signal processing and filtering steps between the photodetector and the data acquisition device, such as photon counting and lock-in amplification. These steps can be implemented as real-time digital signal processing (DSP) elements on field-programmable gate array (FPGA) devices, an approach that affords much greater flexibility than commercial photon counting or lock-in devices. We will present progress toward developing two new FPGA-based DSP devices for multiphoton and time-resolved microscopy applications. The first is a high-speed multiharmonic lock-in amplifier for transient absorption microscopy, which is being developed for real-time analysis of the intensity-dependence of melanin, with applications in vivo and ex vivo (noninvasive histopathology of melanoma and pigmented lesions). The second device is a kHz lock-in amplifier running on a low cost (50-200) development platform. It is our hope that these FPGA-based DSP devices will enable new, high-speed, low-cost applications in multiphoton and time-resolved microscopy.
Hardware Prototyping of Neural Network based Fetal Electrocardiogram Extraction

NASA Astrophysics Data System (ADS)

Hasan, M. A.; Reaz, M. B. I.

2012-01-01

The aim of this paper is to model the algorithm for Fetal ECG (FECG) extraction from composite abdominal ECG (AECG) using VHDL (Very High Speed Integrated Circuit Hardware Description Language) for FPGA (Field Programmable Gate Array) implementation. Artificial Neural Network that provides efficient and effective ways of separating FECG signal from composite AECG signal has been designed. The proposed method gives an accuracy of 93.7% for R-peak detection in FHR monitoring. The designed VHDL model is synthesized and fitted into Altera's Stratix II EP2S15F484C3 using the Quartus II version 8.0 Web Edition for FPGA implementation.
An Ultra-High Speed Whole Slide Image Viewing System

PubMed Central

Yagi, Yukako; Yoshioka, Shigeatsu; Kyusojin, Hiroshi; Onozato, Maristela; Mizutani, Yoichi; Osato, Kiyoshi; Yada, Hiroaki; Mark, Eugene J.; Frosch, Matthew P.; Louis, David N.

2012-01-01

Background: One of the goals for a Whole Slide Imaging (WSI) system is implementation in the clinical practice of pathology. One of the unresolved problems in accomplishing this goal is the speed of the entire process, i.e., from viewing the slides through making the final diagnosis. Most users are not satisfied with the correct viewing speeds of available systems. We have evaluated a new WSI viewing station and tool that focuses on speed. Method: A prototype WSI viewer based on PlayStation®3 with wireless controllers was evaluated at the Department of Pathology at MGH for the following reasons: 1. For the simulation of signing-out cases; 2. Enabling discussion at a consensus conference; and 3. Use at slide seminars during a Continuing Medical Education course. Results: Pathologists were being able to use the system comfortably after 0–15 min training. There were no complaints regarding speed. Most pathologists were satisfied with the functionality, usability and speed of the system. The most difficult situation was simulating diagnostic sign-out. Conclusion: The preliminary results of adapting the Sony PlayStation®3 (PS3®) as an ultra-high speed WSI viewing system were promising. The achieved speed is consistent with what would be needed to use WSI in daily practice. PMID:22063731
An ultra-high speed Whole Slide Image viewing system.

PubMed

Yagi, Yukako; Yoshioka, Shigeatsu; Kyusojin, Hiroshi; Onozato, Maristela; Mizutani, Yoichi; Osato, Kiyoshi; Yada, Hiroaki; Mark, Eugene J; Frosch, Matthew P; Louis, David N

2012-01-01

One of the goals for a Whole Slide Imaging (WSI) system is implementation in the clinical practice of pathology. One of the unresolved problems in accomplishing this goal is the speed of the entire process, i.e., from viewing the slides through making the final diagnosis. Most users are not satisfied with the correct viewing speeds of available systems. We have evaluated a new WSI viewing station and tool that focuses on speed. A prototype WSI viewer based on PlayStation®3 with wireless controllers was evaluated at the Department of Pathology at MGH for the following reasons: 1. For the simulation of signing-out cases; 2. Enabling discussion at a consensus conference; and 3. Use at slide seminars during a Continuing Medical Education course. Pathologists were being able to use the system comfortably after 0-15 min training. There were no complaints regarding speed. Most pathologists were satisfied with the functionality, usability and speed of the system. The most difficult situation was simulating diagnostic sign-out. The preliminary results of adapting the Sony PlayStation®3 (PS3®) as an ultra-high speed WSI viewing system were promising. The achieved speed is consistent with what would be needed to use WSI in daily practice.
An ultra-high speed whole slide image viewing system.

PubMed

Yagi, Yukako; Yoshioka, Shigeatsu; Kyusojin, Hiroshi; Onozato, Maristela; Mizutani, Yoichi; Osato, Kiyoshi; Yada, Hiroaki; Mark, Eugene J; Frosch, Matthew P; Louis, David N

2012-01-01

One of the goals for a Whole Slide Imaging (WSI) system is implementation in the clinical practice of pathology. One of the unresolved problems in accomplishing this goal is the speed of the entire process, i.e., from viewing the slides through making the final diagnosis. Most users are not satisfied with the correct viewing speeds of available systems. We have evaluated a new WSI viewing station and tool that focuses on speed. A prototype WSI viewer based on PlayStation®3 with wireless controllers was evaluated at the Department of Pathology at MGH for the following reasons: 1. For the simulation of signing-out cases; 2. Enabling discussion at a consensus conference; and 3. Use at slide seminars during a Continuing Medical Education course. Pathologists were being able to use the system comfortably after 0-15 min training. There were no complaints regarding speed. Most pathologists were satisfied with the functionality, usability and speed of the system. The most difficult situation was simulating diagnostic sign-out. The preliminary results of adapting the Sony PlayStation®3 (PS3®) as an ultra-high speed WSI viewing system were promising. The achieved speed is consistent with what would be needed to use WSI in daily practice.
A high-speed network for cardiac image review.

PubMed

Elion, J L; Petrocelli, R R

1994-01-01

A high-speed fiber-based network for the transmission and display of digitized full-motion cardiac images has been developed. Based on Asynchronous Transfer Mode (ATM), the network is scaleable, meaning that the same software and hardware is used for a small local area network or for a large multi-institutional network. The system can handle uncompressed digital angiographic images, considered to be at the "high-end" of the bandwidth requirements. Along with the networking, a general-purpose multi-modality review station has been implemented without specialized hardware. This station can store a full injection sequence in "loop RAM" in a 512 x 512 format, then interpolate to 1024 x 1024 while displaying at 30 frames per second. The network and review stations connect to a central file server that uses a virtual file system to make a large high-speed RAID storage disk and associated off-line storage tapes and cartridges all appear as a single large file system to the software. In addition to supporting archival storage and review, the system can also digitize live video using high-speed Direct Memory Access (DMA) from the frame grabber to present uncompressed data to the network. Fully functional prototypes have provided the proof of concept, with full deployment in the institution planned as the next stage.
A high-speed network for cardiac image review.

PubMed Central

Elion, J. L.; Petrocelli, R. R.

1994-01-01

A high-speed fiber-based network for the transmission and display of digitized full-motion cardiac images has been developed. Based on Asynchronous Transfer Mode (ATM), the network is scaleable, meaning that the same software and hardware is used for a small local area network or for a large multi-institutional network. The system can handle uncompressed digital angiographic images, considered to be at the "high-end" of the bandwidth requirements. Along with the networking, a general-purpose multi-modality review station has been implemented without specialized hardware. This station can store a full injection sequence in "loop RAM" in a 512 x 512 format, then interpolate to 1024 x 1024 while displaying at 30 frames per second. The network and review stations connect to a central file server that uses a virtual file system to make a large high-speed RAID storage disk and associated off-line storage tapes and cartridges all appear as a single large file system to the software. In addition to supporting archival storage and review, the system can also digitize live video using high-speed Direct Memory Access (DMA) from the frame grabber to present uncompressed data to the network. Fully functional prototypes have provided the proof of concept, with full deployment in the institution planned as the next stage. PMID:7949964
Neuromorphic Hardware Architecture Using the Neural Engineering Framework for Pattern Recognition.

PubMed

Wang, Runchun; Thakur, Chetan Singh; Cohen, Gregory; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, Andre

2017-06-01

We present a hardware architecture that uses the neural engineering framework (NEF) to implement large-scale neural networks on field programmable gate arrays (FPGAs) for performing massively parallel real-time pattern recognition. NEF is a framework that is capable of synthesising large-scale cognitive systems from subnetworks and we have previously presented an FPGA implementation of the NEF that successfully performs nonlinear mathematical computations. That work was developed based on a compact digital neural core, which consists of 64 neurons that are instantiated by a single physical neuron using a time-multiplexing approach. We have now scaled this approach up to build a pattern recognition system by combining identical neural cores together. As a proof of concept, we have developed a handwritten digit recognition system using the MNIST database and achieved a recognition rate of 96.55%. The system is implemented on a state-of-the-art FPGA and can process 5.12 million digits per second. The architecture and hardware optimisations presented offer high-speed and resource-efficient means for performing high-speed, neuromorphic, and massively parallel pattern recognition and classification tasks.
Application of high speed machining technology in aviation

NASA Astrophysics Data System (ADS)

Bałon, Paweł; Szostak, Janusz; Kiełbasa, Bartłomiej; Rejman, Edward; Smusz, Robert

2018-05-01

Aircraft structures are exposed to many loads during their working lifespan. Every particular action made during a flight is composed of a series of air movements which generate various aircraft loads. The most rigorous requirement which modern aircraft structures must fulfill is to maintain their high durability and reliability. This requirement involves taking many restrictions into account during the aircraft design process. The most important factor is the structure's overall mass, which has a crucial impact on both utility properties and cost-effectiveness. This makes aircraft one of the most complex results of modern technology. Additionally, there is currently an increasing utilization of high strength aluminum alloys, which requires the implementation of new manufacturing processes. High Speed Machining technology (HSM) is currently one of the most important machining technologies used in the aviation industry, especially in the machining of aluminium alloys. The primary difference between HSM and other milling techniques is the ability to select cutting parameters - depth of the cut layer, feed rate, and cutting speed in order to simultaneously ensure high quality, precision of the machined surface, and high machining efficiency, all of which shorten the manufacturing process of the integral components. In this paper, the authors explain the implementation of the HSM method in integral aircraft constructions. It presents the method of the airframe manufacturing method, and the final results. The HSM method is compared to the previous method where all subcomponents were manufactured by bending and forming processes, and then, they were joined by riveting.
An FPGA-based bolometer for the MAST-U Super-X divertor.

PubMed

Lovell, Jack; Naylor, Graham; Field, Anthony; Drewelow, Peter; Sharples, Ray

2016-11-01

A new resistive bolometer system has been developed for MAST-Upgrade. It will measure radiated power in the new Super-X divertor, with millisecond time resolution, along 16 vertical and 16 horizontal lines of sight. The system uses a Xilinx Zynq-7000 series Field-Programmable Gate Array (FPGA) in the D-TACQ ACQ2106 carrier to perform real time data acquisition and signal processing. The FPGA enables AC-synchronous detection using high performance digital filtering to achieve a high signal-to-noise ratio and will be able to output processed data in real time with millisecond latency. The system has been installed on 8 previously unused channels of the JET vertical bolometer system. Initial results suggest good agreement with data from existing vertical channels but with higher bandwidth and signal-to-noise ratio.
Integration and test of high-speed transmitter electronics for free-space laser communications

NASA Technical Reports Server (NTRS)

Soni, Nitin J.; Lizanich, Paul J.

1994-01-01

The NASA Lewis Research Center in Cleveland, Ohio, has developed the electronics for a free-space, direct-detection laser communications system demonstration. Under the High-Speed Laser Integrated Terminal Electronics (Hi-LITE) Project, NASA Lewis has built a prototype full-duplex, dual-channel electronics transmitter and receiver operating at 325 megabit S per second (Mbps) per channel and using quaternary pulse-position modulation (QPPM). This paper describes the integration and testing of the transmitter portion for future application in free-space, direct-detection laser communications. A companion paper reviews the receiver portion of the prototype electronics. Minor modifications to the transmitter were made since the initial report on the entire system, and this paper addresses them. The digital electronics are implemented in gallium arsenide integrated circuits mounted on prototype boards. The fabrication and implementation issues related to these high-speed devices are discussed. The transmitter's test results are documented, and its functionality is verified by exercising all modes of operation. Various testing issues pertaining to high-speed circuits are addressed. A description of the transmitter electronics packaging concludes the paper.
FPGA-based multi-channel fluorescence lifetime analysis of Fourier multiplexed frequency-sweeping lifetime imaging

PubMed Central

Zhao, Ming; Li, Yu; Peng, Leilei

2014-01-01

We report a fast non-iterative lifetime data analysis method for the Fourier multiplexed frequency-sweeping confocal FLIM (Fm-FLIM) system [ Opt. Express22, 10221 ( 2014)24921725]. The new method, named R-method, allows fast multi-channel lifetime image analysis in the system’s FPGA data processing board. Experimental tests proved that the performance of the R-method is equivalent to that of single-exponential iterative fitting, and its sensitivity is well suited for time-lapse FLIM-FRET imaging of live cells, for example cyclic adenosine monophosphate (cAMP) level imaging with GFP-Epac-mCherry sensors. With the R-method and its FPGA implementation, multi-channel lifetime images can now be generated in real time on the multi-channel frequency-sweeping FLIM system, and live readout of FRET sensors can be performed during time-lapse imaging. PMID:25321778
Developments of FPGA-based digital back-ends for low frequency antenna arrays at Medicina radio telescopes

NASA Astrophysics Data System (ADS)

Naldi, G.; Bartolini, M.; Mattana, A.; Pupillo, G.; Hickish, J.; Foster, G.; Bianchi, G.; Lingua, A.; Monari, J.; Montebugnoli, S.; Perini, F.; Rusticelli, S.; Schiaffino, M.; Virone, G.; Zarb Adami, K.

In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
High-Speed, high-power, switching transistor

NASA Technical Reports Server (NTRS)

Carnahan, D.; Ohu, C. K.; Hower, P. L.

1979-01-01

Silicon transistor rate for 200 angstroms at 400 to 600 volts combines switching speed of transistors with ruggedness, power capacity of thyristor. Transistor introduces unique combination of increased power-handling capability, unusally low saturation and switching losses, and submicrosecond switching speeds. Potential applications include high power switching regulators, linear amplifiers, chopper controls for high frequency electrical vehicle drives, VLF transmitters, RF induction heaters, kitchen cooking ranges, and electronic scalpels for medical surgery.
Design and Implementation of a Mechanical Control System for the Scanning Microwave Limb Sounder

NASA Technical Reports Server (NTRS)

Bowden, William

2011-01-01

The Scanning Microwave Limb Sounder (SMLS) will use technological improvements in low noise mixers to provide precise data on the Earth's atmospheric composition with high spatial resolution. This project focuses on the design and implementation of a real time control system needed for airborne engineering tests of the SMLS. The system must coordinate the actuation of optical components using four motors with encoder readback, while collecting synchronized telemetric data from a GPS receiver and 3-axis gyrometric system. A graphical user interface for testing the control system was also designed using Python. Although the system could have been implemented with a FPGA-based setup, we chose to use a low cost processor development kit manufactured by XMOS. The XMOS architecture allows parallel execution of multiple tasks on separate threads-making it ideal for this application and is easily programmed using XC (a subset of C). The necessary communication interfaces were implemented in software, including Ethernet, with significant cost and time reduction compared to an FPGA-based approach. For these reasons, the XMOS technology is an attractive, cost effective, alternative to FPGA-based technologies for this design and similar rapid prototyping projects.
Digitization of Analog Signals using a Field Programmable Gate Array (FPGA)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aguilera, Daniel; Rusu, Vadim

The idea of this research is consolidating the electrical components used for capturing data in the Mu2e Tracker. Ideally, an FPGA will serve as the Time-Division Converters (TDC) and Analog-to-Digital Converters (ADC). The TDC is already being carried out by the FPGA, but we are still using off the shelf ADCs. This poster proposes using Low Voltage Differential Signaling as the basis for analog-to-digital conversion using and FPGA.
A Compact Synchronous Cellular Model of Nonlinear Calcium Dynamics: Simulation and FPGA Synthesis Results.

PubMed

Soleimani, Hamid; Drakakis, Emmanuel M

2017-06-01

Recent studies have demonstrated that calcium is a widespread intracellular ion that controls a wide range of temporal dynamics in the mammalian body. The simulation and validation of such studies using experimental data would benefit from a fast large scale simulation and modelling tool. This paper presents a compact and fully reconfigurable cellular calcium model capable of mimicking Hopf bifurcation phenomenon and various nonlinear responses of the biological calcium dynamics. The proposed cellular model is synthesized on a digital platform for a single unit and a network model. Hardware synthesis, physical implementation on FPGA, and theoretical analysis confirm that the proposed cellular model can mimic the biological calcium behaviors with considerably low hardware overhead. The approach has the potential to speed up large-scale simulations of slow intracellular dynamics by sharing more cellular units in real-time. To this end, various networks constructed by pipelining 10 k to 40 k cellular calcium units are compared with an equivalent simulation run on a standard PC workstation. Results show that the cellular hardware model is, on average, 83 times faster than the CPU version.

An integrated framework for high level design of high performance signal processing circuits on FPGAs

NASA Astrophysics Data System (ADS)

Benkrid, K.; Belkacemi, S.; Sukhsawas, S.

2005-06-01

This paper proposes an integrated framework for the high level design of high performance signal processing algorithms' implementations on FPGAs. The framework emerged from a constant need to rapidly implement increasingly complicated algorithms on FPGAs while maintaining the high performance needed in many real time digital signal processing applications. This is particularly important for application developers who often rely on iterative and interactive development methodologies. The central idea behind the proposed framework is to dynamically integrate high performance structural hardware description languages with higher level hardware languages in other to help satisfy the dual requirement of high level design and high performance implementation. The paper illustrates this by integrating two environments: Celoxica's Handel-C language, and HIDE, a structural hardware environment developed at the Queen's University of Belfast. On the one hand, Handel-C has been proven to be very useful in the rapid design and prototyping of FPGA circuits, especially control intensive ones. On the other hand, HIDE, has been used extensively, and successfully, in the generation of highly optimised parameterisable FPGA cores. In this paper, this is illustrated in the construction of a scalable and fully parameterisable core for image algebra's five core neighbourhood operations, where fully floorplanned efficient FPGA configurations, in the form of EDIF netlists, are generated automatically for instances of the core. In the proposed combined framework, highly optimised data paths are invoked dynamically from within Handel-C, and are synthesized using HIDE. Although the idea might seem simple prima facie, it could have serious implications on the design of future generations of hardware description languages.
High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology

NASA Astrophysics Data System (ADS)

Rajan, K.; Patnaik, L. M.; Ramakrishna, J.

1997-08-01

Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon
Visualization of hump formation in high-speed gas metal arc welding

NASA Astrophysics Data System (ADS)

Wu, C. S.; Zhong, L. M.; Gao, J. Q.

2009-11-01

The hump bead is a typical weld defect observed in high-speed welding. Its occurrence limits the improvement of welding productivity. Visualization of hump formation during high-speed gas metal arc welding (GMAW) is helpful in the better understanding of the humping phenomena so that effective measures can be taken to suppress or decrease the tendency of hump formation and achieve higher productivity welding. In this study, an experimental system was developed to implement vision-based observation of the weld pool behavior during high-speed GMAW. Considering the weld pool characteristics in high-speed welding, a narrow band-pass and neutral density filter was equipped for the CCD camera, the suitable exposure time was selected and side view orientation of the CCD camera was employed. The events that took place at the rear portion of the weld pools were imaged during the welding processes with and without hump bead formation, respectively. It was found that the variation of the weld pool surface height and the solid-liquid interface at the pool trailing with time shows some useful information to judge whether the humping phenomenon occurs or not.
Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ceriani, Marco; Palermo, Gianluca; Secchi, Simone

We present a prototype of a multi-core architecture implemented on FPGA, designed to enable efficient execution of irregular applications on distributed shared memory machines, while maintaining high performance on regular workloads. The architecture is composed of off-the-shelf soft-core cores, local interconnection and memory interface, integrated with custom components that optimize it for irregular applications. It relies on three key elements: a global address space, multithreading, and fine-grained synchronization. Global addresses are scrambled to reduce the formation of network hot-spots, while the latency of the transactions is covered by integrating an hardware scheduler within the custom load/store buffers to take advantagemore » from the availability of multiple executions threads, increasing the efficiency in a transparent way to the application. We evaluated a dual node system irregular kernels showing scalability in the number of cores and threads.« less
FPGA-Based X-Ray Detection and Measurement for an X-Ray Polarimeter

NASA Technical Reports Server (NTRS)

Gregory, Kyle; Hill, Joanne; Black, Kevin; Baumgartner, Wayne

2013-01-01

This technology enables detection and measurement of x-rays in an x-ray polarimeter using a field-programmable gate array (FPGA). The technology was developed for the Gravitational and Extreme Magnetism Small Explorer (GEMS) mission. It performs precision energy and timing measurements, as well as rejection of non-x-ray events. It enables the GEMS polarimeter to detect precisely when an event has taken place so that additional measurements can be made. The technology also enables this function to be performed in an FPGA using limited resources so that mass and power can be minimized while reliability for a space application is maximized and precise real-time operation is achieved. This design requires a low-noise, charge-sensitive preamplifier; a highspeed analog to digital converter (ADC); and an x-ray detector with a cathode terminal. It functions by computing a sum of differences for time-samples whose difference exceeds a programmable threshold. A state machine advances through states as a programmable number of consecutive samples exceeds or fails to exceed this threshold. The pulse height is recorded as the accumulated sum. The track length is also measured based on the time from the start to the end of accumulation. For track lengths longer than a certain length, the algorithm estimates the barycenter of charge deposit by comparing the accumulator value at the midpoint to the final accumulator value. The design also employs a number of techniques for rejecting background events. This innovation enables the function to be performed in space where it can operate autonomously with a rapid response time. This implementation combines advantages of computing system-based approaches with those of pure analog approaches. The result is an implementation that is highly reliable, performs in real-time, rejects background events, and consumes minimal power.
Synchronous high speed multi-point velocity profile measurement by heterodyne interferometry

NASA Astrophysics Data System (ADS)

Hou, Xueqin; Xiao, Wen; Chen, Zonghui; Qin, Xiaodong; Pan, Feng

2017-02-01

This paper presents a synchronous multipoint velocity profile measurement system, which acquires the vibration velocities as well as images of vibrating objects by combining optical heterodyne interferometry and a high-speed CMOS-DVR camera. The high-speed CMOS-DVR camera records a sequence of images of the vibrating object. Then, by extracting and processing multiple pixels at the same time, a digital demodulation technique is implemented to simultaneously acquire the vibrating velocity of the target from the recorded sequences of images. This method is validated with an experiment. A piezoelectric ceramic plate with standard vibration characteristics is used as the vibrating target, which is driven by a standard sinusoidal signal.
Implementing Speed and Separation Monitoring in Collaborative Robot Workcells.

PubMed

Marvel, Jeremy A; Norcross, Rick

2017-04-01

We provide an overview and guidance for the Speed and Separation Monitoring methodology as presented in the International Organization of Standardization's technical specification 15066 on collaborative robot safety. Such functionality is provided by external, intelligent observer systems integrated into a robotic workcell. The SSM minimum protective distance function equation is discussed in detail, with consideration for the input values, implementation specifications, and performance expectations. We provide analytical analyses and test results of the current equation, discuss considerations for implementing SSM in human-occupied environments, and provide directions for technological advancements toward standardization.
Implementing Speed and Separation Monitoring in Collaborative Robot Workcells

PubMed Central

Marvel, Jeremy A.; Norcross, Rick

2016-01-01

We provide an overview and guidance for the Speed and Separation Monitoring methodology as presented in the International Organization of Standardization's technical specification 15066 on collaborative robot safety. Such functionality is provided by external, intelligent observer systems integrated into a robotic workcell. The SSM minimum protective distance function equation is discussed in detail, with consideration for the input values, implementation specifications, and performance expectations. We provide analytical analyses and test results of the current equation, discuss considerations for implementing SSM in human-occupied environments, and provide directions for technological advancements toward standardization. PMID:27885312
A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System.

PubMed

Zhang, Zhen; Ma, Cheng; Zhu, Rong

2017-08-23

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.
A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System

PubMed Central

Zhang, Zhen; Zhu, Rong

2017-01-01

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
Implementation of density-based solver for all speeds in the framework of OpenFOAM

NASA Astrophysics Data System (ADS)

Shen, Chun; Sun, Fengxian; Xia, Xinlin

2014-10-01

In the framework of open source CFD code OpenFOAM, a density-based solver for all speeds flow field is developed. In this solver the preconditioned all speeds AUSM+(P) scheme is adopted and the dual time scheme is implemented to complete the unsteady process. Parallel computation could be implemented to accelerate the solving process. Different interface reconstruction algorithms are implemented, and their accuracy with respect to convection is compared. Three benchmark tests of lid-driven cavity flow, flow crossing over a bump, and flow over a forward-facing step are presented to show the accuracy of the AUSM+(P) solver for low-speed incompressible flow, transonic flow, and supersonic/hypersonic flow. Firstly, for the lid driven cavity flow, the computational results obtained by different interface reconstruction algorithms are compared. It is indicated that the one dimensional reconstruction scheme adopted in this solver possesses high accuracy and the solver developed in this paper can effectively catch the features of low incompressible flow. Then via the test cases regarding the flow crossing over bump and over forward step, the ability to capture characteristics of the transonic and supersonic/hypersonic flows are confirmed. The forward-facing step proves to be the most challenging for the preconditioned solvers with and without the dual time scheme. Nonetheless, the solvers described in this paper reproduce the main features of this flow, including the evolution of the initial transient.
An Evolutionary Method for Financial Forecasting in Microscopic High-Speed Trading Environment.

PubMed

Huang, Chien-Feng; Li, Hsu-Chih

2017-01-01

The advancement of information technology in financial applications nowadays have led to fast market-driven events that prompt flash decision-making and actions issued by computer algorithms. As a result, today's markets experience intense activity in the highly dynamic environment where trading systems respond to others at a much faster pace than before. This new breed of technology involves the implementation of high-speed trading strategies which generate significant portion of activity in the financial markets and present researchers with a wealth of information not available in traditional low-speed trading environments. In this study, we aim at developing feasible computational intelligence methodologies, particularly genetic algorithms (GA), to shed light on high-speed trading research using price data of stocks on the microscopic level. Our empirical results show that the proposed GA-based system is able to improve the accuracy of the prediction significantly for price movement, and we expect this GA-based methodology to advance the current state of research for high-speed trading and other relevant financial applications.
An Evolutionary Method for Financial Forecasting in Microscopic High-Speed Trading Environment

PubMed Central

Li, Hsu-Chih

2017-01-01

The advancement of information technology in financial applications nowadays have led to fast market-driven events that prompt flash decision-making and actions issued by computer algorithms. As a result, today's markets experience intense activity in the highly dynamic environment where trading systems respond to others at a much faster pace than before. This new breed of technology involves the implementation of high-speed trading strategies which generate significant portion of activity in the financial markets and present researchers with a wealth of information not available in traditional low-speed trading environments. In this study, we aim at developing feasible computational intelligence methodologies, particularly genetic algorithms (GA), to shed light on high-speed trading research using price data of stocks on the microscopic level. Our empirical results show that the proposed GA-based system is able to improve the accuracy of the prediction significantly for price movement, and we expect this GA-based methodology to advance the current state of research for high-speed trading and other relevant financial applications. PMID:28316618
Evaluation of FPGA to PC feedback loop

NASA Astrophysics Data System (ADS)

Linczuk, Pawel; Zabolotny, Wojciech M.; Wojenski, Andrzej; Krawczyk, Rafal D.; Pozniak, Krzysztof T.; Chernyshova, Maryna; Czarski, Tomasz; Gaska, Michal; Kasprowicz, Grzegorz; Kowalska-Strzeciwilk, Ewa; Malinowski, Karol

2017-08-01

The paper presents the evaluation study of the performance of the data transmission subsystem which can be used in High Energy Physics (HEP) and other High-Performance Computing (HPC) systems. The test environment consisted of Xilinx Artix-7 FPGA and server-grade PC connected via the PCIe 4xGen2 bus. The DMA engine was based on the Xilinx DMA for PCI Express Subsystem1 controlled by the modified Xilinx XDMA kernel driver.2 The research is focused on the influence of the system configuration on achievable throughput and latency of data transfer.
Hardware Implementation of Lossless Adaptive and Scalable Hyperspectral Data Compression for Space

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh; Keymeulen, Didier; Bakhshi, Alireza; Klimesh, Matthew

2009-01-01

On-board lossless hyperspectral data compression reduces data volume in order to meet NASA and DoD limited downlink capabilities. The technique also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware. A modified form of the algorithm that is better suited for data from pushbroom instruments is generally appropriate for flight implementation. A scalable field programmable gate array (FPGA) hardware implementation was developed. The FPGA implementation achieves a throughput performance of 58 Msamples/sec, which can be increased to over 100 Msamples/sec in a parallel implementation that uses twice the hardware resources This paper describes the hardware implementation of the 'Modified Fast Lossless' compression algorithm on an FPGA. The FPGA implementation targets the current state-of-the-art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for space applications.
FPGA-based architecture for real-time data reduction of ultrasound signals.

PubMed

Soto-Cajiga, J A; Pedraza-Ortega, J C; Rubio-Gonzalez, C; Bandala-Sanchez, M; Romero-Troncoso, R de J

2012-02-01

This paper describes a novel method for on-line real-time data reduction of radiofrequency (RF) ultrasound signals. The approach is based on a field programmable gate array (FPGA) system intended mainly for steel thickness measurements. Ultrasound data reduction is desirable when: (1) direct measurements performed by an operator are not accessible; (2) it is required to store a considerable amount of data; (3) the application requires measuring at very high speeds; and (4) the physical space for the embedded hardware is limited. All the aforementioned scenarios can be present in applications such as pipeline inspection where data reduction is traditionally performed on-line using pipeline inspection gauges (PIG). The method proposed in this work consists of identifying and storing in real-time only the time of occurrence (TOO) and the maximum amplitude of each echo present in a given RF ultrasound signal. The method is tested with a dedicated immersion system where a significant data reduction with an average of 96.5% is achieved. Copyright © 2011 Elsevier B.V. All rights reserved.
A programmable controller based on CAN field bus embedded microprocessor and FPGA

NASA Astrophysics Data System (ADS)

Cai, Qizhong; Guo, Yifeng; Chen, Wenhei; Wang, Mingtao

2008-10-01

One kind of new programmable controller(PLC) is introduced in this paper. The advanced embedded microprocessor and Field-Programmable Gate Array (FPGA) device are applied in the PLC system. The PLC system structure was presented in this paper. It includes 32 bits Advanced RISC Machines (ARM) embedded microprocessor as control core, FPGA as control arithmetic coprocessor and CAN bus as data communication criteria protocol connected the host controller and its various extension modules. It is detailed given that the circuits and working principle, IiO interface circuit between ARM and FPGA and interface circuit between ARM and FPGA coprocessor. Furthermore the interface circuit diagrams between various modules are written. In addition, it is introduced that ladder chart program how to control the transfer info of control arithmetic part in FPGA coprocessor. The PLC, through nearly two months of operation to meet the design of the basic requirements.
A new FPGA-driven P-HIFU system with harmonic cancellation technique

NASA Astrophysics Data System (ADS)

Wu, Hao; Shen, Guofeng; Su, Zhiqiang; Chen, Yazhu

2017-03-01

This paper introduces a high intensity focused ultrasound system for ablation using switch-mode power amplifiers with harmonic cancellation technique eliminating the 3rdharmonic and all even harmonics. The efficiency of the amplifier is optimized by choosing different parameters of the harmonic cancellation technique. This technique requires double driving signals, and specific signal waveform because of the full-bridge topology. The new FPGA-driven P-HIFU system has 200 channels of phase signals that can form 100 output channels. An FPGA chip is used to generate these signals, and each channel has a phase resolution of 2 ns, less than one degree. The output waveform of the amplifier, voltage waveform across the transducer, shows fewer harmonic components.
Field Programmable Gate Array for Implementation of Redundant Advanced Digital Feedback Control

NASA Technical Reports Server (NTRS)

King, K. D.

2003-01-01

The goal of this effort was to develop a digital motor controller using field programmable gate arrays (FPGAs). This is a more rugged approach than a conventional microprocessor digital controller. FPGAs typically have higher radiation (rad) tolerance than both the microprocessor and memory required for a conventional digital controller. Furthermore, FPGAs can typically operate at higher speeds. (While speed is usually not an issue for motor controllers, it can be for other system controllers.) Other than motor power, only a 3.3-V digital power supply was used in the controller; no analog bias supplies were used. Since most of the circuit was implemented in the FPGA, no additional parts were needed other than the power transistors to drive the motor. The benefits that FPGAs provide over conventional designs-lower power and fewer parts-allow for smaller packaging and reduced weight and cost.
Evaluation Of Traffic Control Devices For Rural High-Speed Maintenance Work Zones

DOT National Transportation Integrated Search

2000-10-01

This report documents the first year activities of a two-year project in which various work zone traffic control devices, treatments, and practices were implemented and evaluated. The focus was on rural high-speed work zones. Nine work zones were stu...

Beam Instrument Development System

DOE Office of Scientific and Technical Information (OSTI.GOV)

DOOLITTLE, LAWRENCE; HUANG, GANG; DU, QIANG

Beam Instrumentation Development System (BIDS) is a collection of common support libraries and modules developed during a series of Low-Level Radio Frequency (LLRF) control and timing/synchronization projects. BIDS includes a collection of Hardware Description Language (HDL) libraries and software libraries. The BIDS can be used for the development of any FPGA-based system, such as LLRF controllers. HDL code in this library is generic and supports common Digital Signal Processing (DSP) functions, FPGA-specific drivers (high-speed serial link wrappers, clock generation, etc.), ADC/DAC drivers, Ethernet MAC implementation, etc.
Study on a new chaotic bitwise dynamical system and its FPGA implementation

NASA Astrophysics Data System (ADS)

Wang, Qian-Xue; Yu, Si-Min; Guyeux, C.; Bahi, J.; Fang, Xiao-Le

2015-06-01

In this paper, the structure of a new chaotic bitwise dynamical system (CBDS) is described. Compared to our previous research work, it uses various random bitwise operations instead of only one. The chaotic behavior of CBDS is mathematically proven according to the Devaney's definition, and its statistical properties are verified both for uniformity and by a comprehensive, reputed and stringent battery of tests called TestU01. Furthermore, a systematic methodology developing the parallel computations is proposed for FPGA platform-based realization of this CBDS. Experiments finally validate the proposed systematic methodology. Project supported by China Postdoctoral Science Foundation (Grant No. 2014M552175), the Scientific Research Foundation for the Returned Overseas Chinese Scholars, Chinese Education Ministry, the National Natural Science Foundation of China (Grant No. 61172023), and the Specialized Research Foundation of Doctoral Subjects of Chinese Education Ministry (Grant No. 20114420110003).
Real-time implementation of a multispectral mine target detection algorithm

NASA Astrophysics Data System (ADS)

Samson, Joseph W.; Witter, Lester J.; Kenton, Arthur C.; Holloway, John H., Jr.

2003-09-01

Spatial-spectral anomaly detection (the "RX Algorithm") has been exploited on the USMC's Coastal Battlefield Reconnaissance and Analysis (COBRA) Advanced Technology Demonstration (ATD) and several associated technology base studies, and has been found to be a useful method for the automated detection of surface-emplaced antitank land mines in airborne multispectral imagery. RX is a complex image processing algorithm that involves the direct spatial convolution of a target/background mask template over each multispectral image, coupled with a spatially variant background spectral covariance matrix estimation and inversion. The RX throughput on the ATD was about 38X real time using a single Sun UltraSparc system. A goal to demonstrate RX in real-time was begun in FY01. We now report the development and demonstration of a Field Programmable Gate Array (FPGA) solution that achieves a real-time implementation of the RX algorithm at video rates using COBRA ATD data. The approach uses an Annapolis Microsystems Firebird PMC card containing a Xilinx XCV2000E FPGA with over 2,500,000 logic gates and 18MBytes of memory. A prototype system was configured using a Tek Microsystems VME board with dual-PowerPC G4 processors and two PMC slots. The RX algorithm was translated from its C programming implementation into the VHDL language and synthesized into gates that were loaded into the FPGA. The VHDL/synthesizer approach allows key RX parameters to be quickly changed and a new implementation automatically generated. Reprogramming the FPGA is done rapidly and in-circuit. Implementation of the RX algorithm in a single FPGA is a major first step toward achieving real-time land mine detection.
FPGA-based gating and logic for multichannel single photon counting

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pooser, Raphael C; Earl, Dennis Duncan; Evans, Philip G

2012-01-01

We present results characterizing multichannel InGaAs single photon detectors utilizing gated passive quenching circuits (GPQC), self-differencing techniques, and field programmable gate array (FPGA)-based logic for both diode gating and coincidence counting. Utilizing FPGAs for the diode gating frontend and the logic counting backend has the advantage of low cost compared to custom built logic circuits and current off-the-shelf detector technology. Further, FPGA logic counters have been shown to work well in quantum key distribution (QKD) test beds. Our setup combines multiple independent detector channels in a reconfigurable manner via an FPGA backend and post processing in order to perform coincidencemore » measurements between any two or more detector channels simultaneously. Using this method, states from a multi-photon polarization entangled source are detected and characterized via coincidence counting on the FPGA. Photons detection events are also processed by the quantum information toolkit for application testing (QITKAT)« less
High-Speed Sealift Technology. Volume 1

DTIC Science & Technology

1998-09-01

performance of high - speed commercial and military sealift ships , in advance of detailed design studies, in order to help define realistic future mission...Therefore, the viability of new High - Speed Sealift (HSS) ships (oceangoing cargo vessels capable of at least 40 kt that are able to onload and offload... propulsion power for dynamically supported concepts) VK = average ship speed for a voyage (i.e., sustained or service speed )
STRS SpaceWire FPGA Module

NASA Technical Reports Server (NTRS)

Lux, James P.; Taylor, Gregory H.; Lang, Minh; Stern, Ryan A.

2011-01-01

An FPGA module leverages the previous work from Goddard Space Flight Center (GSFC) relating to NASA s Space Telecommunications Radio System (STRS) project. The STRS SpaceWire FPGA Module is written in the Verilog Register Transfer Level (RTL) language, and it encapsulates an unmodified GSFC core (which is written in VHDL). The module has the necessary inputs/outputs (I/Os) and parameters to integrate seamlessly with the SPARC I/O FPGA Interface module (also developed for the STRS operating environment, OE). Software running on the SPARC processor can access the configuration and status registers within the SpaceWire module. This allows software to control and monitor the SpaceWire functions, but it is also used to give software direct access to what is transmitted and received through the link. SpaceWire data characters can be sent/received through the software interface, as well as through the dedicated interface on the GSFC core. Similarly, SpaceWire time codes can be sent/received through the software interface or through a dedicated interface on the core. This innovation is designed for plug-and-play integration in the STRS OE. The SpaceWire module simplifies the interfaces to the GSFC core, and synchronizes all I/O to a single clock. An interrupt output (with optional masking) identifies time-sensitive events within the module. Test modes were added to allow internal loopback of the SpaceWire link and internal loopback of the client-side data interface.
High-speed sailing

NASA Astrophysics Data System (ADS)

Püschl, Wolfgang

2018-07-01

This article is to review, for the benefit of university teachers, the most important arguments concerning the theory of sailing, especially regarding its high-speed aspect. The matter presented should be appropriate for students with basic knowledge of physics, such as advanced undergraduate or graduate. It is intended, furthermore, to put recent developments in the art of sailing in the proper historic perspective. We first regard the general geometric and dynamic conditions for steady sailing on a given course and then take a closer look at the high-speed case and its counter-intuitive aspects. A short overview is given on how the aero-hydrodynamic lift force arises, disposing of some wrong but entrenched ideas. The multi-faceted, composite nature of the drag force is expounded, with the special case of wave drag as a phenomenon at the boundary between different media. It is discussed how these various factors have to contribute in order to attain maximum speed. Modern solutions to this optimisation problem are considered, as well as their repercussions on the sport of sailing now and in the future.
A parallel architecture of interpolated timing recovery for high- speed data transfer rate and wide capture-range

NASA Astrophysics Data System (ADS)

Higashino, Satoru; Kobayashi, Shoei; Yamagami, Tamotsu

2007-06-01

High data transfer rate has been demanded for data storage devices along increasing the storage capacity. In order to increase the transfer rate, high-speed data processing techniques in read-channel devices are required. Generally, parallel architecture is utilized for the high-speed digital processing. We have developed a new architecture of Interpolated Timing Recovery (ITR) to achieve high-speed data transfer rate and wide capture-range in read-channel devices for the information storage channels. It facilitates the parallel implementation on large-scale-integration (LSI) devices.
FPGA for Power Control of MSL Avionics

NASA Technical Reports Server (NTRS)

Wang, Duo; Burke, Gary R.

2011-01-01

A PLGT FPGA (Field Programmable Gate Array) is included in the LCC (Load Control Card), GID (Guidance Interface & Drivers), TMC (Telemetry Multiplexer Card), and PFC (Pyro Firing Card) boards of the Mars Science Laboratory (MSL) spacecraft. (PLGT stands for PFC, LCC, GID, and TMC.) It provides the interface between the backside bus and the power drivers on these boards. The LCC drives power switches to switch power loads, and also relays. The GID drives the thrusters and latch valves, as well as having the star-tracker and Sun-sensor interface. The PFC drives pyros, and the TMC receives digital and analog telemetry. The FPGA is implemented both in Xilinx (Spartan 3- 400) and in Actel (RTSX72SU, ASX72S). The Xilinx Spartan 3 part is used for the breadboard, the Actel ASX part is used for the EM (Engineer Module), and the pin-compatible, radiation-hardened RTSX part is used for final EM and flight. The MSL spacecraft uses a FC (Flight Computer) to control power loads, relays, thrusters, latch valves, Sun-sensor, and star-tracker, and to read telemetry such as temperature. Commands are sent over a 1553 bus to the MREU (Multi-Mission System Architecture Platform Remote Engineering Unit). The MREU resends over a remote serial command bus c-bus to the LCC, GID TMC, and PFC. The MREU also sends out telemetry addresses via a remote serial telemetry address bus to the LCC, GID, TMC, and PFC, and the status is returned over the remote serial telemetry data bus.
Design of an FPGA-Based Algorithm for Real-Time Solutions of Statistics-Based Positioning

PubMed Central

DeWitt, Don; Johnson-Williams, Nathan G.; Miyaoka, Robert S.; Li, Xiaoli; Lockhart, Cate; Lewellen, Tom K.; Hauck, Scott

2010-01-01

We report on the implementation of an algorithm and hardware platform to allow real-time processing of the statistics-based positioning (SBP) method for continuous miniature crystal element (cMiCE) detectors. The SBP method allows an intrinsic spatial resolution of ~1.6 mm FWHM to be achieved using our cMiCE design. Previous SBP solutions have required a postprocessing procedure due to the computation and memory intensive nature of SBP. This new implementation takes advantage of a combination of algebraic simplifications, conversion to fixed-point math, and a hierarchal search technique to greatly accelerate the algorithm. For the presented seven stage, 127 × 127 bin LUT implementation, these algorithm improvements result in a reduction from >7 × 106 floating-point operations per event for an exhaustive search to < 5 × 103 integer operations per event. Simulations show nearly identical FWHM positioning resolution for this accelerated SBP solution, and positioning differences of <0.1 mm from the exhaustive search solution. A pipelined field programmable gate array (FPGA) implementation of this optimized algorithm is able to process events in excess of 250 K events per second, which is greater than the maximum expected coincidence rate for an individual detector. In contrast with all detectors being processed at a centralized host, as in the current system, a separate FPGA is available at each detector, thus dividing the computational load. These methods allow SBP results to be calculated in real-time and to be presented to the image generation components in real-time. A hardware implementation has been developed using a commercially available prototype board. PMID:21197135
Micromirror structured illumination microscope for high-speed in vivo drosophila brain imaging.

PubMed

Masson, A; Pedrazzani, M; Benrezzak, S; Tchenio, P; Preat, T; Nutarelli, D

2014-01-27

Genetic tools and especially genetically encoded fluorescent reporters have given a special place to optical microscopy in drosophila neurobiology research. In order to monitor neural networks activity, high speed and sensitive techniques, with high spatial resolution are required. Structured illumination microscopies are wide-field approaches with optical sectioning ability. Despite the large progress made with the introduction of the HiLo principle, they did not meet the criteria of speed and/or spatial resolution for drosophila brain imaging. We report on a new implementation that took advantage of micromirror matrix technology to structure the illumination. Thus, we showed that the developed instrument exhibits a spatial resolution close to that of confocal microscopy but it can record physiological responses with a speed improved by more than an order a magnitude.
High-speed polarized light microscopy for in situ, dynamic measurement of birefringence properties

NASA Astrophysics Data System (ADS)

Wu, Xianyu; Pankow, Mark; Shadow Huang, Hsiao-Ying; Peters, Kara

2018-01-01

A high-speed, quantitative polarized light microscopy (QPLM) instrument has been developed to monitor the optical slow axis spatial realignment during controlled medium to high strain rate experiments at acquisition rates up to 10 kHz. This high-speed QPLM instrument is implemented within a modified drop tower and demonstrated using polycarbonate specimens. By utilizing a rotating quarter wave plate and a high-speed camera, the minimum acquisition time to generate an alignment map of a birefringent specimen is 6.1 ms. A sequential analysis method allows the QPLM instrument to generate QPLM data at the high-speed camera imaging frequency 10 kHz. The obtained QPLM data is processed using a vector correlation technique to detect anomalous optical axis realignment and retardation changes throughout the loading event. The detected anomalous optical axis realignment is shown to be associated with crack initiation, propagation, and specimen failure in a dynamically loaded polycarbonate specimen. The work provides a foundation for detecting damage in biological tissues through local collagen fiber realignment and fracture during dynamic loading.
Filtered Mass Density Function for Design Simulation of High Speed Airbreathing Propulsion Systems

NASA Technical Reports Server (NTRS)

Givi, P.; Madnia, C. K.; Gicquel, L. Y. M.; Sheikhi, M. R. H.; Drozda, T. G.

2002-01-01

The objective of this research is to improve and implement the filtered mass density function (FDF) methodology for large eddy simulation (LES) of high speed reacting turbulent flows. NASA is interested in the design of various components involved in air breathing propulsion systems such as the scramjet. There is a demand for development of robust tools that can aid in the design procedure. The physics of high speed reactive flows is rich with many complexities. LES is regarded as one of the most promising means of simulating turbulent reacting flows.
Radiation Tolerant, FPGA-Based SmallSat Computer System

NASA Technical Reports Server (NTRS)

LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew

2015-01-01

The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
Comparing an FPGA to a Cell for an Image Processing Application

NASA Astrophysics Data System (ADS)

Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.

2010-12-01

Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator

PubMed Central

Wang, Runchun M.; Thakur, Chetan S.; van Schaik, André

2018-01-01

This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks. PMID:29692702
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.

PubMed

Wang, Runchun M; Thakur, Chetan S; van Schaik, André

2018-01-01

This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.
High speed micromachining with high power UV laser

NASA Astrophysics Data System (ADS)

Patel, Rajesh S.; Bovatsek, James M.

2013-03-01

Increasing demand for creating fine features with high accuracy in manufacturing of electronic mobile devices has fueled growth for lasers in manufacturing. High power, high repetition rate ultraviolet (UV) lasers provide an opportunity to implement a cost effective high quality, high throughput micromachining process in a 24/7 manufacturing environment. The energy available per pulse and the pulse repetition frequency (PRF) of diode pumped solid state (DPSS) nanosecond UV lasers have increased steadily over the years. Efficient use of the available energy from a laser is important to generate accurate fine features at a high speed with high quality. To achieve maximum material removal and minimal thermal damage for any laser micromachining application, use of the optimal process parameters including energy density or fluence (J/cm2), pulse width, and repetition rate is important. In this study we present a new high power, high PRF QuasarR 355-40 laser from Spectra-Physics with TimeShiftTM technology for unique software adjustable pulse width, pulse splitting, and pulse shaping capabilities. The benefits of these features for micromachining include improved throughput and quality. Specific example and results of silicon scribing are described to demonstrate the processing benefits of the Quasar's available power, PRF, and TimeShift technology.
High-speed technique based on a parallel projection correlation procedure for digital image correlation

NASA Astrophysics Data System (ADS)

Zaripov, D. I.; Renfu, Li

2018-05-01

The implementation of high-efficiency digital image correlation methods based on a zero-normalized cross-correlation (ZNCC) procedure for high-speed, time-resolved measurements using a high-resolution digital camera is associated with big data processing and is often time consuming. In order to speed-up ZNCC computation, a high-speed technique based on a parallel projection correlation procedure is proposed. The proposed technique involves the use of interrogation window projections instead of its two-dimensional field of luminous intensity. This simplification allows acceleration of ZNCC computation up to 28.8 times compared to ZNCC calculated directly, depending on the size of interrogation window and region of interest. The results of three synthetic test cases, such as a one-dimensional uniform flow, a linear shear flow and a turbulent boundary-layer flow, are discussed in terms of accuracy. In the latter case, the proposed technique is implemented together with an iterative window-deformation technique. On the basis of the results of the present work, the proposed technique is recommended to be used for initial velocity field calculation, with further correction using more accurate techniques.
FPGA implementation of digital down converter using CORDIC algorithm

NASA Astrophysics Data System (ADS)

Agarwal, Ashok; Lakshmi, Boppana

2013-01-01

In radio receivers, Digital Down Converters (DDC) are used to translate the signal from Intermediate Frequency level to baseband. It also decimates the oversampled signal to a lower sample rate, eliminating the need of a high end digital signal processors. In this paper we have implemented architecture for DDC employing CORDIC algorithm, which down converts an IF signal of 70MHz (3G) to 200 KHz baseband GSM signal, with an SFDR greater than 100dB. The implemented architecture reduces the hardware resource requirements by 15 percent when compared with other architecture available in the literature due to elimination of explicit multipliers and a quadrature phase shifter for mixing.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.