Remote monitoring and fault recovery for FPGA-based field controllers of telescope and instruments
NASA Astrophysics Data System (ADS)
Zhu, Yuhua; Zhu, Dan; Wang, Jianing
2012-09-01
As the increasing size and more and more functions, modern telescopes have widely used the control architecture, i.e. central control unit plus field controller. FPGA-based field controller has the advantages of field programmable, which provide a great convenience for modifying software and hardware of control system. It also gives a good platform for implementation of the new control scheme. Because of multi-controlled nodes and poor working environment in scattered locations, reliability and stability of the field controller should be fully concerned. This paper mainly describes how we use the FPGA-based field controller and Ethernet remote to construct monitoring system with multi-nodes. When failure appearing, the new FPGA chip does self-recovery first in accordance with prerecovery strategies. In case of accident, remote reconstruction for the field controller can be done through network intervention if the chip is not being restored. This paper also introduces the network remote reconstruction solutions of controller, the system structure and transport protocol as well as the implementation methods. The idea of hardware and software design is given based on the FPGA. After actual operation on the large telescopes, desired results have been achieved. The improvement increases system reliability and reduces workload of maintenance, showing good application and popularization.
Diagnostic layer integration in FPGA-based pipeline measurement systems for HEP experiments
NASA Astrophysics Data System (ADS)
Pozniak, Krzysztof T.
2007-08-01
Integrated triggering and data acquisition systems for high energy physics experiments may be considered as fast, multichannel, synchronous, distributed, pipeline measurement systems. A considerable extension of functional, technological and monitoring demands, which has recently been imposed on them, forced a common usage of large field-programmable gate array (FPGA), digital signal processing-enhanced matrices and fast optical transmission for their realization. This paper discusses modelling, design, realization and testing of pipeline measurement systems. A distribution of synchronous data stream flows is considered in the network. A general functional structure of a single network node is presented. A suggested, novel block structure of the node model facilitates full implementation in the FPGA chip, circuit standardization and parametrization, as well as integration of functional and diagnostic layers. A general method for pipeline system design was derived. This method is based on a unified model of the synchronous data network node. A few examples of practically realized, FPGA-based, pipeline measurement systems were presented. The described systems were applied in ZEUS and CMS.
Rofoee, Bijan Rahimzadeh; Zervas, Georgios; Yan, Yan; Amaya, Norberto; Qin, Yixuan; Simeonidou, Dimitra
2013-03-11
The paper presents a novel network architecture on demand approach using on-chip and-off chip implementations, enabling programmable, highly efficient and transparent networking, well suited for intra-datacenter communications. The implemented FPGA-based adaptable line-card with on-chip design along with an architecture on demand (AoD) based off-chip flexible switching node, deliver single chip dual L2-Packet/L1-time shared optical network (TSON) server Network Interface Cards (NIC) interconnected through transparent AoD based switch. It enables hitless adaptation between Ethernet over wavelength switched network (EoWSON), and TSON based sub-wavelength switching, providing flexible bitrates, while meeting strict bandwidth, QoS requirements. The on and off-chip performance results show high throughput (9.86Ethernet, 8.68Gbps TSON), high QoS, as well as hitless switch-over.
Lifting Scheme DWT Implementation in a Wireless Vision Sensor Network
NASA Astrophysics Data System (ADS)
Ong, Jia Jan; Ang, L.-M.; Seng, K. P.
This paper presents the practical implementation of a Wireless Visual Sensor Network (WVSN) with DWT processing on the visual nodes. WVSN consists of visual nodes that capture video and transmit to the base-station without processing. Limitation of network bandwidth restrains the implementation of real time video streaming from remote visual nodes through wireless communication. Three layers of DWT filters are implemented to process the captured image from the camera. With having all the wavelet coefficients produced, it is possible just to transmit the low frequency band coefficients and obtain an approximate image at the base-station. This will reduce the amount of power required in transmission. When necessary, transmitting all the wavelet coefficients will produce the full detail of image, which is similar to the image captured at the visual nodes. The visual node combines the CMOS camera, Xilinx Spartan-3L FPGA and wireless ZigBee® network that uses the Ember EM250 chip.
Soft-core processor study for node-based architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Houten, Jonathan Roger; Jarosz, Jason P.; Welch, Benjamin James
2008-09-01
Node-based architecture (NBA) designs for future satellite projects hold the promise of decreasing system development time and costs, size, weight, and power and positioning the laboratory to address other emerging mission opportunities quickly. Reconfigurable Field Programmable Gate Array (FPGA) based modules will comprise the core of several of the NBA nodes. Microprocessing capabilities will be necessary with varying degrees of mission-specific performance requirements on these nodes. To enable the flexibility of these reconfigurable nodes, it is advantageous to incorporate the microprocessor into the FPGA itself, either as a hardcore processor built into the FPGA or as a soft-core processor builtmore » out of FPGA elements. This document describes the evaluation of three reconfigurable FPGA based processors for use in future NBA systems--two soft cores (MicroBlaze and non-fault-tolerant LEON) and one hard core (PowerPC 405). Two standard performance benchmark applications were developed for each processor. The first, Dhrystone, is a fixed-point operation metric. The second, Whetstone, is a floating-point operation metric. Several trials were run at varying code locations, loop counts, processor speeds, and cache configurations. FPGA resource utilization was recorded for each configuration. Cache configurations impacted the results greatly; for optimal processor efficiency it is necessary to enable caches on the processors. Processor caches carry a penalty; cache error mitigation is necessary when operating in a radiation environment.« less
Optoelectronic date acquisition system based on FPGA
NASA Astrophysics Data System (ADS)
Li, Xin; Liu, Chunyang; Song, De; Tong, Zhiguo; Liu, Xiangqing
2015-11-01
An optoelectronic date acquisition system is designed based on FPGA. FPGA chip that is EP1C3T144C8 of Cyclone devices from Altera corporation is used as the centre of logic control, XTP2046 chip is used as A/D converter, host computer that communicates with the date acquisition system through RS-232 serial communication interface are used as display device and photo resistance is used as photo sensor. We use Verilog HDL to write logic control code about FPGA. It is proved that timing sequence is correct through the simulation of ModelSim. Test results indicate that this system meets the design requirement, has fast response and stable operation by actual hardware circuit test.
Design of video interface conversion system based on FPGA
NASA Astrophysics Data System (ADS)
Zhao, Heng; Wang, Xiang-jun
2014-11-01
This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
Li, Bingyi; Chen, Liang; Yu, Wenyue; Xie, Yizhuang; Bian, Mingming; Zhang, Qingjun; Pang, Long
2018-01-01
With the development of satellite load technology and very large-scale integrated (VLSI) circuit technology, on-board real-time synthetic aperture radar (SAR) imaging systems have facilitated rapid response to disasters. A key goal of the on-board SAR imaging system design is to achieve high real-time processing performance under severe size, weight, and power consumption constraints. This paper presents a multi-node prototype system for real-time SAR imaging processing. We decompose the commonly used chirp scaling (CS) SAR imaging algorithm into two parts according to the computing features. The linearization and logic-memory optimum allocation methods are adopted to realize the nonlinear part in a reconfigurable structure, and the two-part bandwidth balance method is used to realize the linear part. Thus, float-point SAR imaging processing can be integrated into a single Field Programmable Gate Array (FPGA) chip instead of relying on distributed technologies. A single-processing node requires 10.6 s and consumes 17 W to focus on 25-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384. The design methodology of the multi-FPGA parallel accelerating system under the real-time principle is introduced. As a proof of concept, a prototype with four processing nodes and one master node is implemented using a Xilinx xc6vlx315t FPGA. The weight and volume of one single machine are 10 kg and 32 cm × 24 cm × 20 cm, respectively, and the power consumption is under 100 W. The real-time performance of the proposed design is demonstrated on Chinese Gaofen-3 stripmap continuous imaging. PMID:29495637
Photoelectric radar servo control system based on ARM+FPGA
NASA Astrophysics Data System (ADS)
Wu, Kaixuan; Zhang, Yue; Li, Yeqiu; Dai, Qin; Yao, Jun
2016-01-01
In order to get smaller, faster, and more responsive requirements of the photoelectric radar servo control system. We propose a set of core ARM + FPGA architecture servo controller. Parallel processing capability of FPGA to be used for the encoder feedback data, PWM carrier modulation, A, B code decoding processing and so on; Utilizing the advantage of imaging design in ARM Embedded systems achieves high-speed implementation of the PID algorithm. After the actual experiment, the closed-loop speed of response of the system cycles up to 2000 times/s, in the case of excellent precision turntable shaft, using a PID algorithm to achieve the servo position control with the accuracy of + -1 encoder input code. Firstly, This article carry on in-depth study of the embedded servo control system hardware to determine the ARM and FPGA chip as the main chip with systems based on a pre-measured target required to achieve performance requirements, this article based on ARM chip used Samsung S3C2440 chip of ARM7 architecture , the FPGA chip is chosen xilinx's XC3S400 . ARM and FPGA communicate by using SPI bus, the advantage of using SPI bus is saving a lot of pins for easy system upgrades required thereafter. The system gets the speed datas through the photoelectric-encoder that transports the datas to the FPGA, Then the system transmits the datas through the FPGA to ARM, transforms speed datas into the corresponding position and velocity data in a timely manner, prepares the corresponding PWM wave to control motor rotation by making comparison between the position data and the velocity data setted in advance . According to the system requirements to draw the schematics of the photoelectric radar servo control system and PCB board to produce specially. Secondly, using PID algorithm to control the servo system, the datas of speed obtained from photoelectric-encoder is calculated position data and speed data via high-speed digital PID algorithm and coordinate models. Finally, a large number of experiments verify the reliability of embedded servo control system's functions, the stability of the program and the stability of the hardware circuit. Meanwhile, the system can also achieve the satisfactory of user experience, to achieve a multi-mode motion, real-time motion status monitoring, online system parameter changes and other convenient features.
Tuple spaces in hardware for accelerated implicit routing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Zachary Kent; Tripp, Justin
2010-12-01
Organizing and optimizing data objects on networks with support for data migration and failing nodes is a complicated problem to handle as systems grow. The goal of this work is to demonstrate that high levels of speedup can be achieved by moving responsibility for finding, fetching, and staging data into an FPGA-based network card. We present a system for implicit routing of data via FPGA-based network cards. In this system, data structures are requested by name, and the network of FPGAs finds the data within the network and relays the structure to the requester. This is acheived through successive examinationmore » of hardware hash tables implemented in the FPGA. By avoiding software stacks between nodes, the data is quickly fetched entirely through FPGA-FPGA interaction. The performance of this system is orders of magnitude faster than software implementations due to the improved speed of the hash tables and lowered latency between the network nodes.« less
Optimized FPGA Implementation of the Thyroid Hormone Secretion Mechanism Using CAD Tools.
Alghazo, Jaafar M
2017-02-01
The goal of this paper is to implement the secretion mechanism of the Thyroid Hormone (TH) based on bio-mathematical differential eqs. (DE) on an FPGA chip. Hardware Descriptive Language (HDL) is used to develop a behavioral model of the mechanism derived from the DE. The Thyroid Hormone secretion mechanism is simulated with the interaction of the related stimulating and inhibiting hormones. Synthesis of the simulation is done with the aid of CAD tools and downloaded on a Field Programmable Gate Arrays (FPGAs) Chip. The chip output shows identical behavior to that of the designed algorithm through simulation. It is concluded that the chip mimics the Thyroid Hormone secretion mechanism. The chip, operating in real-time, is computer-independent stand-alone system.
Design of the SLAC RCE Platform: A General Purpose ATCA Based Data Acquisition System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herbst, R.; Claus, R.; Freytag, M.
2015-01-23
The SLAC RCE platform is a general purpose clustered data acquisition system implemented on a custom ATCA compliant blade, called the Cluster On Board (COB). The core of the system is the Reconfigurable Cluster Element (RCE), which is a system-on-chip design based upon the Xilinx Zynq family of FPGAs, mounted on custom COB daughter-boards. The Zynq architecture couples a dual core ARM Cortex A9 based processor with a high performance 28nm FPGA. The RCE has 12 external general purpose bi-directional high speed links, each supporting serial rates of up to 12Gbps. 8 RCE nodes are included on a COB, eachmore » with a 10Gbps connection to an on-board 24-port Ethernet switch integrated circuit. The COB is designed to be used with a standard full-mesh ATCA backplane allowing multiple RCE nodes to be tightly interconnected with minimal interconnect latency. Multiple shelves can be clustered using the front panel 10-gbps connections. The COB also supports local and inter-blade timing and trigger distribution. An experiment specific Rear Transition Module adapts the 96 high speed serial links to specific experiments and allows an experiment-specific timing and busy feedback connection. This coupling of processors with a high performance FPGA fabric in a low latency, multiple node cluster allows high speed data processing that can be easily adapted to any physics experiment. RTEMS and Linux are both ported to the module. The RCE has been used or is the baseline for several current and proposed experiments (LCLS, HPS, LSST, ATLAS-CSC, LBNE, DarkSide, ILC-SiD, etc).« less
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations
NASA Astrophysics Data System (ADS)
Hamada, Tsuyoshi; Fukushige, Toshiyuki; Kawai, Atsushi; Makino, Junichiro
2000-10-01
We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and ``traditional'' GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter relies on a hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions, but also other forms of interactions, such as the van der Waals force, hydro\\-dynamical interactions in the SPHr calculation, and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interactions similar to that of GRAPE-3. One pipeline is fitted into a single FPGA chip, operated at 16 MHz clock. Thus, for gravitational interactions, PROGRAPE-1 provided a speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for a wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.
FPGA chip performance improvement with gate shrink through alternating PSM 90nm process
NASA Astrophysics Data System (ADS)
Yu, Chun-Chi; Shieh, Ming-Feng; Liu, Erick; Lin, Benjamin; Ho, Jonathan; Wu, Xin; Panaite, Petrisor; Chacko, Manoj; Zhang, Yunqiang; Lei, Wen-Kang
2005-11-01
In the post-physical verification space called 'Mask Synthesis' a key component of design-for-manufacturing (DFM), double-exposure based, dark-field, alternating PSM (Alt-PSM) is being increasingly applied at the 90nm node in addition with other mature resolution enhancement techniques (RETs) such as optical proximity correction (OPC) and sub-resolution assist features (SRAF). Several high-performance IC manufacturers already use alt-PSM technology in 65nm production. At 90nm having strong control over the lithography process is a critical component in meeting targeted yield goals. However, implementing alt-PSM in production has been challenging due to several factors such as phase conflict errors, mask manufacturing, and the increased production cost due to the need for two masks in the process. Implementation of Alt-PSM generally requires phase compliance rules and proper phase topology in the layout and this has been successful for the technology node with these rules implemented. However, this may not be true for a mature, production process technology, in this case 90 nm. Especially, in the foundry-fabless business model where the foundry provides a standard set of design rules to its customers for a given process technology, and where not all the foundry customers require Alt-PSM in their tapeout flow. With minimum design changes, design houses usually are motivated by higher product performance for the existing designs. What follows is an in-depth review of the motivation to apply alt-PSM on a production FPGA, the DFM challenges to each partner faced, its effect on the tapeout flow, and how design, manufacturing, and EDA teams worked together to resolve phase conflicts, tapeout the chip, and finally verify the silicon results in production.
Design of transient light signal simulator based on FPGA
NASA Astrophysics Data System (ADS)
Kang, Jing; Chen, Rong-li; Wang, Hong
2014-11-01
A design scheme of transient light signal simulator based on Field Programmable gate Array (FPGA) was proposed in this paper. Based on the characteristics of transient light signals and measured feature points of optical intensity signals, a fitted curve was created in MATLAB. And then the wave data was stored in a programmed memory chip AT29C1024 by using SUPERPRO programmer. The control logic was realized inside one EP3C16 FPGA chip. Data readout, data stream cache and a constant current buck regulator for powering high-brightness LEDs were all controlled by FPGA. A 12-Bit multiplying CMOS digital-to-analog converter (DAC) DAC7545 and an amplifier OPA277 were used to convert digital signals to voltage signals. A voltage-controlled current source constituted by a NPN transistor and an operational amplifier controlled LED array diming to achieve simulation of transient light signal. LM3405A, 1A Constant Current Buck Regulator for Powering LEDs, was used to simulate strong background signal in space. Experimental results showed that the scheme as a transient light signal simulator can satisfy the requests of the design stably.
Implementation of the Timepix ASIC in the Scalable Readout System
NASA Astrophysics Data System (ADS)
Lupberger, M.; Desch, K.; Kaminski, J.
2016-09-01
We report on the development of electronics hardware, FPGA firmware and software to provide a flexible multi-chip readout of the Timepix ASIC within the framework of the Scalable Readout System (SRS). The system features FPGA-based zero-suppression and the possibility to read out up to 4×8 chips with a single Front End Concentrator (FEC). By operating several FECs in parallel, in principle an arbitrary number of chips can be read out, exploiting the scaling features of SRS. Specifically, we tested the system with a setup consisting of 160 Timepix ASICs, operated as GridPix devices in a large TPC field cage in a 1 T magnetic field at a DESY test beam facility providing an electron beam of up to 6 GeV. We discuss the design choices, the dedicated hardware components, the FPGA firmware as well as the performance of the system in the test beam.
Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming
2011-11-01
Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubois, David H; Dubois, Andrew J; Boorman, Thomas M
2009-01-01
This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Xia, Fei; Jin, Guoqing
2014-06-01
PKNOTS is a most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard four-dimensional (4D) dynamic programming (DP) method and is the basis of many variants and improved algorithms. Unfortunately, the O(N(6)) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS package and prototype system for accelerating RNA folding application based on FPGA chip. We adopted a series of storage optimization strategies to resolve the "Memory Wall" problem. We aggressively exploit parallel computing strategies to improve computational efficiency. We also propose several methods that collectively reduce the storage requirements for FPGA on-chip memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D DP problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 50x average speedup over the PKNOTS-1.08 software running on a PC platform with Intel Core2 Q9400 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.
High-definition video display based on the FPGA and THS8200
NASA Astrophysics Data System (ADS)
Qian, Jia; Sui, Xiubao
2014-11-01
This paper presents a high-definition video display solution based on the FPGA and THS8200. THS8200 is a video decoder chip launched by TI company, this chip has three 10-bit DAC channels which can capture video data in both 4:2:2 and 4:4:4 formats, and its data synchronization can be either through the dedicated synchronization signals HSYNC and VSYNC, or extracted from the embedded video stream synchronization information SAV / EAV code. In this paper, we will utilize the address and control signals generated by FPGA to access to the data-storage array, and then the FPGA generates the corresponding digital video signals YCbCr. These signals combined with the synchronization signals HSYNC and VSYNC that are also generated by the FPGA act as the input signals of THS8200. In order to meet the bandwidth requirements of the high-definition TV, we adopt video input in the 4:2:2 format over 2×10-bit interface. THS8200 is needed to be controlled by FPGA with I2C bus to set the internal registers, and as a result, it can generate the synchronous signal that is satisfied with the standard SMPTE and transfer the digital video signals YCbCr into analog video signals YPbPr. Hence, the composite analog output signals YPbPr are consist of image data signal and synchronous signal which are superimposed together inside the chip THS8200. The experimental research indicates that the method presented in this paper is a viable solution for high-definition video display, which conforms to the input requirements of the new high-definition display devices.
Design of the ANTARES LCM-DAQ board test bench using a FPGA-based system-on-chip approach
NASA Astrophysics Data System (ADS)
Anvar, S.; Kestener, P.; Le Provost, H.
2006-11-01
The System-on-Chip (SoC) approach consists in using state-of-the-art FPGA devices with embedded RISC processor cores, high-speed differential LVDS links and ready-to-use multi-gigabit transceivers allowing development of compact systems with substantial number of IO channels. Required performances are obtained through a subtle separation of tasks between closely cooperating programmable hardware logic and user-friendly software environment. We report about our experience in using the SoC approach for designing the production test bench of the off-shore readout system for the ANTARES neutrino experiment.
Central FPGA-based destination and load control in the LHCb MHz event readout
NASA Astrophysics Data System (ADS)
Jacobsson, R.
2012-10-01
The readout strategy of the LHCb experiment is based on complete event readout at 1 MHz. A set of 320 sub-detector readout boards transmit event fragments at total rate of 24.6 MHz at a bandwidth usage of up to 70 GB/s over a commercial switching network based on Gigabit Ethernet to a distributed event building and high-level trigger processing farm with 1470 individual multi-core computer nodes. In the original specifications, the readout was based on a pure push protocol. This paper describes the proposal, implementation, and experience of a non-conventional mixture of a push and a pull protocol, akin to credit-based flow control. An FPGA-based central master module, partly operating at the LHC bunch clock frequency of 40.08 MHz and partly at a double clock speed, is in charge of the entire trigger and readout control from the front-end electronics up to the high-level trigger farm. One FPGA is dedicated to controlling the event fragment packing in the readout boards, the assignment of the farm node destination for each event, and controls the farm load based on an asynchronous pull mechanism from each farm node. This dynamic readout scheme relies on generic event requests and the concept of node credit allowing load control and trigger rate regulation as a function of the global farm load. It also allows the vital task of fast central monitoring and automatic recovery in-flight of failing nodes while maintaining dead-time and event loss at a minimum. This paper demonstrates the strength and suitability of implementing this real-time task for a very large distributed system in an FPGA where no random delays are introduced, and where extreme reliability and accurate event accounting are fundamental requirements. It was in use during the entire commissioning phase of LHCb and has been in faultless operation during the first two years of physics luminosity data taking.
FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling.
Kim, Chang-Min; Park, Hyung-Min; Kim, Taesu; Choi, Yoon-Kyung; Lee, Soo-Young
2003-01-01
An field programmable gate array (FPGA) implementation of independent component analysis (ICA) algorithm is reported for blind signal separation (BSS) and adaptive noise canceling (ANC) in real time. In order to provide enormous computing power for ICA-based algorithms with multipath reverberation, a special digital processor is designed and implemented in FPGA. The chip design fully utilizes modular concept and several chips may be put together for complex applications with a large number of noise sources. Experimental results with a fabricated test board are reported for ANC only, BSS only, and simultaneous ANC/BSS, which demonstrates successful speech enhancement in real environments in real time.
Improved On-Chip Measurement of Delay in an FPGA or ASIC
NASA Technical Reports Server (NTRS)
Chen, Yuan; Burke, Gary; Sheldon, Douglas
2007-01-01
An improved design has been devised for on-chip-circuitry for measuring the delay through a chain of combinational logic elements in a field-programmable gate array (FPGA) or application-specific integrated circuit (ASIC). In the improved design, the delay chain does not include input and output buffers and is not configured as an oscillator. Instead, the delay chain is made part of the signal chain of an on-chip pulse generator. The duration of the pulse is measured on-chip and taken to equal the delay.
Research on NC motion controller based on SOPC technology
NASA Astrophysics Data System (ADS)
Jiang, Tingbiao; Meng, Biao
2006-11-01
With the rapid development of the digitization and informationization, the application of numerical control technology in the manufacturing industry becomes more and more important. However, the conventional numerical control system usually has some shortcomings such as the poor in system openness, character of real-time, cutability and reconfiguration. In order to solve these problems, this paper investigates the development prospect and advantage of the application in numerical control area with system-on-a-Programmable-Chip (SOPC) technology, and puts forward to a research program approach to the NC controller based on SOPC technology. Utilizing the characteristic of SOPC technology, we integrate high density logic device FPGA, memory SRAM, and embedded processor ARM into a single programmable logic device. We also combine the 32-bit RISC processor with high computing capability of the complicated algorithm with the FPGA device with strong motivable reconfiguration logic control ability. With these steps, we can greatly resolve the defect described in above existing numerical control systems. For the concrete implementation method, we use FPGA chip embedded with ARM hard nuclear processor to construct the control core of the motion controller. We also design the peripheral circuit of the controller according to the requirements of actual control functions, transplant real-time operating system into ARM, design the driver of the peripheral assisted chip, develop the application program to control and configuration of FPGA, design IP core of logic algorithm for various NC motion control to configured it into FPGA. The whole control system uses the concept of modular and structured design to develop hardware and software system. Thus the NC motion controller with the advantage of easily tailoring, highly opening, reconfigurable, and expandable can be implemented.
Design of an MR image processing module on an FPGA chip
NASA Astrophysics Data System (ADS)
Li, Limin; Wyrwicz, Alice M.
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments.
Design of an MR image processing module on an FPGA chip
Li, Limin; Wyrwicz, Alice M.
2015-01-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128 × 128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. PMID:25909646
FPGA Online Tracking Algorithm for the PANDA Straw Tube Tracker
NASA Astrophysics Data System (ADS)
Liang, Yutie; Ye, Hua; Galuska, Martin J.; Gessler, Thomas; Kuhn, Wolfgang; Lange, Jens Soren; Wagner, Milan N.; Liu, Zhen'an; Zhao, Jingzhou
2017-06-01
A novel FPGA based online tracking algorithm for helix track reconstruction in a solenoidal field, developed for the PANDA spectrometer, is described. Employing the Straw Tube Tracker detector with 4636 straw tubes, the algorithm includes a complex track finder, and a track fitter. Implemented in VHDL, the algorithm is tested on a Xilinx Virtex-4 FX60 FPGA chip with different types of events, at different event rates. A processing time of 7 $\\mu$s per event for an average of 6 charged tracks is obtained. The momentum resolution is about 3\\% (4\\%) for $p_t$ ($p_z$) at 1 GeV/c. Comparing to the algorithm running on a CPU chip (single core Intel Xeon E5520 at 2.26 GHz), an improvement of 3 orders of magnitude in processing time is obtained. The algorithm can handle severe overlapping of events which are typical for interaction rates above 10 MHz.
Address-event-based platform for bioinspired spiking systems
NASA Astrophysics Data System (ADS)
Jiménez-Fernández, A.; Luján, C. D.; Linares-Barranco, A.; Gómez-Rodríguez, F.; Rivas, M.; Jiménez, G.; Civit, A.
2007-05-01
Address Event Representation (AER) is an emergent neuromorphic interchip communication protocol that allows a real-time virtual massive connectivity between huge number neurons, located on different chips. By exploiting high speed digital communication circuits (with nano-seconds timings), synaptic neural connections can be time multiplexed, while neural activity signals (with mili-seconds timings) are sampled at low frequencies. Also, neurons generate "events" according to their activity levels. More active neurons generate more events per unit time, and access the interchip communication channel more frequently, while neurons with low activity consume less communication bandwidth. When building multi-chip muti-layered AER systems, it is absolutely necessary to have a computer interface that allows (a) reading AER interchip traffic into the computer and visualizing it on the screen, and (b) converting conventional frame-based video stream in the computer into AER and injecting it at some point of the AER structure. This is necessary for test and debugging of complex AER systems. In the other hand, the use of a commercial personal computer implies to depend on software tools and operating systems that can make the system slower and un-robust. This paper addresses the problem of communicating several AER based chips to compose a powerful processing system. The problem was discussed in the Neuromorphic Engineering Workshop of 2006. The platform is based basically on an embedded computer, a powerful FPGA and serial links, to make the system faster and be stand alone (independent from a PC). A new platform is presented that allow to connect up to eight AER based chips to a Spartan 3 4000 FPGA. The FPGA is responsible of the network communication based in Address-Event and, at the same time, to map and transform the address space of the traffic to implement a pre-processing. A MMU microprocessor (Intel XScale 400MHz Gumstix Connex computer) is also connected to the FPGA to allow the platform to implement eventbased algorithms to interact to the AER system, like control algorithms, network connectivity, USB support, etc. The LVDS transceiver allows a bandwidth of up to 1.32 Gbps, around ~66 Mega events per second (Mevps).
FPGA based control system for space instrumentation
NASA Astrophysics Data System (ADS)
Di Giorgio, Anna M.; Cerulli Irelli, Pasquale; Nuzzolo, Francesco; Orfei, Renato; Spinoglio, Luigi; Liu, Giovanni S.; Saraceno, Paolo
2008-07-01
The prototype for a general purpose FPGA based control system for space instrumentation is presented, with particular attention to the instrument control application software. The system HW is based on the LEON3FT processor, which gives the flexibility to configure the chip with only the necessary HW functionalities, from simple logic up to small dedicated processors. The instrument control SW is developed in ANSI C and for time critical (<10μs) commanding sequences implements an internal instructions sequencer, triggered via an interrupt service routine based on a HW high priority interrupt.
Locomotive track detection for underground
NASA Astrophysics Data System (ADS)
Ma, Zhonglei; Lang, Wenhui; Li, Xiaoming; Wei, Xing
2017-08-01
In order to improve the PC-based track detection system, this paper proposes a method to detect linear track for underground locomotive based on DSP + FPGA. Firstly, the analog signal outputted from the camera is sampled by A / D chip. Then the collected digital signal is preprocessed by FPGA. Secondly, the output signal of FPGA is transmitted to DSP via EMIF port. Subsequently, the adaptive threshold edge detection, polar angle and radius constrain based Hough transform are implemented by DSP. Lastly, the detected track information is transmitted to host computer through Ethernet interface. The experimental results show that the system can not only meet the requirements of real-time detection, but also has good robustness.
Sensor Systems Based on FPGAs and Their Applications: A Survey
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2012-01-01
In this manuscript, we present a survey of designs and implementations of research sensor nodes that rely on FPGAs, either based upon standalone platforms or as a combination of microcontroller and FPGA. Several current challenges in sensor networks are distinguished and linked to the features of modern FPGAs. As it turns out, low-power optimized FPGAs are able to enhance the computation of several types of algorithms in terms of speed and power consumption in comparison to microcontrollers of commercial sensor nodes. We show that architectures based on the combination of microcontrollers and FPGA can play a key role in the future of sensor networks, in fields where processing capabilities such as strong cryptography, self-testing and data compression, among others, are paramount.
Baker, Zachary Kent; Power, John Fredrick; Tripp, Justin Leonard; Dunham, Mark Edward; Stettler, Matthew W; Jones, John Alexander
2014-10-14
Disclosed is a method and system for performing operations on at least one input data vector in order to produce at least one output vector to permit easy, scalable and fast programming of a petascale equivalent supercomputer. A PetaFlops Router may comprise one or more PetaFlops Nodes, which may be connected to each other and/or external data provider/consumers via a programmable crossbar switch external to the PetaFlops Node. Each PetaFlops Node has a FPGA and a programmable intra-FPGA crossbar switch that permits input and output variables to be configurably connected to various physical operators contained in the FPGA as desired by a user. This allows a user to specify the instruction set of the system on a per-application basis. Further, the intra-FPGA crossbar switch permits the output of one operation to be delivered as an input to a second operation. By configuring the external crossbar switch, the output of a first operation on a first PetaFlops Node may be used as the input for a second operation on a second PetaFlops Node. An embodiment may provide an ability for the system to recognize and generate pipelined functions. Streaming operators may be connected together at run-time and appropriately staged to allow data to flow through a series of functions. This allows the system to provide high throughput and parallelism when possible. The PetaFlops Router may implement the user desired instructions by appropriately configuring the intra-FPGA crossbar switch on each PetaFlops Node and the external crossbar switch.
Precise delay measurement through combinatorial logic
NASA Technical Reports Server (NTRS)
Burke, Gary R. (Inventor); Chen, Yuan (Inventor); Sheldon, Douglas J. (Inventor)
2010-01-01
A high resolution circuit and method for facilitating precise measurement of on-chip delays for FPGAs for reliability studies. The circuit embeds a pulse generator on an FPGA chip having one or more groups of LUTS (the "LUT delay chain"), also on-chip. The circuit also embeds a pulse width measurement circuit on-chip, and measures the duration of the generated pulse through the delay chain. The pulse width of the output pulse represents the delay through the delay chain without any I/O delay. The pulse width measurement circuit uses an additional asynchronous clock autonomous from the main clock and the FPGA propagation delay can be displayed on a hex display continuously for testing purposes.
A single FPGA-based portable ultrasound imaging system for point-of-care applications.
Kim, Gi-Duck; Yoon, Changhan; Kye, Sang-Bum; Lee, Youngbae; Kang, Jeeun; Yoo, Yangmo; Song, Tai-kyong
2012-07-01
We present a cost-effective portable ultrasound system based on a single field-programmable gate array (FPGA) for point-of-care applications. In the portable ultrasound system developed, all the ultrasound signal and image processing modules, including an effective 32-channel receive beamformer with pseudo-dynamic focusing, are embedded in an FPGA chip. For overall system control, a mobile processor running Linux at 667 MHz is used. The scan-converted ultrasound image data from the FPGA are directly transferred to the system controller via external direct memory access without a video processing unit. The potable ultrasound system developed can provide real-time B-mode imaging with a maximum frame rate of 30, and it has a battery life of approximately 1.5 h. These results indicate that the single FPGA-based portable ultrasound system developed is able to meet the processing requirements in medical ultrasound imaging while providing improved flexibility for adapting to emerging POC applications.
Design of an MR image processing module on an FPGA chip.
Li, Limin; Wyrwicz, Alice M
2015-06-01
We describe the design and implementation of an image processing module on a single-chip Field-Programmable Gate Array (FPGA) for real-time image processing. We also demonstrate that through graphical coding the design work can be greatly simplified. The processing module is based on a 2D FFT core. Our design is distinguished from previously reported designs in two respects. No off-chip hardware resources are required, which increases portability of the core. Direct matrix transposition usually required for execution of 2D FFT is completely avoided using our newly-designed address generation unit, which saves considerable on-chip block RAMs and clock cycles. The image processing module was tested by reconstructing multi-slice MR images from both phantom and animal data. The tests on static data show that the processing module is capable of reconstructing 128×128 images at speed of 400 frames/second. The tests on simulated real-time streaming data demonstrate that the module works properly under the timing conditions necessary for MRI experiments. Copyright © 2015 Elsevier Inc. All rights reserved.
SAD-Based Stereo Vision Machine on a System-on-Programmable-Chip (SoPC)
Zhang, Xiang; Chen, Zhangwei
2013-01-01
This paper, proposes a novel solution for a stereo vision machine based on the System-on-Programmable-Chip (SoPC) architecture. The SOPC technology provides great convenience for accessing many hardware devices such as DDRII, SSRAM, Flash, etc., by IP reuse. The system hardware is implemented in a single FPGA chip involving a 32-bit Nios II microprocessor, which is a configurable soft IP core in charge of managing the image buffer and users' configuration data. The Sum of Absolute Differences (SAD) algorithm is used for dense disparity map computation. The circuits of the algorithmic module are modeled by the Matlab-based DSP Builder. With a set of configuration interfaces, the machine can process many different sizes of stereo pair images. The maximum image size is up to 512 K pixels. This machine is designed to focus on real time stereo vision applications. The stereo vision machine offers good performance and high efficiency in real time. Considering a hardware FPGA clock of 90 MHz, 23 frames of 640 × 480 disparity maps can be obtained in one second with 5 × 5 matching window and maximum 64 disparity pixels. PMID:23459385
DDGIPS: a general image processing system in robot vision
NASA Astrophysics Data System (ADS)
Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang
2000-10-01
Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi-algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6201s each has a local memory space, and they also have a shared system memory space which enables them to intercommunicate and exchange data efficiently. At the same time, they can be directly inter-connected in star-shaped architecture. All of these are under the control of a FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
DDGIPS: a general image processing system in robot vision
NASA Astrophysics Data System (ADS)
Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang
2000-10-01
Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi- algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6210s each has a local memory space, and they also have a shared system memory space which enable them to intercommunicate and exchange data efficiently. At the same time, they can be directly interconnected in star- shaped architecture. All of these are under the control of FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
Systems-on-chip approach for real-time simulation of wheel-rail contact laws
NASA Astrophysics Data System (ADS)
Mei, T. X.; Zhou, Y. J.
2013-04-01
This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
A generic FPGA-based detector readout and real-time image processing board
NASA Astrophysics Data System (ADS)
Sarpotdar, Mayuresh; Mathew, Joice; Safonova, Margarita; Murthy, Jayant
2016-07-01
For space-based astronomical observations, it is important to have a mechanism to capture the digital output from the standard detector for further on-board analysis and storage. We have developed a generic (application- wise) field-programmable gate array (FPGA) board to interface with an image sensor, a method to generate the clocks required to read the image data from the sensor, and a real-time image processor system (on-chip) which can be used for various image processing tasks. The FPGA board is applied as the image processor board in the Lunar Ultraviolet Cosmic Imager (LUCI) and a star sensor (StarSense) - instruments developed by our group. In this paper, we discuss the various design considerations for this board and its applications in the future balloon and possible space flights.
NASA Astrophysics Data System (ADS)
Naldi, G.; Bartolini, M.; Mattana, A.; Pupillo, G.; Hickish, J.; Foster, G.; Bianchi, G.; Lingua, A.; Monari, J.; Montebugnoli, S.; Perini, F.; Rusticelli, S.; Schiaffino, M.; Virone, G.; Zarb Adami, K.
In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
A FPGA embedded web server for remote monitoring and control of smart sensors networks.
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2013-12-27
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology.
A FPGA Embedded Web Server for Remote Monitoring and Control of Smart Sensors Networks
Magdaleno, Eduardo; Rodríguez, Manuel; Pérez, Fernando; Hernández, David; García, Enrique
2014-01-01
This article describes the implementation of a web server using an embedded Altera NIOS II IP core, a general purpose and configurable RISC processor which is embedded in a Cyclone FPGA. The processor uses the μCLinux operating system to support a Boa web server of dynamic pages using Common Gateway Interface (CGI). The FPGA is configured to act like the master node of a network, and also to control and monitor a network of smart sensors or instruments. In order to develop a totally functional system, the FPGA also includes an implementation of the time-triggered protocol (TTP/A). Thus, the implemented master node has two interfaces, the webserver that acts as an Internet interface and the other to control the network. This protocol is widely used to connecting smart sensors and actuators and microsystems in embedded real-time systems in different application domains, e.g., industrial, automotive, domotic, etc., although this protocol can be easily replaced by any other because of the inherent characteristics of the FPGA-based technology. PMID:24379047
Moving Horizon Estimation on a Chip
2014-06-26
description, e.g. VHDL or Verilog, for FPGA implementation . Especially for those whose main expertise is in control system design, writing algorithms in C...ditional Kalman Filter(KF) where recursive solution is available. We devel- oped various MHE designs and implemented them on the Xilinx Zynq ZC702 FPGA...practical deployment of the MHE technology. 2.2 Implementation of MHE on FPGA The next paper demonstrated the feasibility of implementing MHE algo
Solving global shallow water equations on heterogeneous supercomputers
Fu, Haohuan; Gan, Lin; Yang, Chao; Xue, Wei; Wang, Lanning; Wang, Xinliang; Huang, Xiaomeng; Yang, Guangwen
2017-01-01
The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration of heterogeneous accelerators, how to scale the computing problems onto more nodes and various kinds of accelerators has become a challenge for the model development. This paper describes our efforts on developing a highly scalable framework for performing global atmospheric modeling on heterogeneous supercomputers equipped with various accelerators, such as GPU (Graphic Processing Unit), MIC (Many Integrated Core), and FPGA (Field Programmable Gate Arrays) cards. We propose a generalized partition scheme of the problem domain, so as to keep a balanced utilization of both CPU resources and accelerator resources. With optimizations on both computing and memory access patterns, we manage to achieve around 8 to 20 times speedup when comparing one hybrid GPU or MIC node with one CPU node with 12 cores. Using a customized FPGA-based data-flow engines, we see the potential to gain another 5 to 8 times improvement on performance. On heterogeneous supercomputers, such as Tianhe-1A and Tianhe-2, our framework is capable of achieving ideally linear scaling efficiency, and sustained double-precision performances of 581 Tflops on Tianhe-1A (using 3750 nodes) and 3.74 Pflops on Tianhe-2 (using 8644 nodes). Our study also provides an evaluation on the programming paradigm of various accelerator architectures (GPU, MIC, FPGA) for performing global atmospheric simulation, to form a picture about both the potential performance benefits and the programming efforts involved. PMID:28282428
Real-time implementation of camera positioning algorithm based on FPGA & SOPC
NASA Astrophysics Data System (ADS)
Yang, Mingcao; Qiu, Yuehong
2014-09-01
In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
FPGA-Based Laboratory Assignments for NoC-Based Manycore Systems
ERIC Educational Resources Information Center
Ttofis, C.; Theocharides, T.; Michael, M. K.
2012-01-01
Manycore systems have emerged as being one of the dominant architectural trends in next-generation computer systems. These highly parallel systems are expected to be interconnected via packet-based networks-on-chip (NoC). The complexity of such systems poses novel and exciting challenges in academia, as teaching their design requires the students…
Design and implementation of a programming circuit in radiation-hardened FPGA
NASA Astrophysics Data System (ADS)
Lihua, Wu; Xiaowei, Han; Yan, Zhao; Zhongli, Liu; Fang, Yu; Chen, Stanley L.
2011-08-01
We present a novel programming circuit used in our radiation-hardened field programmable gate array (FPGA) chip. This circuit provides the ability to write user-defined configuration data into an FPGA and then read it back. The proposed circuit adopts the direct-access programming point scheme instead of the typical long token shift register chain. It not only saves area but also provides more flexible configuration operations. By configuring the proposed partial configuration control register, our smallest configuration section can be conveniently configured as a single data and a flexible partial configuration can be easily implemented. The hierarchical simulation scheme, optimization of the critical path and the elaborate layout plan make this circuit work well. Also, the radiation hardened by design programming point is introduced. This circuit has been implemented in a static random access memory (SRAM)-based FPGA fabricated by a 0.5 μm partial-depletion silicon-on-insulator CMOS process. The function test results of the fabricated chip indicate that this programming circuit successfully realizes the desired functions in the configuration and read-back. Moreover, the radiation test results indicate that the programming circuit has total dose tolerance of 1 × 105 rad(Si), dose rate survivability of 1.5 × 1011 rad(Si)/s and neutron fluence immunity of 1 × 1014 n/cm2.
Driver face tracking using semantics-based feature of eyes on single FPGA
NASA Astrophysics Data System (ADS)
Yu, Ying-Hao; Chen, Ji-An; Ting, Yi-Siang; Kwok, Ngaiming
2017-06-01
Tracking driver's face is one of the essentialities for driving safety control. This kind of system is usually designed with complicated algorithms to recognize driver's face by means of powerful computers. The design problem is not only about detecting rate but also from parts damages under rigorous environments by vibration, heat, and humidity. A feasible strategy to counteract these damages is to integrate entire system into a single chip in order to achieve minimum installation dimension, weight, power consumption, and exposure to air. Meanwhile, an extraordinary methodology is also indispensable to overcome the dilemma of low-computing capability and real-time performance on a low-end chip. In this paper, a novel driver face tracking system is proposed by employing semantics-based vague image representation (SVIR) for minimum hardware resource usages on a FPGA, and the real-time performance is also guaranteed at the same time. Our experimental results have indicated that the proposed face tracking system is viable and promising for the smart car design in the future.
NASA Astrophysics Data System (ADS)
Mandal, Swagata; Saini, Jogender; Zabołotny, Wojciech M.; Sau, Suman; Chakrabarti, Amlan; Chattopadhyay, Subhasis
2017-03-01
Due to the dramatic increase of data volume in modern high energy physics (HEP) experiments, a robust high-speed data acquisition (DAQ) system is very much needed to gather the data generated during different nuclear interactions. As the DAQ works under harsh radiation environment, there is a fair chance of data corruption due to various energetic particles like alpha, beta, or neutron. Hence, a major challenge in the development of DAQ in the HEP experiment is to establish an error resilient communication system between front-end sensors or detectors and back-end data processing computing nodes. Here, we have implemented the DAQ using field-programmable gate array (FPGA) due to some of its inherent advantages over the application-specific integrated circuit. A novel orthogonal concatenated code and cyclic redundancy check (CRC) have been used to mitigate the effects of data corruption in the user data. Scrubbing with a 32-b CRC has been used against error in the configuration memory of FPGA. Data from front-end sensors will reach to the back-end processing nodes through multiple stages that may add an uncertain amount of delay to the different data packets. We have also proposed a novel memory management algorithm that helps to process the data at the back-end computing nodes removing the added path delays. To the best of our knowledge, the proposed FPGA-based DAQ utilizing optical link with channel coding and efficient memory management modules can be considered as first of its kind. Performance estimation of the implemented DAQ system is done based on resource utilization, bit error rate, efficiency, and robustness to radiation.
Using partial reconfiguration for SoC design and implementation
NASA Astrophysics Data System (ADS)
Krasteva, Yana E.; Portilla, Jorge; Tobajas Guerrero, Félix; de la Torre, Eduardo
2009-05-01
Most reconfigurable systems rely on FPGA technology. Among these ones, those which permit dynamic and partial reconfiguration, offer added benefits in flexibility, in-field device upgrade, improved design and manufacturing time, and even, in some cases, power consumption reductions. However, dynamic reconfiguration is a complex task, and the real benefits of its use in real applications have been often questioned. This paper presents an overview of the partial reconfiguration technique application, along with four original applications. The main goal of these applications is to test several architectures with different flexibility and, to search for the partial reconfiguration "killing application", that is, the application that better demonstrates the benefits of today reconfigurable systems based on commercial FPGAs. Therefore, the presented applications are rather a proof of concept, than fully operative and closed systems. First, a brief introduction to the partial reconfigurable systems application topic has been included. After that, the descriptions of the created reconfigurable systems are presented: first, an on-chip communications emulation framework, second, an on chip debugging system, third, a wireless sensor network reconfigurable node and finally, a remote reconfigurable client-server device. Each application is described in a separate section of the paper along with some test and results. General conclusions are included at the end of the paper.
Readout and trigger for the AFP detector at ATLAS experiment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kocian, M.
AFP, the ATLAS Forward Proton consists of silicon detectors at 205 m and 217 m on each side of ATLAS. In 2016 two detectors in one side were installed. The FEI4 chips are read at 160 Mbps over the optical fibers. The DAQ system uses a FPGA board with Artix chip and a mezzanine card with RCE data processing module based on a Zynq chip with ARM processor running ArchLinux. Finally, in this paper we give an overview of the AFP detector with the commissioning steps taken to integrate with the ATLAS TDAQ. Furthermore first performance results are presented.
Readout and trigger for the AFP detector at ATLAS experiment
Kocian, M.
2017-01-25
AFP, the ATLAS Forward Proton consists of silicon detectors at 205 m and 217 m on each side of ATLAS. In 2016 two detectors in one side were installed. The FEI4 chips are read at 160 Mbps over the optical fibers. The DAQ system uses a FPGA board with Artix chip and a mezzanine card with RCE data processing module based on a Zynq chip with ARM processor running ArchLinux. Finally, in this paper we give an overview of the AFP detector with the commissioning steps taken to integrate with the ATLAS TDAQ. Furthermore first performance results are presented.
Dynamically Reconfigurable Systolic Array Accelerator
NASA Technical Reports Server (NTRS)
Dasu, Aravind; Barnes, Robert
2012-01-01
A polymorphic systolic array framework has been developed that works in conjunction with an embedded microprocessor on a field-programmable gate array (FPGA), which allows for dynamic and complimentary scaling of acceleration levels of two algorithms active concurrently on the FPGA. Use is made of systolic arrays and a hardware-software co-design to obtain an efficient multi-application acceleration system. The flexible and simple framework allows hosting of a broader range of algorithms, and is extendable to more complex applications in the area of aerospace embedded systems. FPGA chips can be responsive to realtime demands for changing applications needs, but only if the electronic fabric can respond fast enough. This systolic array framework allows for rapid partial and dynamic reconfiguration of the chip in response to the real-time needs of scalability, and adaptability of executables.
Neuromorphic VLSI vision system for real-time texture segregation.
Shimonomura, Kazuhiro; Yagi, Tetsuya
2008-10-01
The visual system of the brain can perceive an external scene in real-time with extremely low power dissipation, although the response speed of an individual neuron is considerably lower than that of semiconductor devices. The neurons in the visual pathway generate their receptive fields using a parallel and hierarchical architecture. This architecture of the visual cortex is interesting and important for designing a novel perception system from an engineering perspective. The aim of this study is to develop a vision system hardware, which is designed inspired by a hierarchical visual processing in V1, for real time texture segregation. The system consists of a silicon retina, orientation chip, and field programmable gate array (FPGA) circuit. The silicon retina emulates the neural circuits of the vertebrate retina and exhibits a Laplacian-Gaussian-like receptive field. The orientation chip selectively aggregates multiple pixels of the silicon retina in order to produce Gabor-like receptive fields that are tuned to various orientations by mimicking the feed-forward model proposed by Hubel and Wiesel. The FPGA circuit receives the output of the orientation chip and computes the responses of the complex cells. Using this system, the neural images of simple cells were computed in real-time for various orientations and spatial frequencies. Using the orientation-selective outputs obtained from the multi-chip system, a real-time texture segregation was conducted based on a computational model inspired by psychophysics and neurophysiology. The texture image was filtered by the two orthogonally oriented receptive fields of the multi-chip system and the filtered images were combined to segregate the area of different texture orientation with the aid of FPGA. The present system is also useful for the investigation of the functions of the higher-order cells that can be obtained by combining the simple and complex cells.
FPGA accelerator for protein secondary structure prediction based on the GOR algorithm
2011-01-01
Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582
Multi-DSP and FPGA based Multi-channel Direct IF/RF Digital receiver for atmospheric radar
NASA Astrophysics Data System (ADS)
Yasodha, Polisetti; Jayaraman, Achuthan; Kamaraj, Pandian; Durga rao, Meka; Thriveni, A.
2016-07-01
Modern phased array radars depend highly on digital signal processing (DSP) to extract the echo signal information and to accomplish reliability along with programmability and flexibility. The advent of ASIC technology has made various digital signal processing steps to be realized in one DSP chip, which can be programmed as per the application and can handle high data rates, to be used in the radar receiver to process the received signal. Further, recent days field programmable gate array (FPGA) chips, which can be re-programmed, also present an opportunity to utilize them to process the radar signal. A multi-channel direct IF/RF digital receiver (MCDRx) is developed at NARL, taking the advantage of high speed ADCs and high performance DSP chips/FPGAs, to be used for atmospheric radars working in HF/VHF bands. Multiple channels facilitate the radar t be operated in multi-receiver modes and also to obtain the wind vector with improved time resolution, without switching the antenna beam. MCDRx has six channels, implemented on a custom built digital board, which is realized using six numbers of ADCs for simultaneous processing of the six input signals, Xilinx vertex5 FPGA and Spartan6 FPGA, and two ADSPTS201 DSP chips, each of which performs one phase of processing. MCDRx unit interfaces with the data storage/display computer via two gigabit ethernet (GbE) links. One of the six channels is used for Doppler beam swinging (DBS) mode and the other five channels are used for multi-receiver mode operations, dedicatedly. Each channel has (i) ADC block, to digitize RF/IF signal, (ii) DDC block for digital down conversion of the digitized signal, (iii) decoding block to decode the phase coded signal, and (iv) coherent integration block for integrating the data preserving phase intact. ADC block consists of Analog devices make AD9467 16-bit ADCs, to digitize the input signal at 80 MSPS. The output of ADC is centered around (80 MHz - input frequency). The digitized data is fed to DDC block, which down converts the data to base-band. The DDC block has NCO, mixer and two chains of Bessel filters (fifth order cascaded integration comb filter, two FIR filters, two half band filters and programmable FIR filters) for in-phase (I) and Quadrature phase (Q) channels. The NCO has 32 bits and is set to match the output frequency of ADC. Further, DDC down samples (decimation) the data and reduces the data rate to 16 MSPS. This data is further decimated and the data rate is reduced down to 4/2/1/0.5/0.25/0.125/0.0625 MSPS for baud lengths 0.25/0.5/1/2/4/8/16 μs respectively. The down sampled data is then fed to decoding block, which performs cross correlation to achieve pulse compression of the binary-phase coded data to obtain better range resolution with maximum possible height coverage. This step improves the signal power by a factor equal to the length of the code. Coherent integration block integrates the decoded data coherently for successive pulses, which improves the signal to noise ratio and reduces the data volume. DDC, decoding and coherent integration blocks are implemented in Xilinx vertex5 FPGA. Till this point, function of all six channels is same for DBS mode and multi-receiver modes. Data from vertex5 FPGA is transferred to PC via GbE-1 interface for multi-modes or to two Analog devices make ADSP-TS201 DSP chips (A and B), via link port for DBS mode. ADSP-TS201 chips perform the normalization, DC removal, windowing, FFT computation and spectral averaging on the data, which is transferred to storage/display PC via GbE-2 interface for real-time data display and data storing. Physical layer of GbE interface is implemented in an external chip (Marvel 88E1111) and MAC layer is implemented internal to vertex5 FPGA. The MCDRx has total 4 GB of DDR2 memory for data storage. Spartan6 FPGA is used for generating timing signals, required for basic operation of the radar and testing of the MCDRx.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-01-01
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-12-15
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
Implementation of a wireless ECG acquisition SoC for IEEE 802.15.4 (ZigBee) applications.
Wang, Liang-Hung; Chen, Tsung-Yen; Lin, Kuang-Hao; Fang, Qiang; Lee, Shuenn-Yuh
2015-01-01
This paper presents a wireless biosignal acquisition system-on-a-chip (WBSA-SoC) specialized for electrocardiogram (ECG) monitoring. The proposed system consists of three subsystems, namely, 1) the ECG acquisition node, 2) the protocol for standard IEEE 802.15.4 ZigBee system, and 3) the RF transmitter circuits. The ZigBee protocol is adopted for wireless communication to achieve high integration, applicability, and portability. A fully integrated CMOS RF front end containing a quadrature voltage-controlled oscillator and a 2.4-GHz low-IF (i.e., zero-IF) transmitter is employed to transmit ECG signals through wireless communication. The low-power WBSA-SoC is implemented by the TSMC 0.18-μm standard CMOS process. An ARM-based displayer with FPGA demodulation and an RF receiver with analog-to-digital mixed-mode circuits are constructed as verification platform to demonstrate the wireless ECG acquisition system. Measurement results on the human body show that the proposed SoC can effectively acquire ECG signals.
Multichannel FPGA based MVT system for high precision time (20 ps RMS) and charge measurement
NASA Astrophysics Data System (ADS)
Pałka, M.; Strzempek, P.; Korcyl, G.; Bednarski, T.; Niedźwiecki, Sz.; Białas, P.; Czerwiński, E.; Dulski, K.; Gajos, A.; Głowacz, B.; Gorgol, M.; Jasińska, B.; Kamińska, D.; Kajetanowicz, M.; Kowalski, P.; Kozik, T.; Krzemień, W.; Kubicz, E.; Mohhamed, M.; Raczyński, L.; Rudy, Z.; Rundel, O.; Salabura, P.; Sharma, N. G.; Silarski, M.; Smyrski, J.; Strzelecki, A.; Wieczorek, A.; Wiślicki, W.; Zieliński, M.; Zgardzińska, B.; Moskal, P.
2017-08-01
In this article it is presented an FPGA based Multi-Voltage Threshold (MVT) system which allows of sampling fast signals (1-2 ns rising and falling edge) in both voltage and time domain. It is possible to achieve a precision of time measurement of 20 ps RMS and reconstruct charge of signals, using a simple approach, with deviation from real value smaller than 10%. Utilization of the differential inputs of an FPGA chip as comparators together with an implementation of a TDC inside an FPGA allowed us to achieve a compact multi-channel system characterized by low power consumption and low production costs. This paper describes realization and functioning of the system comprising 192-channel TDC board and a four mezzanine cards which split incoming signals and discriminate them. The boards have been used to validate a newly developed Time-of-Flight Positron Emission Tomography system based on plastic scintillators. The achieved full system time resolution of σ(TOF) ≈ 68 ps is by factor of two better with respect to the current TOF-PET systems.
Research based on the SoPC platform of feature-based image registration
NASA Astrophysics Data System (ADS)
Shi, Yue-dong; Wang, Zhi-hui
2015-12-01
This paper focuses on the study of implementing feature-based image registration by System on a Programmable Chip (SoPC) hardware platform. We solidify the image registration algorithm on the FPGA chip, in which embedded soft core processor Nios II can speed up the image processing system. In this way, we can make image registration technology get rid of the PC. And, consequently, this kind of technology will be got an extensive use. The experiment result indicates that our system shows stable performance, particularly in terms of matching processing which noise immunity is good. And feature points of images show a reasonable distribution.
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2013-01-01
Typically, commercial sensor nodes are equipped with MCUsclocked at a low-frequency (i.e., within the 4–12 MHz range). Consequently, executing cryptographic algorithms in those MCUs generally requires a huge amount of time. In this respect, the required energy consumption can be higher than using a separate accelerator based on a Field-programmable Gate Array (FPGA) that is switched on when needed. In this manuscript, we present the design of a cryptographic accelerator suitable for an FPGA-based sensor node and compliant with the IEEE802.15.4 standard. All the embedded resources of the target platform (Xilinx Artix-7) have been maximized in order to provide a cost-effective solution. Moreover, we have added key negotiation capabilities to the IEEE 802.15.4 security suite based on Elliptic Curve Cryptography (ECC;. Our results suggest that tailored accelerators based on FPGA can behave better in terms of energy than contemporary software solutions for motes, such as the TinyECC and NanoECC libraries. In this regard, a point multiplication (PM) can be performed between 8.58- and 15.4-times faster, 3.40- to 23.59-times faster (Elliptic Curve Diffie-Hellman, ECDH) and between 5.45- and 34.26-times faster (Elliptic Curve Integrated Encryption Scheme, ECIES). Moreover, the energy consumption was also improved with a factor of 8.96 (PM). PMID:23899936
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2013-07-29
Typically, commercial sensor nodes are equipped with MCUsclocked at a low-frequency (i.e., within the 4-12 MHz range). Consequently, executing cryptographic algorithms in those MCUs generally requires a huge amount of time. In this respect, the required energy consumption can be higher than using a separate accelerator based on a Field-programmable Gate Array (FPGA) that is switched on when needed. In this manuscript, we present the design of a cryptographic accelerator suitable for an FPGA-based sensor node and compliant with the IEEE802.15.4 standard. All the embedded resources of the target platform (Xilinx Artix-7) have been maximized in order to provide a cost-effective solution. Moreover, we have added key negotiation capabilities to the IEEE 802.15.4 security suite based on Elliptic Curve Cryptography (ECC). Our results suggest that tailored accelerators based on FPGA can behave better in terms of energy than contemporary software solutions for motes, such as the TinyECC and NanoECC libraries. In this regard, a point multiplication (PM) can be performed between 8.58- and 15.4-times faster, 3.40- to 23.59-times faster (Elliptic Curve Diffie-Hellman, ECDH) and between 5.45- and 34.26-times faster (Elliptic Curve Integrated Encryption Scheme, ECIES). Moreover, the energy consumption was also improved with a factor of 8.96 (PM).
The research of data acquisition system for Raman spectrometer
NASA Astrophysics Data System (ADS)
Cui, Xiao; Guo, Pan; Zhang, Yinchao; Chen, Siying; Chen, He; Chen, Wenbo
2011-11-01
Raman spectrometer has been widely used as an identification tool for analyzing material structure and composition in many fields. However, Raman scattering echo signal is very weak, about dozens of photons at most in one laser plus signal. Therefore, it is a great challenge to design a Raman spectrum data acquisition system which could accurately receive the weak echo signal. The system designed in this paper receives optical signals with the principle of photon counter and could detect single photon. The whole system consists of a photoelectric conversion module H7421-40 and a photo counting card including a field programmable gate array (FPGA) chip and a PCI9054 chip. The module H7421-40 including a PMT, an amplifier and a discriminator has high sensitivity on wavelength from 300nm to 720nm. The Center Wavelength is 580nm which is close to the excitation wavelength (532nm), QE 40% at peak wavelength, Count Sensitivity is 7.8*105(S-1PW-1) and Count Linearity is 1.5MHZ. In FPGA chip, the functions are divided into three parts: parameter setting module, controlling module, data collection and storage module. All the commands, parameters and data are transmitted between FPGA and computer by PCI9054 chip through the PCI interface. The result of experiment shows that the Raman spectrum data acquisition system is reasonable and efficient. There are three primary advantages of the data acquisition system: the first one is the high sensitivity with single photon detection capability; the second one is the high integrated level which means all the operation could be done by the photo counting card; and the last one is the high expansion ability because of the smart reconfigurability of FPGA chip.
FPGA implementation of motifs-based neuronal network and synchronization analysis
NASA Astrophysics Data System (ADS)
Deng, Bin; Zhu, Zechen; Yang, Shuangming; Wei, Xile; Wang, Jiang; Yu, Haitao
2016-06-01
Motifs in complex networks play a crucial role in determining the brain functions. In this paper, 13 kinds of motifs are implemented with Field Programmable Gate Array (FPGA) to investigate the relationships between the networks properties and motifs properties. We use discretization method and pipelined architecture to construct various motifs with Hindmarsh-Rose (HR) neuron as the node model. We also build a small-world network based on these motifs and conduct the synchronization analysis of motifs as well as the constructed network. We find that the synchronization properties of motif determine that of motif-based small-world network, which demonstrates effectiveness of our proposed hardware simulation platform. By imitation of some vital nuclei in the brain to generate normal discharges, our proposed FPGA-based artificial neuronal networks have the potential to replace the injured nuclei to complete the brain function in the treatment of Parkinson's disease and epilepsy.
Dynamically programmable cache
NASA Astrophysics Data System (ADS)
Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas
1998-10-01
Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
FPGA based charge fast histogramming for GEM detector
NASA Astrophysics Data System (ADS)
Poźniak, Krzysztof T.; Byszuk, A.; Chernyshova, M.; Cieszewski, R.; Czarski, T.; Dominik, W.; Jakubowska, K.; Kasprowicz, G.; Rzadkiewicz, J.; Scholz, M.; Zabolotny, W.
2013-10-01
This article presents a fast charge histogramming method for the position sensitive X-ray GEM detector. The energy resolved measurements are carried out simultaneously for 256 channels of the GEM detector. The whole process of histogramming is performed in 21 FPGA chips (Spartan-6 series from Xilinx) . The results of the histogramming process are stored in an external DDR3 memory. The structure of an electronic measuring equipment and a firmware functionality implemented in the FPGAs is described. Examples of test measurements are presented.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Efficient k-Winner-Take-All Competitive Learning Hardware Architecture for On-Chip Learning
Ou, Chien-Min; Li, Hui-Ya; Hwang, Wen-Jyi
2012-01-01
A novel k-winners-take-all (k-WTA) competitive learning (CL) hardware architecture is presented for on-chip learning in this paper. The architecture is based on an efficient pipeline allowing k-WTA competition processes associated with different training vectors to be performed concurrently. The pipeline architecture employs a novel codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for the subsequent training vectors. The architecture is implemented by the field programmable gate array (FPGA). It is used as a hardware accelerator in a system on programmable chip (SOPC) for realtime on-chip learning. Experimental results show that the SOPC has significantly lower training time than that of other k-WTA CL counterparts operating with or without hardware support.
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-01-01
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness. PMID:23867746
Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam
2013-07-17
Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
The readout chain for the bar PANDA MVD strip detector
NASA Astrophysics Data System (ADS)
Schnell, R.; Brinkmann, K.-Th.; Di Pietro, V.; Kleines, H.; Goerres, A.; Riccardi, A.; Rivetti, A.; Rolo, M. D.; Sohlbach, H.; Zaunick, H.-G.
2015-02-01
The bar PANDA (antiProton ANnihilation at DArmstadt) experiment will study the strong interaction in annihilation reactions between an antiproton beam and a stationary gas jet target. The detector will comprise different sub-detectors for tracking, particle identification and calorimetry. The Micro-Vertex Detector (MVD) as the innermost part of the tracking system will allow precise tracking and detection of secondary vertices. For the readout of the double-sided silicon strip sensors a custom-made ASIC is being developed, employing the Time-over-Threshold (ToT) technique for digitization and utilize time-to-digital converters (TDC) to provide a high-precision time stamp of the hit. A custom-made Module Data Concentrator ASIC (MDC) will multiplex the data of all front-ends of one sensor towards the CERN-developed GBT chip set (GigaBit Transceiver). The MicroTCA-based MVD Multiplexer Board (MMB) at the off-detector site will receive and concentrate the data from the GBT links and transfer it to FPGA-based compute nodes for global event building.
Performance evaluation of multiple (32 channels) sub-nanosecond TDC implemented in low-cost FPGA
NASA Astrophysics Data System (ADS)
Lichard, P.; Konstantinou, G.; Villar Vilanueva, A.; Palladino, V.
2014-03-01
NA62 experiment Straw tracker frontend board serves as a gas-tight detector cover and integrates two CARIOCA chips, a low cost FPGA (Cyclon III, Altera) and a set of 400Mbit/s links to the backend. The FPGA houses 16 pairs of sub-nanosecond resolution TDCs with derandomizers and an output link serializer. Evaluation methods, including simulations, and performance results of the system in the lab and on a detector prototype are presented.
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing
Bravo, Ignacio; Baliñas, Javier; Gardel, Alfredo; Lázaro, José L.; Espinosa, Felipe; García, Jorge
2011-01-01
This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely. PMID:22163739
Language Classification using N-grams Accelerated by FPGA-based Bloom Filters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacob, A; Gokhale, M
N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.
A novel pipeline based FPGA implementation of a genetic algorithm
NASA Astrophysics Data System (ADS)
Thirer, Nonel
2014-05-01
To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
An FPGA-Based People Detection System
NASA Astrophysics Data System (ADS)
Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.
2005-12-01
This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.
Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubois, David H; Dubois, Andrew J; Boorman, Thomas M
2009-03-10
This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
Design of a system based on DSP and FPGA for video recording and replaying
NASA Astrophysics Data System (ADS)
Kang, Yan; Wang, Heng
2013-08-01
This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA for video recording and replaying has a considerable perspective in analysis after the event, simulated exercitation and so forth.
A Secure Content Delivery System Based on a Partially Reconfigurable FPGA
NASA Astrophysics Data System (ADS)
Hori, Yohei; Yokoyama, Hiroyuki; Sakane, Hirofumi; Toda, Kenji
We developed a content delivery system using a partially reconfigurable FPGA to securely distribute digital content on the Internet. With partial reconfigurability of a Xilinx Virtex-II Pro FPGA, the system provides an innovative single-chip solution for protecting digital content. In the system, a partial circuit must be downloaded from a server to the client terminal to play content. Content will be played only when the downloaded circuit is correctly combined (=interlocked) with the circuit built in the terminal. Since each circuit has a unique I/O configuration, the downloaded circuit interlocks with the corresponding built-in circuit designed for a particular terminal. Thus, the interface of the circuit itself provides a novel authentication mechanism. This paper describes the detailed architecture of the system and clarify the feasibility and effectiveness of the system. In addition, we discuss a fail-safe mechanism and future work necessary for the practical application of the system.
Heavy-Ion Microbeam Fault Injection into SRAM-Based FPGA Implementations of Cryptographic Circuits
NASA Astrophysics Data System (ADS)
Li, Huiyun; Du, Guanghua; Shao, Cuiping; Dai, Liang; Xu, Guoqing; Guo, Jinlong
2015-06-01
Transistors hit by heavy ions may conduct transiently, thereby introducing transient logic errors. Attackers can exploit these abnormal behaviors and extract sensitive information from the electronic devices. This paper demonstrates an ion irradiation fault injection attack experiment into a cryptographic field-programmable gate-array (FPGA) circuit. The experiment proved that the commercial FPGA chip is vulnerable to low-linear energy transfer carbon irradiation, and the attack can cause the leakage of secret key bits. A statistical model is established to estimate the possibility of an effective fault injection attack on cryptographic integrated circuits. The model incorporates the effects from temporal, spatial, and logical probability of an effective attack on the cryptographic circuits. The rate of successful attack calculated from the model conforms well to the experimental results. This quantitative success rate model can help evaluate security risk for designers as well as for the third-party assessment organizations.
FPGA-Based, Self-Checking, Fault-Tolerant Computers
NASA Technical Reports Server (NTRS)
Some, Raphael; Rennels, David
2004-01-01
A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing identical programs in lock step, with comparison of their outputs to detect errors. It would also contain various cache local memory circuits, communication circuits, and configurable special-purpose processors that would use self-checking checkers. (The basic principle of the self-checking checker method is to utilize logic circuitry that generates error signals whenever there is an error in either the checker or the circuit being checked.) The memory system would comprise a main memory and a hardware-controlled check-pointing system (CPS) based on a buffer memory denoted the recovery cache. The main memory would contain random-access memory (RAM) chips and FPGAs that would, in addition to everything else, implement double-error-detecting and single-error-correcting memory functions to enable recovery from single-bit errors.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
A phase-based stereo vision system-on-a-chip.
Díaz, Javier; Ros, Eduardo; Sabatini, Silvio P; Solari, Fabio; Mota, Sonia
2007-02-01
A simple and fast technique for depth estimation based on phase measurement has been adopted for the implementation of a real-time stereo system with sub-pixel resolution on an FPGA device. The technique avoids the attendant problem of phase warping. The designed system takes full advantage of the inherent processing parallelism and segmentation capabilities of FPGA devices to achieve a computation speed of 65megapixels/s, which can be arranged with a customized frame-grabber module to process 211frames/s at a size of 640x480 pixels. The processing speed achieved is higher than conventional camera frame rates, thus allowing the system to extract multiple estimations and be used as a platform to evaluate integration schemes of a population of neurons without increasing hardware resource demands.
NASA Astrophysics Data System (ADS)
Hu, Kun; Lu, Houbing; Wang, Xu; Li, Feng; Wang, Xinxin; Geng, Tianru; Yang, Hang; Liu, Shengquan; Han, Liang; Jin, Ge
2017-06-01
A front-end electronics prototype for the ATLAS small-strip Thin Gap Chamber (sTGC) based on gigabit Ethernet has been developed. The prototype is designed to read out signals of pads, wires, and strips of the sTGC detector. The prototype includes two VMM2 chips developed to read out the signals of the sTGC, a Xilinx Kintex-7 field-programmable gate array (FPGA) used for the VMM2 configuration and the events storage, and a gigabit Ethernet transceiver PHY chip for interfacing with a computer. The VMM2 chip is designed for the readout of the Micromegas detector and sTGC detector, which is composed of 64 linear front-end channels. Each channel integrates a charge-sensitive amplifier, a shaper, several analog-to-digital converters, and other digital functions. For a bunch-crossing interval of 25 ns, events are continuously read out by the FPGA and forwarded to the computer. The interface between the computer and the prototype has been measured to reach an error-free rate of 900 Mb/s, therefore making a very effective use of the available bandwidth. Additionally, the computer can control several prototypes of this kind simultaneously via the Ethernet interface. At present, the prototype will be used for the sTGC performance test. The features of the prototype are described in detail.
Using SRAM Based FPGAs for Power-Aware High Performance Wireless Sensor Networks
Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa
2012-01-01
While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today’s applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements. PMID:22736971
Using SRAM based FPGAs for power-aware high performance wireless sensor networks.
Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa
2012-01-01
While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today's applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements.
All-Digital Time-Domain CMOS Smart Temperature Sensor with On-Chip Linearity Enhancement.
Chen, Chun-Chi; Chen, Chao-Lieh; Lin, Yi
2016-01-30
This paper proposes the first all-digital on-chip linearity enhancement technique for improving the accuracy of the time-domain complementary metal-oxide semiconductor (CMOS) smart temperature sensor. To facilitate on-chip application and intellectual property reuse, an all-digital time-domain smart temperature sensor was implemented using 90 nm Field Programmable Gate Arrays (FPGAs). Although the inverter-based temperature sensor has a smaller circuit area and lower complexity, two-point calibration must be used to achieve an acceptable inaccuracy. With the help of a calibration circuit, the influence of process variations was reduced greatly for one-point calibration support, reducing the test costs and time. However, the sensor response still exhibited a large curvature, which substantially affected the accuracy of the sensor. Thus, an on-chip linearity-enhanced circuit is proposed to linearize the curve and achieve a new linearity-enhanced output. The sensor was implemented on eight different Xilinx FPGA using 118 slices per sensor in each FPGA to demonstrate the benefits of the linearization. Compared with the unlinearized version, the maximal inaccuracy of the linearized version decreased from 5 °C to 2.5 °C after one-point calibration in a range of -20 °C to 100 °C. The sensor consumed 95 μW using 1 kSa/s. The proposed linearity enhancement technique significantly improves temperature sensing accuracy, avoiding costly curvature compensation while it is fully synthesizable for future Very Large Scale Integration (VLSI) system.
All-Digital Time-Domain CMOS Smart Temperature Sensor with On-Chip Linearity Enhancement
Chen, Chun-Chi; Chen, Chao-Lieh; Lin, Yi
2016-01-01
This paper proposes the first all-digital on-chip linearity enhancement technique for improving the accuracy of the time-domain complementary metal-oxide semiconductor (CMOS) smart temperature sensor. To facilitate on-chip application and intellectual property reuse, an all-digital time-domain smart temperature sensor was implemented using 90 nm Field Programmable Gate Arrays (FPGAs). Although the inverter-based temperature sensor has a smaller circuit area and lower complexity, two-point calibration must be used to achieve an acceptable inaccuracy. With the help of a calibration circuit, the influence of process variations was reduced greatly for one-point calibration support, reducing the test costs and time. However, the sensor response still exhibited a large curvature, which substantially affected the accuracy of the sensor. Thus, an on-chip linearity-enhanced circuit is proposed to linearize the curve and achieve a new linearity-enhanced output. The sensor was implemented on eight different Xilinx FPGA using 118 slices per sensor in each FPGA to demonstrate the benefits of the linearization. Compared with the unlinearized version, the maximal inaccuracy of the linearized version decreased from 5 °C to 2.5 °C after one-point calibration in a range of −20 °C to 100 °C. The sensor consumed 95 μW using 1 kSa/s. The proposed linearity enhancement technique significantly improves temperature sensing accuracy, avoiding costly curvature compensation while it is fully synthesizable for future Very Large Scale Integration (VLSI) system. PMID:26840316
A fast one-chip event-preprocessor and sequencer for the Simbol-X Low Energy Detector
NASA Astrophysics Data System (ADS)
Schanz, T.; Tenzer, C.; Maier, D.; Kendziorra, E.; Santangelo, A.
2010-12-01
We present an FPGA-based digital camera electronics consisting of an Event-Preprocessor (EPP) for on-board data preprocessing and a related Sequencer (SEQ) to generate the necessary signals to control the readout of the detector. The device has been originally designed for the Simbol-X low energy detector (LED). The EPP operates on 64×64 pixel images and has a real-time processing capability of more than 8000 frames per second. The already working releases of the EPP and the SEQ are now combined into one Digital-Camera-Controller-Chip (D3C).
NASA Astrophysics Data System (ADS)
Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza
2013-01-01
Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.
Fast Fourier Transform Co-Processor (FFTC)- Towards Embedded GFLOPs
NASA Astrophysics Data System (ADS)
Kuehl, Christopher; Liebstueckel, Uwe; Tejerina, Isaac; Uemminghaus, Michael; Wite, Felix; Kolb, Michael; Suess, Martin; Weigand, Roland
2012-08-01
Many signal processing applications and algorithms perform their operations on the data in the transform domain to gain efficiency. The Fourier Transform Co- Processor has been developed with the aim to offload General Purpose Processors from performing these transformations and therefore to boast the overall performance of a processing module. The IP of the commercial PowerFFT processor has been selected and adapted to meet the constraints of the space environment.In frame of the ESA activity “Fast Fourier Transform DSP Co-processor (FFTC)” (ESTEC/Contract No. 15314/07/NL/LvH/ma) the objectives were the following:Production of prototypes of a space qualified version of the commercial PowerFFT chip called FFTC based on the PowerFFT IP.The development of a stand-alone FFTC Accelerator Board (FTAB) based on the FFTC including the Controller FPGA and SpaceWire Interfaces to verify the FFTC function and performance.The FFTC chip performs its calculations with floating point precision. Stand alone it is capable computing FFTs of up to 1K complex samples in length in only 10μsec. This corresponds to an equivalent processing performance of 4.7 GFlops. In this mode the maximum sustained data throughput reaches 6.4Gbit/s. When connected to up to 4 EDAC protected SDRAM memory banks the FFTC can perform long FFTs with up to 1M complex samples in length or multidimensional FFT- based processing tasks.A Controller FPGA on the FTAB takes care of the SDRAM addressing. The instructions commanded via the Controller FPGA are used to set up the data flow and generate the memory addresses.The presentation will give and overview on the project, including the results of the validation of the FFTC ASIC prototypes.
Fast Fourier Transform Co-processor (FFTC), towards embedded GFLOPs
NASA Astrophysics Data System (ADS)
Kuehl, Christopher; Liebstueckel, Uwe; Tejerina, Isaac; Uemminghaus, Michael; Witte, Felix; Kolb, Michael; Suess, Martin; Weigand, Roland; Kopp, Nicholas
2012-10-01
Many signal processing applications and algorithms perform their operations on the data in the transform domain to gain efficiency. The Fourier Transform Co-Processor has been developed with the aim to offload General Purpose Processors from performing these transformations and therefore to boast the overall performance of a processing module. The IP of the commercial PowerFFT processor has been selected and adapted to meet the constraints of the space environment. In frame of the ESA activity "Fast Fourier Transform DSP Co-processor (FFTC)" (ESTEC/Contract No. 15314/07/NL/LvH/ma) the objectives were the following: • Production of prototypes of a space qualified version of the commercial PowerFFT chip called FFTC based on the PowerFFT IP. • The development of a stand-alone FFTC Accelerator Board (FTAB) based on the FFTC including the Controller FPGA and SpaceWire Interfaces to verify the FFTC function and performance. The FFTC chip performs its calculations with floating point precision. Stand alone it is capable computing FFTs of up to 1K complex samples in length in only 10μsec. This corresponds to an equivalent processing performance of 4.7 GFlops. In this mode the maximum sustained data throughput reaches 6.4Gbit/s. When connected to up to 4 EDAC protected SDRAM memory banks the FFTC can perform long FFTs with up to 1M complex samples in length or multidimensional FFT-based processing tasks. A Controller FPGA on the FTAB takes care of the SDRAM addressing. The instructions commanded via the Controller FPGA are used to set up the data flow and generate the memory addresses. The paper will give an overview on the project, including the results of the validation of the FFTC ASIC prototypes.
The integration of FPGA TDC inside White Rabbit node
NASA Astrophysics Data System (ADS)
Li, H.; Xue, T.; Gong, G.; Li, J.
2017-04-01
White Rabbit technology is capable of delivering sub-nanosecond accuracy and picosecond precision of synchronization and normal data packets over the fiber network. Carry chain structure in FPGA is a popular way to build TDC and tens of picosecond RMS resolution has been achieved. The integration of WR technology with FPGA TDC can enhance and simplify the TDC in many aspects that includes providing a low jitter clock for TDC, a synchronized absolute UTC/TAI timestamp for coarse counter, a fancy way to calibrate the carry chain DNL and an easy to use Ethernet link for data and control information transmit. This paper presents a FPGA TDC implemented inside a normal White Rabbit node with sub-nanosecond measurement precision. The measured standard deviation reaches 50ps between two distributed TDCs. Possible applications of this distributed TDC are also discussed.
High resolution distributed time-to-digital converter (TDC) in a White Rabbit network
NASA Astrophysics Data System (ADS)
Pan, Weibin; Gong, Guanghua; Du, Qiang; Li, Hongming; Li, Jianmin
2014-02-01
The Large High Altitude Air Shower Observatory (LHAASO) project consists of a complex detector array with over 6000 detector nodes spreading over 1.2 km2 areas. The arrival times of shower particles are captured by time-to-digital converters (TDCs) in the detectors' frontend electronics, the arrival direction of the high energy cosmic ray are then to be reconstructed from the space-time information of all detector nodes. To guarantee the angular resolution of 0.5°, a time synchronization of 500 ps (RMS) accuracy and 100 ps precision must be achieved among all TDC nodes. A technology enhancing Gigabit Ethernet, called the White Rabbit (WR), has shown the capability of delivering sub-nanosecond accuracy and picoseconds precision of synchronization over the standard data packet transfer. In this paper we demonstrate a distributed TDC prototype system combining the FPGA based TDC and the WR technology. With the time synchronization and data transfer services from a compact WR node, separate FPGA-TDC nodes can be combined to provide uniform time measurement information for correlated events. The design detail and test performance will be described in the paper.
A digitalized silicon microgyroscope based on embedded FPGA.
Xia, Dunzhu; Yu, Cheng; Wang, Yuliang
2012-09-27
This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system.
A Digitalized Silicon Microgyroscope Based on Embedded FPGA
Xia, Dunzhu; Yu, Cheng; Wang, Yuliang
2012-01-01
This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system. PMID:23201990
System Design of One-chip Wave Particle Interaction Analyzer for SCOPE mission.
NASA Astrophysics Data System (ADS)
Fukuhara, Hajime; Ueda, Yoshikatsu; Kojima, Hiro; Yamakawa, Hiroshi
In past science spacecrafts such like GEOTAIL, we usually capture electric and magnetic field waveforms and observe energetic eletron and ion particles as velocity distributions by each sensor. We analyze plasma wave-particle interactions by these respective data and the discussions are sometimes restricted by the difference of time resolution and by the data loss in desired regions. One-chip Wave Particle Interaction Analyzer (OWPIA) conducts direct quantitative observations of wave-particle interaction by direct 'E dot v' calculation on-board. This new instruments have a capability to use all plasma waveform data and electron particle informations. In the OWPIA system, we have to calibrate the digital observation data and transform the same coordinate system. All necessary calculations are processed in Field Programmable Gate Array(FPGA). In our study, we introduce a basic concept of the OWPIA system and a optimization method for each calculation functions installed in FPGA. And we also discuss the process speed, the FPGA utilization efficiency, the total power consumption.
Design and realization of the baseband processor in satellite navigation and positioning receiver
NASA Astrophysics Data System (ADS)
Zhang, Dawei; Hu, Xiulin; Li, Chen
2007-11-01
The content of this paper is focused on the Design and realization of the baseband processor in satellite navigation and positioning receiver. Baseband processor is the most important part of the satellite positioning receiver. The design covers baseband processor's main functions include multi-channel digital signal DDC, acquisition, code tracking, carrier tracking, demodulation, etc. The realization is based on an Altera's FPGA device, that makes the system can be improved and upgraded without modifying the hardware. It embodies the theory of software defined radio (SDR), and puts the theory of the spread spectrum into practice. This paper puts emphasis on the realization of baseband processor in FPGA. In the order of choosing chips, design entry, debugging and synthesis, the flow is presented detailedly. Additionally the paper detailed realization of Digital PLL in order to explain a method of reducing the consumption of FPGA. Finally, the paper presents the result of Synthesis. This design has been used in BD-1, BD-2 and GPS.
Design of barrier bucket kicker control system
NASA Astrophysics Data System (ADS)
Ni, Fa-Fu; Wang, Yan-Yu; Yin, Jun; Zhou, De-Tai; Shen, Guo-Dong; Zheng, Yang-De.; Zhang, Jian-Chuan; Yin, Jia; Bai, Xiao; Ma, Xiao-Li
2018-05-01
The Heavy-Ion Research Facility in Lanzhou (HIRFL) contains two synchrotrons: the main cooler storage ring (CSRm) and the experimental cooler storage ring (CSRe). Beams are extracted from CSRm, and injected into CSRe. To apply the Barrier Bucket (BB) method on the CSRe beam accumulation, a new BB technology based kicker control system was designed and implemented. The controller of the system is implemented using an Advanced Reduced Instruction Set Computer (RISC) Machine (ARM) chip and a field-programmable gate array (FPGA) chip. Within the architecture, ARM is responsible for data presetting and floating number arithmetic processing. The FPGA computes the RF phase point of the two rings and offers more accurate control of the time delay. An online preliminary experiment on HIRFL was also designed to verify the functionalities of the control system. The result shows that the reference trigger point of two different sinusoidal RF signals for an arbitrary phase point was acquired with a matched phase error below 1° (approximately 2.1 ns), and the step delay time better than 2 ns were realized.
NASA Astrophysics Data System (ADS)
Yussup, N.; Ibrahim, M. M.; Lombigit, L.; Rahman, N. A. A.; Zin, M. R. M.
2014-02-01
Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of data acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yussup, N.; Ibrahim, M. M.; Lombigit, L.
Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of datamore » acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.« less
Architectures for Cognitive Systems
2010-02-01
highly modular many- node chip was designed which addressed power efficiency to the maximum extent possible. Each node contains an Asynchronous Field...optimization to perform complex cognitive computing operations. This project focused on the design of the core and integration across a four node chip . A...follow on project will focus on creating a 3 dimensional stack of chips that is enabled by the low power usage. The chip incorporates structures to
A real-time multi-scale 2D Gaussian filter based on FPGA
NASA Astrophysics Data System (ADS)
Luo, Haibo; Gai, Xingqin; Chang, Zheng; Hui, Bin
2014-11-01
Multi-scale 2-D Gaussian filter has been widely used in feature extraction (e.g. SIFT, edge etc.), image segmentation, image enhancement, image noise removing, multi-scale shape description etc. However, their computational complexity remains an issue for real-time image processing systems. Aimed at this problem, we propose a framework of multi-scale 2-D Gaussian filter based on FPGA in this paper. Firstly, a full-hardware architecture based on parallel pipeline was designed to achieve high throughput rate. Secondly, in order to save some multiplier, the 2-D convolution is separated into two 1-D convolutions. Thirdly, a dedicate first in first out memory named as CAFIFO (Column Addressing FIFO) was designed to avoid the error propagating induced by spark on clock. Finally, a shared memory framework was designed to reduce memory costs. As a demonstration, we realized a 3 scales 2-D Gaussian filter on a single ALTERA Cyclone III FPGA chip. Experimental results show that, the proposed framework can computing a Multi-scales 2-D Gaussian filtering within one pixel clock period, is further suitable for real-time image processing. Moreover, the main principle can be popularized to the other operators based on convolution, such as Gabor filter, Sobel operator and so on.
Research of the small satellite data management system
NASA Astrophysics Data System (ADS)
Yu, Xiaozhou; Zhou, Fengqi; Zhou, Jun
2007-11-01
Small satellite is the integration of light weight, small volume and low launch cost. It is a promising approach to realize the future space mission. A detailed study of the data management system has been carried out, with using new reconfiguration method based on System On Programmable Chip (SOPC). Compared with common structure of satellite, the Central Terminal Unit (CTU), the Remote Terminal Unit (RTU) and Serial Data Bus (SDB) of the data management are all integrated in single chip. Thus the reliability of the satellite is greatly improved. At the same time, the data management system has powerful performance owing to the modern FPGA processing ability.
NASA Astrophysics Data System (ADS)
Alvear, Andrés.; Finger, Ricardo; Fuentes, Roberto; Sapunar, Raúl; Geelen, Tom; Curotto, Franco; Rodríguez, Rafael; Monasterio, David; Reyes, Nicolás.; Mena, Patricio; Bronfman, Leonardo
2016-07-01
Field Programmable Gate Arrays (FPGAs) capacity and Analog to Digital Converters (ADCs) speed have largely increased in the last decade. Nowadays we can find one million or more logic blocks (slices) as well as several thousand arithmetic units (ALUs/DSP) available on a single FPGA chip. We can also commercially procure ADC chips reaching 10 GSPS, with 8 bits resolution or more. This unprecedented power of computing hardware has allowed the digitalization of signal processes traditionally performed by analog components. In radio astronomy, the clearest example has been the development of digital sideband separating receivers which, by replacing the IF hybrid and calibrating the system imbalances, have exhibited a sideband rejection above 40dB; this is 20 to 30dB higher than traditional analog sideband separating (2SB) receivers. In Rodriguez et al.,1 and Finger et al.,2 we have demonstrated very high digital sideband separation at 3mm and 1mm wavelengths, using laboratory setups. We here show the first implementation of such technique with a 3mm receiver integrated into a telescope, where the calibration was performed by quasi-optical injection of the test tone in front of the Cassegrain antenna. We also reported progress in digital polarization synthesis, particularly in the implementation of a calibrated Digital Ortho-Mode Transducer (DOMT) based on the Morgan et al. proof of concept.3 They showed off- line synthesis of polarization with isolation higher than 40dB. We plan to implement a digital polarimeter in a real-time FPGA-based (ROACH-2) platform, to show ultra-pure polarization isolation in a non-stop integrating spectrometer.
Implementation of a pulse coupled neural network in FPGA.
Waldemark, J; Millberg, M; Lindblad, T; Waldemark, K; Becanovic, V
2000-06-01
The Pulse Coupled neural network, PCNN, is a biologically inspired neural net and it can be used in various image analysis applications, e.g. time-critical applications in the field of image pre-processing like segmentation, filtering, etc. a VHDL implementation of the PCNN targeting FPGA was undertaken and the results presented here. The implementation contains many interesting features. By pipelining the PCNN structure a very high throughput of 55 million neuron iterations per second could be achieved. By making the coefficients re-configurable during operation, a complete recognition system could be implemented on one, or maybe two, chip(s). Reconsidering the ranges and resolutions of the constants may save a lot of hardware, since the higher resolution requires larger multipliers, adders, memories etc.
Wireless Sensor Node for Autonomous Monitoring and Alerts in Remote Environments
NASA Technical Reports Server (NTRS)
Panangadan, Anand V. (Inventor); Monacos, Steve P. (Inventor)
2015-01-01
A method, apparatus, system, and computer program products provides personal alert and tracking capabilities using one or more nodes. Each node includes radio transceiver chips operating at different frequency ranges, a power amplifier, sensors, a display, and embedded software. The chips enable the node to operate as either a mobile sensor node or a relay base station node while providing a long distance relay link between nodes. The power amplifier enables a line-of-sight communication between the one or more nodes. The sensors provide a GPS signal, temperature, and accelerometer information (used to trigger an alert condition). The embedded software captures and processes the sensor information, provides a multi-hop packet routing protocol to relay the sensor information to and receive alert information from a command center, and to display the alert information on the display.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Learn, Mark Walter
Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System
NASA Astrophysics Data System (ADS)
Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.
2016-03-01
We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio interferometers.
Research of aerial imaging spectrometer data acquisition technology based on USB 3.0
NASA Astrophysics Data System (ADS)
Huang, Junze; Wang, Yueming; He, Daogang; Yu, Yanan
2016-11-01
With the emergence of UAV (unmanned aerial vehicle) platform for aerial imaging spectrometer, research of aerial imaging spectrometer DAS(data acquisition system) faces new challenges. Due to the limitation of platform and other factors, the aerial imaging spectrometer DAS requires small-light, low-cost and universal. Traditional aerial imaging spectrometer DAS system is expensive, bulky, non-universal and unsupported plug-and-play based on PCIe. So that has been unable to meet promotion and application of the aerial imaging spectrometer. In order to solve these problems, the new data acquisition scheme bases on USB3.0 interface.USB3.0 can provide guarantee of small-light, low-cost and universal relying on the forward-looking technology advantage. USB3.0 transmission theory is up to 5Gbps.And the GPIF programming interface achieves 3.2Gbps of the effective theoretical data bandwidth.USB3.0 can fully meet the needs of the aerial imaging spectrometer data transmission rate. The scheme uses the slave FIFO asynchronous data transmission mode between FPGA and USB3014 interface chip. Firstly system collects spectral data from TLK2711 of high-speed serial interface chip. Then FPGA receives data in DDR2 cache after ping-pong data processing. Finally USB3014 interface chip transmits data via automatic-dma approach and uploads to PC by USB3.0 cable. During the manufacture of aerial imaging spectrometer, the DAS can achieve image acquisition, transmission, storage and display. All functions can provide the necessary test detection for aerial imaging spectrometer. The test shows that system performs stable and no data lose. Average transmission speed and storage speed of writing SSD can stabilize at 1.28Gbps. Consequently ,this data acquisition system can meet application requirements for aerial imaging spectrometer.
Design and implementation of projects with Xilinx Zynq FPGA: a practical case
NASA Astrophysics Data System (ADS)
Travaglini, R.; D'Antone, I.; Meneghini, S.; Rignanese, L.; Zuffa, M.
The main advantage when using FPGAs with embedded processors is the availability of additional several high-performance resources in the same physical device. Moreover, the FPGA programmability allows for connect custom peripherals. Xilinx have designed a programmable device named Zynq-7000 (simply called Zynq in the following), which integrates programmable logic (identical to the other Xilinx "serie 7" devices) with a System on Chip (SOC) based on two embedded ARM processors. Since both parts are deeply connected, the designers benefit from performance of hardware SOC and flexibility of programmability as well. In this paper a design developed by the Electronic Design Department at the Bologna Division of INFN will be presented as a practical case of project based on Zynq device. It is developed by using a commercial board called ZedBoard hosting a FMC mezzanine with a 12-bit 500 MS/s ADC. The Zynq FPGA on the ZedBoard receives digital outputs from the ADC and send them to the acquisition PC, after proper formatting, through a Gigabit Ethernet link. The major focus of the paper will be about the methodology to develop a Zynq-based design with the Xilinx Vivado software, enlightening how to configure the SOC and connect it with the programmable logic. Firmware design techniques will be presented: in particular both VHDL and IP core based strategies will be discussed. Further, the procedure to develop software for the embedded processor will be presented. Finally, some debugging tools, like the embedded Logic Analyzer, will be shown. Advantages and disadvantages with respect to adopting FPGA without embedded processors will be discussed.
[A novel biologic electricity signal measurement based on neuron chip].
Lei, Yinsheng; Wang, Mingshi; Sun, Tongjing; Zhu, Qiang; Qin, Ran
2006-06-01
Neuron chip is a multiprocessor with three pipeline CPU; its communication protocol and control processor are integrated in effect to carry out the function of communication, control, attemper, I/O, etc. A novel biologic electronic signal measurement network system is composed of intelligent measurement nodes with neuron chip at the core. In this study, the electronic signals such as ECG, EEG, EMG and BOS can be synthetically measured by those intelligent nodes, and some valuable diagnostic messages are found. Wavelet transform is employed in this system to analyze various biologic electronic signals due to its strong time-frequency ability of decomposing signal local character. Better effect is gained. This paper introduces the hardware structure of network and intelligent measurement node, the measurement theory and the signal figure of data acquisition and processing.
HALO: a reconfigurable image enhancement and multisensor fusion system
NASA Astrophysics Data System (ADS)
Wu, F.; Hickman, D. L.; Parker, Steve J.
2014-06-01
Contemporary high definition (HD) cameras and affordable infrared (IR) imagers are set to dramatically improve the effectiveness of security, surveillance and military vision systems. However, the quality of imagery is often compromised by camera shake, or poor scene visibility due to inadequate illumination or bad atmospheric conditions. A versatile vision processing system called HALO™ is presented that can address these issues, by providing flexible image processing functionality on a low size, weight and power (SWaP) platform. Example processing functions include video distortion correction, stabilisation, multi-sensor fusion and image contrast enhancement (ICE). The system is based around an all-programmable system-on-a-chip (SoC), which combines the computational power of a field-programmable gate array (FPGA) with the flexibility of a CPU. The FPGA accelerates computationally intensive real-time processes, whereas the CPU provides management and decision making functions that can automatically reconfigure the platform based on user input and scene content. These capabilities enable a HALO™ equipped reconnaissance or surveillance system to operate in poor visibility, providing potentially critical operational advantages in visually complex and challenging usage scenarios. The choice of an FPGA based SoC is discussed, and the HALO™ architecture and its implementation are described. The capabilities of image distortion correction, stabilisation, fusion and ICE are illustrated using laboratory and trials data.
On-chip visual perception of motion: a bio-inspired connectionist model on FPGA.
Torres-Huitzil, César; Girau, Bernard; Castellanos-Sánchez, Claudio
2005-01-01
Visual motion provides useful information to understand the dynamics of a scene to allow intelligent systems interact with their environment. Motion computation is usually restricted by real time requirements that need the design and implementation of specific hardware architectures. In this paper, the design of hardware architecture for a bio-inspired neural model for motion estimation is presented. The motion estimation is based on a strongly localized bio-inspired connectionist model with a particular adaptation of spatio-temporal Gabor-like filtering. The architecture is constituted by three main modules that perform spatial, temporal, and excitatory-inhibitory connectionist processing. The biomimetic architecture is modeled, simulated and validated in VHDL. The synthesis results on a Field Programmable Gate Array (FPGA) device show the potential achievement of real-time performance at an affordable silicon area.
Neuron array with plastic synapses and programmable dendrites.
Ramakrishnan, Shubha; Wunderlich, Richard; Hasler, Jennifer; George, Suma
2013-10-01
We describe a novel neuromorphic chip architecture that models neurons for efficient computation. Traditional architectures of neuron array chips consist of large scale systems that are interfaced with AER for implementing intra- or inter-chip connectivity. We present a chip that uses AER for inter-chip communication but uses fast, reconfigurable FPGA-style routing with local memory for intra-chip connectivity. We model neurons with biologically realistic channel models, synapses and dendrites. This chip is suitable for small-scale network simulations and can also be used for sequence detection, utilizing directional selectivity properties of dendrites, ultimately for use in word recognition.
Estimating the circuit delay of FPGA with a transfer learning method
NASA Astrophysics Data System (ADS)
Cui, Xiuhai; Liu, Datong; Peng, Yu; Peng, Xiyuan
2017-10-01
With the increase of FPGA (Field Programmable Gate Array, FPGA) functionality, FPGA has become an on-chip system platform. Due to increase the complexity of FPGA, estimating the delay of FPGA is a very challenge work. To solve the problems, we propose a transfer learning estimation delay (TLED) method to simplify the delay estimation of different speed grade FPGA. In fact, the same style different speed grade FPGA comes from the same process and layout. The delay has some correlation among different speed grade FPGA. Therefore, one kind of speed grade FPGA is chosen as a basic training sample in this paper. Other training samples of different speed grade can get from the basic training samples through of transfer learning. At the same time, we also select a few target FPGA samples as training samples. A general predictive model is trained by these samples. Thus one kind of estimation model is used to estimate different speed grade FPGA circuit delay. The framework of TRED includes three phases: 1) Building a basic circuit delay library which includes multipliers, adders, shifters, and so on. These circuits are used to train and build the predictive model. 2) By contrasting experiments among different algorithms, the forest random algorithm is selected to train predictive model. 3) The target circuit delay is predicted by the predictive model. The Artix-7, Kintex-7, and Virtex-7 are selected to do experiments. Each of them includes -1, -2, -2l, and -3 different speed grade. The experiments show the delay estimation accuracy score is more than 92% with the TLED method. This result shows that the TLED method is a feasible delay assessment method, especially in the high-level synthesis stage of FPGA tool, which is an efficient and effective delay assessment method.
Design of low noise imaging system
NASA Astrophysics Data System (ADS)
Hu, Bo; Chen, Xiaolai
2017-10-01
In order to meet the needs of engineering applications for low noise imaging system under the mode of global shutter, a complete imaging system is designed based on the SCMOS (Scientific CMOS) image sensor CIS2521F. The paper introduces hardware circuit and software system design. Based on the analysis of key indexes and technologies about the imaging system, the paper makes chips selection and decides SCMOS + FPGA+ DDRII+ Camera Link as processing architecture. Then it introduces the entire system workflow and power supply and distribution unit design. As for the software system, which consists of the SCMOS control module, image acquisition module, data cache control module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The imaging experimental results show that the imaging system exhibits a 2560*2160 pixel resolution, has a maximum frame frequency of 50 fps. The imaging quality of the system satisfies the requirement of the index.
High performance embedded system for real-time pattern matching
NASA Astrophysics Data System (ADS)
Sotiropoulou, C.-L.; Luciano, P.; Gkaitatzis, S.; Citraro, S.; Giannetti, P.; Dell'Orso, M.
2017-02-01
In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton-proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device.
NASA Astrophysics Data System (ADS)
Gunay, Omer; Ozsarac, Ismail; Kamisli, Fatih
2017-05-01
Video recording is an essential property of new generation military imaging systems. Playback of the stored video on the same device is also desirable as it provides several operational benefits to end users. Two very important constraints for many military imaging systems, especially for hand-held devices and thermal weapon sights, are power consumption and size. To meet these constraints, it is essential to perform most of the processing applied to the video signal, such as preprocessing, compression, storing, decoding, playback and other system functions on a single programmable chip, such as FPGA, DSP, GPU or ASIC. In this work, H.264/AVC (Advanced Video Coding) compatible video compression, storage, decoding and playback blocks are efficiently designed and implemented on FPGA platforms using FPGA fabric and Altera NIOS II soft processor. Many subblocks that are used in video encoding are also used during video decoding in order to save FPGA resources and power. Computationally complex blocks are designed using FPGA fabric, while blocks such as SD card write/read, H.264 syntax decoding and CAVLC decoding are done using NIOS processor to benefit from software flexibility. In addition, to keep power consumption low, the system was designed to require limited external memory access. The design was tested using 640x480 25 fps thermal camera on CYCLONE V FPGA, which is the ALTERA's lowest power FPGA family, and consumes lower than 40% of CYCLONE V 5CEFA7 FPGA resources on average.
Toward a Dynamically Reconfigurable Computing and Communication System for Small Spacecraft
NASA Technical Reports Server (NTRS)
Kifle, Muli; Andro, Monty; Tran, Quang K.; Fujikawa, Gene; Chu, Pong P.
2003-01-01
Future science missions will require the use of multiple spacecraft with multiple sensor nodes autonomously responding and adapting to a dynamically changing space environment. The acquisition of random scientific events will require rapidly changing network topologies, distributed processing power, and a dynamic resource management strategy. Optimum utilization and configuration of spacecraft communications and navigation resources will be critical in meeting the demand of these stringent mission requirements. There are two important trends to follow with respect to NASA's (National Aeronautics and Space Administration) future scientific missions: the use of multiple satellite systems and the development of an integrated space communications network. Reconfigurable computing and communication systems may enable versatile adaptation of a spacecraft system's resources by dynamic allocation of the processor hardware to perform new operations or to maintain functionality due to malfunctions or hardware faults. Advancements in FPGA (Field Programmable Gate Array) technology make it possible to incorporate major communication and network functionalities in FPGA chips and provide the basis for a dynamically reconfigurable communication system. Advantages of higher computation speeds and accuracy are envisioned with tremendous hardware flexibility to ensure maximum survivability of future science mission spacecraft. This paper discusses the requirements, enabling technologies, and challenges associated with dynamically reconfigurable space communications systems.
A CMOS high speed imaging system design based on FPGA
NASA Astrophysics Data System (ADS)
Tang, Hong; Wang, Huawei; Cao, Jianzhong; Qiao, Mingrui
2015-10-01
CMOS sensors have more advantages than traditional CCD sensors. The imaging system based on CMOS has become a hot spot in research and development. In order to achieve the real-time data acquisition and high-speed transmission, we design a high-speed CMOS imaging system on account of FPGA. The core control chip of this system is XC6SL75T and we take advantages of CameraLink interface and AM41V4 CMOS image sensors to transmit and acquire image data. AM41V4 is a 4 Megapixel High speed 500 frames per second CMOS image sensor with global shutter and 4/3" optical format. The sensor uses column parallel A/D converters to digitize the images. The CameraLink interface adopts DS90CR287 and it can convert 28 bits of LVCMOS/LVTTL data into four LVDS data stream. The reflected light of objects is photographed by the CMOS detectors. CMOS sensors convert the light to electronic signals and then send them to FPGA. FPGA processes data it received and transmits them to upper computer which has acquisition cards through CameraLink interface configured as full models. Then PC will store, visualize and process images later. The structure and principle of the system are both explained in this paper and this paper introduces the hardware and software design of the system. FPGA introduces the driven clock of CMOS. The data in CMOS is converted to LVDS signals and then transmitted to the data acquisition cards. After simulation, the paper presents a row transfer timing sequence of CMOS. The system realized real-time image acquisition and external controls.
Energy Harvesting Chip and the Chip Based Power Supply Development for a Wireless Sensor Network.
Lee, Dasheng
2008-12-02
In this study, an energy harvesting chip was developed to scavenge energy from artificial light to charge a wireless sensor node. The chip core is a miniature transformer with a nano-ferrofluid magnetic core. The chip embedded transformer can convert harvested energy from its solar cell to variable voltage output for driving multiple loads. This chip system yields a simple, small, and more importantly, a battery-less power supply solution. The sensor node is equipped with multiple sensors that can be enabled by the energy harvesting power supply to collect information about the human body comfort degree. Compared with lab instruments, the nodes with temperature, humidity and photosensors driven by harvested energy had variation coefficient measurement precision of less than 6% deviation under low environmental light of 240 lux. The thermal comfort was affected by the air speed. A flow sensor equipped on the sensor node was used to detect airflow speed. Due to its high power consumption, this sensor node provided 15% less accuracy than the instruments, but it still can meet the requirement of analysis for predicted mean votes (PMV) measurement. The energy harvesting wireless sensor network (WSN) was deployed in a 24-hour convenience store to detect thermal comfort degree from the air conditioning control. During one year operation, the sensor network powered by the energy harvesting chip retained normal functions to collect the PMV index of the store. According to the one month statistics of communication status, the packet loss rate (PLR) is 2.3%, which is as good as the presented results of those WSNs powered by battery. Referring to the electric power records, almost 54% energy can be saved by the feedback control of an energy harvesting sensor network. These results illustrate that, scavenging energy not only creates a reliable power source for electronic devices, such as wireless sensor nodes, but can also be an energy source by building an energy efficient program.
Energy Harvesting Chip and the Chip Based Power Supply Development for a Wireless Sensor Network
Lee, Dasheng
2008-01-01
In this study, an energy harvesting chip was developed to scavenge energy from artificial light to charge a wireless sensor node. The chip core is a miniature transformer with a nano-ferrofluid magnetic core. The chip embedded transformer can convert harvested energy from its solar cell to variable voltage output for driving multiple loads. This chip system yields a simple, small, and more importantly, a battery-less power supply solution. The sensor node is equipped with multiple sensors that can be enabled by the energy harvesting power supply to collect information about the human body comfort degree. Compared with lab instruments, the nodes with temperature, humidity and photosensors driven by harvested energy had variation coefficient measurement precision of less than 6% deviation under low environmental light of 240 lux. The thermal comfort was affected by the air speed. A flow sensor equipped on the sensor node was used to detect airflow speed. Due to its high power consumption, this sensor node provided 15% less accuracy than the instruments, but it still can meet the requirement of analysis for predicted mean votes (PMV) measurement. The energy harvesting wireless sensor network (WSN) was deployed in a 24-hour convenience store to detect thermal comfort degree from the air conditioning control. During one year operation, the sensor network powered by the energy harvesting chip retained normal functions to collect the PMV index of the store. According to the one month statistics of communication status, the packet loss rate (PLR) is 2.3%, which is as good as the presented results of those WSNs powered by battery. Referring to the electric power records, almost 54% energy can be saved by the feedback control of an energy harvesting sensor network. These results illustrate that, scavenging energy not only creates a reliable power source for electronic devices, such as wireless sensor nodes, but can also be an energy source by building an energy efficient program. PMID:27873953
Fast particles identification in programmable form at level-0 trigger by means of the 3D-Flow system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crosetto, Dario B.
1998-10-30
The 3D-Flow Processor system is a new, technology-independent concept in very fast, real-time system architectures. Based on either an FPGA or an ASIC implementation, it can address, in a fully programmable manner, applications where commercially available processors would fail because of throughput requirements. Possible applications include filtering-algorithms (pattern recognition) from the input of multiple sensors, as well as moving any input validated by these filtering-algorithms to a single output channel. Both operations can easily be implemented on a 3D-Flow system to achieve a real-time processing system with a very short lag time. This system can be built either with off-the-shelfmore » FPGAs or, for higher data rates, with CMOS chips containing 4 to 16 processors each. The basic building block of the system, a 3D-Flow processor, has been successfully designed in VHDL code written in ''Generic HDL'' (mostly made of reusable blocks that are synthesizable in different technologies, or FPGAs), to produce a netlist for a four-processor ASIC featuring 0.35 micron CBA (Ceil Base Array) technology at 3.3 Volts, 884 mW power dissipation at 60 MHz and 63.75 mm sq. die size. The same VHDL code has been targeted to three FPGA manufacturers (Altera EPF10K250A, ORCA-Lucent Technologies 0R3T165 and Xilinx XCV1000). A complete set of software tools, the 3D-Flow System Manager, equally applicable to ASIC or FPGA implementations, has been produced to provide full system simulation, application development, real-time monitoring, and run-time fault recovery. Today's technology can accommodate 16 processors per chip in a medium size die, at a cost per processor of less than $5 based on the current silicon die/size technology cost.« less
The Level 0 Pixel Trigger system for the ALICE experiment
NASA Astrophysics Data System (ADS)
Aglieri Rinella, G.; Kluge, A.; Krivda, M.; ALICE Silicon Pixel Detector project
2007-01-01
The ALICE Silicon Pixel Detector contains 1200 readout chips. Fast-OR signals indicate the presence of at least one hit in the 8192 pixel matrix of each chip. The 1200 bits are transmitted every 100 ns on 120 data readout optical links using the G-Link protocol. The Pixel Trigger System extracts and processes them to deliver an input signal to the Level 0 trigger processor targeting a latency of 800 ns. The system is compact, modular and based on FPGA devices. The architecture allows the user to define and implement various trigger algorithms. The system uses advanced 12-channel parallel optical fiber modules operating at 1310 nm as optical receivers and 12 deserializer chips closely packed in small area receiver boards. Alternative solutions with multi-channel G-Link deserializers implemented directly in programmable hardware devices were investigated. The design of the system and the progress of the ALICE Pixel Trigger project are described in this paper.
NEW EPICS/RTEMS IOC BASED ON ALTERA SOC AT JEFFERSON LAB
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Jianxun; Seaton, Chad; Allison, Trent L.
A new EPICS/RTEMS IOC based on the Altera System-on-Chip (SoC) FPGA is being designed at Jefferson Lab. The Altera SoC FPGA integrates a dual ARM Cortex-A9 Hard Processor System (HPS) consisting of processor, peripherals and memory interfaces tied seamlessly with the FPGA fabric using a high-bandwidth interconnect backbone. The embedded Altera SoC IOC has features of remote network boot via U-Boot from SD card or QSPI Flash, 1Gig Ethernet, 1GB DDR3 SDRAM on HPS, UART serial ports, and ISA bus interface. RTEMS for the ARM processor BSP were built with CEXP shell, which will dynamically load the EPICS applications atmore » runtime. U-Boot is the primary bootloader to remotely load the kernel image into local memory from a DHCP/TFTP server over Ethernet, and automatically run RTEMS and EPICS. The first design of the SoC IOC will be compatible with Jefferson Lab’s current PC104 IOCs, which have been running in CEBAF 10 years. The next design would be mounting in a chassis and connected to a daughter card via standard HSMC connectors. This standard SoC IOC will become the next generation of low-level IOC for the accelerator controls at Jefferson Lab.« less
SPIDR, a general-purpose readout system for pixel ASICs
NASA Astrophysics Data System (ADS)
van der Heijden, B.; Visser, J.; van Beuzekom, M.; Boterenbrood, H.; Kulis, S.; Munneke, B.; Schreuder, F.
2017-02-01
The SPIDR (Speedy PIxel Detector Readout) system is a flexible general-purpose readout platform that can be easily adapted to test and characterize new and existing detector readout ASICs. It is originally designed for the readout of pixel ASICs from the Medipix/Timepix family, but other types of ASICs or front-end circuits can be read out as well. The SPIDR system consists of an FPGA board with memory and various communication interfaces, FPGA firmware, CPU subsystem and an API library on the PC . The FPGA firmware can be adapted to read out other ASICs by re-using IP blocks. The available IP blocks include a UDP packet builder, 1 and 10 Gigabit Ethernet MAC's and a "soft core" CPU . Currently the firmware is targeted at the Xilinx VC707 development board and at a custom board called Compact-SPIDR . The firmware can easily be ported to other Xilinx 7 series and ultra scale FPGAs. The gap between an ASIC and the data acquisition back-end is bridged by the SPIDR system. Using the high pin count VITA 57 FPGA Mezzanine Card (FMC) connector only a simple chip carrier PCB is required. A 1 and a 10 Gigabit Ethernet interface handle the connection to the back-end. These can be used simultaneously for high-speed data and configuration over separate channels. In addition to the FMC connector, configurable inputs and outputs are available for synchronization with other detectors. A high resolution (≈ 27 ps bin size) Time to Digital converter is provided for time stamping events in the detector. The SPIDR system is frequently used as readout for the Medipix3 and Timepix3 ASICs. Using the 10 Gigabit Ethernet interface it is possible to read out a single chip at full bandwidth or up to 12 chips at a reduced rate. Another recent application is the test-bed for the VeloPix ASIC, which is developed for the Vertex Detector of the LHCb experiment. In this case the SPIDR system processes the 20 Gbps scrambled data stream from the VeloPix and distributes it over four 10 Gigabit Ethernet links, and in addition provides the slow and fast control for the chip.
Integration of multi-interface conversion channel using FPGA for modular photonic network
NASA Astrophysics Data System (ADS)
Janicki, Tomasz; Pozniak, Krzysztof T.; Romaniuk, Ryszard S.
2010-09-01
The article discusses the integration of different types of interfaces with FPGA circuits using a reconfigurable communication platform. The solution has been implemented in practice in a single node of a distributed measurement system. Construction of communication platform has been presented with its selected hardware modules, described in VHDL and implemented in FPGA circuits. The graphical user interface (GUI) has been described that allows a user to control the operation of the system. In the final part of the article selected practical solutions have been introduced. The whole measurement system resides on multi-gigabit optical network. The optical network construction is highly modular, reconfigurable and scalable.
NASA Astrophysics Data System (ADS)
Ren, Y. J.; Zhu, J. G.; Yang, X. Y.; Ye, S. H.
2006-10-01
The Virtex-II Pro FPGA is applied to the vision sensor tracking system of IRB2400 robot. The hardware platform, which undertakes the task of improving SNR and compressing data, is constructed by using the high-speed image processing of FPGA. The lower level image-processing algorithm is realized by combining the FPGA frame and the embedded CPU. The velocity of image processing is accelerated due to the introduction of FPGA and CPU. The usage of the embedded CPU makes it easily to realize the logic design of interface. Some key techniques are presented in the text, such as read-write process, template matching, convolution, and some modules are simulated too. In the end, the compare among the modules using this design, using the PC computer and using the DSP, is carried out. Because the high-speed image processing system core is a chip of FPGA, the function of which can renew conveniently, therefore, to a degree, the measure system is intelligent.
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
A high-speed DAQ framework for future high-level trigger and event building clusters
NASA Astrophysics Data System (ADS)
Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.
2017-03-01
Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
FPGA implementation of a configurable neuromorphic CPG-based locomotion controller.
Barron-Zambrano, Jose Hugo; Torres-Huitzil, Cesar
2013-09-01
Neuromorphic engineering is a discipline devoted to the design and development of computational hardware that mimics the characteristics and capabilities of neuro-biological systems. In recent years, neuromorphic hardware systems have been implemented using a hybrid approach incorporating digital hardware so as to provide flexibility and scalability at the cost of power efficiency and some biological realism. This paper proposes an FPGA-based neuromorphic-like embedded system on a chip to generate locomotion patterns of periodic rhythmic movements inspired by Central Pattern Generators (CPGs). The proposed implementation follows a top-down approach where modularity and hierarchy are two desirable features. The locomotion controller is based on CPG models to produce rhythmic locomotion patterns or gaits for legged robots such as quadrupeds and hexapods. The architecture is configurable and scalable for robots with either different morphologies or different degrees of freedom (DOFs). Experiments performed on a real robot are presented and discussed. The obtained results demonstrate that the CPG-based controller provides the necessary flexibility to generate different rhythmic patterns at run-time suitable for adaptable locomotion. Copyright © 2013 Elsevier Ltd. All rights reserved.
Towards a Generic and Adaptive System-On-Chip Controller for Space Exploration Instrumentation
NASA Technical Reports Server (NTRS)
Iturbe, Xabier; Keymeulen, Didier; Yiu, Patrick; Berisford, Dan; Hand, Kevin; Carlson, Robert; Ozer, Emre
2015-01-01
This paper introduces one of the first efforts conducted at NASA’s Jet Propulsion Laboratory (JPL) to develop a generic System-on-Chip (SoC) platform to control science instruments that are proposed for future NASA missions. The SoC platform is named APEX-SoC, where APEX stands for Advanced Processor for space Exploration, and is based on a hybrid Xilinx Zynq that combines an FPGA and an ARM Cortex-A9 dual-core processor on a single chip. The Zynq implements a generic and customizable on-chip infrastructure that can be reused with a variety of instruments, and it has been coupled with a set of off-chip components that are necessary to deal with the different instruments. We have taken JPL’s Compositional InfraRed Imaging Spectrometer (CIRIS), which is proposed for NASA icy moons missions, as a use-case scenario to demonstrate that the entire data processing, control and interface of an instrument can be implemented on a single device using the on-chip infrastructure described in this paper. We show that the performance results achieved in this preliminary version of the instrumentation controller are sufficient to fulfill the science requirements demanded to the CIRIS instrument in future NASA missions, such as Europa.
Versatile single-chip event sequencer for atomic physics experiments
NASA Astrophysics Data System (ADS)
Eyler, Edward
2010-03-01
A very inexpensive dsPIC microcontroller with internal 32-bit counters is used to produce a flexible timing signal generator with up to 16 TTL-compatible digital outputs, with a time resolution and accuracy of 50 ns. This time resolution is easily sufficient for event sequencing in typical experiments involving cold atoms or laser spectroscopy. This single-chip device is capable of triggered operation and can also function as a sweeping delay generator. With one additional chip it can also concurrently produce accurately timed analog ramps, and another one-chip addition allows real-time control from an external computer. Compared to an FPGA-based digital pattern generator, this design is slower but simpler and more flexible, and it can be reprogrammed using ordinary `C' code without special knowledge. I will also describe the use of the same microcontroller with additional hardware to implement a digital lock-in amplifier and PID controller for laser locking, including a simple graphics-based control unit. This work is supported in part by the NSF.
FPGA Implementation of Metastability-Based True Random Number Generator
NASA Astrophysics Data System (ADS)
Hata, Hisashi; Ichikawa, Shuichi
True random number generators (TRNGs) are important as a basis for computer security. Though there are some TRNGs composed of analog circuit, the use of digital circuits is desired for the application of TRNGs to logic LSIs. Some of the digital TRNGs utilize jitter in free-running ring oscillators as a source of entropy, which consume large power. Another type of TRNG exploits the metastability of a latch to generate entropy. Although this kind of TRNG has been mostly implemented with full-custom LSI technology, this study presents an implementation based on common FPGA technology. Our TRNG is comprised of logic gates only, and can be integrated in any kind of logic LSI. The RS latch in our TRNG is implemented as a hard-macro to guarantee the quality of randomness by minimizing the signal skew and load imbalance of internal nodes. To improve the quality and throughput, the output of 64-256 latches are XOR'ed. The derived design was verified on a Xilinx Virtex-4 FPGA (XC4VFX20), and passed NIST statistical test suite without post-processing. Our TRNG with 256 latches occupies 580 slices, while achieving 12.5Mbps throughput.
Embedded System Implementation on FPGA System With μCLinux OS
NASA Astrophysics Data System (ADS)
Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna
2011-02-01
Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
2016-05-01
A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
2016-05-01
A9 CPU and 15 W for the i7 CPU. A method of accelerating this computation is by using a customized hardware unit called a field- programmable gate...implementation of custom logic to accelerate com- putational workloads. This FPGA fabric, in addition to the standard programmable logic, contains 220...chip; field- programmable gate array Daniel Gebhardt U U U U 18 (619) 553-2786 INITIAL DISTRIBUTION 84300 Library (2) 85300 Archive/Stock (1
2016-03-31
Corporation, Linthicum, Maryland *Corresponding author: Pavel.Borodulin@ngc.com Abstract: A chip -scale, highly-reconfigurable transmitter and...the technology has been used in a chip -scale, reconfigurable receiver demonstration and ongoing efforts to increase the level of performance and...circuit (RF-FPGA). It consists of a heterogeneous assembly of a SiGe BiCMOS chip with multiple 3D-integrated, low-loss, phase-change switch chiplets
Data acquisition system issues for large experiments
NASA Astrophysics Data System (ADS)
Siskind, E. J.
2007-09-01
This talk consists of personal observations on two classes of data acquisition ("DAQ") systems for Silicon trackers in large experiments with which the author has been concerned over the last three or more years. The first half is a classic "lessons learned" recital based on experience with the high-level debug and configuration of the DAQ system for the GLAST LAT detector. The second half is concerned with a discussion of the promises and pitfalls of using modern (and future) generations of "system-on-a-chip" ("SOC") or "platform" field-programmable gate arrays ("FPGAs") in future large DAQ systems. The DAQ system pipeline for the 864k channels of Si tracker in the GLAST LAT consists of five tiers of hardware buffers which ultimately feed into the main memory of the (two-active-node) level-3 trigger processor farm. The data formats and buffer volumes of these tiers are briefly described, as well as the flow control employed between successive tiers. Lessons learned regarding data formats, buffer volumes, and flow control/data discard policy are discussed. The continued development of platform FPGAs containing large amounts of configurable logic fabric, embedded PowerPC hard processor cores, digital signal processing components, large volumes of on-chip buffer memory, and multi-gigabit serial I/O capability permits DAQ system designers to vastly increase the amount of data preprocessing that can be performed in parallel within the DAQ pipeline for detector systems in large experiments. The capabilities of some currently available FPGA families are reviewed, along with the prospects for next-generation families of announced, but not yet available, platform FPGAs. Some experience with an actual implementation is presented, and reconciliation between advertised and achievable specifications is attempted. The prospects for applying these components to space-borne Si tracker detectors are briefly discussed.
Construction of Protograph LDPC Codes with Linear Minimum Distance
NASA Technical Reports Server (NTRS)
Divsalar, Dariush; Dolinar, Sam; Jones, Christopher
2006-01-01
A construction method for protograph-based LDPC codes that simultaneously achieve low iterative decoding threshold and linear minimum distance is proposed. We start with a high-rate protograph LDPC code with variable node degrees of at least 3. Lower rate codes are obtained by splitting check nodes and connecting them by degree-2 nodes. This guarantees the linear minimum distance property for the lower-rate codes. Excluding checks connected to degree-1 nodes, we show that the number of degree-2 nodes should be at most one less than the number of checks for the protograph LDPC code to have linear minimum distance. Iterative decoding thresholds are obtained by using the reciprocal channel approximation. Thresholds are lowered by using either precoding or at least one very high-degree node in the base protograph. A family of high- to low-rate codes with minimum distance linearly increasing in block size and with capacity-approaching performance thresholds is presented. FPGA simulation results for a few example codes show that the proposed codes perform as predicted.
Hardware architecture design of a fast global motion estimation method
NASA Astrophysics Data System (ADS)
Liang, Chaobing; Sang, Hongshi; Shen, Xubang
2015-12-01
VLSI implementation of gradient-based global motion estimation (GME) faces two main challenges: irregular data access and high off-chip memory bandwidth requirement. We previously proposed a fast GME method that reduces computational complexity by choosing certain number of small patches containing corners and using them in a gradient-based framework. A hardware architecture is designed to implement this method and further reduce off-chip memory bandwidth requirement. On-chip memories are used to store coordinates of the corners and template patches, while the Gaussian pyramids of both the template and reference frame are stored in off-chip SDRAMs. By performing geometric transform only on the coordinates of the center pixel of a 3-by-3 patch in the template image, a 5-by-5 area containing the warped 3-by-3 patch in the reference image is extracted from the SDRAMs by burst read. Patched-based and burst mode data access helps to keep the off-chip memory bandwidth requirement at the minimum. Although patch size varies at different pyramid level, all patches are processed in term of 3x3 patches, so the utilization of the patch-processing circuit reaches 100%. FPGA implementation results show that the design utilizes 24,080 bits on-chip memory and for a sequence with resolution of 352x288 and frequency of 60Hz, the off-chip bandwidth requirement is only 3.96Mbyte/s, compared with 243.84Mbyte/s of the original gradient-based GME method. This design can be used in applications like video codec, video stabilization, and super-resolution, where real-time GME is a necessity and minimum memory bandwidth requirement is appreciated.
NASA Astrophysics Data System (ADS)
Burri, Samuel; Homulle, Harald; Bruschini, Claudio; Charbon, Edoardo
2016-04-01
LinoSPAD is a reconfigurable camera sensor with a 256×1 CMOS SPAD (single-photon avalanche diode) pixel array connected to a low cost Xilinx Spartan 6 FPGA. The LinoSPAD sensor's line of pixels has a pitch of 24 μm and 40% fill factor. The FPGA implements an array of 64 TDCs and histogram engines capable of processing up to 8.5 giga-photons per second. The LinoSPAD sensor measures 1.68 mm×6.8 mm and each pixel has a direct digital output to connect to the FPGA. The chip is bonded on a carrier PCB to connect to the FPGA motherboard. 64 carry chain based TDCs sampled at 400 MHz can generate a timestamp every 7.5 ns with a mean time resolution below 25 ps per code. The 64 histogram engines provide time-of-arrival histograms covering up to 50 ns. An alternative mode allows the readout of 28 bit timestamps which have a range of up to 4.5 ms. Since the FPGA TDCs have considerable non-linearity we implemented a correction module capable of increasing histogram linearity at real-time. The TDC array is interfaced to a computer using a super-speed USB3 link to transfer over 150k histograms per second for the 12.5 ns reference period used in our characterization. After characterization and subsequent programming of the post-processing we measure an instrument response histogram shorter than 100 ps FWHM using a strong laser pulse with 50 ps FWHM. A timing resolution that when combined with the high fill factor makes the sensor well suited for a wide variety of applications from fluorescence lifetime microscopy over Raman spectroscopy to 3D time-of-flight.
JPIC-Rad-Hard JPEG2000 Image Compression ASIC
NASA Astrophysics Data System (ADS)
Zervas, Nikos; Ginosar, Ran; Broyde, Amitai; Alon, Dov
2010-08-01
JPIC is a rad-hard high-performance image compression ASIC for the aerospace market. JPIC implements tier 1 of the ISO/IEC 15444-1 JPEG2000 (a.k.a. J2K) image compression standard [1] as well as the post compression rate-distortion algorithm, which is part of tier 2 coding. A modular architecture enables employing a single JPIC or multiple coordinated JPIC units. JPIC is designed to support wide data sources of imager in optical, panchromatic and multi-spectral space and airborne sensors. JPIC has been developed as a collaboration of Alma Technologies S.A. (Greece), MBT/IAI Ltd (Israel) and Ramon Chips Ltd (Israel). MBT IAI defined the system architecture requirements and interfaces, The JPEG2K-E IP core from Alma implements the compression algorithm [2]. Ramon Chips adds SERDES interfaces and host interfaces and integrates the ASIC. MBT has demonstrated the full chip on an FPGA board and created system boards employing multiple JPIC units. The ASIC implementation, based on Ramon Chips' 180nm CMOS RadSafe[TM] RH cell library enables superior radiation hardness.
Fpga based L-band pulse doppler radar design and implementation
NASA Astrophysics Data System (ADS)
Savci, Kubilay
As its name implies RADAR (Radio Detection and Ranging) is an electromagnetic sensor used for detection and locating targets from their return signals. Radar systems propagate electromagnetic energy, from the antenna which is in part intercepted by an object. Objects reradiate a portion of energy which is captured by the radar receiver. The received signal is then processed for information extraction. Radar systems are widely used for surveillance, air security, navigation, weather hazard detection, as well as remote sensing applications. In this work, an FPGA based L-band Pulse Doppler radar prototype, which is used for target detection, localization and velocity calculation has been built and a general-purpose Pulse Doppler radar processor has been developed. This radar is a ground based stationary monopulse radar, which transmits a short pulse with a certain pulse repetition frequency (PRF). Return signals from the target are processed and information about their location and velocity is extracted. Discrete components are used for the transmitter and receiver chain. The hardware solution is based on Xilinx Virtex-6 ML605 FPGA board, responsible for the control of the radar system and the digital signal processing of the received signal, which involves Constant False Alarm Rate (CFAR) detection and Pulse Doppler processing. The algorithm is implemented in MATLAB/SIMULINK using the Xilinx System Generator for DSP tool. The field programmable gate arrays (FPGA) implementation of the radar system provides the flexibility of changing parameters such as the PRF and pulse length therefore it can be used with different radar configurations as well. A VHDL design has been developed for 1Gbit Ethernet connection to transfer digitized return signal and detection results to PC. An A-Scope software has been developed with C# programming language to display time domain radar signals and detection results on PC. Data are processed both in FPGA chip and on PC. FPGA uses fixed point arithmetic operations as it is fast and facilitates source requirement as it consumes less hardware than floating point arithmetic operations. The software uses floating point arithmetic operations, which ensure precision in processing at the expense of speed. The functionality of the radar system has been tested for experimental validation in the field with a moving car and the validation of submodules are tested with synthetic data simulated on MATLAB.
Integrated test system of infrared and laser data based on USB 3.0
NASA Astrophysics Data System (ADS)
Fu, Hui Quan; Tang, Lin Bo; Zhang, Chao; Zhao, Bao Jun; Li, Mao Wen
2017-07-01
Based on USB3.0, this paper presents the design method of an integrated test system for both infrared image data and laser signal data processing module. The core of the design is FPGA logic control, the design uses dual-chip DDR3 SDRAM to achieve high-speed laser data cache, and receive parallel LVDS image data through serial-to-parallel conversion chip, and it achieves high-speed data communication between the system and host computer through the USB3.0 bus. The experimental results show that the developed PC software realizes the real-time display of 14-bit LVDS original image after 14-to-8 bit conversion and JPEG2000 compressed image after decompression in software, and can realize the real-time display of the acquired laser signal data. The correctness of the test system design is verified, indicating that the interface link is normal.
Wirtz, Sebastian F; Cunha, Adauto P A; Labusch, Marc; Marzun, Galina; Barcikowski, Stephan; Söffker, Dirk
2018-06-01
Today, the demand for continuous monitoring of valuable or safety critical equipment is increasing in many industrial applications due to safety and economical requirements. Therefore, reliable in-situ measurement techniques are required for instance in Structural Health Monitoring (SHM) as well as process monitoring and control. Here, current challenges are related to the processing of sensor data with a high data rate and low latency. In particular, measurement and analyses of Acoustic Emission (AE) are widely used for passive, in-situ inspection. Advantages of AE are related to its sensitivity to different micro-mechanical mechanisms on the material level. However, online processing of AE waveforms is computationally demanding. The related equipment is typically bulky, expensive, and not well suited for permanent installation. The contribution of this paper is the development of a Field Programmable Gate Array (FPGA)-based measurement system using ZedBoard devlopment kit with Zynq-7000 system on chip for embedded implementation of suitable online processing algorithms. This platform comprises a dual-core Advanced Reduced Instruction Set Computer Machine (ARM) architecture running a Linux operating system and FPGA fabric. A FPGA-based hardware implementation of the discrete wavelet transform is realized to accelerate processing the AE measurements. Key features of the system are low cost, small form factor, and low energy consumption, which makes it suitable to serve as field-deployed measurement and control device. For verification of the functionality, a novel automatically realized adjustment of the working distance during pulsed laser ablation in liquids is established as an example. A sample rate of 5 MHz is achieved at 16 bit resolution.
NASA Astrophysics Data System (ADS)
Zou, Liang; Fu, Zhuang; Zhao, YanZheng; Yang, JunYan
2010-07-01
This paper proposes a kind of pipelined electric circuit architecture implemented in FPGA, a very large scale integrated circuit (VLSI), which efficiently deals with the real time non-uniformity correction (NUC) algorithm for infrared focal plane arrays (IRFPA). Dual Nios II soft-core processors and a DSP with a 64+ core together constitute this image system. Each processor undertakes own systematic task, coordinating its work with each other's. The system on programmable chip (SOPC) in FPGA works steadily under the global clock frequency of 96Mhz. Adequate time allowance makes FPGA perform NUC image pre-processing algorithm with ease, which has offered favorable guarantee for the work of post image processing in DSP. And at the meantime, this paper presents a hardware (HW) and software (SW) co-design in FPGA. Thus, this systematic architecture yields an image processing system with multiprocessor, and a smart solution to the satisfaction with the performance of the system.
Kirk, Andrew G; Plant, David V; Szymanski, Ted H; Vranesic, Zvonko G; Tooley, Frank A P; Rolston, David R; Ayliffe, Michael H; Lacroix, Frederic K; Robertson, Brian; Bernier, Eric; Brosseau, Daniel F
2003-05-10
Design and implementation of a free-space optical backplane for multiprocessor applications is presented. The system is designed to interconnect four multiprocessor nodes that communicate by using multiplexed 32-bit packets. Each multiprocessor node is electrically connected to an optoelectronic VLSI chip which implements the hyperplane interconnection architecture. The chips each contain 256 optical transmitters (implemented as dual-rail multiple quantum-well modulators) and 256 optical receivers. A rigid free-space microoptical interconnection system that interconnects the transceiver chips in a 512-channel unidirectional ring is implemented. Full design, implementation, and operational details are provided.
NASA Astrophysics Data System (ADS)
Kirk, Andrew G.; Plant, David V.; Szymanski, Ted H.; Vranesic, Zvonko G.; Tooley, Frank A. P.; Rolston, David R.; Ayliffe, Michael H.; Lacroix, Frederic K.; Robertson, Brian; Bernier, Eric; Brosseau, Daniel F.
2003-05-01
Design and implementation of a free-space optical backplane for multiprocessor applications is presented. The system is designed to interconnect four multiprocessor nodes that communicate by using multiplexed 32-bit packets. Each multiprocessor node is electrically connected to an optoelectronic VLSI chip which implements the hyperplane interconnection architecture. The chips each contain 256 optical transmitters (implemented as dual-rail multiple quantum-well modulators) and 256 optical receivers. A rigid free-space microoptical interconnection system that interconnects the transceiver chips in a 512-channel unidirectional ring is implemented. Full design, implementation, and operational details are provided.
A Robust Strategy for Total Ionizing Dose Testing of Field Programmable Gate Arrays
NASA Technical Reports Server (NTRS)
Wilcox, Edward; Berg, Melanie; Friendlich, Mark; Lakeman, Joseph; KIm, Hak; Pellish, Jonathan; LaBel, Kenneth
2012-01-01
We present a novel method of FPGA TID testing that measures propagation delay between flip-flops operating at maximum speed. Measurement is performed on-chip at-speed and provides a key design metric when building system-critical synchronous designs.
Serial data acquisition for GEM-2D detector
NASA Astrophysics Data System (ADS)
Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech; Zienkiewicz, Pawel; Mazon, Didier; Malard, Philippe; Herrmann, Albrecht; Vezinet, Didier
2014-11-01
This article debates about data fast acquisition and histogramming method for the X-ray GEM detector. The whole process of histogramming is performed by FPGA chips (Spartan-6 series from Xilinx). The results of the histogramming process are stored in an internal FPGA memory and then sent to PC. In PC data is merged and processed by MATLAB. The structure of firmware functionality implemented in the FPGAs is described. Examples of test measurements and results are presented.
A space-efficient quantum computer simulator suitable for high-speed FPGA implementation
NASA Astrophysics Data System (ADS)
Frank, Michael P.; Oniciuc, Liviu; Meyer-Baese, Uwe H.; Chiorescu, Irinel
2009-05-01
Conventional vector-based simulators for quantum computers are quite limited in the size of the quantum circuits they can handle, due to the worst-case exponential growth of even sparse representations of the full quantum state vector as a function of the number of quantum operations applied. However, this exponential-space requirement can be avoided by using general space-time tradeoffs long known to complexity theorists, which can be appropriately optimized for this particular problem in a way that also illustrates some interesting reformulations of quantum mechanics. In this paper, we describe the design and empirical space/time complexity measurements of a working software prototype of a quantum computer simulator that avoids excessive space requirements. Due to its space-efficiency, this design is well-suited to embedding in single-chip environments, permitting especially fast execution that avoids access latencies to main memory. We plan to prototype our design on a standard FPGA development board.
Video sensor architecture for surveillance applications.
Sánchez, Jordi; Benet, Ginés; Simó, José E
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%.
Video Sensor Architecture for Surveillance Applications
Sánchez, Jordi; Benet, Ginés; Simó, José E.
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%. PMID:22438723
NASA Astrophysics Data System (ADS)
Song, Z.; Wang, Y.; Kuang, J.
2018-05-01
Field Programmable Gate Arrays (FPGAs) made with 28 nm and more advanced process technology have great potentials for implementation of high precision time-to-digital convertors (TDC), because the delay cells in the tapped delay line (TDL) used for time interpolation are getting smaller and smaller. However, the bubble problems in the TDL status are becoming more complicated, which make it difficult to achieve TDCs on these chips with a high time precision. In this paper, we are proposing a novel decomposition encoding scheme, which not only can solve the bubble problem easily, but also has a high encoding efficiency. The potential of these chips to realize TDC can be fully released with the scheme. In a Xilinx Kintex-7 FPGA chip, we implemented a TDC system with 256 TDC channels, which doubles the number of TDC channels that our previous technique could achieve. Performances of all these TDC channels are evaluated. The average RMS time precision among them is 10.23 ps in the time-interval measurement range of (0–10 ns), and their measurement throughput reaches 277 M measures per second.
Yang, Fan; Paindavoine, M
2003-01-01
This paper describes a real time vision system that allows us to localize faces in video sequences and verify their identity. These processes are image processing techniques based on the radial basis function (RBF) neural network approach. The robustness of this system has been evaluated quantitatively on eight video sequences. We have adapted our model for an application of face recognition using the Olivetti Research Laboratory (ORL), Cambridge, UK, database so as to compare the performance against other systems. We also describe three hardware implementations of our model on embedded systems based on the field programmable gate array (FPGA), zero instruction set computer (ZISC) chips, and digital signal processor (DSP) TMS320C62, respectively. We analyze the algorithm complexity and present results of hardware implementations in terms of the resources used and processing speed. The success rates of face tracking and identity verification are 92% (FPGA), 85% (ZISC), and 98.2% (DSP), respectively. For the three embedded systems, the processing speeds for images size of 288 /spl times/ 352 are 14 images/s, 25 images/s, and 4.8 images/s, respectively.
Optically Programmable Field Programmable Gate Arrays (FPGA) Systems
2004-01-01
VCSEL requires placing the array far enough as to overlap the entire footprint of the signal beam in order to record the hologram. Therefore, these...hologram that self-focuses, due to phase -conjugation, on the array of detectors in the chip. VC A 10 m m 10 mm 18mm 16mm SEL RRAY OPTICAL MEMORY LOGIC...the VCSEL array , the chip and the optical material, and the requirements they have to meet for their use in the OPGA system. Section
CoNNeCT Baseband Processor Module
NASA Technical Reports Server (NTRS)
Yamamoto, Clifford K; Jedrey, Thomas C.; Gutrich, Daniel G.; Goodpasture, Richard L.
2011-01-01
A document describes the CoNNeCT Baseband Processor Module (BPM) based on an updated processor, memory technology, and field-programmable gate arrays (FPGAs). The BPM was developed from a requirement to provide sufficient computing power and memory storage to conduct experiments for a Software Defined Radio (SDR) to be implemented. The flight SDR uses the AT697 SPARC processor with on-chip data and instruction cache. The non-volatile memory has been increased from a 20-Mbit EEPROM (electrically erasable programmable read only memory) to a 4-Gbit Flash, managed by the RTAX2000 Housekeeper, allowing more programs and FPGA bit-files to be stored. The volatile memory has been increased from a 20-Mbit SRAM (static random access memory) to a 1.25-Gbit SDRAM (synchronous dynamic random access memory), providing additional memory space for more complex operating systems and programs to be executed on the SPARC. All memory is EDAC (error detection and correction) protected, while the SPARC processor implements fault protection via TMR (triple modular redundancy) architecture. Further capability over prior BPM designs includes the addition of a second FPGA to implement features beyond the resources of a single FPGA. Both FPGAs are implemented with Xilinx Virtex-II and are interconnected by a 96-bit bus to facilitate data exchange. Dedicated 1.25- Gbit SDRAMs are wired to each Xilinx FPGA to accommodate high rate data buffering for SDR applications as well as independent SpaceWire interfaces. The RTAX2000 manages scrub and configuration of each Xilinx.
NASA Astrophysics Data System (ADS)
Liu, Peipei; Yang, Suyoung; Lim, Hyung Jin; Park, Hyung Chul; Ko, In Chang; Sohn, Hoon
2014-03-01
Fatigue crack is one of the main culprits for the failure of metallic structures. Recently, it has been shown that nonlinear wave modulation spectroscopy (NWMS) is effective in detecting nonlinear mechanisms produced by fatigue crack. In this study, an active wireless sensor node for fatigue crack detection is developed based on NWMS. Using PZT transducers attached to a target structure, ultrasonic waves at two distinctive frequencies are generated, and their modulation due to fatigue crack formation is detected using another PZT transducer. Furthermore, a reference-free NWMS algorithm is developed so that fatigue crack can be detected without relying on history data of the structure with minimal parameter adjustment by the end users. The algorithm is embedded into FPGA, and the diagnosis is transmitted to a base station using a commercial wireless communication system. The whole design of the sensor node is fulfilled in a low power working strategy. Finally, an experimental verification has been performed using aluminum plate specimens to show the feasibility of the developed active wireless NWMS sensor node.
Multi-element germanium detectors for synchrotron applications
NASA Astrophysics Data System (ADS)
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; Vernon, E.; Pinelli, D.; Dooryhee, E.; Ghose, S.; Caswell, T.; Siddons, D. P.; Miceli, A.; Baldwin, J.; Almer, J.; Okasinski, J.; Quaranta, O.; Woods, R.; Krings, T.; Stock, S.
2018-04-01
We have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. We will discuss the technical details of the systems, and present some of the results from them.
Nanophotonic rare-earth quantum memory with optically controlled retrieval
NASA Astrophysics Data System (ADS)
Zhong, Tian; Kindem, Jonathan M.; Bartholomew, John G.; Rochman, Jake; Craiciu, Ioana; Miyazono, Evan; Bettinelli, Marco; Cavalli, Enrico; Verma, Varun; Nam, Sae Woo; Marsili, Francesco; Shaw, Matthew D.; Beyer, Andrew D.; Faraon, Andrei
2017-09-01
Optical quantum memories are essential elements in quantum networks for long-distance distribution of quantum entanglement. Scalable development of quantum network nodes requires on-chip qubit storage functionality with control of the readout time. We demonstrate a high-fidelity nanophotonic quantum memory based on a mesoscopic neodymium ensemble coupled to a photonic crystal cavity. The nanocavity enables >95% spin polarization for efficient initialization of the atomic frequency comb memory and time bin-selective readout through an enhanced optical Stark shift of the comb frequencies. Our solid-state memory is integrable with other chip-scale photon source and detector devices for multiplexed quantum and classical information processing at the network nodes.
García, Gabriel J.; Jara, Carlos A.; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M.; Torres, Fernando
2014-01-01
The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field. PMID:24691100
García, Gabriel J; Jara, Carlos A; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M; Torres, Fernando
2014-03-31
The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.
Sensor chip and apparatus for tactile and/or flow sensing
NASA Technical Reports Server (NTRS)
Liu, Chang (Inventor); Chen, Jack (Inventor); Engel, Jonathan (Inventor)
2008-01-01
A sensor chip, comprising a flexible, polymer-based substrate, and at least one microfabricated sensor disposed on the substrate and including a conductive element. The at least one sensor comprises at least one of a tactile sensor and a flow sensor. Other embodiments of the present invention include sensors and/or multi-modal sensor nodes.
Sensor chip and apparatus for tactile and/or flow sensing
NASA Technical Reports Server (NTRS)
Liu, Chang (Inventor); Chen, Jack (Inventor); Engel, Jonathan (Inventor)
2009-01-01
A sensor chip, comprising a flexible, polymer-based substrate, and at least one microfabricated sensor disposed on the substrate and including a conductive element. The at least one sensor comprises at least one of a tactile sensor and a flow sensor. Other embodiments of the present invention include sensors and/or multi-modal sensor nodes.
Fault Tolerant Microcontroller for the Configurable Fault Tolerant Processor
2008-09-01
many others come to mind I also wish to thank Jan Grey for providing an excellent System-on-a-Chip that formed a core component of this thesis...developed by Jan Gray as documented in his "Building a RISC CPU and System-on-a-Chip in an FPGA" series of articles that was published in Circuit Cellar...those detailed by Jan Gray in his "Getting Started with the XSOC Project v0.93" [16]. The XSOC distribution is available at <http://www.fpgacpu.org
Temperature-Adaptive Circuits on Reconfigurable Analog Arrays
NASA Technical Reports Server (NTRS)
Stoica, Adrian; Zebulum, Ricardo S.; Keymeulen, Didier; Ramesham, Rajeshuni; Neff, Joseph; Katkoori, Srinivas
2006-01-01
Demonstration of a self-reconfigurable Integrated Circuit (IC) that would operate under extreme temperature (-180 C and 120 C) and radiation (300krad), without the protection of thermal controls and radiation shields. Self-Reconfigurable Electronics platform: a) Evolutionary Processor (EP) to run reconfiguration mechanism; b) Reconfigurable chip (FPGA, FPAA, etc).
FPGA Sequencer for Radar Altimeter Applications
NASA Technical Reports Server (NTRS)
Berkun, Andrew C.; Pollard, Brian D.; Chen, Curtis W.
2011-01-01
A sequencer for a radar altimeter provides accurate attitude information for a reliable soft landing of the Mars Science Laboratory (MSL). This is a field-programmable- gate-array (FPGA)-only implementation. A table loaded externally into the FPGA controls timing, processing, and decision structures. Radar is memory-less and does not use previous acquisitions to assist in the current acquisition. All cycles complete in exactly 50 milliseconds, regardless of range or whether a target was found. A RAM (random access memory) within the FPGA holds instructions for up to 15 sets. For each set, timing is run, echoes are processed, and a comparison is made. If a target is seen, more detailed processing is run on that set. If no target is seen, the next set is tried. When all sets have been run, the FPGA terminates and waits for the next 50-millisecond event. This setup simplifies testing and improves reliability. A single vertex chip does the work of an entire assembly. Output products require minor processing to become range and velocity. This technology is the heart of the Terminal Descent Sensor, which is an integral part of the Entry Decent and Landing system for MSL. In addition, it is a strong candidate for manned landings on Mars or the Moon.
Li, Zong-Tao; Wu, Tie-Jun; Lin, Can-Long; Ma, Long-Hua
2011-01-01
A new generalized optimum strapdown algorithm with coning and sculling compensation is presented, in which the position, velocity and attitude updating operations are carried out based on the single-speed structure in which all computations are executed at a single updating rate that is sufficiently high to accurately account for high frequency angular rate and acceleration rectification effects. Different from existing algorithms, the updating rates of the coning and sculling compensations are unrelated with the number of the gyro incremental angle samples and the number of the accelerometer incremental velocity samples. When the output sampling rate of inertial sensors remains constant, this algorithm allows increasing the updating rate of the coning and sculling compensation, yet with more numbers of gyro incremental angle and accelerometer incremental velocity in order to improve the accuracy of system. Then, in order to implement the new strapdown algorithm in a single FPGA chip, the parallelization of the algorithm is designed and its computational complexity is analyzed. The performance of the proposed parallel strapdown algorithm is tested on the Xilinx ISE 12.3 software platform and the FPGA device XC6VLX550T hardware platform on the basis of some fighter data. It is shown that this parallel strapdown algorithm on the FPGA platform can greatly decrease the execution time of algorithm to meet the real-time and high precision requirements of system on the high dynamic environment, relative to the existing implemented on the DSP platform. PMID:22164058
Multi-petascale highly efficient parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.
A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time andmore » supports DMA functionality allowing for parallel processing message-passing.« less
A new FPGA-driven P-HIFU system with harmonic cancellation technique
NASA Astrophysics Data System (ADS)
Wu, Hao; Shen, Guofeng; Su, Zhiqiang; Chen, Yazhu
2017-03-01
This paper introduces a high intensity focused ultrasound system for ablation using switch-mode power amplifiers with harmonic cancellation technique eliminating the 3rdharmonic and all even harmonics. The efficiency of the amplifier is optimized by choosing different parameters of the harmonic cancellation technique. This technique requires double driving signals, and specific signal waveform because of the full-bridge topology. The new FPGA-driven P-HIFU system has 200 channels of phase signals that can form 100 output channels. An FPGA chip is used to generate these signals, and each channel has a phase resolution of 2 ns, less than one degree. The output waveform of the amplifier, voltage waveform across the transducer, shows fewer harmonic components.
Slow Computing Simulation of Bio-plausible Control
2012-03-01
information networks, neuromorphic chips would become necessary. Small unstable flying platforms currently require RTK, GPS, or Vicon closed-circuit...Visual, and IR Sensing FPGA ASIC Neuromorphic Chip Simulation Quad Rotor Robotic Insect Uniform Independent Network Single Modality Neural Network... neuromorphic Processing across parallel computational elements =0.54 N u m b e r o f c o m p u ta tio n s - No info 14 integrated circuit
Picoradio: Communication/Computation Piconodes for Sensor Networks
2003-01-02
diagram of PicoNode III, or Quark node. It is made from two custom chips, Strange RF and Charm digital processor , and is complemented by a set of...the chipset comprising of Strange (analog OOK transceiver) and Charm (digital processor ) chips. 44 Figure 33: System block diagram of the Quark node...19 2.B PICONODE II - TWO-CHIP PICONODE IMPLEMENTATION ......................................... 21 2.B.1 Baseband processor (BBP
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA
Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang
2009-01-01
Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2017-01-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.
Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason
2016-10-01
With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator
Wang, Runchun M.; Thakur, Chetan S.; van Schaik, André
2018-01-01
This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks. PMID:29692702
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.
Wang, Runchun M; Thakur, Chetan S; van Schaik, André
2018-01-01
This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.
The Design and Development of the SMEX-Lite Power System
NASA Technical Reports Server (NTRS)
Rakow, Glenn P.; Schnurr, Richard G., Jr.; Solly, Michael A.
1998-01-01
This paper describes the design and development of a 250W orbit average electrical power system electronic Power Node and software for use in Low Earth Orbit missions. The mass of the Power Node is 3.6 Kg (8 lb.). The dimensions of the Power Node are 30cm x 26cm x 7.9cm (11 in. x 10.25 in x 3.1 in.) The design was realized using software, Field Programmable Gate Array (FPGA) digital logic and surface mount technology. The design is generic enough to reduce the non-recurring engineering for different mission configurations. The Power Node charges one to five, low cost, 22-cell 4 AH D-cell battery packs independently. The battery charging algorithms are executed in the power software to reduce the mass and size of the power electronic. The Power Node implements a peak-power tracking algorithm using an innovative hardware/software approach. The power software task is hosted on the spacecraft processor. The power software task generates a MIL-STD-1553 command packet to update the Power Node control settings. The settings for the battery voltage and current limits, as well as minimum solar array voltage used to implement peak power tracking are contained in this packet. Several advanced topologies are used in the Power Node. These include synchronous rectification in the bus regulators, average current control in the battery chargers and quasi-resonant converters for the Field Effect Transistor (FET) transistor drive electronics. Lastly, the main bus regulator uses a feed-forward topology with the PWM implemented in an FPGA.
Simulation environment based on the Universal Verification Methodology
NASA Astrophysics Data System (ADS)
Fiergolski, A.
2017-01-01
Universal Verification Methodology (UVM) is a standardized approach of verifying integrated circuit designs, targeting a Coverage-Driven Verification (CDV). It combines automatic test generation, self-checking testbenches, and coverage metrics to indicate progress in the design verification. The flow of the CDV differs from the traditional directed-testing approach. With the CDV, a testbench developer, by setting the verification goals, starts with an structured plan. Those goals are targeted further by a developed testbench, which generates legal stimuli and sends them to a device under test (DUT). The progress is measured by coverage monitors added to the simulation environment. In this way, the non-exercised functionality can be identified. Moreover, the additional scoreboards indicate undesired DUT behaviour. Such verification environments were developed for three recent ASIC and FPGA projects which have successfully implemented the new work-flow: (1) the CLICpix2 65 nm CMOS hybrid pixel readout ASIC design; (2) the C3PD 180 nm HV-CMOS active sensor ASIC design; (3) the FPGA-based DAQ system of the CLICpix chip. This paper, based on the experience from the above projects, introduces briefly UVM and presents a set of tips and advices applicable at different stages of the verification process-cycle.
A Reconfigurable Communications System for Small Spacecraft
NASA Technical Reports Server (NTRS)
Chu, Pong P.; Kifle, Muli
2004-01-01
Two trends of NASA missions are the use of multiple small spacecraft and the development of an integrated space network. To achieve these goals, a robust and agile communications system is needed. Advancements in field programmable gate array (FPGA) technology have made it possible to incorporate major communication and network functionalities in FPGA chips; thus this technology has great potential as the basis for a reconfigurable communications system. This report discusses the requirements of future space communications, reviews relevant issues, and proposes a methodology to design and construct a reconfigurable communications system for small scientific spacecraft.
Providing Self-Healing Ability for Wireless Sensor Node by Using Reconfigurable Hardware
Yuan, Shenfang; Qiu, Lei; Gao, Shang; Tong, Yao; Yang, Weiwei
2012-01-01
Wireless sensor networks (WSNs) have received tremendous attention over the past ten years. In engineering applications of WSNs, a number of sensor nodes are usually spread across some specific geographical area. Some of these nodes have to work in harsh environments. Dependability of the Wireless Sensor Network (WSN) is very important for its successful applications in the engineering area. In ordinary research, when a node has a failure, it is usually discarded and the network is reorganized to ensure the normal operation of the WSN. Using appropriate WSN re-organization methods, though the sensor networks can be reorganized, this causes additional maintenance costs and sometimes still decreases the function of the networks. In those situations where the sensor networks cannot be reorganized, the performance of the whole WSN will surely be degraded. In order to ensure the reliable and low cost operation of WSNs, a method to develop a wireless sensor node with self-healing ability based on reconfigurable hardware is proposed in this paper. Two self-healing WSN node realization paradigms based on reconfigurable hardware are presented, including a redundancy-based self-healing paradigm and a whole FPAA/FPGA based self-healing paradigm. The nodes designed with the self-healing ability can dynamically change their node configurations to repair the nodes' hardware failures. To demonstrate these two paradigms, a strain sensor node is adopted as an illustration to show the concepts. Two strain WSN sensor nodes with self-healing ability are developed respectively according to the proposed self-healing paradigms. Evaluation experiments on self-healing ability and power consumption are performed. Experimental results show that the developed nodes can self-diagnose the failures and recover to a normal state automatically. The research presented can improve the robustness of WSNs and reduce the maintenance cost of WSNs in engineering applications. PMID:23202176
Research on Safety Monitoring System of Tailings Dam Based on Internet of Things
NASA Astrophysics Data System (ADS)
Wang, Ligang; Yang, Xiaocong; He, Manchao
2018-03-01
The paper designed and implemented the safety monitoring system of tailings dam based on Internet of things, completed the hardware and software design of sensor nodes, routing nodes and coordinator node by using ZigBee wireless sensor chip CC2630 and 3G/4G data transmission module, developed the software platform integrated with geographic information system. The paper achieved real-time monitoring and data collection of tailings dam dam deformation, seepage line, water level and rainfall for all-weather, the stability of tailings dam based on the Internet of things monitoring is analyzed, and realized intelligent and scientific management of tailings dam under the guidance of the remote expert system.
A Low Power IoT Sensor Node Architecture for Waste Management Within Smart Cities Context.
Cerchecci, Matteo; Luti, Francesco; Mecocci, Alessandro; Parrino, Stefano; Peruzzi, Giacomo; Pozzebon, Alessandro
2018-04-21
This paper focuses on the realization of an Internet of Things (IoT) architecture to optimize waste management in the context of Smart Cities. In particular, a novel typology of sensor node based on the use of low cost and low power components is described. This node is provided with a single-chip microcontroller, a sensor able to measure the filling level of trash bins using ultrasounds and a data transmission module based on the LoRa LPWAN (Low Power Wide Area Network) technology. Together with the node, a minimal network architecture was designed, based on a LoRa gateway, with the purpose of testing the IoT node performances. Especially, the paper analyzes in detail the node architecture, focusing on the energy saving technologies and policies, with the purpose of extending the batteries lifetime by reducing power consumption, through hardware and software optimization. Tests on sensor and radio module effectiveness are also presented.
A Low Power IoT Sensor Node Architecture for Waste Management Within Smart Cities Context
Cerchecci, Matteo; Luti, Francesco; Mecocci, Alessandro; Parrino, Stefano; Peruzzi, Giacomo
2018-01-01
This paper focuses on the realization of an Internet of Things (IoT) architecture to optimize waste management in the context of Smart Cities. In particular, a novel typology of sensor node based on the use of low cost and low power components is described. This node is provided with a single-chip microcontroller, a sensor able to measure the filling level of trash bins using ultrasounds and a data transmission module based on the LoRa LPWAN (Low Power Wide Area Network) technology. Together with the node, a minimal network architecture was designed, based on a LoRa gateway, with the purpose of testing the IoT node performances. Especially, the paper analyzes in detail the node architecture, focusing on the energy saving technologies and policies, with the purpose of extending the batteries lifetime by reducing power consumption, through hardware and software optimization. Tests on sensor and radio module effectiveness are also presented. PMID:29690552
Development of an LSI for Tactile Sensor Systems on the Whole-Body of Robots
NASA Astrophysics Data System (ADS)
Muroyama, Masanori; Makihata, Mitsutoshi; Nakano, Yoshihiro; Matsuzaki, Sakae; Yamada, Hitoshi; Yamaguchi, Ui; Nakayama, Takahiro; Nonomura, Yutaka; Fujiyoshi, Motohiro; Tanaka, Shuji; Esashi, Masayoshi
We have developed a network type tactile sensor system, which realizes high-density tactile sensors on the whole-body of nursing and communication robots. The system consists of three kinds of nodes: host, relay and sensor nodes. Roles of the sensor node are to sense forces and, to encode the sensing data and to transmit the encoded data on serial channels by interruption handling. Relay nodes and host deal with a number of the encoded sensing data from the sensor nodes. A sensor node consists of a capacitive MEMS force sensor and a signal processing/transmission LSI. In this paper, details of an LSI for the sensor node are described. We designed experimental sensor node LSI chips by a commercial 0.18µm standard CMOS process. The 0.18µm LSIs were supplied in wafer level for MEMS post-process. The LSI chip area is 2.4mm × 2.4mm, which includes logic, CF converter and memory circuits. The maximum clock frequency of the chip with a large capacitive load is 10MHz. Measured power consumption at 10MHz clock is 2.23mW. Experimental results indicate that size, response time, sensor sensitivity and power consumption are all enough for practical tactile sensor systems.
NASA Astrophysics Data System (ADS)
Zand, Ramtin; DeMara, Ronald F.
2017-12-01
In this paper, we have developed a radiation-hardened non-volatile lookup table (LUT) circuit utilizing spin Hall effect (SHE)-magnetic random access memory (MRAM) devices. The design is motivated by modeling the effect of radiation particles striking hybrid complementary metal oxide semiconductor/spin based circuits, and the resistive behavior of SHE-MRAM devices via established and precise physics equations. The models developed are leveraged in the SPICE circuit simulator to verify the functionality of the proposed design. The proposed hardening technique is based on using feedback transistors, as well as increasing the radiation capacity of the sensitive nodes. Simulation results show that our proposed LUT circuit can achieve multiple node upset (MNU) tolerance with more than 38% and 60% power-delay product improvement as well as 26% and 50% reduction in device count compared to the previous energy-efficient radiation-hardened LUT designs. Finally, we have performed a process variation analysis showing that the MNU immunity of our proposed circuit is realized at the cost of increased susceptibility to transistor and MRAM variations compared to an unprotected LUT design.
FPGA-based prototype storage system with phase change memory
NASA Astrophysics Data System (ADS)
Li, Gezi; Chen, Xiaogang; Chen, Bomy; Li, Shunfen; Zhou, Mi; Han, Wenbing; Song, Zhitang
2016-10-01
With the ever-increasing amount of data being stored via social media, mobile telephony base stations, and network devices etc. the database systems face severe bandwidth bottlenecks when moving vast amounts of data from storage to the processing nodes. At the same time, Storage Class Memory (SCM) technologies such as Phase Change Memory (PCM) with unique features like fast read access, high density, non-volatility, byte-addressability, positive response to increasing temperature, superior scalability, and zero standby leakage have changed the landscape of modern computing and storage systems. In such a scenario, we present a storage system called FLEET which can off-load partial or whole SQL queries to the storage engine from CPU. FLEET uses an FPGA rather than conventional CPUs to implement the off-load engine due to its highly parallel nature. We have implemented an initial prototype of FLEET with PCM-based storage. The results demonstrate that significant performance and CPU utilization gains can be achieved by pushing selected query processing components inside in PCM-based storage.
Crossbar Nanocomputer Development
2012-04-01
their utilization. Areas such as neuromorphic computing, signal processing, arithmetic processing, and crossbar computing are only some of the...due to its intrinsic, network-on- chip flexibility to re-route around defects. Preliminary efforts in crossbar computing have been demonstrated by...they approach their scaling limits [2]. Other applications that memristive devices are suited for include FPGA [3], encryption [4], and neuromorphic
Low-power SXGA active matrix OLED
NASA Astrophysics Data System (ADS)
Wacyk, Ihor; Prache, Olivier; Ghosh, Amal
2009-05-01
This paper presents the design and first evaluation of a full-color 1280×3×1024 pixel, active matrix organic light emitting diode (AMOLED) microdisplay that operates at a low power of 200mW under typical operating conditions of 35fL, and offers a precision 30-bit RGB digital interface in a compact size (0.78-inch diagonal active area). The new system architecture developed by eMagin for the SXGA microdisplay, based on a separate FPGA driver and AMOLED display chip, offers several benefits, including better power efficiency, cost-effectiveness, more features for improved performance, and increased system flexibility.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McKisson, John
The source code for the Java Data Acquisition suite provides interfaces to the JLab built USB FPGA ADC across a LAN network. Each jDaq node provides ListMode data from JLab built detector systems and readouts.
Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research
Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M
2008-01-01
Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation. PMID:18412963
Günthard, H F; Wong, J K; Ignacio, C C; Havlir, D V; Richman, D D
1998-07-01
The performance of the high-density oligonucleotide array methodology (GeneChip) in detecting drug resistance mutations in HIV-1 pol was compared with that of automated dideoxynucleotide sequencing (ABI) of clinical samples, viral stocks, and plasmid-derived NL4-3 clones. Sequences from 29 clinical samples (plasma RNA, n = 17; lymph node RNA, n = 5; lymph node DNA, n = 7) from 12 patients, from 6 viral stock RNA samples, and from 13 NL4-3 clones were generated by both methods. Editing was done independently by a different investigator for each method before comparing the sequences. In addition, NL4-3 wild type (WT) and mutants were mixed in varying concentrations and sequenced by both methods. Overall, a concordance of 99.1% was found for a total of 30,865 bases compared. The comparison of clinical samples (plasma RNA and lymph node RNA and DNA) showed a slightly lower match of base calls, 98.8% for 19,831 nucleotides compared (protease region, 99.5%, n = 8272; RT region, 98.3%, n = 11,316), than for viral stocks and NL4-3 clones (protease region, 99.8%; RT region, 99.5%). Artificial mixing experiments showed a bias toward calling wild-type bases by GeneChip. Discordant base calls are most likely due to differential detection of mixtures. The concordance between GeneChip and ABI was high and appeared dependent on the nature of the templates (directly amplified versus cloned) and the complexity of mixes.
Field-Programmable Gate Array Computer in Structural Analysis: An Initial Exploration
NASA Technical Reports Server (NTRS)
Singleterry, Robert C., Jr.; Sobieszczanski-Sobieski, Jaroslaw; Brown, Samuel
2002-01-01
This paper reports on an initial assessment of using a Field-Programmable Gate Array (FPGA) computational device as a new tool for solving structural mechanics problems. A FPGA is an assemblage of binary gates arranged in logical blocks that are interconnected via software in a manner dependent on the algorithm being implemented and can be reprogrammed thousands of times per second. In effect, this creates a computer specialized for the problem that automatically exploits all the potential for parallel computing intrinsic in an algorithm. This inherent parallelism is the most important feature of the FPGA computational environment. It is therefore important that if a problem offers a choice of different solution algorithms, an algorithm of a higher degree of inherent parallelism should be selected. It is found that in structural analysis, an 'analog computer' style of programming, which solves problems by direct simulation of the terms in the governing differential equations, yields a more favorable solution algorithm than current solution methods. This style of programming is facilitated by a 'drag-and-drop' graphic programming language that is supplied with the particular type of FPGA computer reported in this paper. Simple examples in structural dynamics and statics illustrate the solution approach used. The FPGA system also allows linear scalability in computing capability. As the problem grows, the number of FPGA chips can be increased with no loss of computing efficiency due to data flow or algorithmic latency that occurs when a single problem is distributed among many conventional processors that operate in parallel. This initial assessment finds the FPGA hardware and software to be in their infancy in regard to the user conveniences; however, they have enormous potential for shrinking the elapsed time of structural analysis solutions if programmed with algorithms that exhibit inherent parallelism and linear scalability. This potential warrants further development of FPGA-tailored algorithms for structural analysis.
A Two-Stage Reconstruction Processor for Human Detection in Compressive Sensing CMOS Radar.
Tsao, Kuei-Chi; Lee, Ling; Chu, Ta-Shun; Huang, Yuan-Hao
2018-04-05
Complementary metal-oxide-semiconductor (CMOS) radar has recently gained much research attraction because small and low-power CMOS devices are very suitable for deploying sensing nodes in a low-power wireless sensing system. This study focuses on the signal processing of a wireless CMOS impulse radar system that can detect humans and objects in the home-care internet-of-things sensing system. The challenges of low-power CMOS radar systems are the weakness of human signals and the high computational complexity of the target detection algorithm. The compressive sensing-based detection algorithm can relax the computational costs by avoiding the utilization of matched filters and reducing the analog-to-digital converter bandwidth requirement. The orthogonal matching pursuit (OMP) is one of the popular signal reconstruction algorithms for compressive sensing radar; however, the complexity is still very high because the high resolution of human respiration leads to high-dimension signal reconstruction. Thus, this paper proposes a two-stage reconstruction algorithm for compressive sensing radar. The proposed algorithm not only has lower complexity than the OMP algorithm by 75% but also achieves better positioning performance than the OMP algorithm especially in noisy environments. This study also designed and implemented the algorithm by using Vertex-7 FPGA chip (Xilinx, San Jose, CA, USA). The proposed reconstruction processor can support the 256 × 13 real-time radar image display with a throughput of 28.2 frames per second.
Empirical Mode Decomposition and Neural Networks on FPGA for Fault Diagnosis in Induction Motors
Garcia-Perez, Arturo; Osornio-Rios, Roque Alfredo; Romero-Troncoso, Rene de Jesus
2014-01-01
Nowadays, many industrial applications require online systems that combine several processing techniques in order to offer solutions to complex problems as the case of detection and classification of multiple faults in induction motors. In this work, a novel digital structure to implement the empirical mode decomposition (EMD) for processing nonstationary and nonlinear signals using the full spline-cubic function is presented; besides, it is combined with an adaptive linear network (ADALINE)-based frequency estimator and a feed forward neural network (FFNN)-based classifier to provide an intelligent methodology for the automatic diagnosis during the startup transient of motor faults such as: one and two broken rotor bars, bearing defects, and unbalance. Moreover, the overall methodology implementation into a field-programmable gate array (FPGA) allows an online and real-time operation, thanks to its parallelism and high-performance capabilities as a system-on-a-chip (SoC) solution. The detection and classification results show the effectiveness of the proposed fused techniques; besides, the high precision and minimum resource usage of the developed digital structures make them a suitable and low-cost solution for this and many other industrial applications. PMID:24678281
Empirical mode decomposition and neural networks on FPGA for fault diagnosis in induction motors.
Camarena-Martinez, David; Valtierra-Rodriguez, Martin; Garcia-Perez, Arturo; Osornio-Rios, Roque Alfredo; Romero-Troncoso, Rene de Jesus
2014-01-01
Nowadays, many industrial applications require online systems that combine several processing techniques in order to offer solutions to complex problems as the case of detection and classification of multiple faults in induction motors. In this work, a novel digital structure to implement the empirical mode decomposition (EMD) for processing nonstationary and nonlinear signals using the full spline-cubic function is presented; besides, it is combined with an adaptive linear network (ADALINE)-based frequency estimator and a feed forward neural network (FFNN)-based classifier to provide an intelligent methodology for the automatic diagnosis during the startup transient of motor faults such as: one and two broken rotor bars, bearing defects, and unbalance. Moreover, the overall methodology implementation into a field-programmable gate array (FPGA) allows an online and real-time operation, thanks to its parallelism and high-performance capabilities as a system-on-a-chip (SoC) solution. The detection and classification results show the effectiveness of the proposed fused techniques; besides, the high precision and minimum resource usage of the developed digital structures make them a suitable and low-cost solution for this and many other industrial applications.
Multi-element germanium detectors for synchrotron applications
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; ...
2018-04-27
In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
Multi-element germanium detectors for synchrotron applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.
In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
NASA Technical Reports Server (NTRS)
Keymeulen, Didier; Ferguson, Michael I.; Fink, Wolfgang; Oks, Boris; Peay, Chris; Terrile, Richard; Cheng, Yen; Kim, Dennis; MacDonald, Eric; Foor, David
2005-01-01
We propose a tuning method for MEMS gyroscopes based on evolutionary computation to efficiently increase the sensitivity of MEMS gyroscopes through tuning. The tuning method was tested for the second generation JPL/Boeing Post-resonator MEMS gyroscope using the measurement of the frequency response of the MEMS device in open-loop operation. We also report on the development of a hardware platform for integrated tuning and closed loop operation of MEMS gyroscopes. The control of this device is implemented through a digital design on a Field Programmable Gate Array (FPGA). The hardware platform easily transitions to an embedded solution that allows for the miniaturization of the system to a single chip.
NASA Astrophysics Data System (ADS)
Oh, Gyong Jin; Kim, Lyang-June; Sheen, Sue-Ho; Koo, Gyou-Phyo; Jin, Sang-Hun; Yeo, Bo-Yeon; Lee, Jong-Ho
2009-05-01
This paper presents a real time implementation of Non Uniformity Correction (NUC). Two point correction and one point correction with shutter were carried out in an uncooled imaging system which will be applied to a missile application. To design a small, light weight and high speed imaging system for a missile system, SoPC (System On a Programmable Chip) which comprises of FPGA and soft core (Micro-blaze) was used. Real time NUC and generation of control signals are implemented using FPGA. Also, three different NUC tables were made to make the operating time shorter and to reduce the power consumption in a large range of environment temperature. The imaging system consists of optics and four electronics boards which are detector interface board, Analog to Digital converter board, Detector signal generation board and Power supply board. To evaluate the imaging system, NETD was measured. The NETD was less than 160mK in three different environment temperatures.
Wireless visual sensor network resource allocation using cross-layer optimization
NASA Astrophysics Data System (ADS)
Bentley, Elizabeth S.; Matyjas, John D.; Medley, Michael J.; Kondi, Lisimachos P.
2009-01-01
In this paper, we propose an approach to manage network resources for a Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network where nodes monitor scenes with varying levels of motion. It uses cross-layer optimization across the physical layer, the link layer and the application layer. Our technique simultaneously assigns a source coding rate, a channel coding rate, and a power level to all nodes in the network based on one of two criteria that maximize the quality of video of the entire network as a whole, subject to a constraint on the total chip rate. One criterion results in the minimal average end-to-end distortion amongst all nodes, while the other criterion minimizes the maximum distortion of the network. Our approach allows one to determine the capacity of the visual sensor network based on the number of nodes and the quality of video that must be transmitted. For bandwidth-limited applications, one can also determine the minimum bandwidth needed to accommodate a number of nodes with a specific target chip rate. Video captured by a sensor node camera is encoded and decoded using the H.264 video codec by a centralized control unit at the network layer. To reduce the computational complexity of the solution, Universal Rate-Distortion Characteristics (URDCs) are obtained experimentally to relate bit error probabilities to the distortion of corrupted video. Bit error rates are found first by using Viterbi's upper bounds on the bit error probability and second, by simulating nodes transmitting data spread by Total Square Correlation (TSC) codes over a Rayleigh-faded DS-CDMA channel and receiving that data using Auxiliary Vector (AV) filtering.
NASA Technical Reports Server (NTRS)
2007-01-01
Topics covered include: High-Accuracy, High-Dynamic-Range Phase-Measurement System; Simple, Compact, Safe Impact Tester; Multi-Antenna Radar Systems for Doppler Rain Measurements; 600-GHz Electronically Tunable Vector Measurement System; Modular Architecture for the Measurement of Space Radiation; VLSI Design of a Turbo Decoder; Architecture of an Autonomous Radio Receiver; Improved On-Chip Measurement of Delay in an FPGA or ASIC; Resource Selection and Ranking; Accident/Mishap Investigation System; Simplified Identification of mRNA or DNA in Whole Cells; Printed Multi-Turn Loop Antennas for RF Biotelemetry; Making Ternary Quantum Dots From Single-Source Precursors; Improved Single-Source Precursors for Solar-Cell Absorbers; Spray CVD for Making Solar-Cell Absorber Layers; Glass/BNNT Composite for Sealing Solid Oxide Fuel Cells; A Method of Assembling Compact Coherent Fiber-Optic Bundles; Manufacturing Diamond Under Very High Pressure; Ring-Resonator/Sol-Gel Interferometric Immunosensor; Compact Fuel-Cell System Would Consume Neat Methanol; Algorithm Would Enable Robots to Solve Problems Creatively; Hypothetical Scenario Generator for Fault-Tolerant Diagnosis; Smart Data Node in the Sky; Pseudo-Waypoint Guidance for Proximity Spacecraft Maneuvers; Update on Controlling Herds of Cooperative Robots; and Simulation and Testing of Maneuvering of a Planetary Rover.
SVM classifier on chip for melanoma detection.
Afifi, Shereen; GholamHosseini, Hamid; Sinha, Roopak
2017-07-01
Support Vector Machine (SVM) is a common classifier used for efficient classification with high accuracy. SVM shows high accuracy for classifying melanoma (skin cancer) clinical images within computer-aided diagnosis systems used by skin cancer specialists to detect melanoma early and save lives. We aim to develop a medical low-cost handheld device that runs a real-time embedded SVM-based diagnosis system for use in primary care for early detection of melanoma. In this paper, an optimized SVM classifier is implemented onto a recent FPGA platform using the latest design methodology to be embedded into the proposed device for realizing online efficient melanoma detection on a single system on chip/device. The hardware implementation results demonstrate a high classification accuracy of 97.9% and a significant acceleration factor of 26 from equivalent software implementation on an embedded processor, with 34% of resources utilization and 2 watts for power consumption. Consequently, the implemented system meets crucial embedded systems constraints of high performance and low cost, resources utilization and power consumption, while achieving high classification accuracy.
Space division multiplexing chip-to-chip quantum key distribution.
Bacco, Davide; Ding, Yunhong; Dalgaard, Kjeld; Rottwitt, Karsten; Oxenløwe, Leif Katsuo
2017-09-29
Quantum cryptography is set to become a key technology for future secure communications. However, to get maximum benefit in communication networks, transmission links will need to be shared among several quantum keys for several independent users. Such links will enable switching in quantum network nodes of the quantum keys to their respective destinations. In this paper we present an experimental demonstration of a photonic integrated silicon chip quantum key distribution protocols based on space division multiplexing (SDM), through multicore fiber technology. Parallel and independent quantum keys are obtained, which are useful in crypto-systems and future quantum network.
Reconfigurable signal processor designs for advanced digital array radar systems
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan (Rockee); Yu, Xining
2017-05-01
The new challenges originated from Digital Array Radar (DAR) demands a new generation of reconfigurable backend processor in the system. The new FPGA devices can support much higher speed, more bandwidth and processing capabilities for the need of digital Line Replaceable Unit (LRU). This study focuses on using the latest Altera and Xilinx devices in an adaptive beamforming processor. The field reprogrammable RF devices from Analog Devices are used as analog front end transceivers. Different from other existing Software-Defined Radio transceivers on the market, this processor is designed for distributed adaptive beamforming in a networked environment. The following aspects of the novel radar processor will be presented: (1) A new system-on-chip architecture based on Altera's devices and adaptive processing module, especially for the adaptive beamforming and pulse compression, will be introduced, (2) Successful implementation of generation 2 serial RapidIO data links on FPGA, which supports VITA-49 radio packet format for large distributed DAR processing. (3) Demonstration of the feasibility and capabilities of the processor in a Micro-TCA based, SRIO switching backplane to support multichannel beamforming in real-time. (4) Application of this processor in ongoing radar system development projects, including OU's dual-polarized digital array radar, the planned new cylindrical array radars, and future airborne radars.
Sparse matrix-vector multiplication on network-on-chip
NASA Astrophysics Data System (ADS)
Sun, C.-C.; Götze, J.; Jheng, H.-Y.; Ruan, S.-J.
2010-12-01
In this paper, we present an idea for performing matrix-vector multiplication by using Network-on-Chip (NoC) architecture. In traditional IC design on-chip communications have been designed with dedicated point-to-point interconnections. Therefore, regular local data transfer is the major concept of many parallel implementations. However, when dealing with the parallel implementation of sparse matrix-vector multiplication (SMVM), which is the main step of all iterative algorithms for solving systems of linear equation, the required data transfers depend on the sparsity structure of the matrix and can be extremely irregular. Using the NoC architecture makes it possible to deal with arbitrary structure of the data transfers; i.e. with the irregular structure of the sparse matrices. So far, we have already implemented the proposed SMVM-NoC architecture with the size 4×4 and 5×5 in IEEE 754 single float point precision using FPGA.
Multiport Circular Polarized RFID-Tag Antenna for UHF Sensor Applications.
Zaid, Jamal; Abdulhadi, Abdulhadi; Kesavan, Arun; Belaizi, Yassin; Denidni, Tayeb A
2017-07-05
A circular polarized patch antenna for UHF RFID tag-based sensor applications is presented, with the circular polarization (CP) generated by a new antenna shape, an asymmetric stars shaped slotted microstrip patch antenna (CP-ASSSMP). Four stars etched on the patch allow the antenna's size to be reduced by close to 20%. The proposed antenna is matched with two RFID chips via inductive-loop matching. The first chip is connected to a resistive sensor and acts as a sensor node, and the second is used as a reference node. The proposed antenna is used for two targets, serving as both reference and sensor simultaneously, thereby eliminating the need for a second antenna. Its reader can read the RFID chips at any orientation of the tag due to the CP. The measured reading range is about 25 m with mismatch polarization. The operating frequency band is 902-929 MHz for the two ports, which is covered by the US RFID band, and the axial-ratio bandwidth is about 7 MHz. In addition, the reader can also detect temperature, based on the minimum difference in the power required by the reference and sensor.
Multiport Circular Polarized RFID-Tag Antenna for UHF Sensor Applications
Zaid, Jamal; Abdulhadi, Abdulhadi; Kesavan, Arun; Belaizi, Yassin; Denidni, Tayeb A.
2017-01-01
A circular polarized patch antenna for UHF RFID tag-based sensor applications is presented, with the circular polarization (CP) generated by a new antenna shape, an asymmetric stars shaped slotted microstrip patch antenna (CP-ASSSMP). Four stars etched on the patch allow the antenna’s size to be reduced by close to 20%. The proposed antenna is matched with two RFID chips via inductive-loop matching. The first chip is connected to a resistive sensor and acts as a sensor node, and the second is used as a reference node. The proposed antenna is used for two targets, serving as both reference and sensor simultaneously, thereby eliminating the need for a second antenna. Its reader can read the RFID chips at any orientation of the tag due to the CP. The measured reading range is about 25 m with mismatch polarization. The operating frequency band is 902–929 MHz for the two ports, which is covered by the US RFID band, and the axial-ratio bandwidth is about 7 MHz. In addition, the reader can also detect temperature, based on the minimum difference in the power required by the reference and sensor. PMID:28678178
A wide-range programmable frequency synthesizer based on a finite state machine filter
NASA Astrophysics Data System (ADS)
Alser, Mohammed H.; Assaad, Maher M.; Hussin, Fawnizu A.
2013-11-01
In this article, an FPGA-based design and implementation of a fully digital wide-range programmable frequency synthesizer based on a finite state machine filter is presented. The advantages of the proposed architecture are that, it simultaneously generates a high frequency signal from a low frequency reference signal (i.e. synthesising), and synchronising the two signals (signals have the same phase, or a constant difference) without jitter accumulation issue. The architecture is portable and can be easily implemented for various platforms, such as FPGAs and integrated circuits. The frequency synthesizer circuit can be used as a part of SERDES devices in intra/inter chip communication in system-on-chip (SoC). The proposed circuit is designed using Verilog language and synthesized for the Altera DE2-70 development board, with the Cyclone II (EP2C35F672C6) device on board. Simulation and experimental results are included; they prove the synthesizing and tracking features of the proposed architecture. The generated clock signal frequency of a range from 19.8 MHz to 440 MHz is synchronized to the input reference clock with a frequency step of 0.12 MHz.
Embedded Streaming Deep Neural Networks Accelerator With Applications.
Dundar, Aysegul; Jin, Jonghoon; Martini, Berin; Culurciello, Eugenio
2017-07-01
Deep convolutional neural networks (DCNNs) have become a very powerful tool in visual perception. DCNNs have applications in autonomous robots, security systems, mobile phones, and automobiles, where high throughput of the feedforward evaluation phase and power efficiency are important. Because of this increased usage, many field-programmable gate array (FPGA)-based accelerators have been proposed. In this paper, we present an optimized streaming method for DCNNs' hardware accelerator on an embedded platform. The streaming method acts as a compiler, transforming a high-level representation of DCNNs into operation codes to execute applications in a hardware accelerator. The proposed method utilizes maximum computational resources available based on a novel-scheduled routing topology that combines data reuse and data concatenation. It is tested with a hardware accelerator implemented on the Xilinx Kintex-7 XC7K325T FPGA. The system fully explores weight-level and node-level parallelizations of DCNNs and achieves a peak performance of 247 G-ops while consuming less than 4 W of power. We test our system with applications on object classification and object detection in real-world scenarios. Our results indicate high-performance efficiency, outperforming all other presented platforms while running these applications.
High-speed line-scan camera with digital time delay integration
NASA Astrophysics Data System (ADS)
Bodenstorfer, Ernst; Fürtler, Johannes; Brodersen, Jörg; Mayer, Konrad J.; Eckel, Christian; Gravogl, Klaus; Nachtnebel, Herbert
2007-02-01
Dealing with high-speed image acquisition and processing systems, the speed of operation is often limited by the amount of available light, due to short exposure times. Therefore, high-speed applications often use line-scan cameras, based on charge-coupled device (CCD) sensors with time delayed integration (TDI). Synchronous shift and accumulation of photoelectric charges on the CCD chip - according to the objects' movement - result in a longer effective exposure time without introducing additional motion blur. This paper presents a high-speed color line-scan camera based on a commercial complementary metal oxide semiconductor (CMOS) area image sensor with a Bayer filter matrix and a field programmable gate array (FPGA). The camera implements a digital equivalent to the TDI effect exploited with CCD cameras. The proposed design benefits from the high frame rates of CMOS sensors and from the possibility of arbitrarily addressing the rows of the sensor's pixel array. For the digital TDI just a small number of rows are read out from the area sensor which are then shifted and accumulated according to the movement of the inspected objects. This paper gives a detailed description of the digital TDI algorithm implemented on the FPGA. Relevant aspects for the practical application are discussed and key features of the camera are listed.
Random number generators for large-scale parallel Monte Carlo simulations on FPGA
NASA Astrophysics Data System (ADS)
Lin, Y.; Wang, F.; Liu, B.
2018-05-01
Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Implementation of LSCMA adaptive array terminal for mobile satellite communications
NASA Astrophysics Data System (ADS)
Zhou, Shun; Wang, Huali; Xu, Zhijun
2007-11-01
This paper considers the application of adaptive array antenna based on the least squares constant modulus algorithm (LSCMA) for interference rejection in mobile SATCOM terminals. A two-element adaptive array scheme is implemented with a combination of ADI TS201S DSP chips and Altera Stratix II FPGA device, which makes a cooperating computation for adaptive beamforming. Its interference suppressing performance is verified via Matlab simulations. Digital hardware system is implemented to execute the operations of LSCMA beamforming algorithm that is represented by an algorithm flowchart. The result of simulations and test indicate that this scheme can improve the anti-jamming performance of terminals.
NASA Astrophysics Data System (ADS)
Szadkowski, Zbigniew; Fraenkel, E. D.; van den Berg, Ad M.
2013-10-01
We present the FPGA/NIOS implementation of an adaptive finite impulse response (FIR) filter based on linear prediction to suppress radio frequency interference (RFI). This technique will be used for experiments that observe coherent radio emission from extensive air showers induced by ultra-high-energy cosmic rays. These experiments are designed to make a detailed study of the development of the electromagnetic part of air showers. Therefore, these radio signals provide information that is complementary to that obtained by water-Cherenkov detectors which are predominantly sensitive to the particle content of an air shower at ground. The radio signals from air showers are caused by the coherent emission due to geomagnetic and charge-excess processes. These emissions can be observed in the frequency band between 10-100 MHz. However, this frequency range is significantly contaminated by narrow-band RFI and other human-made distortions. A FIR filter implemented in the FPGA logic segment of the front-end electronics of a radio sensor significantly improves the signal-to-noise ratio. In this paper we discuss an adaptive filter which is based on linear prediction. The coefficients for the linear predictor (LP) are dynamically refreshed and calculated in the embedded NIOS processor, which is implemented in the same FPGA chip. The Levinson recursion, used to obtain the filter coefficients, is also implemented in the NIOS and is partially supported by direct multiplication in the DSP blocks of the logic FPGA segment. Tests confirm that the LP can be an alternative to other methods involving multiple time-to-frequency domain conversions using an FFT procedure. These multiple conversions draw heavily on the power consumption of the FPGA and are avoided by the linear prediction approach. Minimization of the power consumption is an important issue because the final system will be powered by solar panels. The FIR filter has been successfully tested in the Altera development kits with the EP4CE115F29C7 from the Cyclone IV family and the EP3C120F780C7 from the Cyclone III family at a 170 MHz sampling rate, a 12-bit I/O resolution, and an internal 30-bit dynamic range. Most of the slow floating-point NIOS calculations have been moved to the FPGA logic segments as extended fixed-point operations, which significantly reduced the refreshing time of the coefficients used in the LP. We conclude that the LP is a viable alternative to other methods such as non-adaptive methods involving digital notch filters or multiple time-to-frequency domain conversions using an FFT procedure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, M.
Configuration and calibration of the front-end electronics typical of many silicon detector configurations were investigated in a lab activity based on a pair of strip sensors interfaced with FSSR2 read-out chips and an FPGA. This simple hardware configuration, originally developed for a telescope at the Fermilab Test Beam Facility, was used to measure thresholds and noise on individual readout channels and to study the influence that different configurations of the front-end electronics had on the observed levels of noise in the system. An understanding of the calibration and operation of this small detector system provided an opportunity to explore themore » architecture of larger systems such as those currently in use at LHC experiments.« less
Ultrascalable petaflop parallel supercomputer
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY
2010-07-20
A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
Design of the micro pressure multi-node measuring system for micro-fluidic chip
NASA Astrophysics Data System (ADS)
Mu, Lili; Guo, Shuheng; Rong, Li; Yin, Ke
2016-01-01
An online multi-node microfludic pressure measuring system was designed in the paper. The research focused on the design of pressure test circuit system and methods on dealing with pressure data collecting. The MPXV7002 micro-pressure sensor was selected to measure the chip inside channel pressure and installed by a silicone tube on different micro-channel measured nodes. The pressure transmission loss was estimated in the paper, and corrected by the filtering and smoothing method. The pressure test experiment was carried out and the data were analyzed. Finally, the measuring system was calibrated. The results showed that the measuring system had high testing precision.
Nanophotonic rare-earth quantum memory with optically controlled retrieval.
Zhong, Tian; Kindem, Jonathan M; Bartholomew, John G; Rochman, Jake; Craiciu, Ioana; Miyazono, Evan; Bettinelli, Marco; Cavalli, Enrico; Verma, Varun; Nam, Sae Woo; Marsili, Francesco; Shaw, Matthew D; Beyer, Andrew D; Faraon, Andrei
2017-09-29
Optical quantum memories are essential elements in quantum networks for long-distance distribution of quantum entanglement. Scalable development of quantum network nodes requires on-chip qubit storage functionality with control of the readout time. We demonstrate a high-fidelity nanophotonic quantum memory based on a mesoscopic neodymium ensemble coupled to a photonic crystal cavity. The nanocavity enables >95% spin polarization for efficient initialization of the atomic frequency comb memory and time bin-selective readout through an enhanced optical Stark shift of the comb frequencies. Our solid-state memory is integrable with other chip-scale photon source and detector devices for multiplexed quantum and classical information processing at the network nodes. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.
Flight Qualified Micro Sun Sensor
NASA Technical Reports Server (NTRS)
Liebe, Carl Christian; Mobasser, Sohrab; Wrigley, Chris; Schroeder, Jeffrey; Bae, Youngsam; Naegle, James; Katanyoutanant, Sunant; Jerebets, Sergei; Schatzel, Donald; Lee, Choonsup
2007-01-01
A prototype small, lightweight micro Sun sensor (MSS) has been flight qualified as part of the attitude-determination system of a spacecraft or for Mars surface operations. The MSS has previously been reported at a very early stage of development in NASA Tech Briefs, Vol. 28, No. 1 (January 2004). An MSS is essentially a miniature multiple-pinhole electronic camera combined with digital processing electronics that functions analogously to a sundial. A micromachined mask containing a number of microscopic pinholes is mounted in front of an active-pixel sensor (APS). Electronic circuits for controlling the operation of the APS, readout from the pixel photodetectors, and analog-to-digital conversion are all integrated onto the same chip along with the APS. The digital processing includes computation of the centroids of the pinhole Sun images on the APS. The spacecraft computer has the task of converting the Sun centroids into Sun angles utilizing a calibration polynomial. The micromachined mask comprises a 500-micron-thick silicon wafer, onto which is deposited a 57-nm-thick chromium adhesion- promotion layer followed by a 200-nm-thick gold light-absorption layer. The pinholes, 50 microns in diameter, are formed in the gold layer by photolithography. The chromium layer is thin enough to be penetrable by an amount of Sunlight adequate to form measurable pinhole images. A spacer frame between the mask and the APS maintains a gap of .1 mm between the pinhole plane and the photodetector plane of the APS. To minimize data volume, mass, and power consumption, the digital processing of the APS readouts takes place in a single field-programmable gate array (FPGA). The particular FPGA is a radiation- tolerant unit that contains .32,000 gates. No external memory is used so the FPGA calculates the centroids in real time as pixels are read off the APS with minimal internal memory. To enable the MSS to fit into a small package, the APS, the FPGA, and other components are mounted on a single two-sided board following chip-on-board design practices
NASA Astrophysics Data System (ADS)
Cieszewski, Radoslaw; Linczuk, Maciej
2016-09-01
The development of FPGA technology and the increasing complexity of applications in recent decades have forced compilers to move to higher abstraction levels. Compilers interprets an algorithmic description of a desired behavior written in High-Level Languages (HLLs) and translate it to Hardware Description Languages (HDLs). This paper presents a RPython based High-Level synthesis (HLS) compiler. The compiler get the configuration parameters and map RPython program to VHDL. Then, VHDL code can be used to program FPGA chips. In comparison of other technologies usage, FPGAs have the potential to achieve far greater performance than software as a result of omitting the fetch-decode-execute operations of General Purpose Processors (GPUs), and introduce more parallel computation. This can be exploited by utilizing many resources at the same time. Creating parallel algorithms computed with FPGAs in pure HDL is difficult and time consuming. Implementation time can be greatly reduced with High-Level Synthesis compiler. This article describes design methodologies and tools, implementation and first results of created VHDL backend for RPython compiler.
Event-driven processing for hardware-efficient neural spike sorting
NASA Astrophysics Data System (ADS)
Liu, Yan; Pereira, João L.; Constandinou, Timothy G.
2018-02-01
Objective. The prospect of real-time and on-node spike sorting provides a genuine opportunity to push the envelope of large-scale integrated neural recording systems. In such systems the hardware resources, power requirements and data bandwidth increase linearly with channel count. Event-based (or data-driven) processing can provide here a new efficient means for hardware implementation that is completely activity dependant. In this work, we investigate using continuous-time level-crossing sampling for efficient data representation and subsequent spike processing. Approach. (1) We first compare signals (synthetic neural datasets) encoded with this technique against conventional sampling. (2) We then show how such a representation can be directly exploited by extracting simple time domain features from the bitstream to perform neural spike sorting. (3) The proposed method is implemented in a low power FPGA platform to demonstrate its hardware viability. Main results. It is observed that considerably lower data rates are achievable when using 7 bits or less to represent the signals, whilst maintaining the signal fidelity. Results obtained using both MATLAB and reconfigurable logic hardware (FPGA) indicate that feature extraction and spike sorting accuracies can be achieved with comparable or better accuracy than reference methods whilst also requiring relatively low hardware resources. Significance. By effectively exploiting continuous-time data representation, neural signal processing can be achieved in a completely event-driven manner, reducing both the required resources (memory, complexity) and computations (operations). This will see future large-scale neural systems integrating on-node processing in real-time hardware.
Interface Supports Lightweight Subsystem Routing for Flight Applications
NASA Technical Reports Server (NTRS)
Lux, James P.; Block, Gary L.; Ahmad, Mohammad; Whitaker, William D.; Dillon, James W.
2010-01-01
A wireless avionics interface exploits the constrained nature of data networks in flight systems to use a lightweight routing method. This simplified routing means that a processor is not required, and the logic can be implemented as an intellectual property (IP) core in a field-programmable gate array (FPGA). The FPGA can be shared with the flight subsystem application. In addition, the router is aware of redundant subsystems, and can be configured to provide hot standby support as part of the interface. This simplifies implementation of flight applications requiring hot stand - by support. When a valid inbound packet is received from the network, the destination node address is inspected to determine whether the packet is to be processed by this node. Each node has routing tables for the next neighbor node to guide the packet to the destination node. If it is to be processed, the final packet destination is inspected to determine whether the packet is to be forwarded to another node, or routed locally. If the packet is local, it is sent to an Applications Data Interface (ADI), which is attached to a local flight application. Under this scheme, an interface can support many applications in a subsystem supporting a high level of subsystem integration. If the packet is to be forwarded to another node, it is sent to the outbound packet router. The outbound packet router receives packets from an ADI or a packet to be forwarded. It then uses a lookup table to determine the next destination for the packet. Upon detecting a remote subsystem failure, the routing table can be updated to autonomously bypass the failed subsystem.
CBM First-level Event Selector Input Interface Demonstrator
NASA Astrophysics Data System (ADS)
Hutter, Dirk; de Cuveland, Jan; Lindenstruth, Volker
2017-10-01
CBM is a heavy-ion experiment at the future FAIR facility in Darmstadt, Germany. Featuring self-triggered front-end electronics and free-streaming read-out, event selection will exclusively be done by the First Level Event Selector (FLES). Designed as an HPC cluster with several hundred nodes its task is an online analysis and selection of the physics data at a total input data rate exceeding 1 TByte/s. To allow efficient event selection, the FLES performs timeslice building, which combines the data from all given input links to self-contained, potentially overlapping processing intervals and distributes them to compute nodes. Partitioning the input data streams into specialized containers allows performing this task very efficiently. The FLES Input Interface defines the linkage between the FEE and the FLES data transport framework. A custom FPGA PCIe board, the FLES Interface Board (FLIB), is used to receive data via optical links and transfer them via DMA to the host’s memory. The current prototype of the FLIB features a Kintex-7 FPGA and provides up to eight 10 GBit/s optical links. A custom FPGA design has been developed for this board. DMA transfers and data structures are optimized for subsequent timeslice building. Index tables generated by the FPGA enable fast random access to the written data containers. In addition the DMA target buffers can directly serve as InfiniBand RDMA source buffers without copying the data. The usage of POSIX shared memory for these buffers allows data access from multiple processes. An accompanying HDL module has been developed to integrate the FLES link into the front-end FPGA designs. It implements the front-end logic interface as well as the link protocol. Prototypes of all Input Interface components have been implemented and integrated into the FLES test framework. This allows the implementation and evaluation of the foreseen CBM read-out chain.
Rapid prototyping of update algorithm of discrete Fourier transform for real-time signal processing
NASA Astrophysics Data System (ADS)
Kakad, Yogendra P.; Sherlock, Barry G.; Chatapuram, Krishnan V.; Bishop, Stephen
2001-10-01
An algorithm is developed in the companion paper, to update the existing DFT to represent the new data series that results when a new signal point is received. Updating the DFT in this way uses less computation than directly evaluating the DFT using the FFT algorithm, This reduces the computational order by a factor of log2 N. The algorithm is able to work in the presence of data window function, for use with rectangular window, the split triangular, Hanning, Hamming, and Blackman windows. In this paper, a hardware implementation of this algorithm, using FPGA technology, is outlined. Unlike traditional fully customized VLSI circuits, FPGAs represent a technical break through in the corresponding industry. The FPGA implements thousands of gates of logic in a single IC chip and it can be programmed by users at their site in a few seconds or less depending on the type of device used. The risk is low and the development time is short. The advantages have made FPGAs very popular for rapid prototyping of algorithms in the area of digital communication, digital signal processing, and image processing. Our paper addresses the related issues of implementation using hardware descriptive language in the development of the design and the subsequent downloading on the programmable hardware chip.
A Two-Stage Reconstruction Processor for Human Detection in Compressive Sensing CMOS Radar
Tsao, Kuei-Chi; Lee, Ling; Chu, Ta-Shun
2018-01-01
Complementary metal-oxide-semiconductor (CMOS) radar has recently gained much research attraction because small and low-power CMOS devices are very suitable for deploying sensing nodes in a low-power wireless sensing system. This study focuses on the signal processing of a wireless CMOS impulse radar system that can detect humans and objects in the home-care internet-of-things sensing system. The challenges of low-power CMOS radar systems are the weakness of human signals and the high computational complexity of the target detection algorithm. The compressive sensing-based detection algorithm can relax the computational costs by avoiding the utilization of matched filters and reducing the analog-to-digital converter bandwidth requirement. The orthogonal matching pursuit (OMP) is one of the popular signal reconstruction algorithms for compressive sensing radar; however, the complexity is still very high because the high resolution of human respiration leads to high-dimension signal reconstruction. Thus, this paper proposes a two-stage reconstruction algorithm for compressive sensing radar. The proposed algorithm not only has lower complexity than the OMP algorithm by 75% but also achieves better positioning performance than the OMP algorithm especially in noisy environments. This study also designed and implemented the algorithm by using Vertex-7 FPGA chip (Xilinx, San Jose, CA, USA). The proposed reconstruction processor can support the 256×13 real-time radar image display with a throughput of 28.2 frames per second. PMID:29621170
An IO block array in a radiation-hardened SOI SRAM-based FPGA
NASA Astrophysics Data System (ADS)
Yan, Zhao; Lihua, Wu; Xiaowei, Han; Yan, Li; Qianli, Zhang; Liang, Chen; Guoquan, Zhang; Jianzhong, Li; Bo, Yang; Jiantou, Gao; Jian, Wang; Ming, Li; Guizhai, Liu; Feng, Zhang; Xufeng, Guo; Kai, Zhao; Chen, Stanley L.; Fang, Yu; Zhongli, Liu
2012-01-01
We present an input/output block (IOB) array used in the radiation-hardened SRAM-based field-programmable gate array (FPGA) VS1000, which is designed and fabricated with a 0.5 μm partially depleted silicon-on-insulator (SOI) logic process at the CETC 58th Institute. Corresponding with the characteristics of the FPGA, each IOB includes a local routing pool and two IO cells composed of a signal path circuit, configurable input/output buffers and an ESD protection network. A boundary-scan path circuit can be used between the programmable buffers and the input/output circuit or as a transparent circuit when the IOB is applied in different modes. Programmable IO buffers can be used at TTL/CMOS standard levels. The local routing pool enhances the flexibility and routability of the connection between the IOB array and the core logic. Radiation-hardened designs, including A-type and H-type body-tied transistors and special D-type registers, improve the anti-radiation performance. The ESD protection network, which provides a high-impulse discharge path on a pad, prevents the breakdown of the core logic caused by the immense current. These design strategies facilitate the design of FPGAs with different capacities or architectures to form a series of FPGAs. The functionality and performance of the IOB array is proved after a functional test. The radiation test indicates that the proposed VS1000 chip with an IOB array has a total dose tolerance of 100 krad(Si), a dose survivability rate of 1.5 × 1011 rad(Si)/s, and a neutron fluence immunity of 1 × 1014 n/cm2.
Broad-Bandwidth FPGA-Based Digital Polyphase Spectrometer
NASA Technical Reports Server (NTRS)
Jamot, Robert F.; Monroe, Ryan M.
2012-01-01
With present concern for ecological sustainability ever increasing, it is desirable to model the composition of Earth s upper atmosphere accurately with regards to certain helpful and harmful chemicals, such as greenhouse gases and ozone. The microwave limb sounder (MLS) is an instrument designed to map the global day-to-day concentrations of key atmospheric constituents continuously. One important component in MLS is the spectrometer, which processes the raw data provided by the receivers into frequency-domain information that cannot only be transmitted more efficiently, but also processed directly once received. The present-generation spectrometer is fully analog. The goal is to include a fully digital spectrometer in the next-generation sensor. In a digital spectrometer, incoming analog data must be converted into a digital format, processed through a Fourier transform, and finally accumulated to reduce the impact of input noise. While the final design will be placed on an application specific integrated circuit (ASIC), the building of these chips is prohibitively expensive. To that end, this design was constructed on a field-programmable gate array (FPGA). A family of state-of-the-art digital Fourier transform spectrometers has been developed, with a combination of high bandwidth and fine resolution. Analog signals consisting of radiation emitted by constituents in planetary atmospheres or galactic sources are downconverted and subsequently digitized by a pair of interleaved analog-to-digital converters (ADCs). This 6-Gsps (gigasample per second) digital representation of the analog signal is then processed through an FPGA-based streaming fast Fourier transform (FFT). Digital spectrometers have many advantages over previously used analog spectrometers, especially in terms of accuracy and resolution, both of which are particularly important for the type of scientific questions to be addressed with next-generation radiometers.
FPGA-accelerated algorithm for the regular expression matching system
NASA Astrophysics Data System (ADS)
Russek, P.; Wiatr, K.
2015-01-01
This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
NASA Astrophysics Data System (ADS)
Bosse, Stefan
2013-05-01
Sensorial materials consisting of high-density, miniaturized, and embedded sensor networks require new robust and reliable data processing and communication approaches. Structural health monitoring is one major field of application for sensorial materials. Each sensor node provides some kind of sensor, electronics, data processing, and communication with a strong focus on microchip-level implementation to meet the goals of miniaturization and low-power energy environments, a prerequisite for autonomous behaviour and operation. Reliability requires robustness of the entire system in the presence of node, link, data processing, and communication failures. Interaction between nodes is required to manage and distribute information. One common interaction model is the mobile agent. An agent approach provides stronger autonomy than a traditional object or remote-procedure-call based approach. Agents can decide for themselves, which actions are performed, and they are capable of flexible behaviour, reacting on the environment and other agents, providing some degree of robustness. Traditionally multi-agent systems are abstract programming models which are implemented in software and executed on program controlled computer architectures. This approach does not well scale to micro-chip level and requires full equipped computers and communication structures, and the hardware architecture does not consider and reflect the requirements for agent processing and interaction. We propose and demonstrate a novel design paradigm for reliable distributed data processing systems and a synthesis methodology and framework for multi-agent systems implementable entirely on microchip-level with resource and power constrained digital logic supporting Agent-On-Chip architectures (AoC). The agent behaviour and mobility is fully integrated on the micro-chip using pipelined communicating processes implemented with finite-state machines and register-transfer logic. The agent behaviour, interaction (communication), and mobility features are modelled and specified on a machine-independent abstract programming level using a state-based agent behaviour language (APL). With this APL a high-level agent compiler is able to synthesize a hardware model (RTL, VHDL), a software model (C, ML), or a simulation model (XML) suitable to simulate a multi-agent system using the SeSAm simulator framework. Agent communication is provided by a simple tuple-space database implemented on node level providing fault tolerant access of global data. A novel synthesis development kit (SynDK) based on a graph-structured database approach is introduced to support the rapid development of compilers and synthesis tools, used for example for the design and implementation of the APL compiler.
A natural-color mapping for single-band night-time image based on FPGA
NASA Astrophysics Data System (ADS)
Wang, Yilun; Qian, Yunsheng
2018-01-01
A natural-color mapping for single-band night-time image method based on FPGA can transmit the color of the reference image to single-band night-time image, which is consistent with human visual habits and can help observers identify the target. This paper introduces the processing of the natural-color mapping algorithm based on FPGA. Firstly, the image can be transformed based on histogram equalization, and the intensity features and standard deviation features of reference image are stored in SRAM. Then, the real-time digital images' intensity features and standard deviation features are calculated by FPGA. At last, FPGA completes the color mapping through matching pixels between images using the features in luminance channel.
A system for characterization of DEPFET silicon pixel matrices and test beam results
NASA Astrophysics Data System (ADS)
Furletov, Sergey; DEPFET Collaboration
2011-02-01
The DEPFET pixel detector offers first stage in-pixel amplification by incorporating a field effect transistor in the high resistivity silicon substrate. In this concept, a very small input capacitance can be realized thus allowing for low noise measurements. This makes DEPFET sensors a favorable technology for tracking in particle physics. Therefore a system with a DEPFET pixel matrix was developed to test DEPFET performance for an application as a vertex detector for the Belle II experiment. The system features a current based, row-wise readout of a DEPFET pixel matrix with a designated readout chip, steering chips for matrix control, a FPGA based data acquisition board, and a dedicated software package. The system was successfully operated in both test beam and lab environment. In 2009 new DEPFET matrices have been characterized in a 120 GeV pion beam at the CERN SPS. The current status of the DEPFET system and test beam results are presented.
Implementation of image transmission server system using embedded Linux
NASA Astrophysics Data System (ADS)
Park, Jong-Hyun; Jung, Yeon Sung; Nam, Boo Hee
2005-12-01
In this paper, we performed the implementation of image transmission server system using embedded system that is for the specified object and easy to install and move. Since the embedded system has lower capability than the PC, we have to reduce the quantity of calculation of the baseline JPEG image compression and transmission. We used the Redhat Linux 9.0 OS at the host PC and the target board based on embedded Linux. The image sequences are obtained from the camera attached to the FPGA (Field Programmable Gate Array) board with ALTERA cooperation chip. For effectiveness and avoiding some constraints from the vendor's own, we made the device driver using kernel module.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Szadkowski, Zbigniew
We present the new approach to a filtering of radio frequency interferences (RFI) in the Auger Engineering Radio Array (AERA) which study the electromagnetic part of the Extensive Air Showers. The radio stations can observe radio signals caused by coherent emissions due to geomagnetic radiation and charge excess processes. AERA observes frequency band from 30 to 80 MHz. This range is highly contaminated by human-made RFI. In order to improve the signal to noise ratio RFI filters are used in AERA to suppress this contamination. The first kind of filter used by AERA was the Median one, based on themore » Fast Fourier Transform (FFT) technique. The second one, which is currently in use, is the infinite impulse response (IIR) notch filter. The proposed new filter is a finite impulse response (FIR) filter based on a linear prediction (LP). A periodic contamination hidden in a registered signal (digitized in the ADC) can be extracted and next subtracted to make signal cleaner. The FIR filter requires a calculation of n=32, 64 or even 128 coefficients (dependent on a required speed or accuracy) by solving of n linear equations with coefficients built from the covariance Toeplitz matrix. This matrix can be solved by the Levinson recursion, which is much faster than the Gauss procedure. The filter has been already tested in the real AERA radio stations on Argentinean pampas with a very successful results. The linear equations were solved either in the virtual soft-core NIOSR processor (implemented in the FPGA chip as a net of logic elements) or in the external Voipac PXA270M ARM processor. The NIOS processor is relatively slow (50 MHz internal clock), calculations performed in an external processor consume a significant amount of time for data exchange between the FPGA and the processor. Test showed a very good efficiency of the RFI suppression for stationary (long-term) contaminations. However, we observed a short-time contaminations, which could not be suppressed either by the IIR-notch filter or by the FIR filter based on the linear predictions. For the LP FIR filter the refreshment time of the filter coefficients was to long and filter did not keep up with the changes of a contamination structure, mainly due to a long calculation time in a slow processors. We propose to use the Cyclone V SE chip with embedded micro-controller operating with 925 MHz internal clock to significantly reduce a refreshment time of the FIR coefficients. The lab results are promising. (authors)« less
A CMOS ASIC Design for SiPM Arrays
Dey, Samrat; Banks, Lushon; Chen, Shaw-Pin; Xu, Wenbin; Lewellen, Thomas K.; Miyaoka, Robert S.; Rudell, Jacques C.
2012-01-01
Our lab has previously reported on novel board-level readout electronics for an 8×8 silicon photomultiplier (SiPM) array featuring row/column summation technique to reduce the hardware requirements for signal processing. We are taking the next step by implementing a monolithic CMOS chip which is based on the row-column architecture. In addition, this paper explores the option of using diagonal summation as well as calibration to compensate for temperature and process variations. Further description of a timing pickoff signal which aligns all of the positioning (spatial channels) pulses in the array is described. The ASIC design is targeted to be scalable with the detector size and flexible to accommodate detectors from different vendors. This paper focuses on circuit implementation issues associated with the design of the ASIC to interface our Phase II MiCES FPGA board with a SiPM array. Moreover, a discussion is provided for strategies to eventually integrate all the analog and mixed-signal electronics with the SiPM, on either a single-silicon substrate or multi-chip module (MCM). PMID:24825923
1 mm3-sized optical neural stimulator based on CMOS integrated photovoltaic power receiver
NASA Astrophysics Data System (ADS)
Tokuda, Takashi; Ishizu, Takaaki; Nattakarn, Wuthibenjaphonchai; Haruta, Makito; Noda, Toshihiko; Sasagawa, Kiyotaka; Sawan, Mohamad; Ohta, Jun
2018-04-01
In this work, we present a simple complementary metal-oxide semiconductor (CMOS)-controlled photovoltaic power-transfer platform that is suitable for very small (less than or equal to 1-2 mm) electronic devices such as implantable health-care devices or distributed nodes for the Internet of Things. We designed a 1.25 mm × 1.25 mm CMOS power receiver chip that contains integrated photovoltaic cells. We characterized the CMOS-integrated power receiver and successfully demonstrated blue light-emitting diode (LED) operation powered by infrared light. Then, we integrated the CMOS chip and a few off-chip components into a 1-mm3 implantable optogenetic stimulator, and demonstrated the operation of the device.
Fast data transmission in dynamic data acquisition system for plasma diagnostics
NASA Astrophysics Data System (ADS)
Byszuk, Adrian; Poźniak, Krzysztof; Zabołotny, Wojciech M.; Kasprowicz, Grzegorz; Wojeński, Andrzej; Cieszewski, Radosław; Juszczyk, Bartłomiej; Kolasiński, Piotr; Zienkiewicz, Paweł; Chernyshova, Maryna; Czarski, Tomasz
2014-11-01
This paper describes architecture of a new data acquisition system (DAQ) targeted mainly at plasma diagnostic experiments. Modular architecture, in combination with selected hardware components, allows for straightforward reconfiguration of the whole system, both offline and online. Main emphasis will be put into the implementation of data transmission subsystem in said system. One of the biggest advantages of described system is modular architecture with well defined boundaries between main components: analog frontend (AFE), digital backplane and acquisition/control software. Usage of a FPGA chips allows for a high flexibility in design of analog frontends, including ADC <--> FPGA interface. Data transmission between backplane boards and user software was accomplished with the use of industry-standard PCI Express (PCIe) technology. PCIe implementation includes both FPGA firmware and Linux device driver. High flexibility of PCIe connections was accomplished due to use of configurable PCIe switch. Whenever it's possible, described DAQ system tries to make use of standard off-the-shelf (OTF) components, including typical x86 CPU & motherboard (acting as PCIe controller) and cabling.
Circuit design of an EMCCD camera
NASA Astrophysics Data System (ADS)
Li, Binhua; Song, Qian; Jin, Jianhui; He, Chun
2012-07-01
EMCCDs have been used in the astronomical observations in many ways. Recently we develop a camera using an EMCCD TX285. The CCD chip is cooled to -100°C in an LN2 dewar. The camera controller consists of a driving board, a control board and a temperature control board. Power supplies and driving clocks of the CCD are provided by the driving board, the timing generator is located in the control board. The timing generator and an embedded Nios II CPU are implemented in an FPGA. Moreover the ADC and the data transfer circuit are also in the control board, and controlled by the FPGA. The data transfer between the image workstation and the camera is done through a Camera Link frame grabber. The software of image acquisition is built using VC++ and Sapera LT. This paper describes the camera structure, the main components and circuit design for video signal processing channel, clock driver, FPGA and Camera Link interfaces, temperature metering and control system. Some testing results are presented.
NASA Astrophysics Data System (ADS)
Kelly, Jamie S.; Bowman, Hiroshi C.; Rao, Vittal S.; Pottinger, Hardy J.
1997-06-01
Implementation issues represent an unfamiliar challenge to most control engineers, and many techniques for controller design ignore these issues outright. Consequently, the design of controllers for smart structural systems usually proceeds without regard for their eventual implementation, thus resulting either in serious performance degradation or in hardware requirements that squander power, complicate integration, and drive up cost. The level of integration assumed by the Smart Patch further exacerbates these difficulties, and any design inefficiency may render the realization of a single-package sensor-controller-actuator system infeasible. The goal of this research is to automate the controller implementation process and to relieve the design engineer of implementation concerns like quantization, computational efficiency, and device selection. We specifically target Field Programmable Gate Arrays (FPGAs) as our hardware platform because these devices are highly flexible, power efficient, and reprogrammable. The current study develops an automated implementation sequence that minimizes hardware requirements while maintaining controller performance. Beginning with a state space representation of the controller, the sequence automatically generates a configuration bitstream for a suitable FPGA implementation. MATLAB functions optimize and simulate the control algorithm before translating it into the VHSIC hardware description language. These functions improve power efficiency and simplify integration in the final implementation by performing a linear transformation that renders the controller computationally friendly. The transformation favors sparse matrices in order to reduce multiply operations and the hardware necessary to support them; simultaneously, the remaining matrix elements take on values that minimize limit cycles and parameter sensitivity. The proposed controller design methodology is implemented on a simple cantilever beam test structure using FPGA hardware. The experimental closed loop response is compared with that of an automated FPGA controller implementation. Finally, we explore the integration of FPGA based controllers into a multi-chip module, which we believe represents the next step towards the realization of the Smart Patch.
Study on VCSEL laser heating chip in nuclear magnetic resonance gyroscope
NASA Astrophysics Data System (ADS)
Liang, Xiaoyang; Zhou, Binquan; Wu, Wenfeng; Jia, Yuchen; Wang, Jing
2017-10-01
In recent years, atomic gyroscope has become an important direction of inertial navigation. Nuclear magnetic resonance gyroscope has a stronger advantage in the miniaturization of the size. In atomic gyroscope, the lasers are indispensable devices which has an important effect on the improvement of the gyroscope performance. The frequency stability of the VCSEL lasers requires high precision control of temperature. However, the heating current of the laser will definitely bring in the magnetic field, and the sensitive device, alkali vapor cell, is very sensitive to the magnetic field, so that the metal pattern of the heating chip should be designed ingeniously to eliminate the magnetic field introduced by the heating current. In this paper, a heating chip was fabricated by MEMS process, i.e. depositing platinum on semiconductor substrates. Platinum has long been considered as a good resistance material used for measuring temperature The VCSEL laser chip is fixed in the center of the heating chip. The thermometer resistor measures the temperature of the heating chip, which can be considered as the same temperature of the VCSEL laser chip, by turning the temperature signal into voltage signal. The FPGA chip is used as a micro controller, and combined with PID control algorithm constitute a closed loop control circuit. The voltage applied to the heating resistor wire is modified to achieve the temperature control of the VCSEL laser. In this way, the laser frequency can be controlled stably and easily. Ultimately, the temperature stability can be achieved better than 100mK.
Design and realization of the real-time spectrograph controller for LAMOST based on FPGA
NASA Astrophysics Data System (ADS)
Wang, Jianing; Wu, Liyan; Zeng, Yizhong; Dai, Songxin; Hu, Zhongwen; Zhu, Yongtian; Wang, Lei; Wu, Zhen; Chen, Yi
2008-08-01
A large Schmitt reflector telescope, Large Sky Area Multi-Object Fiber Spectroscopic Telescope(LAMOST), is being built in China, which has effective aperture of 4 meters and can observe the spectra of as many as 4000 objects simultaneously. To fit such a large amount of observational objects, the dispersion part is composed of a set of 16 multipurpose fiber-fed double-beam Schmidt spectrographs, of which each has about ten of moveable components realtimely accommodated and manipulated by a controller. An industrial Ethernet network connects those 16 spectrograph controllers. The light from stars is fed to the entrance slits of the spectrographs with optical fibers. In this paper, we mainly introduce the design and realization of our real-time controller for the spectrograph, our design using the technique of System On Programmable Chip (SOPC) based on Field Programmable Gate Array (FPGA) and then realizing the control of the spectrographs through NIOSII Soft Core Embedded Processor. We seal the stepper motor controller as intellectual property (IP) cores and reuse it, greatly simplifying the design process and then shortening the development time. Under the embedded operating system μC/OS-II, a multi-tasks control program has been well written to realize the real-time control of the moveable parts of the spectrographs. At present, a number of such controllers have been applied in the spectrograph of LAMOST.
Partnership For Edge Physics Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parashar, Manish
In this effort, we will extend our prior work as part of CPES (i.e., DART and DataSpaces) to support in-situ tight coupling between application codes that exploits data locality and core-level parallelism to maximize on-chip data exchange and reuse. This will be accomplished by mapping coupled simulations so that the data exchanges are more localized within the nodes. Coupled simulation workflows can more effectively utilize the resources available on emerging HEC platforms if they can be mapped and executed to exploit data locality as well as the communication patterns between application components. Scheduling and running such workflows requires an extendedmore » framework that should provide a unified hybrid abstraction to enable coordination and data sharing across computation tasks that run on the heterogeneous multi-core-based systems, and develop a data-locality based dynamic tasks scheduling approach to increase on-chip or intra-node data exchanges and in-situ execution. This effort will extend our prior work as part of CPES (i.e., DART and DataSpaces), which provided a simple virtual shared-space abstraction hosted at the staging nodes, to support application coordination, data sharing and active data processing services. Moreover, it will transparently manage the low-level operations associated with the inter-application data exchange, such as data redistributions, and will enable running coupled simulation workflow on multi-cores computing platforms.« less
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
Efficient full-chip SRAF placement using machine learning for best accuracy and improved consistency
NASA Astrophysics Data System (ADS)
Wang, Shibing; Baron, Stanislas; Kachwala, Nishrin; Kallingal, Chidam; Sun, Dezheng; Shu, Vincent; Fong, Weichun; Li, Zero; Elsaid, Ahmad; Gao, Jin-Wei; Su, Jing; Ser, Jung-Hoon; Zhang, Quan; Chen, Been-Der; Howell, Rafael; Hsu, Stephen; Luo, Larry; Zou, Yi; Zhang, Gary; Lu, Yen-Wen; Cao, Yu
2018-03-01
Various computational approaches from rule-based to model-based methods exist to place Sub-Resolution Assist Features (SRAF) in order to increase process window for lithography. Each method has its advantages and drawbacks, and typically requires the user to make a trade-off between time of development, accuracy, consistency and cycle time. Rule-based methods, used since the 90 nm node, require long development time and struggle to achieve good process window performance for complex patterns. Heuristically driven, their development is often iterative and involves significant engineering time from multiple disciplines (Litho, OPC and DTCO). Model-based approaches have been widely adopted since the 20 nm node. While the development of model-driven placement methods is relatively straightforward, they often become computationally expensive when high accuracy is required. Furthermore these methods tend to yield less consistent SRAFs due to the nature of the approach: they rely on a model which is sensitive to the pattern placement on the native simulation grid, and can be impacted by such related grid dependency effects. Those undesirable effects tend to become stronger when more iterations or complexity are needed in the algorithm to achieve required accuracy. ASML Brion has developed a new SRAF placement technique on the Tachyon platform that is assisted by machine learning and significantly improves the accuracy of full chip SRAF placement while keeping consistency and runtime under control. A Deep Convolutional Neural Network (DCNN) is trained using the target wafer layout and corresponding Continuous Transmission Mask (CTM) images. These CTM images have been fully optimized using the Tachyon inverse mask optimization engine. The neural network generated SRAF guidance map is then used to place SRAF on full-chip. This is different from our existing full-chip MB-SRAF approach which utilizes a SRAF guidance map (SGM) of mask sensitivity to improve the contrast of optical image at the target pattern edges. In this paper, we demonstrate that machine learning assisted SRAF placement can achieve a superior process window compared to the SGM model-based SRAF method, while keeping the full-chip runtime affordable, and maintain consistency of SRAF placement . We describe the current status of this machine learning assisted SRAF technique and demonstrate its application to full chip mask synthesis and discuss how it can extend the computational lithography roadmap.
TrackCC: A Practical Wireless Indoor Localization System Based on Less-Expensive Chips
Li, Xiaolong; Zheng, Yan; Cai, Jun; Yi, Yunfei
2017-01-01
This paper aims at proposing a new wireless indoor localization system (ILS), called TrackCC, based on a commercial type of low-power system-on-chip (SoC), nRF24LE1. This type of chip has only l output power levels and acute fluctuation for a received minimum power level in operation, which give rise to many practical challenges for designing localization algorithms. In order to address these challenges, we exploit the Markov theory to construct a (l+1)×(l+1) -sized state transition matrix to remove the fluctuation, and then propose a priority-based pattern matching algorithm to search for the most similar match in the signal map to estimate the real position of unknown nodes. The experimental results show that, compared to two existing wireless ILSs, LANDMARC and SAIL, which have meter level positioning accuracy, the proposed TrackCC can achieve the decimeter level accuracy on average in both line-of-sight (LOS) and non-line-of-sight (NLOS) senarios. PMID:28617313
Dense, Efficient Chip-to-Chip Communication at the Extremes of Computing
ERIC Educational Resources Information Center
Loh, Matthew
2013-01-01
The scalability of CMOS technology has driven computation into a diverse range of applications across the power consumption, performance and size spectra. Communication is a necessary adjunct to computation, and whether this is to push data from node-to-node in a high-performance computing cluster or from the receiver of wireless link to a neural…
Protograph LDPC Codes with Node Degrees at Least 3
NASA Technical Reports Server (NTRS)
Divsalar, Dariush; Jones, Christopher
2006-01-01
In this paper we present protograph codes with a small number of degree-3 nodes and one high degree node. The iterative decoding threshold for proposed rate 1/2 codes are lower, by about 0.2 dB, than the best known irregular LDPC codes with degree at least 3. The main motivation is to gain linear minimum distance to achieve low error floor. Also to construct rate-compatible protograph-based LDPC codes for fixed block length that simultaneously achieves low iterative decoding threshold and linear minimum distance. We start with a rate 1/2 protograph LDPC code with degree-3 nodes and one high degree node. Higher rate codes are obtained by connecting check nodes with degree-2 non-transmitted nodes. This is equivalent to constraint combining in the protograph. The condition where all constraints are combined corresponds to the highest rate code. This constraint must be connected to nodes of degree at least three for the graph to have linear minimum distance. Thus having node degree at least 3 for rate 1/2 guarantees linear minimum distance property to be preserved for higher rates. Through examples we show that the iterative decoding threshold as low as 0.544 dB can be achieved for small protographs with node degrees at least three. A family of low- to high-rate codes with minimum distance linearly increasing in block size and with capacity-approaching performance thresholds is presented. FPGA simulation results for a few example codes show that the proposed codes perform as predicted.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less
[Low-power Wireless Micro Ambulatory Electrocardiogram Node].
Cai, Zhipeng; Luo, Kan; Li, Jianqing
2016-02-01
Ambulatory electrocardiogram (ECG) monitoring can effectively reduce the risk and death rate of patients with cardiovascular diseases (CVDs). The Body Sensor Network (BSN) based ECG monitoring is a new and efficien method to protect the CVDs patients. To meet the challenges of miniaturization, low power and high signal quality of the node, we proposed a novel 50 mmX 50 mmX 10 mm, 30 g wireless ECG node, which includes the single-chip an alog front-end AD8232, ultra-low power microprocessor MSP430F1611 and Bluetooth module HM-11. The ECG signal quality is guaranteed by the on-line digital filtering. The difference threshold algorithm results in accuracy of R-wave detection and heart rate. Experiments were carried out to test the node and the results showed that the pro posed node reached the design target, and it has great potential in application of wireless ECG monitoring.
3-D integrated heterogeneous intra-chip free-space optical interconnect.
Ciftcioglu, Berkehan; Berman, Rebecca; Wang, Shang; Hu, Jianyun; Savidis, Ioannis; Jain, Manish; Moore, Duncan; Huang, Michael; Friedman, Eby G; Wicks, Gary; Wu, Hui
2012-02-13
This paper presents the first chip-scale demonstration of an intra-chip free-space optical interconnect (FSOI) we recently proposed. This interconnect system provides point-to-point free-space optical links between any two communication nodes, and hence constructs an all-to-all intra-chip communication fabric, which can be extended for inter-chip communications as well. Unlike electrical and other waveguide-based optical interconnects, FSOI exhibits low latency, high energy efficiency, and large bandwidth density, and hence can significantly improve the performance of future many-core chips. In this paper, we evaluate the performance of the proposed FSOI interconnect, and compare it to a waveguide-based optical interconnect with wavelength division multiplexing (WDM). It shows that the FSOI system can achieve significantly lower loss and higher energy efficiency than the WDM system, even with optimistic assumptions for the latter. A 1×1-cm2 chip prototype is fabricated on a germanium substrate with integrated photodetectors. Commercial 850-nm GaAs vertical-cavity-surface-emitting-lasers (VCSELs) and fabricated fused silica microlenses are 3-D integrated on top of the substrate. At 1.4-cm distance, the measured optical transmission loss is 5 dB, the crosstalk is less than -20 dB, and the electrical-to-electrical bandwidth is 3.3 GHz. The latter is mainly limited by the 5-GHz VCSEL.
Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ceriani, Marco; Palermo, Gianluca; Secchi, Simone
We present a prototype of a multi-core architecture implemented on FPGA, designed to enable efficient execution of irregular applications on distributed shared memory machines, while maintaining high performance on regular workloads. The architecture is composed of off-the-shelf soft-core cores, local interconnection and memory interface, integrated with custom components that optimize it for irregular applications. It relies on three key elements: a global address space, multithreading, and fine-grained synchronization. Global addresses are scrambled to reduce the formation of network hot-spots, while the latency of the transactions is covered by integrating an hardware scheduler within the custom load/store buffers to take advantagemore » from the availability of multiple executions threads, increasing the efficiency in a transparent way to the application. We evaluated a dual node system irregular kernels showing scalability in the number of cores and threads.« less
Radiation effects and mitigation strategies for modern FPGAs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stettler, M. W.; Caffrey, M. P.; Graham, P. S.
2004-01-01
Field Programmable Gate Array devices have become the technology of choice in small volume modern instrumentation and control systems. These devices have always offered significant advantages in flexibility, and recent advances in fabrication have greatly increased logic capacity, substantially increasing the number of applications for this technology. Unfortunately, the increased density (and corresponding shrinkage of process geometry), has made these devices more susceptible to failure due to external radiation. This has been an issue for space based systems for some time, but is now becoming an issue for terrestrial systems in elevated radiation environments and commercial avionics as well. Characterizingmore » the failure modes of Xilinx FPGAs, and developing mitigation strategies is the subject of ongoing research by a consortium of academic, industrial, and governmental laboratories. This paper presents background information of radiation effects and failure modes, as well as current and future mitigation techniques. In particular, the availability of very large FPGA devices, complete with generous amounts of RAM and embedded processor(s), has led to the implementation of complete digital systems on a single device, bringing issues of system reliability and redundancy management to the chip level. Radiation effects on a single FPGA are increasingly likely to have system level consequences, and will need to be addressed in current and future designs.« less
Tethered Forth system for FPGA applications
NASA Astrophysics Data System (ADS)
Goździkowski, Paweł; Zabołotny, Wojciech M.
2013-10-01
This paper presents the tethered Forth system dedicated for testing and debugging of FPGA based electronic systems. Use of the Forth language allows to interactively develop and run complex testing or debugging routines. The solution is based on a small, 16-bit soft core CPU, used to implement the Forth Virtual Machine. Thanks to the use of the tethered Forth model it is possible to minimize usage of the internal RAM memory in the FPGA. The function of the intelligent terminal, which is an essential part of the tethered Forth system, may be fulfilled by the standard PC computer or by the smartphone. System is implemented in Python (the software for intelligent terminal), and in VHDL (the IP core for FPGA), so it can be easily ported to different hardware platforms. The connection between the terminal and FPGA may be established and disconnected many times without disturbing the state of the FPGA based system. The presented system has been verified in the hardware, and may be used as a tool for debugging, testing and even implementing of control algorithms for FPGA based systems.
FPGA based demodulation of laser induced fluorescence in plasmas
NASA Astrophysics Data System (ADS)
Mattingly, Sean W.; Skiff, Fred
2018-04-01
We present a field programmable gate array (FPGA)-based system that counts photons from laser-induced fluorescence (LIF) on a laboratory plasma. This is accomplished with FPGA-based up/down counters that demodulate the data, giving a background-subtracted LIF signal stream that is updated with a new point as each laser amplitude modulation cycle completes. We demonstrate using the FPGA to modulate a laser at 1 MHz and demodulate the resulting LIF data stream. This data stream is used to calculate an LIF-based measurement sampled at 1 MHz of a plasma ion fluctuation spectrum.
Real-time FPGA architectures for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2000-03-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
A FPGA-based Cluster Finder for CMOS Monolithic Active Pixel Sensors of the MIMOSA-26 Family
NASA Astrophysics Data System (ADS)
Li, Qiyan; Amar-Youcef, S.; Doering, D.; Deveaux, M.; Fröhlich, I.; Koziel, M.; Krebs, E.; Linnik, B.; Michel, J.; Milanovic, B.; Müntz, C.; Stroth, J.; Tischler, T.
2014-06-01
CMOS Monolithic Active Pixel Sensors (MAPS) demonstrated excellent performances in the field of charged particle tracking. Among their strong points are an single point resolution few μm, a light material budget of 0.05% X0 in combination with a good radiation tolerance and high rate capability. Those features make the sensors a valuable technology for vertex detectors of various experiments in heavy ion and particle physics. To reduce the load on the event builders and future mass storage systems, we have developed algorithms suited for preprocessing and reducing the data streams generated by the MAPS. This real-time processing employs remaining free resources of the FPGAs of the readout controllers of the detector and complements the on-chip data reduction circuits of the MAPS.
Fault Tolerance in ZigBee Wireless Sensor Networks
NASA Technical Reports Server (NTRS)
Alena, Richard; Gilstrap, Ray; Baldwin, Jarren; Stone, Thom; Wilson, Pete
2011-01-01
Wireless sensor networks (WSN) based on the IEEE 802.15.4 Personal Area Network standard are finding increasing use in the home automation and emerging smart energy markets. The network and application layers, based on the ZigBee 2007 PRO Standard, provide a convenient framework for component-based software that supports customer solutions from multiple vendors. This technology is supported by System-on-a-Chip solutions, resulting in extremely small and low-power nodes. The Wireless Connections in Space Project addresses the aerospace flight domain for both flight-critical and non-critical avionics. WSNs provide the inherent fault tolerance required for aerospace applications utilizing such technology. The team from Ames Research Center has developed techniques for assessing the fault tolerance of ZigBee WSNs challenged by radio frequency (RF) interference or WSN node failure.
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan R.
2015-05-01
New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
Towards pattern generation and chaotic series prediction with photonic reservoir computers
NASA Astrophysics Data System (ADS)
Antonik, Piotr; Hermans, Michiel; Duport, François; Haelterman, Marc; Massar, Serge
2016-03-01
Reservoir Computing is a bio-inspired computing paradigm for processing time dependent signals that is particularly well suited for analog implementations. Our team has demonstrated several photonic reservoir computers with performance comparable to digital algorithms on a series of benchmark tasks such as channel equalisation and speech recognition. Recently, we showed that our opto-electronic reservoir computer could be trained online with a simple gradient descent algorithm programmed on an FPGA chip. This setup makes it in principle possible to feed the output signal back into the reservoir, and thus highly enrich the dynamics of the system. This will allow to tackle complex prediction tasks in hardware, such as pattern generation and chaotic and financial series prediction, which have so far only been studied in digital implementations. Here we report simulation results of our opto-electronic setup with an FPGA chip and output feedback applied to pattern generation and Mackey-Glass chaotic series prediction. The simulations take into account the major aspects of our experimental setup. We find that pattern generation can be easily implemented on the current setup with very good results. The Mackey-Glass series prediction task is more complex and requires a large reservoir and more elaborate training algorithm. With these adjustments promising result are obtained, and we now know what improvements are needed to match previously reported numerical results. These simulation results will serve as basis of comparison for experiments we will carry out in the coming months.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nash, T.; Atac, R.; Cook, A.
1989-03-06
The ACPMAPS multipocessor is a highly cost effective, local memory parallel computer with a hypercube or compound hypercube architecture. Communication requires the attention of only the two communicating nodes. The design is aimed at floating point intensive, grid like problems, particularly those with extreme computing requirements. The processing nodes of the system are single board array processors, each with a peak power of 20 Mflops, supported by 8 Mbytes of data and 2 Mbytes of instruction memory. The system currently being assembled has a peak power of 5 Gflops. The nodes are based on the Weitek XL Chip set. Themore » system delivers performance at approximately $300/Mflop. 8 refs., 4 figs.« less
Barone, Umberto; Merletti, Roberto
2013-08-01
A compact and portable system for real-time, multichannel, HD-sEMG acquisition is presented. The device is based on a modular, multiboard approach for scalability and to optimize power consumption for battery operating mode. The proposed modular approach allows us to configure the number of sEMG channels from 64 to 424. A plastic-optical-fiber-based 10/100 Ethernet link is implemented on a field-programmable gate array (FPGA)-based board for real-time, safety data transmission toward a personal computer or laptop for data storage and offline analysis. The high-performance A/D conversion stage, based on 24-bit ADC, allows us to automatically serialize the samples and transmits them on a single SPI bus connecting a sequence of up to 14 ADC chips in chain mode. The prototype is configured to work with 64 channels and a sample frequency of 2.441 ksps (derived from 25-MHz clock source), corresponding to a real data throughput of 3 Mbps. The prototype was assembled to demonstrate the available features (e.g., scalability) and evaluate the expected performances. The analog front end board could be dynamically configured to acquire sEMG signals in monopolar or single differential mode by means of FPGA I/O interface. The system can acquire continuously 64 channels for up to 5 h with a lightweight battery pack of 7.5 Vdc/2200 mAh. A PC-based application was also developed, by means of the open source Qt Development Kit from Nokia, for prototype characterization, sEMG measurements, and real-time visualization of 2-D maps.
Study on Radiation Condition in DAMPE Orbit by Analyzing the Engineering Data of BGO Calorimeter
NASA Astrophysics Data System (ADS)
Feng, Changqing; Liu, Shubin; Zhang, Yunlong; Ma, Siyuan
2016-07-01
The DAMPE (DArk Matter Particle Explorer) is a scientific satellite which was successfully launched into a 500 Km sun-synchronous orbit, on December 17th, 2015, from the Jiuquan Satellite Launch Center of China. The major scientific objectives of the DAMPE mission are primary cosmic ray, gamma ray astronomy and dark matter particles, by observing high energy primary cosmic rays, especially positrons/electrons and gamma rays with an energy range from 5 GeV to 10 TeV. The BGO calorimeter is a critical sub-detector of DAMPE payload, for measuring the energy of cosmic particles, distinguishing positrons/electrons and gamma rays from hadron background, and providing trigger information. It utilizes 308 BGO (Bismuth Germanate Oxide) crystal logs with the size of 2.5cm*2.5cm*60cm for each log, to form a total absorption electromagnetic calorimeter. All the BGO logs are stacked in 14 layers, with each layer consisting of 22 BGO crystal logs and each log is viewed by two Hamamatsu R5610A PMTs (photomultiplier tubes), from both sides respectively. In order to achieve a large dynamic range, each PMT base incorporates a three dynode (2, 5, 8) pick off, which results in 616 PMTs and 1848 signal channels. The readout electronics system, which consists of 16 FEE (Front End Electronics) modules, was developed. Its main functions are based on the Flash-based FPGA (Field Programmable Gate Array) chip and low power, 32-channel VA160 and VATA160 ASICs (Application Specific Integrated Circuits) for precisely measuring the charge of PMT signals and providing "hit" signals as well. The hit signals are sent to the trigger module of PDPU (Payload Data Process Unit) and the hit rates of each layer is real-timely recorded by counters and packed into the engineering data, which directly reflect the flux of particles which fly into or pass through the detectors. In order to mitigate the SEU (Single Event Upset) effect in radioactive space environment, certain protecting methods, such as TMR (Triple Modular Redundancy) and CRC (Cyclic Redundancy Check) for some critical registers in FPGA logic was adopted. To mitigate the SEL (Single Event Latch-up) effect for the ASICs chips, a protecting solution by monitoring the current of VA160/VATA160 chips are applied. All the SEU and SEL events are recorded by counters and transmitted to ground station in the form of engineering data. The information of hit rates, and the SEU and SEL counters in the engineering data can be used to evaluate the radiation condition and its variations in DAMPE orbit. The preliminary results are introduced in this paper, which is based on the engineering data in the first six months after launching.
Economical Implementation of a Filter Engine in an FPGA
NASA Technical Reports Server (NTRS)
Kowalski, James E.
2009-01-01
A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be advantageous to combine compact FPGA digital filter implementations with other application-specific logic implementations on single integrated-circuit chips. An FPGA could readily be tailored to implement a variety of filters because the filter coefficients would be loaded into memory at startup.
A low delay transmission method of multi-channel video based on FPGA
NASA Astrophysics Data System (ADS)
Fu, Weijian; Wei, Baozhi; Li, Xiaobin; Wang, Quan; Hu, Xiaofei
2018-03-01
In order to guarantee the fluency of multi-channel video transmission in video monitoring scenarios, we designed a kind of video format conversion method based on FPGA and its DMA scheduling for video data, reduces the overall video transmission delay.In order to sace the time in the conversion process, the parallel ability of FPGA is used to video format conversion. In order to improve the direct memory access (DMA) writing transmission rate of PCIe bus, a DMA scheduling method based on asynchronous command buffer is proposed. The experimental results show that this paper designs a low delay transmission method based on FPGA, which increases the DMA writing transmission rate by 34% compared with the existing method, and then the video overall delay is reduced to 23.6ms.
Region-Oriented Placement Algorithm for Coarse-Grained Power-Gating FPGA Architecture
NASA Astrophysics Data System (ADS)
Li, Ce; Dong, Yiping; Watanabe, Takahiro
An FPGA plays an essential role in industrial products due to its fast, stable and flexible features. But the power consumption of FPGAs used in portable devices is one of critical issues. Top-down hierarchical design method is commonly used in both ASIC and FPGA design. But, in the case where plural modules are integrated in an FPGA and some of them might be in sleep-mode, current FPGA architecture cannot be fully effective. In this paper, coarse-grained power gating FPGA architecture is proposed where a whole area of an FPGA is partitioned into several regions and power supply is controlled for each region, so that modules in sleep mode can be effectively power-off. We also propose a region oriented FPGA placement algorithm fitted to this user's hierarchical design based on VPR[1]. Simulation results show that this proposed method could reduce power consumption of FPGA by 38% on average by setting unused modules or regions in sleep mode.
NASA Astrophysics Data System (ADS)
Schnauber, Peter; Schall, Johannes; Bounouar, Samir; Höhne, Theresa; Park, Suk-In; Ryu, Geun-Hwan; Heindel, Tobias; Burger, Sven; Song, Jin-Dong; Rodt, Sven; Reitzenstein, Stephan
2018-04-01
The development of multi-node quantum optical circuits has attracted great attention in recent years. In particular, interfacing quantum-light sources, gates and detectors on a single chip is highly desirable for the realization of large networks. In this context, fabrication techniques that enable the deterministic integration of pre-selected quantum-light emitters into nanophotonic elements play a key role when moving forward to circuits containing multiple emitters. Here, we present the deterministic integration of an InAs quantum dot into a 50/50 multi-mode interference beamsplitter via in-situ electron beam lithography. We demonstrate the combined emitter-gate interface functionality by measuring triggered single-photon emission on-chip with $g^{(2)}(0) = 0.13\\pm 0.02$. Due to its high patterning resolution as well as spectral and spatial control, in-situ electron beam lithography allows for integration of pre-selected quantum emitters into complex photonic systems. Being a scalable single-step approach, it paves the way towards multi-node, fully integrated quantum photonic chips.
Indiveri, Giacomo
2008-01-01
Biological organisms perform complex selective attention operations continuously and effortlessly. These operations allow them to quickly determine the motor actions to take in response to combinations of external stimuli and internal states, and to pay attention to subsets of sensory inputs suppressing non salient ones. Selective attention strategies are extremely effective in both natural and artificial systems which have to cope with large amounts of input data and have limited computational resources. One of the main computational primitives used to perform these selection operations is the Winner-Take-All (WTA) network. These types of networks are formed by arrays of coupled computational nodes that selectively amplify the strongest input signals, and suppress the weaker ones. Neuromorphic circuits are an optimal medium for constructing WTA networks and for implementing efficient hardware models of selective attention systems. In this paper we present an overview of selective attention systems based on neuromorphic WTA circuits ranging from single-chip vision sensors for selecting and tracking the position of salient features, to multi-chip systems implement saliency-map based models of selective attention. PMID:27873818
Indiveri, Giacomo
2008-09-03
Biological organisms perform complex selective attention operations continuously and effortlessly. These operations allow them to quickly determine the motor actions to take in response to combinations of external stimuli and internal states, and to pay attention to subsets of sensory inputs suppressing non salient ones. Selective attention strategies are extremely effective in both natural and artificial systems which have to cope with large amounts of input data and have limited computational resources. One of the main computational primitives used to perform these selection operations is the Winner-Take-All (WTA) network. These types of networks are formed by arrays of coupled computational nodes that selectively amplify the strongest input signals, and suppress the weaker ones. Neuromorphic circuits are an optimal medium for constructing WTA networks and for implementing efficient hardware models of selective attention systems. In this paper we present an overview of selective attention systems based on neuromorphic WTA circuits ranging from single-chip vision sensors for selecting and tracking the position of salient features, to multi-chip systems implement saliency-map based models of selective attention.
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform
Claus, R.
2015-10-23
The ATLAS muon Cathode Strip Chamber (CSC) back-end readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run 2 luminosity. The readout design is based on the Reconfiguration Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the ATCA platform. The RCE design is based on the new System on Chip Xilinx Zynq series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources together with auxiliary memories to form a versatile DAQmore » building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the Zynq for G-link, S-link and TTC allowed the full system of 320 G-links from the 32 chambers to be processed by 6 COBs in one ATCA shelf through software waveform feature extraction to output 32 S-links. Furthermore, the full system was installed in Sept. 2014. We will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning towards LHC Run 2.« less
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform
NASA Astrophysics Data System (ADS)
Claus, R.; ATLAS Collaboration
2016-07-01
The ATLAS muon Cathode Strip Chamber (CSC) back-end readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run 2 luminosity. The readout design is based on the Reconfiguration Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the ATCA platform. The RCE design is based on the new System on Chip Xilinx Zynq series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources together with auxiliary memories to form a versatile DAQ building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the Zynq for G-link, S-link and TTC allowed the full system of 320 G-links from the 32 chambers to be processed by 6 COBs in one ATCA shelf through software waveform feature extraction to output 32 S-links. The full system was installed in Sept. 2014. We will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning towards LHC Run 2.
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform
NASA Astrophysics Data System (ADS)
Bartoldus, R.; Claus, R.; Garelli, N.; Herbst, R. T.; Huffer, M.; Iakovidis, G.; Iordanidou, K.; Kwan, K.; Kocian, M.; Lankford, A. J.; Moschovakos, P.; Nelson, A.; Ntekas, K.; Ruckman, L.; Russell, J.; Schernau, M.; Schlenker, S.; Su, D.; Valderanis, C.; Wittgen, M.; Yildiz, S. C.
2016-01-01
The ATLAS muon Cathode Strip Chamber (CSC) backend readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run-2 luminosity. The readout design is based on the Reconfigurable Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the Advanced Telecommunication Computing Architecture (ATCA) platform. The RCE design is based on the new System on Chip XILINX ZYNQ series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources. Together with auxiliary memories, all these components form a versatile DAQ building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the ZYNQ for high speed input and output fiberoptic links and TTC allowed the full system of 320 input links from the 32 chambers to be processed by 6 COBs in one ATCA shelf. The full system was installed in September 2014. We will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning for LHC Run 2.
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform
Bartoldus, R.; Claus, R.; Garelli, N.; ...
2016-01-25
The ATLAS muon Cathode Strip Chamber (CSC) backend readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run-2 luminosity. The readout design is based on the Reconfigurable Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the Advanced Telecommunication Computing Architecture (ATCA) platform. The RCE design is based on the new System on Chip XILINX ZYNQ series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources. Together with auxiliary memories, all ofmore » these components form a versatile DAQ building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the ZYNQ for high speed input and output fiberoptic links and TTC allowed the full system of 320 input links from the 32 chambers to be processed by 6 COBs in one ATCA shelf. The full system was installed in September 2014. In conclusion, we will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning for LHC Run 2.« less
Manufacturability of the X Architecture at the 90-nm technology node
NASA Astrophysics Data System (ADS)
Smayling, Michael C.; Sarma, Robin C.; Nagata, Toshiyuki; Arora, Narain; Duane, Michael P.; Oemardani, Shiany; Shah, Santosh
2004-05-01
In this paper, we discuss the results from a test chip that demonstrate the manufacturability and integration-worthiness of the X Architecture at the 90-nm technology node. We discuss how a collaborative effort between the design and chip making communities used the current generation of mask, lithography, wafer processing, inspection and metrology equipment to create 45 degree wires in typical metal pitches for the upper layers on a 90-nm device in a production environment. Cadence Design Systems created the test structure design and chip validation tools for the project. Canon"s KrF ES3 and ArF AS2 scanners were used for the lithography. Applied Materials used its interconnect fabrication technologies to produce the multilayer copper, low-k interconnect on 300-mm wafers. The results were confirmed for critical dimension and defect levels using Applied Materials" wafer inspection and metrology systems.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
FPGA-based trigger system for the Fermilab SeaQuest experimentz
NASA Astrophysics Data System (ADS)
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; Chang, Ting-Hua; Chang, Wen-Chen; Chen, Yen-Chu; Gilman, Ron; Nakano, Kenichi; Peng, Jen-Chieh; Wang, Su-Yin
2015-12-01
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ+ and μ- produced in 120 GeV/c proton-nucleon interactions in a high rate environment. The trigger system consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is then examined against pre-determined trigger matrices to identify candidate muon tracks. Information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz
Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; ...
2015-09-10
The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
A wireless multi-channel bioimpedance measurement system for personalized healthcare and lifestyle.
Ramos, Javier; Ausín, José Luis; Lorido, Antonio Manuel; Redondo, Francisco; Duque-Carrillo, Juan Francisco
2013-01-01
Miniaturized, noninvasive, wearable sensors constitute a fundamental prerequisite for pervasive, predictive, and preventive healthcare systems. In this sense, this paper presents the design, realization, and evaluation of a wireless multi-channel measurement system based on a cost-effective high-performance integrated circuit for electrical bioimpedance (EBI) measurements in the frequency range from 1 kHz to 1 MHz. The resulting on-chip spectrometer provides high measuring EBI capabilities and together with a low-cost, commercially available radio frequency transceiver device. It provides reliable wireless communication, constitutes the basic node to build EBI wireless sensor networks (EBI-WSNs). The proposed EBI-WSN behaves as a high-performance wireless multi-channel EBI spectrometer, where the number of channels is completely scalable and independently configurable to satisfy specific measurement requirements of each individual. A prototype of the EBI node leads to a very small printed circuit board of approximately 8 cm2 including chip-antenna, which can operate several years on one 3-V coin cell battery and make it suitable for long-term preventive healthcare monitoring.
High precision, fast ultrasonic thermometer based on measurement of the speed of sound in air
NASA Astrophysics Data System (ADS)
Huang, K. N.; Huang, C. F.; Li, Y. C.; Young, M. S.
2002-11-01
This study presents a microcomputer-based ultrasonic system which measures air temperature by detecting variations in the speed of sound in the air. Changes in the speed of sound are detected by phase shift variations of a 40 kHz continuous ultrasonic wave. In a test embodiment, two 40 kHz ultrasonic transducers are set face to face at a constant distance. Phase angle differences between transmitted and received signals are determined by a FPGA digital phase detector and then analyzed in an 89C51 single-chip microcomputer. Temperature is calculated and then sent to a LCD display and, optionally, to a PC. Accuracy of measurement is within 0.05 degC at an inter-transducer distance of 10 cm. Temperature variations are displayed within 10 ms. The main advantages of the proposed system are high resolution, rapid temperature measurement, noncontact measurement and easy implementation.
Bonfanti, A; Ceravolo, M; Zambra, G; Gusmeroli, R; Spinelli, A S; Lacaita, A L; Angotzi, G N; Baranauskas, G; Fadiga, L
2010-01-01
This paper reports a multi-channel neural recording system-on-chip (SoC) with digital data compression and wireless telemetry. The circuit consists of a 16 amplifiers, an analog time division multiplexer, an 8-bit SAR AD converter, a digital signal processor (DSP) and a wireless narrowband 400-MHz binary FSK transmitter. Even though only 16 amplifiers are present in our current die version, the whole system is designed to work with 64 channels demonstrating the feasibility of a digital processing and narrowband wireless transmission of 64 neural recording channels. A digital data compression, based on the detection of action potentials and storage of correspondent waveforms, allows the use of a 1.25-Mbit/s binary FSK wireless transmission. This moderate bit-rate and a low frequency deviation, Manchester-coded modulation are crucial for exploiting a narrowband wireless link and an efficient embeddable antenna. The chip is realized in a 0.35- εm CMOS process with a power consumption of 105 εW per channel (269 εW per channel with an extended transmission range of 4 m) and an area of 3.1 × 2.7 mm(2). The transmitted signal is captured by a digital TV tuner and demodulated by a wideband phase-locked loop (PLL), and then sent to a PC via an FPGA module. The system has been tested for electrical specifications and its functionality verified in in-vivo neural recording experiments.
A FPGA-based architecture for real-time image matching
NASA Astrophysics Data System (ADS)
Wang, Jianhui; Zhong, Sheng; Xu, Wenhui; Zhang, Weijun; Cao, Zhiguo
2013-10-01
Image matching is a fundamental task in computer vision. It is used to establish correspondence between two images taken at different viewpoint or different time from the same scene. However, its large computational complexity has been a challenge to most embedded systems. This paper proposes a single FPGA-based image matching system, which consists of SIFT feature detection, BRIEF descriptor extraction and BRIEF matching. It optimizes the FPGA architecture for the SIFT feature detection to reduce the FPGA resources utilization. Moreover, we implement BRIEF description and matching on FPGA also. The proposed system can implement image matching at 30fps (frame per second) for 1280x720 images. Its processing speed can meet the demand of most real-life computer vision applications.
An embedded vision system for an unmanned four-rotor helicopter
NASA Astrophysics Data System (ADS)
Lillywhite, Kirt; Lee, Dah-Jye; Tippetts, Beau; Fowers, Spencer; Dennis, Aaron; Nelson, Brent; Archibald, James
2006-10-01
In this paper an embedded vision system and control module is introduced that is capable of controlling an unmanned four-rotor helicopter and processing live video for various law enforcement, security, military, and civilian applications. The vision system is implemented on a newly designed compact FPGA board (Helios). The Helios board contains a Xilinx Virtex-4 FPGA chip and memory making it capable of implementing real time vision algorithms. A Smooth Automated Intelligent Leveling daughter board (SAIL), attached to the Helios board, collects attitude and heading information to be processed in order to control the unmanned helicopter. The SAIL board uses an electrolytic tilt sensor, compass, voltage level converters, and analog to digital converters to perform its operations. While level flight can be maintained, problems stemming from the characteristics of the tilt sensor limits maneuverability of the helicopter. The embedded vision system has proven to give very good results in its performance of a number of real-time robotic vision algorithms.
Slow Controls Using the Axiom M5235BCC
NASA Astrophysics Data System (ADS)
Hague, Tyler
2008-10-01
The Forward Vertex Detector group at PHENIX plans to adopt the Axiom M5235 Business Card Controller for use as slow controls. It is also being evaluated for slow controls on FermiLab e906. This controller features the Freescale MCF5235 microprocessor. It also has three parallel buses, these being the MCU port, BUS port, and enhanced Time Processing Unit (eTPU) port. The BUS port uses a chip select module with three external chip selects to communicate with peripherals. This will be used to communicate with and configure Field Programmable Gate Arrays (FPGAs). The controller also has an Ethernet port which can use several different protocols such as TCP and UDP. This will be used to transfer files with computers on a network. The M5235 Business Card Controller will be placed in a VME crate along with VME card and a Spartan-3 FPGA.
FPGA-based distributed computing microarchitecture for complex physical dynamics investigation.
Borgese, Gianluca; Pace, Calogero; Pantano, Pietro; Bilotta, Eleonora
2013-09-01
In this paper, we present a distributed computing system, called DCMARK, aimed at solving partial differential equations at the basis of many investigation fields, such as solid state physics, nuclear physics, and plasma physics. This distributed architecture is based on the cellular neural network paradigm, which allows us to divide the differential equation system solving into many parallel integration operations to be executed by a custom multiprocessor system. We push the number of processors to the limit of one processor for each equation. In order to test the present idea, we choose to implement DCMARK on a single FPGA, designing the single processor in order to minimize its hardware requirements and to obtain a large number of easily interconnected processors. This approach is particularly suited to study the properties of 1-, 2- and 3-D locally interconnected dynamical systems. In order to test the computing platform, we implement a 200 cells, Korteweg-de Vries (KdV) equation solver and perform a comparison between simulations conducted on a high performance PC and on our system. Since our distributed architecture takes a constant computing time to solve the equation system, independently of the number of dynamical elements (cells) of the CNN array, it allows us to reduce the elaboration time more than other similar systems in the literature. To ensure a high level of reconfigurability, we design a compact system on programmable chip managed by a softcore processor, which controls the fast data/control communication between our system and a PC Host. An intuitively graphical user interface allows us to change the calculation parameters and plot the results.
Design of an FPGA-based electronic flow regulator (EFR) for spacecraft propulsion system
NASA Astrophysics Data System (ADS)
Manikandan, J.; Jayaraman, M.; Jayachandran, M.
2011-02-01
This paper describes a scheme for electronically regulating the flow of propellant to the thruster from a high-pressure storage tank used in spacecraft application. Precise flow delivery of propellant to thrusters ensures propulsion system operation at best efficiency by maximizing the propellant and power utilization for the mission. The proposed field programmable gate array (FPGA) based electronic flow regulator (EFR) is used to ensure precise flow of propellant to the thrusters from a high-pressure storage tank used in spacecraft application. This paper presents hardware and software design of electronic flow regulator and implementation of the regulation logic onto an FPGA.Motivation for proposed FPGA-based electronic flow regulation is on the disadvantages of conventional approach of using analog circuits. Digital flow regulation overcomes the analog equivalent as digital circuits are highly flexible, are not much affected due to noise, accurate performance is repeatable, interface is easier to computers, storing facilities are possible and finally failure rate of digital circuits is less. FPGA has certain advantages over ASIC and microprocessor/micro-controller that motivated us to opt for FPGA-based electronic flow regulator. Also the control algorithm being software, it is well modifiable without changing the hardware. This scheme is simple enough to adopt for a wide range of applications, where the flow is to be regulated for efficient operation.The proposed scheme is based on a space-qualified re-configurable field programmable gate arrays (FPGA) and hybrid micro circuit (HMC). A graphical user interface (GUI) based application software is also developed for debugging, monitoring and controlling the electronic flow regulator from PC COM port.
2010-07-22
dependent , providing a natural bandwidth match between compute cores and the memory subsystem. • High Bandwidth Dcnsity. Waveguides crossing the chip...simulate this memory access architecture on a 2S6-core chip with a concentrated 64-node network lIsing detailed traces of high-performance embedded...memory modulcs, wc placc memory access poi nts (MAPs) around the pcriphery of the chip connected to thc nctwork. These MAPs, shown in Figure 4, contain
Research in Computer Simulation of Integrated Circuits.
1983-07-31
mactore ftor eval -al-rto implementad am a single chip ae those s Lca we beoi ~~g 7he PT!2 software datatow macl*-re ihas nodes ’cr prr., incrastgly... chip grows, these tools are becoming increasingly importan The FTL2 system described in this paper is an interactive system for specifying concurrent...implemented on a single chip grows, theselools are becom- / r/ - --/ ing increasingly important. The FTL2 system described in this paper is an interactive
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe
2017-01-01
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l’information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N-th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work. PMID:28718788
Audi, Ahmad; Pierrot-Deseilligny, Marc; Meynard, Christophe; Thom, Christian
2017-07-18
Images acquired with a long exposure time using a camera embedded on UAVs (Unmanned Aerial Vehicles) exhibit motion blur due to the erratic movements of the UAV. The aim of the present work is to be able to acquire several images with a short exposure time and use an image processing algorithm to produce a stacked image with an equivalent long exposure time. Our method is based on the feature point image registration technique. The algorithm is implemented on the light-weight IGN (Institut national de l'information géographique) camera, which has an IMU (Inertial Measurement Unit) sensor and an SoC (System on Chip)/FPGA (Field-Programmable Gate Array). To obtain the correct parameters for the resampling of the images, the proposed method accurately estimates the geometrical transformation between the first and the N -th images. Feature points are detected in the first image using the FAST (Features from Accelerated Segment Test) detector, then homologous points on other images are obtained by template matching using an initial position benefiting greatly from the presence of the IMU sensor. The SoC/FPGA in the camera is used to speed up some parts of the algorithm in order to achieve real-time performance as our ultimate objective is to exclusively write the resulting image to save bandwidth on the storage device. The paper includes a detailed description of the implemented algorithm, resource usage summary, resulting processing time, resulting images and block diagrams of the described architecture. The resulting stacked image obtained for real surveys does not seem visually impaired. An interesting by-product of this algorithm is the 3D rotation estimated by a photogrammetric method between poses, which can be used to recalibrate in real time the gyrometers of the IMU. Timing results demonstrate that the image resampling part of this algorithm is the most demanding processing task and should also be accelerated in the FPGA in future work.
QCDOC: A 10-teraflops scale computer for lattice QCD
NASA Astrophysics Data System (ADS)
Chen, D.; Christ, N. H.; Cristian, C.; Dong, Z.; Gara, A.; Garg, K.; Joo, B.; Kim, C.; Levkova, L.; Liao, X.; Mawhinney, R. D.; Ohta, S.; Wettig, T.
2001-03-01
The architecture of a new class of computers, optimized for lattice QCD calculations, is described. An individual node is based on a single integrated circuit containing a PowerPC 32-bit integer processor with a 1 Gflops 64-bit IEEE floating point unit, 4 Mbyte of memory, 8 Gbit/sec nearest-neighbor communications and additional control and diagnostic circuitry. The machine's name, QCDOC, derives from "QCD On a Chip".
The Design of a Polymorphous Cognitive Agent Architecture (PCAA)
2008-05-01
tree, and search agents may search the tree for documents or clusters, depositing pheromones on the way down the tree. 46 19 The quality of SODAS...location in the lattice is a node, connected to its neighbors by links, and agents roam across the lattice, depositing pheromones . 49 21 A possible FPGA...provided by swarming, and also figure out a way for learning in ACT-R to trickle down to swarming computations, e.g., through the pheromones . Integration
A Hardware Platform for Tuning of MEMS Devices Using Closed-Loop Frequency Response
NASA Technical Reports Server (NTRS)
Ferguson, Michael I.; MacDonald, Eric; Foor, David
2005-01-01
We report on the development of a hardware platform for integrated tuning and closed-loop operation of MEMS gyroscopes. The platform was developed and tested for the second generation JPL/Boeing Post-Resonator MEMS gyroscope. The control of this device is implemented through a digital design on a Field Programmable Gate Array (FPGA). A software interface allows the user to configure, calibrate, and tune the bias voltages on the micro-gyro. The interface easily transitions to an embedded solution that allows for the miniaturization of the system to a single chip.
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA
NASA Astrophysics Data System (ADS)
Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing
Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
NASA Astrophysics Data System (ADS)
Kapur, Pawan
The miniaturization paradigm for silicon integrated circuits has resulted in a tremendous cost and performance advantage. Aggressive shrinking of devices provides faster transistors and a greater functionality for circuit design. However, scaling induced smaller wire cross-sections coupled with longer lengths owing to larger chip areas, result in a steady deterioration of interconnects. This degradation in interconnect trends threatens to slow down the rapid growth along Moore's law. This work predicts that the situation is worse than anticipated. It shows that in the light of technology and reliability constraints, scaling induced increase in electron surface scattering, fractional cross section area occupied by the highly resistive barrier, and realistic interconnect operation temperature will lead to a significant rise in effective resistivity of modern copper based interconnects. We start by discussing various technology factors affecting copper resistivity. We, next, develop simulation tools to model these effects. Using these tools, we quantify the increase in realistic copper resistivity as a function of future technology nodes, under various technology assumptions. Subsequently, we evaluate the impact of these technology effects on delay and power dissipation of global signaling interconnects. Modern long on-chip wires use repeaters, which dramatically improves their delay and bandwidth. We quantify the repeated wire delays and power dissipation using realistic resistance trends at future nodes. With the motivation of reducing power, we formalize a methodology, which trades power with delay very efficiently for repeated wires. Using this method, we find that although the repeater power comes down, the total power dissipation due to wires is still found to be very large at future nodes. Finally, we explore optical interconnects as a possible substitute, for specific interconnect applications. We model an optical receiver and waveguides. Using this we assess future optical system performance. Finally, we compare the delay and power of future metal interconnects with that of optical interconnects for global signaling application. We also compare the power dissipation of the two approaches for an upper level clock distribution application. We find that for long on-chip communication links, optical interconnects have lower latencies than future metal interconnects at comparable levels of power dissipation.
Diamond photonics for distributed quantum networks
NASA Astrophysics Data System (ADS)
Johnson, Sam; Dolan, Philip R.; Smith, Jason M.
2017-09-01
The distributed quantum network, in which nodes comprising small but well-controlled quantum states are entangled via photonic channels, has in recent years emerged as a strategy for delivering a range of quantum technologies including secure communications, enhanced sensing and scalable quantum computing. Colour centres in diamond are amongst the most promising candidates for nodes fabricated in the solid-state, offering potential for large scale production and for chip-scale integrated devices. In this review we consider the progress made and the remaining challenges in developing diamond-based nodes for quantum networks. We focus on the nitrogen-vacancy and silicon-vacancy colour centres, which have demonstrated many of the necessary attributes for these applications. We focus in particular on the use of waveguides and other photonic microstructures for increasing the efficiency with which photons emitted from these colour centres can be coupled into a network, and the use of microcavities for increasing the fraction of photons emitted that are suitable for generating entanglement between nodes.
NASA Astrophysics Data System (ADS)
Suryono, Suryono; Purnomo Putro, Sapto; Widowati; Adhy, Satriyo
2018-05-01
Experimental results of data acquisition and transmission of water surface level from the field using System on Chip (SOC) Wi-Fi microcontroller are described here. System on Chip (SOC) Wi-Fi microcontroller is useful in dealing with limitations of in situ measurement by people. It is expected to address the problem of field instrumentation such as complexities in electronic circuit, power supply, efficiency, and automation of digital data acquisition. The system developed here employs five (5) nodes consisting of ultrasonic water surface level sensor using (SOC) Wi-Fi microcontroller. The five nodes are connected to a Wi-Fi router as the gateway to send multi-station data to a computer host. Measurement of water surface level using SOC Wi-Fi microcontroller manages conduct multi-station communication via database service programming that is capable of inputting every data sent to the database record according to the identity of data sent. The system here has a measurement error of 0.65 cm, while in terms of range, communication between data node to gateway varies in distance from 25 m to 45 m. Communication has been successfully conducted from one Wi-Fi gateway to the other that further improvement for its multi-station range is a certain possibility.
Design of optical axis jitter control system for multi beam lasers based on FPGA
NASA Astrophysics Data System (ADS)
Ou, Long; Li, Guohui; Xie, Chuanlin; Zhou, Zhiqiang
2018-02-01
A design of optical axis closed-loop control system for multi beam lasers coherent combining based on FPGA was introduced. The system uses piezoelectric ceramics Fast Steering Mirrors (FSM) as actuator, the Fairfield spot detection of multi beam lasers by the high speed CMOS camera for optical detecting, a control system based on FPGA for real-time optical axis jitter suppression. The algorithm for optical axis centroid detecting and PID of anti-Integral saturation were realized by FPGA. Optimize the structure of logic circuit by reuse resource and pipeline, as a result of reducing logic resource but reduced the delay time, and the closed-loop bandwidth increases to 100Hz. The jitter of laser less than 40Hz was reduced 40dB. The cost of the system is low but it works stably.
Design of a stateless low-latency router architecture for green software-defined networking
NASA Astrophysics Data System (ADS)
Saldaña Cercós, Silvia; Ramos, Ramon M.; Ewald Eller, Ana C.; Martinello, Magnos; Ribeiro, Moisés. R. N.; Manolova Fagertun, Anna; Tafur Monroy, Idelfonso
2015-01-01
Expanding software defined networking (SDN) to transport networks requires new strategies to deal with the large number of flows that future core networks will have to face. New south-bound protocols within SDN have been proposed to benefit from having control plane detached from the data plane offering a cost- and energy-efficient forwarding engine. This paper presents an overview of a new approach named KeyFlow to simultaneously reduce latency, jitter, and power consumption in core network nodes. Results on an emulation platform indicate that round trip time (RTT) can be reduced above 50% compared to the reference protocol OpenFlow, specially when flow tables are densely populated. Jitter reduction has been demonstrated experimentally on a NetFPGA-based platform, and 57.3% power consumption reduction has been achieved.
NASA Astrophysics Data System (ADS)
Szplet, R.; Kalisz, J.; Jachna, Z.
2009-02-01
We present a time digitizer having 45 ps resolution, integrated in a field programmable gate array (FPGA) device. The time interval measurement is based on the two-stage interpolation method. A dual-edge two-phase interpolator is driven by the on-chip synthesized 250 MHz clock with precise phase adjustment. An improved dual-edge double synchronizer was developed to control the main counter. The nonlinearity of the digitizer's transfer characteristic is identified and utilized by the dedicated hardware code processor for the on-the-fly correction of the output data. Application of presented ideas has resulted in the measurement uncertainty of the digitizer below 70 ps RMS over the time interval ranging from 0 to 1 s. The use of the two-stage interpolation and a fast FIFO memory has allowed us to obtain the maximum measurement rate of five million measurements per second.
Data transmission protocol for Pi-of-the-Sky cameras
NASA Astrophysics Data System (ADS)
Uzycki, J.; Kasprowicz, G.; Mankiewicz, M.; Nawrocki, K.; Sitek, P.; Sokolowski, M.; Sulej, R.; Tlaczala, W.
2006-10-01
The large amount of data collected by the automatic astronomical cameras has to be transferred to the fast computers in a reliable way. The method chosen should ensure data streaming in both directions but in nonsymmetrical way. The Ethernet interface is very good choice because of its popularity and proven performance. However it requires TCP/IP stack implementation in devices like cameras for full compliance with existing network and operating systems. This paper describes NUDP protocol, which was made as supplement to standard UDP protocol and can be used as a simple-network protocol. The NUDP does not need TCP protocol implementation and makes it possible to run the Ethernet network with simple devices based on microcontroller and/or FPGA chips. The data transmission idea was created especially for the "Pi of the Sky" project.
Energy efficiency analysis and implementation of AES on an FPGA
NASA Astrophysics Data System (ADS)
Kenney, David
The Advanced Encryption Standard (AES) was developed by Joan Daemen and Vincent Rjimen and endorsed by the National Institute of Standards and Technology in 2001. It was designed to replace the aging Data Encryption Standard (DES) and be useful for a wide range of applications with varying throughput, area, power dissipation and energy consumption requirements. Field Programmable Gate Arrays (FPGAs) are flexible and reconfigurable integrated circuits that are useful for many different applications including the implementation of AES. Though they are highly flexible, FPGAs are often less efficient than Application Specific Integrated Circuits (ASICs); they tend to operate slower, take up more space and dissipate more power. There have been many FPGA AES implementations that focus on obtaining high throughput or low area usage, but very little research done in the area of low power or energy efficient FPGA based AES; in fact, it is rare for estimates on power dissipation to be made at all. This thesis presents a methodology to evaluate the energy efficiency of FPGA based AES designs and proposes a novel FPGA AES implementation which is highly flexible and energy efficient. The proposed methodology is implemented as part of a novel scripting tool, the AES Energy Analyzer, which is able to fully characterize the power dissipation and energy efficiency of FPGA based AES designs. Additionally, this thesis introduces a new FPGA power reduction technique called Opportunistic Combinational Operand Gating (OCOG) which is used in the proposed energy efficient implementation. The AES Energy Analyzer was able to estimate the power dissipation and energy efficiency of the proposed AES design during its most commonly performed operations. It was found that the proposed implementation consumes less energy per operation than any previous FPGA based AES implementations that included power estimations. Finally, the use of Opportunistic Combinational Operand Gating on an AES cipher was found to reduce its dynamic power consumption by up to 17% when compared to an identical design that did not employ the technique.
Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; de Jesus Romero-Troncoso, Rene
2010-01-01
Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node. PMID:22163602
Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; Romero-Troncoso, Rene de Jesus
2010-01-01
Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node.
A programmable microsystem using system-on-chip for real-time biotelemetry.
Wang, Lei; Johannessen, Erik A; Hammond, Paul A; Cui, Li; Reid, Stuart W J; Cooper, Jonathan M; Cumming, David R S
2005-07-01
A telemetry microsystem, including multiple sensors, integrated instrumentation and a wireless interface has been implemented. We have employed a methodology akin to that for System-on-Chip microelectronics to design an integrated circuit instrument containing several "intellectual property" blocks that will enable convenient reuse of modules in future projects. The present system was optimized for low-power and included mixed-signal sensor circuits, a programmable digital system, a feedback clock control loop and RF circuits integrated on a 5 mm x 5 mm silicon chip using a 0.6 microm, 3.3 V CMOS process. Undesirable signal coupling between circuit components has been investigated and current injection into sensitive instrumentation nodes was minimized by careful floor-planning. The chip, the sensors, a magnetic induction-based transmitter and two silver oxide cells were packaged into a 36 mm x 12 mm capsule format. A base station was built in order to retrieve the data from the microsystem in real-time. The base station was designed to be adaptive and timing tolerant since the microsystem design was simplified to reduce power consumption and size. The telemetry system was found to have a packet error rate of 10(-3) using an asynchronous simplex link. Trials in animal carcasses were carried out to show that the transmitter was as effective as a conventional RF device whilst consuming less power.
Optimization of wireless Bluetooth sensor systems.
Lonnblad, J; Castano, J; Ekstrom, M; Linden, M; Backlund, Y
2004-01-01
Within this study, three different Bluetooth sensor systems, replacing cables for transmission of biomedical sensor data, have been designed and evaluated. The three sensor architectures are built on 1-, 2- and 3-chip solutions and depending on the monitoring situation and signal character, different solutions are optimal. Essential parameters for all systems have been low physical weight and small size, resistance to interference and interoperability with other technologies as global- or local networks, PC's and mobile phones. Two different biomedical input signals, ECG and PPG (photoplethysmography), have been used to evaluate the three solutions. The study shows that it is possibly to continuously transmit an analogue signal. At low sampling rates and slowly varying parameters, as monitoring the heart rate with PPG, the 1-chip solution is the most suitable, offering low power consumption and thus a longer battery lifetime or a smaller battery, minimizing the weight of the sensor system. On the other hand, when a higher sampling rate is required, as an ECG, the 3-chip architecture, with a FPGA or micro-controller, offers the best solution and performance. Our conclusion is that Bluetooth might be useful in replacing cables of medical monitoring systems.
NASA Astrophysics Data System (ADS)
Jiang, Homin; Yu, Chen-Yu; Kubo, Derek; Chen, Ming-Tang; Guzzino, Kim
2016-11-01
In this study, a 4 bit, 10 giga-samples-per-second analog-to-digital converter (ADC) printed circuit board assembly (PCBA) was designed, manufactured, and characterized for digitizing radio telescopes. For this purpose, an Adsantec ANST7120A-KMA flash ADC chip was used. Together with the field-programmable gate array platform, developed by the Collaboration for Astronomy Signal Processing and Electronics Research community, the PCBA enables data acquisition with a wide bandwidth and simplifies the intermediate frequency section. In the current version, the PCBA and the chip exhibit an analog bandwidth of 10 GHz (3 dB loss) and 20 GHz, respectively, which facilitates second, third, and even fourth Nyquist sampling. The following average performance parameters were obtained from the first and second Nyquist zones of the three boards: a spurious-free dynamic range of 31.35/30.45 dB, a signal-to-noise and distortion ratio of 22.95/21.83 dB, and an effective number of bits of 3.65/3.43, respectively.
Configurable test bed design for nanosats to qualify commercial and customized integrated circuits
NASA Astrophysics Data System (ADS)
Guareschi, W.; Azambuja, J.; Kastensmidt, F.; Reis, R.; Durao, O.; Schuch, N.; Dessbesel, G.
The use of small satellites has increased substantially in recent years due to the reduced cost of their development and launch, as well to the flexibility offered by commercial components. The test bed is a platform that allows components to be evaluated and tested in space. It is a flexible platform, which can be adjusted to a wide quantity of components and interfaces. This work proposes the design and implementation of a test bed suitable for test and evaluation of commercial circuits used in nanosatellites. The development of such a platform allows developers to reduce the efforts in the integration of components and therefore speed up the overall system development time. The proposed test bed is a configurable platform implemented using a Field Programmable Gate Array (FPGA) that controls the communication protocols and connections to the devices under test. The Flash-based ProASIC3E FPGA from Microsemi is used as a control system. This adaptive system enables the control of new payloads and softcores for test and validation in space. Thus, the integration can be easily performed through configuration parameters. It is intended for modularity. Each component connected to the test bed can have a specific interface programmed using a hardware description language (HDL). The data of each component is stored in embedded memories. Each component has its own memory space. The size of the allocated memory can be also configured. The data transfer priority can be set and packaging can be added to the logic, when needed. Communication with peripheral devices and with the Onboard Computer (OBC) is done through the pre-implemented protocols, such as I2C (Inter-Integrated Circuit), SPI (Serial Peripheral Interface) and external memory control. In loco primary tests demonstrated the control system's functionality. The commercial ProASIC3E FPGA family is not space-flight qualified, but tests have been made under Total Ionizing Dose (TID) showing its robustness up to 25 kr- ds (Si). When considering proton and heavy ions, flash-based FPGAs provide immunity to configuration loss and low bit-flips susceptibility in flash memory. In this first version of the test bed two components are connected to the controller FPGA: a commercial magnetometer and a hardened test chip. The embedded FPGA implements a Single Event Effects (SEE) hardened microprocessor and few other soft-cores to be used in space. This test bed will be used in the NanoSatC-BR1, the first Brazilian Cubesat scheduled to be launched in mid-2013.
Assurance of Complex Electronics. What Path Do We Take?
NASA Technical Reports Server (NTRS)
Plastow, Richard A.
2007-01-01
Many of the methods used to develop software bare a close resemblance to Complex Electronics (CE) development. CE are now programmed to perform tasks that were previously handled in software, such as communication protocols. For instance, Field Programmable Gate Arrays (FPGAs) can have over a million logic gates while system-on-chip (SOC) devices can combine a microprocessor, input and output channels, and sometimes an FPGA for programmability. With this increased intricacy, the possibility of "software-like" bugs such as incorrect design, logic, and unexpected interactions within the logic is great. Since CE devices are obscuring the hardware/software boundary, we propose that mature software methodologies may be utilized with slight modifications to develop these devices. By using standardized S/W Engineering methods such as checklists, missing requirements and "bugs" can be detected earlier in the development cycle, thus creating a development process for CE that will be easily maintained and configurable based on the device used.
An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed
NASA Astrophysics Data System (ADS)
Kim, H.; Choi, Y.; Yang, Y.
In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
Software-based high-level synthesis design of FPGA beamformers for synthetic aperture imaging.
Amaro, Joao; Yiu, Billy Y S; Falcao, Gabriel; Gomes, Marco A C; Yu, Alfred C H
2015-05-01
Field-programmable gate arrays (FPGAs) can potentially be configured as beamforming platforms for ultrasound imaging, but a long design time and skilled expertise in hardware programming are typically required. In this article, we present a novel approach to the efficient design of FPGA beamformers for synthetic aperture (SA) imaging via the use of software-based high-level synthesis techniques. Software kernels (coded in OpenCL) were first developed to stage-wise handle SA beamforming operations, and their corresponding FPGA logic circuitry was emulated through a high-level synthesis framework. After design space analysis, the fine-tuned OpenCL kernels were compiled into register transfer level descriptions to configure an FPGA as a beamformer module. The processing performance of this beamformer was assessed through a series of offline emulation experiments that sought to derive beamformed images from SA channel-domain raw data (40-MHz sampling rate, 12 bit resolution). With 128 channels, our FPGA-based SA beamformer can achieve 41 frames per second (fps) processing throughput (3.44 × 10(8) pixels per second for frame size of 256 × 256 pixels) at 31.5 W power consumption (1.30 fps/W power efficiency). It utilized 86.9% of the FPGA fabric and operated at a 196.5 MHz clock frequency (after optimization). Based on these findings, we anticipate that FPGA and high-level synthesis can together foster rapid prototyping of real-time ultrasound processor modules at low power consumption budgets.
FPGA-based GEM detector signal acquisition for SXR spectroscopy system
NASA Astrophysics Data System (ADS)
Wojenski, A.; Pozniak, K. T.; Kasprowicz, G.; Kolasinski, P.; Krawczyk, R.; Zabolotny, W.; Chernyshova, M.; Czarski, T.; Malinowski, K.
2016-11-01
The presented work is related to the Gas Electron Multiplier (GEM) detector soft X-ray spectroscopy system for tokamak applications. The used GEM detector has one-dimensional, 128 channel readout structure. The channels are connected to the radiation-hard electronics with configurable analog stage and fast ADCs, supporting speeds of 125 MSPS for each channel. The digitalized data is sent directly to the FPGAs using fast serial links. The preprocessing algorithms are implemented in the FPGAs, with the data buffering made in the on-board 2Gb DDR3 memory chips. After the algorithmic stage, the data is sent to the Intel Xeon-based PC for further postprocessing using PCI-Express link Gen 2. For connection of multiple FPGAs, PCI-Express switch 8-to-1 was designed. The whole system can support up to 2048 analog channels. The scope of the work is an FPGA-based implementation of the recorder of the raw signal from GEM detector. Since the system will work in a very challenging environment (neutron radiation, intense electro-magnetic fields), the registered signals from the GEM detector can be corrupted. In the case of the very intense hot plasma radiation (e.g. laser generated plasma), the registered signals can overlap. Therefore, it is valuable to register the raw signals from the GEM detector with high number of events during soft X-ray radiation. The signal analysis will have the direct impact on the implementation of photon energy computation algorithms. As the result, the system will produce energy spectra and topological distribution of soft X-ray radiation. The advanced software was developed in order to perform complex system startup and monitoring of hardware units. Using the array of two one-dimensional GEM detectors it will be possible to perform tomographic reconstruction of plasma impurities radiation in the SXR region.
Parameterized hardware description as object oriented hardware model implementation
NASA Astrophysics Data System (ADS)
Drabik, Pawel K.
2010-09-01
The paper introduces novel model for design, visualization and management of complex, highly adaptive hardware systems. The model settles component oriented environment for both hardware modules and software application. It is developed on parameterized hardware description research. Establishment of stable link between hardware and software, as a purpose of designed and realized work, is presented. Novel programming framework model for the environment, named Graphic-Functional-Components is presented. The purpose of the paper is to present object oriented hardware modeling with mentioned features. Possible model implementation in FPGA chips and its management by object oriented software in Java is described.
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view
NASA Astrophysics Data System (ADS)
Cao, Tam P.; Deng, Guang; Elton, Darrell
2009-02-01
In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
FPGA-based gating and logic for multichannel single photon counting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pooser, Raphael C; Earl, Dennis Duncan; Evans, Philip G
2012-01-01
We present results characterizing multichannel InGaAs single photon detectors utilizing gated passive quenching circuits (GPQC), self-differencing techniques, and field programmable gate array (FPGA)-based logic for both diode gating and coincidence counting. Utilizing FPGAs for the diode gating frontend and the logic counting backend has the advantage of low cost compared to custom built logic circuits and current off-the-shelf detector technology. Further, FPGA logic counters have been shown to work well in quantum key distribution (QKD) test beds. Our setup combines multiple independent detector channels in a reconfigurable manner via an FPGA backend and post processing in order to perform coincidencemore » measurements between any two or more detector channels simultaneously. Using this method, states from a multi-photon polarization entangled source are detected and characterized via coincidence counting on the FPGA. Photons detection events are also processed by the quantum information toolkit for application testing (QITKAT)« less
Radiation Tolerant, FPGA-Based SmallSat Computer System
NASA Technical Reports Server (NTRS)
LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew
2015-01-01
The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
FASEA: A FPGA Acquisition System and Software Event Analysis for liquid scintillation counting
NASA Astrophysics Data System (ADS)
Steele, T.; Mo, L.; Bignell, L.; Smith, M.; Alexiev, D.
2009-10-01
The FASEA (FPGA based Acquisition and Software Event Analysis) system has been developed to replace the MAC3 for coincidence pulse processing. The system uses a National Instruments Virtex 5 FPGA card (PXI-7842R) for data acquisition and a purpose developed data analysis software for data analysis. Initial comparisons to the MAC3 unit are included based on measurements of 89Sr and 3H, confirming that the system is able to accurately emulate the behaviour of the MAC3 unit.
Rapid and highly integrated FPGA-based Shack-Hartmann wavefront sensor for adaptive optics system
NASA Astrophysics Data System (ADS)
Chen, Yi-Pin; Chang, Chia-Yuan; Chen, Shean-Jen
2018-02-01
In this study, a field programmable gate array (FPGA)-based Shack-Hartmann wavefront sensor (SHWS) programmed on LabVIEW can be highly integrated into customized applications such as adaptive optics system (AOS) for performing real-time wavefront measurement. Further, a Camera Link frame grabber embedded with FPGA is adopted to enhance the sensor speed reacting to variation considering its advantage of the highest data transmission bandwidth. Instead of waiting for a frame image to be captured by the FPGA, the Shack-Hartmann algorithm are implemented in parallel processing blocks design and let the image data transmission synchronize with the wavefront reconstruction. On the other hand, we design a mechanism to control the deformable mirror in the same FPGA and verify the Shack-Hartmann sensor speed by controlling the frequency of the deformable mirror dynamic surface deformation. Currently, this FPGAbead SHWS design can achieve a 266 Hz cyclic speed limited by the camera frame rate as well as leaves 40% logic slices for additionally flexible design.
NASA Astrophysics Data System (ADS)
Alqasemi, Umar; Li, Hai; Aguirre, Andres; Zhu, Quing
2011-03-01
Co-registering ultrasound (US) and photoacoustic (PA) imaging is a logical extension to conventional ultrasound because both modalities provide complementary information of tumor morphology, tumor vasculature and hypoxia for cancer detection and characterization. In addition, both modalities are capable of providing real-time images for clinical applications. In this paper, a Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP) module-based real-time US/PA imaging system is presented. The system provides real-time US/PA data acquisition and image display for up to 5 fps* using the currently implemented DSP board. It can be upgraded to 15 fps, which is the maximum pulse repetition rate of the used laser, by implementing an advanced DSP module. Additionally, the photoacoustic RF data for each frame is saved for further off-line processing. The system frontend consists of eight 16-channel modules made of commercial and customized circuits. Each 16-channel module consists of two commercial 8-channel receiving circuitry boards and one FPGA board from Analog Devices. Each receiving board contains an IC† that combines. 8-channel low-noise amplifiers, variable-gain amplifiers, anti-aliasing filters, and ADC's‡ in a single chip with sampling frequency of 40MHz. The FPGA board captures the LVDSξ Double Data Rate (DDR) digital output of the receiving board and performs data conditioning and subbeamforming. A customized 16-channel transmission circuitry is connected to the two receiving boards for US pulseecho (PE) mode data acquisition. A DSP module uses External Memory Interface (EMIF) to interface with the eight 16-channel modules through a customized adaptor board. The DSP transfers either sub-beamformed data (US pulse-echo mode or PAI imaging mode) or raw data from FPGA boards to its DDR-2 memory through the EMIF link, then it performs additional processing, after that, it transfer the data to the PC** for further image processing. The PC code performs image processing including demodulation, beam envelope detection and scan conversion. Additionally, the PC code pre-calculates the delay coefficients used for transmission focusing and receiving dynamic focusing for different types of transducers to speed up the imaging process. To further speed up the imaging process, a multi-threads technique is implemented in order to allow formation of previous image frame data and acquisition of the next one simultaneously. The system is also capable of doing semi-real-time automated SO2 imaging at 10 seconds per frame by changing the wavelength knob of the laser automatically using a stepper motor controlled by the system. Initial in vivo experiments were performed on animal tumors to map out its vasculature and hypoxia level, which were superimposed on co-registered US images. The real-time system allows capturing co-registered US/PA images free of motion artifacts and also provides dynamitic information when contrast agents are used.
Design and Training of Limited-Interconnect Architectures
1991-07-16
and signal processing. Neuromorphic (brain like) models, allow an alternative for achieving real-time operation tor such tasks, while having a...compact and robust architecture. Neuromorphic models consist of interconnections of simple computational nodes. In this approach, each node computes a...operational performance. I1. Research Objectives The research objectives were: 1. Development of on- chip local training rules specifically designed for
Radiation Hardened 10BASE-T Ethernet Physical Layer (PHY)
NASA Technical Reports Server (NTRS)
Lin, Michael R. (Inventor); Petrick, David J. (Inventor); Ballou, Kevin M. (Inventor); Espinosa, Daniel C. (Inventor); James, Edward F. (Inventor); Kliesner, Matthew A. (Inventor)
2017-01-01
Embodiments may provide a radiation hardened 10BASE-T Ethernet interface circuit suitable for space flight and in compliance with the IEEE 802.3 standard for Ethernet. The various embodiments may provide a 10BASE-T Ethernet interface circuit, comprising a field programmable gate array (FPGA), a transmitter circuit connected to the FPGA, a receiver circuit connected to the FPGA, and a transformer connected to the transmitter circuit and the receiver circuit. In the various embodiments, the FPGA, transmitter circuit, receiver circuit, and transformer may be radiation hardened.
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Real-time field programmable gate array architecture for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2001-01-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
High density, multi-range analog output Versa Module Europa board for control system applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Kundan, E-mail: kundan@iuac.res.in; Das, Ajit Lal
2014-01-15
A new VMEDAC64, 12-bit 64 channel digital-to-analog converter, a Versa Module Europa (VME) module, features 64 analog voltage outputs with user selectable multiple ranges, has been developed for control system applications at Inter University Accelerator Centre. The FPGA (Field Programmable Gate Array) is the module's core, i.e., it implements the DAC control logic and complexity of VMEbus slave interface logic. The VMEbus slave interface and DAC control logic are completely designed and implemented on a single FPGA chip to achieve high density of 64 channels in a single width VME module and will reduce the module count in the controlmore » system applications, and hence will reduce the power consumption and cost of overall system. One of our early design goals was to develop the VME interface such that it can be easily integrated with the peripheral devices and satisfy the timing specifications of VME standard. The modular design of this module reduces the amount of time required to develop other custom modules for control system. The VME slave interface is written as a single component inside FPGA which will be used as a basic building block for any VMEbus interface project. The module offers multiple output voltage ranges depending upon the requirement. The output voltage range can be reduced or expanded by writing range selection bits in the control register. The module has programmable refresh rate and by default hold capacitors in the sample and hold circuit for each channel are charged periodically every 7.040 ms (i.e., update frequency 284 Hz). Each channel has software controlled output switch which disconnects analog output from the field. The modularity in the firmware design on FPGA makes the debugging very easy. On-board DC/DC converters are incorporated for isolated power supply for the analog section of the board.« less
NASA/BAE SYSTEMS SpaceWire Effort
NASA Technical Reports Server (NTRS)
Rakow, Glenn Parker; Schnurr, Richard G.; Kapcio, Paul
2003-01-01
This paper discusses the state of the NASA and BAE SYSTEMS developments of SpaceWire. NASA has developed intellectual property that implements SpaceWire in Register Transfer Level (RTL) VHDL for a SpaceWire link and router. This design has been extensively verified using directed tests from the SpaceWire Standard and design specification, as well as being randomly tested to flush out hard to find bugs in the code. The high level features of the design will be discussed, including the support for multiple time code masters, which will be useful for the James Webb Space Telescope electrical architecture. This design is now ready to be targeted to FPGA's and ASICs. Target utilization and performance information will be presented for Spaceflight worthy FPGA's and a discussion of the ASIC implementations will be addressed. In particular, the BAE SYSTEMS ASIC will be highlighted which will be implemented on their .25micron rad-hard line. The chip will implement a 4-port router with the ability to tie chips together to make larger routers without external glue logic. This part will have integrated LVDS drivers/receivers, include a PLL and include skew control logic. It will be targeted to run at greater than 300 MHz and include the implementation for the proposed SpaceWire transport layer. The need to provide a reliable transport mechanism for SpaceWire has been identified by both NASA And ESA, who are attempting to define a transport layer standard that utilizes a low overhead, low latency connection oriented approach that works end-to-end. This layer needs to be implemented in hardware to prevent bottlenecks.
NASA Tech Briefs, January 2013
NASA Technical Reports Server (NTRS)
2013-01-01
Topics include: Single-Photon-Sensitive HgCdTe Avalanche Photodiode Detector; Surface-Enhanced Raman Scattering Using Silica Whispering-Gallery Mode Resonators; 3D Hail Size Distribution Interpolation/Extrapolation Algorithm; Color-Changing Sensors for Detecting the Presence of Hypergolic Fuels; Artificial Intelligence Software for Assessing Postural Stability; Transformers: Shape-Changing Space Systems Built with Robotic Textiles; Fibrillar Adhesive for Climbing Robots; Using Pre-Melted Phase Change Material to Keep Payloads in Space Warm for Hours without Power; Development of a Centrifugal Technique for the Microbial Bioburden Analysis of Freon (CFC-11); Microwave Sinterator Freeform Additive Construction System (MS-FACS); DSP/FPGA Design for a High-Speed Programmable S-Band Space Transceiver; On-Chip Power-Combining for High-Power Schottky Diode-Based Frequency Multipliers; FPGA Vision Data Architecture; Memory Circuit Fault Simulator; Ultra-Compact Transputer-Based Controller for High-Level, Multi-Axis Coordination; Regolith Advanced Surface Systems Operations Robot Excavator; Magnetically Actuated Seal; Hybrid Electrostatic/Flextensional Mirror for Lightweight, Large-Aperture, and Cryogenic Space Telescopes; System for Contributing and Discovering Derived Mission and Science Data; Remote Viewer for Maritime Robotics Software; Stackfile Database; Reachability Maps for In Situ Operations; JPL Space Telecommunications Radio System Operating Environment; RFI-SIM: RFI Simulation Package; ION Configuration Editor; Dtest Testing Software; IMPaCT - Integration of Missions, Programs, and Core Technologies; Integrated Systems Health Management (ISHM) Toolkit; Wind-Driven Wireless Networked System of Mobile Sensors for Mars Exploration; In Situ Solid Particle Generator; Analysis of the Effects of Streamwise Lift Distribution on Sonic Boom Signature; Rad-Tolerant, Thermally Stable, High-Speed Fiber-Optic Network for Harsh Environments; Towed Subsurface Optical Communications Buoy; High-Collection-Efficiency Fluorescence Detection Cell; Ultra-Compact, Superconducting Spectrometer-on-a-Chip at Submillimeter Wavelengths; UV Resonant Raman Spectrometer with Multi-Line Laser Excitation; Medicine Delivery Device with Integrated Sterilization and Detection; Ionospheric Simulation System for Satellite Observations and Global Assimilative Model Experiments - ISOGAME; Airborne Tomographic Swath Ice Sounding Processing System; flexplan: Mission Planning System for the Lunar Reconnaissance Orbiter; Estimating Torque Imparted on Spacecraft Using Telemetry; PowderSim: Lagrangian Discrete and Mesh-Free Continuum Simulation Code for Cohesive Soils; Multiple-Frame Detection of Subpixel Targets in Thermal Image Sequences; Metric Learning to Enhance Hyperspectral Image Segmentation; Basic Operational Robotics Instructional System; Sheet Membrane Spacesuit Water Membrane Evaporator; Advanced Materials and Manufacturing for Low-Cost, High-Performance Liquid Rocket Combustion Chambers; Motor Qualification for Long-Duration Mars Missions.
A high data rate universal lattice decoder on FPGA
NASA Astrophysics Data System (ADS)
Ma, Jing; Huang, Xinming; Kura, Swapna
2005-06-01
This paper presents the architecture design of a high data rate universal lattice decoder for MIMO channels on FPGA platform. A phost strategy based lattice decoding algorithm is modified in this paper to reduce the complexity of the closest lattice point search. The data dependency of the improved algorithm is examined and a parallel and pipeline architecture is developed with the iterative decoding function on FPGA and the division intensive channel matrix preprocessing on DSP. Simulation results demonstrate that the improved lattice decoding algorithm provides better bit error rate and less iteration number compared with the original algorithm. The system prototype of the decoder shows that it supports data rate up to 7Mbit/s on a Virtex2-1000 FPGA, which is about 8 times faster than the original algorithm on FPGA platform and two-orders of magnitude better than its implementation on a DSP platform.
Fast data transmission from serial data acquisition for the GEM detector system
NASA Astrophysics Data System (ADS)
Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Byszuk, Adrian; Chernyshova, Maryna; Kasprowicz, Grzegorz; Krawczyk, Rafal D.; Wojenski, Andrzej; Zabolotny, Wojciech
2015-09-01
This article proposes new method of storing data and transferring it to PC in the X-ray GEM detector system. The whole process is performed by FPGA chips (Spartan-6 series from Xilinx). Comparing to previous methods, new approach allows to store much more data in the system. New, improved implementation of the communication algorithm significantly increases transfer rate between system and PC. In PC data is merged and processed by MATLAB. The structure of firmware implemented in the FPGAs is described.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model.
Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid
2014-01-01
A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well.
FPGA in-the-loop simulations of cardiac excitation model under voltage clamp conditions
NASA Astrophysics Data System (ADS)
Othman, Norliza; Adon, Nur Atiqah; Mahmud, Farhanahani
2017-01-01
Voltage clamp technique allows the detection of single channel currents in biological membranes in identifying variety of electrophysiological problems in the cellular level. In this paper, a simulation study of the voltage clamp technique has been presented to analyse current-voltage (I-V) characteristics of ion currents based on Luo-Rudy Phase-I (LR-I) cardiac model by using a Field Programmable Gate Array (FPGA). Nowadays, cardiac models are becoming increasingly complex which can cause a vast amount of time to run the simulation. Thus, a real-time hardware implementation using FPGA could be one of the best solutions for high-performance real-time systems as it provides high configurability and performance, and able to executes in parallel mode operation. For shorter time development while retaining high confidence results, FPGA-based rapid prototyping through HDL Coder from MATLAB software has been used to construct the algorithm for the simulation system. Basically, the HDL Coder is capable to convert the designed MATLAB Simulink blocks into hardware description language (HDL) for the FPGA implementation. As a result, the voltage-clamp fixed-point design of LR-I model has been successfully conducted in MATLAB Simulink and the simulation of the I-V characteristics of the ionic currents has been verified on Xilinx FPGA Virtex-6 XC6VLX240T development board through an FPGA-in-the-loop (FIL) simulation.
SSC San Diego Biennial Review 2003. Vol 3: Intelligence, Surveillance, and Reconnaissance
2003-01-01
following: • Improving a Commander’s situational analysis, awareness, and planning by leveraging the concepts of Smart Push (time-sensitive situational...The node FPGA implements a universal asynchronous receive and transmit ( UART )-style detector to decode the data stream. The data are then briefly...hypothetical sound sources at various points within a five-dimensional search grid over a short processing epoch. Historically, this has been
NASA Astrophysics Data System (ADS)
Bhattacharyya, Kaustuve; Ke, Chih-Ming; Huang, Guo-Tsai; Chen, Kai-Hsiung; Smilde, Henk-Jan H.; Fuchs, Andreas; Jak, Martin; van Schijndel, Mark; Bozkurt, Murat; van der Schaar, Maurits; Meyer, Steffen; Un, Miranda; Morgan, Stephen; Wu, Jon; Tsai, Vincent; Liang, Frida; den Boef, Arie; ten Berge, Peter; Kubis, Michael; Wang, Cathy; Fouquet, Christophe; Terng, L. G.; Hwang, David; Cheng, Kevin; Gau, TS; Ku, Y. C.
2013-04-01
Aggressive on-product overlay requirements in advanced nodes are setting a superior challenge for the semiconductor industry. This forces the industry to look beyond the traditional way-of-working and invest in several new technologies. Integrated metrology2, in-chip overlay control, advanced sampling and process correction-mechanism (using the highest order of correction possible with scanner interface today), are a few of such technologies considered in this publication.
An underwater optical wireless communication system based on LED source
NASA Astrophysics Data System (ADS)
Rao, Jionghui; Wei, Wei; Wang, Feng; Zhang, Xiaohui
2011-11-01
Compared with other communication methods, optical wireless communication (OWC) holds the merits of higher transmitting rate and sufficient secrecy. So it is an efficacious communicating measure for data transmitting between underwater carriers. However, due to the water attenuation and the transmitter & the receiver (TX/RX) collimation, this application is restrained in underwater mobile carriers. A prototype for underwater OWC was developed, in which a high-powered green LED array was used as the light source which partly raveled the TX/RX collimation out. A small pumped-multiple-tube (PMT) was used as the detector to increase the communicating range, and FPGA chips were employed to code and decode the communicating data. The data rate of the prototype approached to 4 Mb/s at 8.4m and 1 Mb/s at 22m where voice and Morse communications were achieved in a scope of 30 degree TX/RX angle.
The RTE inversion on FPGA aboard the solar orbiter PHI instrument
NASA Astrophysics Data System (ADS)
Cobos Carrascosa, J. P.; Aparicio del Moral, B.; Ramos Mas, J. L.; Balaguer, M.; López Jiménez, A. C.; del Toro Iniesta, J. C.
2016-07-01
In this work we propose a multiprocessor architecture to reach high performance in floating point operations by using radiation tolerant FPGA devices, and under narrow time and power constraints. This architecture is used in the PHI instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. The proposed architecture, in a SIMD flavor, is aimed to be an accelerator within the Data Processing Unit (it is composed by a main Leon processor and two FPGAs) for carrying out the RTE inversion on board the spacecraft using a relatively slow FPGA device - Xilinx XQR4VSX55-. The proposed architecture squeezes the FPGA resources in order to reach the computational requirements and improves the ground-based system performance based on commercial CPUs regarding time and power consumption. In this work we demonstrate the feasibility of using this FPGA devices embedded in the SO/PHI instrument. With that goal in mind, we perform tests to evaluate the scientific results and to measure the processing time and power consumption for carrying out the RTE inversion.
Dolev, Danny; Függer, Matthias; Posch, Markus; Schmid, Ulrich; Steininger, Andreas; Lenzen, Christoph
2014-06-01
We present the first implementation of a distributed clock generation scheme for Systems-on-Chip that recovers from an unbounded number of arbitrary transient faults despite a large number of arbitrary permanent faults. We devise self-stabilizing hardware building blocks and a hybrid synchronous/asynchronous state machine enabling metastability-free transitions of the algorithm's states. We provide a comprehensive modeling approach that permits to prove, given correctness of the constructed low-level building blocks, the high-level properties of the synchronization algorithm (which have been established in a more abstract model). We believe this approach to be of interest in its own right, since this is the first technique permitting to mathematically verify, at manageable complexity, high-level properties of a fault-prone system in terms of its very basic components. We evaluate a prototype implementation, which has been designed in VHDL, using the Petrify tool in conjunction with some extensions, and synthesized for an Altera Cyclone FPGA.
A data transmission method for particle physics experiments based on Ethernet physical layer
NASA Astrophysics Data System (ADS)
Huang, Xi-Ru; Cao, Ping; Zheng, Jia-Jun
2015-11-01
Due to its advantages of universality, flexibility and high performance, fast Ethernet is widely used in readout system design for modern particle physics experiments. However, Ethernet is usually used together with the TCP/IP protocol stack, which makes it difficult to implement readout systems because designers have to use the operating system to process this protocol. Furthermore, TCP/IP degrades the transmission efficiency and real-time performance. To maximize the performance of Ethernet in physics experiment applications, a data readout method based on the physical layer (PHY) is proposed. In this method, TCP/IP is replaced with a customized and simple protocol, which makes it easier to implement. On each readout module, data from the front-end electronics is first fed into an FPGA for protocol processing and then sent out to a PHY chip controlled by this FPGA for transmission. This kind of data path is fully implemented by hardware. From the side of the data acquisition system (DAQ), however, the absence of a standard protocol causes problems for the network related applications. To solve this problem, in the operating system kernel space, data received by the network interface card is redirected from the traditional flow to a specified memory space by a customized program. This memory space can easily be accessed by applications in user space. For the purpose of verification, a prototype system has been designed and implemented. Preliminary test results show that this method can meet the requirements of data transmission from the readout module to the DAQ with an efficient and simple manner. Supported by National Natural Science Foundation of China (11005107) and Independent Projects of State Key Laboratory of Particle Detection and Electronics (201301)
Field-Programmable Gate Array-based fluxgate magnetometer with digital integration
NASA Astrophysics Data System (ADS)
Butta, Mattia; Janosek, Michal; Ripka, Pavel
2010-05-01
In this paper, a digital magnetometer based on printed circuit board fluxgate is presented. The fluxgate is pulse excited and the signal is extracted by gate integration. We investigate the possibility to perform integration on very narrow gates (typically 500 ns) by using digital techniques. The magnetometer is based on field-programmable gate array (FPGA) card: we will show all the advantages and disadvantages, given by digitalization of fluxgate output voltage by means of analog-to-digital converter on FPGA card, as well as digitalization performed by external digitizer. Due to very narrow gate, it is shown that a magnetometer entirely based on a FPGA card is preferable, because it avoids noise due to trigger instability. Both open loop and feedback operative mode are described and achieved results are presented.
Boundary-based cellwise OPC for standard-cell layouts
NASA Astrophysics Data System (ADS)
Pawlowski, David M.; Deng, Liang; Wong, Martin D. F.
2007-03-01
Model based optical proximity correction (OPC) has become necessary at 90nm technology node. Cellwise OPC is an attractive technique to reduce the mask data size as well as the prohibitive runtime of full-chip OPC. As feature dimensions have gotten smaller, the radius of influence for edge features has extended further into neighboring cells such that it is no longer sufficient to perform cellwise OPC independent of neighboring cells, especially for the critical layers. The methodology described in this work accounts for features in neighboring cells and allows a cellwise approach to be applied to cells with a printed gate length of 45nm with the projection that it can also be applied to future technology nodes. OPC-ready cells are generated at library creation (independent of placement) using a boundary-based technique. Each cell has a tractable number of OPC-ready versions due to an intelligent characterization of standard cell layout features. Results are very promising: the average edge placement error (EPE) for all metal1 features in 100 layouts is 0.731nm which is less than 1% of metal1 width; the maximum EPE for poly features reduced to 1/3, compared to cellwise OPC without considering boundaries, creating similar levels of lithographic accuracy while obviating any of the drawbacks inherent in layout specific full-chip model-based OPC.
A programmable controller based on CAN field bus embedded microprocessor and FPGA
NASA Astrophysics Data System (ADS)
Cai, Qizhong; Guo, Yifeng; Chen, Wenhei; Wang, Mingtao
2008-10-01
One kind of new programmable controller(PLC) is introduced in this paper. The advanced embedded microprocessor and Field-Programmable Gate Array (FPGA) device are applied in the PLC system. The PLC system structure was presented in this paper. It includes 32 bits Advanced RISC Machines (ARM) embedded microprocessor as control core, FPGA as control arithmetic coprocessor and CAN bus as data communication criteria protocol connected the host controller and its various extension modules. It is detailed given that the circuits and working principle, IiO interface circuit between ARM and FPGA and interface circuit between ARM and FPGA coprocessor. Furthermore the interface circuit diagrams between various modules are written. In addition, it is introduced that ladder chart program how to control the transfer info of control arithmetic part in FPGA coprocessor. The PLC, through nearly two months of operation to meet the design of the basic requirements.
Deterministic quantum state transfer and remote entanglement using microwave photons.
Kurpiers, P; Magnard, P; Walter, T; Royer, B; Pechal, M; Heinsoo, J; Salathé, Y; Akin, A; Storz, S; Besse, J-C; Gasparinetti, S; Blais, A; Wallraff, A
2018-06-01
Sharing information coherently between nodes of a quantum network is fundamental to distributed quantum information processing. In this scheme, the computation is divided into subroutines and performed on several smaller quantum registers that are connected by classical and quantum channels 1 . A direct quantum channel, which connects nodes deterministically rather than probabilistically, achieves larger entanglement rates between nodes and is advantageous for distributed fault-tolerant quantum computation 2 . Here we implement deterministic state-transfer and entanglement protocols between two superconducting qubits fabricated on separate chips. Superconducting circuits 3 constitute a universal quantum node 4 that is capable of sending, receiving, storing and processing quantum information 5-8 . Our implementation is based on an all-microwave cavity-assisted Raman process 9 , which entangles or transfers the qubit state of a transmon-type artificial atom 10 with a time-symmetric itinerant single photon. We transfer qubit states by absorbing these itinerant photons at the receiving node, with a probability of 98.1 ± 0.1 per cent, achieving a transfer-process fidelity of 80.02 ± 0.07 per cent for a protocol duration of only 180 nanoseconds. We also prepare remote entanglement on demand with a fidelity as high as 78.9 ± 0.1 per cent at a rate of 50 kilohertz. Our results are in excellent agreement with numerical simulations based on a master-equation description of the system. This deterministic protocol has the potential to be used for quantum computing distributed across different nodes of a cryogenic network.
Post-OPC verification using a full-chip pattern-based simulation verification method
NASA Astrophysics Data System (ADS)
Hung, Chi-Yuan; Wang, Ching-Heng; Ma, Cliff; Zhang, Gary
2005-11-01
In this paper, we evaluated and investigated techniques for performing fast full-chip post-OPC verification using a commercial product platform. A number of databases from several technology nodes, i.e. 0.13um, 0.11um and 90nm are used in the investigation. Although it has proven that for most cases, our OPC technology is robust in general, due to the variety of tape-outs with complicated design styles and technologies, it is difficult to develop a "complete or bullet-proof" OPC algorithm that would cover every possible layout patterns. In the evaluation, among dozens of databases, some OPC databases were found errors by Model-based post-OPC checking, which could cost significantly in manufacturing - reticle, wafer process, and more importantly the production delay. From such a full-chip OPC database verification, we have learned that optimizing OPC models and recipes on a limited set of test chip designs may not provide sufficient coverage across the range of designs to be produced in the process. And, fatal errors (such as pinch or bridge) or poor CD distribution and process-sensitive patterns may still occur. As a result, more than one reticle tape-out cycle is not uncommon to prove models and recipes that approach the center of process for a range of designs. So, we will describe a full-chip pattern-based simulation verification flow serves both OPC model and recipe development as well as post OPC verification after production release of the OPC. Lastly, we will discuss the differentiation of the new pattern-based and conventional edge-based verification tools and summarize the advantages of our new tool and methodology: 1). Accuracy: Superior inspection algorithms, down to 1nm accuracy with the new "pattern based" approach 2). High speed performance: Pattern-centric algorithms to give best full-chip inspection efficiency 3). Powerful analysis capability: Flexible error distribution, grouping, interactive viewing and hierarchical pattern extraction to narrow down to unique patterns/cells.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model
Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid
2014-01-01
A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well. PMID:25484854
Standby Power Management Architecture for Deep-Submicron Systems
2006-05-19
Driver 61 5.1 Quark PicoNode System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Power Domain Architecture... Quark system protocol stack. . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2 Quark system block diagram...the implementation of the chip using an industry-standard place and route design flow. Lastly some measurements from the chip are presented. 5.1 Quark
Generating clock signals for a cycle accurate, cycle reproducible FPGA based hardware accelerator
Asaad, Sameth W.; Kapur, Mohit
2016-01-05
A method, system and computer program product are disclosed for generating clock signals for a cycle accurate FPGA based hardware accelerator used to simulate operations of a device-under-test (DUT). In one embodiment, the DUT includes multiple device clocks generating multiple device clock signals at multiple frequencies and at a defined frequency ratio; and the FPG hardware accelerator includes multiple accelerator clocks generating multiple accelerator clock signals to operate the FPGA hardware accelerator to simulate the operations of the DUT. In one embodiment, operations of the DUT are mapped to the FPGA hardware accelerator, and the accelerator clock signals are generated at multiple frequencies and at the defined frequency ratio of the frequencies of the multiple device clocks, to maintain cycle accuracy between the DUT and the FPGA hardware accelerator. In an embodiment, the FPGA hardware accelerator may be used to control the frequencies of the multiple device clocks.
Note: Design of FPGA based system identification module with application to atomic force microscopy
NASA Astrophysics Data System (ADS)
Ghosal, Sayan; Pradhan, Sourav; Salapaka, Murti
2018-05-01
The science of system identification is widely utilized in modeling input-output relationships of diverse systems. In this article, we report field programmable gate array (FPGA) based implementation of a real-time system identification algorithm which employs forgetting factors and bias compensation techniques. The FPGA module is employed to estimate the mechanical properties of surfaces of materials at the nano-scale with an atomic force microscope (AFM). The FPGA module is user friendly which can be interfaced with commercially available AFMs. Extensive simulation and experimental results validate the design.
Anticipating and controlling mask costs within EDA physical design
NASA Astrophysics Data System (ADS)
Rieger, Michael L.; Mayhew, Jeffrey P.; Melvin, Lawrence S.; Lugg, Robert M.; Beale, Daniel F.
2003-08-01
For low k1 lithography, more aggressive OPC is being applied to critical layers, and the number of mask layers with OPC treatments is growing rapidly. The 130 nm, process node required, on average, 8 layers containing rules- or model-based OPC. The 90 nm node will have 16 OPC layers, of which 14 layers contain aggressive model-based OPC. This escalation of mask pattern complexity, coupled with the predominant use of vector-scan e-beam (VSB) mask writers contributes to the rising costs of advanced mask sets. Writing times for OPC layouts are several times longer than for traditional layouts, making mask exposure the single largest cost component for OPC masks. Lower mask yields, another key factor in higher mask costs, is also aggravated by OPC. Historical mask set costs are plotted below. The initial cost of a 90 nm-node mask set will exceed one million dollars. The relative impact of mask cost on chip depends on how many total wafers are printed with each mask set. For many foundry chips, where unit production is often well below 1000 wafers, mask costs are larger than wafer processing costs. Further increases in NRE may begin to discourage these suppliers' adoption to 90 nm and smaller nodes. In this paper we will outline several alternatives for reducing mask costs by strategically leveraging dimensional margins. Dimensional specifications for a particular masking layer usually are applied uniformly to all features on that layer. As a practical matter, accuracy requirements on different features in the design may vary widely. Take a polysilicon layer, for example: global tolerance specifications for that layer are driven by the transistor-gate requirements; but these parameters over-specify interconnect feature requirements. By identifying features where dimensional accuracy requirements can be reduced, additional margin can be leveraged to reduce OPC complexity. Mask writing time on VSB tools will drop in nearly direct proportion to reduce shot count. By inspecting masks with reference to feature-dependent margins, instead of uniform specifications, mask yield can be effectively increased further reducing delivered mask expense.
An embedded face-classification system for infrared images on an FPGA
NASA Astrophysics Data System (ADS)
Soto, Javier E.; Figueroa, Miguel
2014-10-01
We present a face-classification architecture for long-wave infrared (IR) images implemented on a Field Programmable Gate Array (FPGA). The circuit is fast, compact and low power, can recognize faces in real time and be embedded in a larger image-processing and computer vision system operating locally on an IR camera. The algorithm uses Local Binary Patterns (LBP) to perform feature extraction on each IR image. First, each pixel in the image is represented as an LBP pattern that encodes the similarity between the pixel and its neighbors. Uniform LBP codes are then used to reduce the number of patterns to 59 while preserving more than 90% of the information contained in the original LBP representation. Then, the image is divided into 64 non-overlapping regions, and each region is represented as a 59-bin histogram of patterns. Finally, the algorithm concatenates all 64 regions to create a 3,776-bin spatially enhanced histogram. We reduce the dimensionality of this histogram using Linear Discriminant Analysis (LDA), which improves clustering and enables us to store an entire database of 53 subjects on-chip. During classification, the circuit applies LBP and LDA to each incoming IR image in real time, and compares the resulting feature vector to each pattern stored in the local database using the Manhattan distance. We implemented the circuit on a Xilinx Artix-7 XC7A100T FPGA and tested it with the UCHThermalFace database, which consists of 28 81 x 150-pixel images of 53 subjects in indoor and outdoor conditions. The circuit achieves a 98.6% hit ratio, trained with 16 images and tested with 12 images of each subject in the database. Using a 100 MHz clock, the circuit classifies 8,230 images per second, and consumes only 309mW.
NASA Astrophysics Data System (ADS)
Kunwar, Samridha
The detection of ultra-high energy cosmic rays is constrained by their flux, requiring detectors with apertures of hundreds or even thousands of square kilometers and close to one hundred percent duty cycle. The sheer scale that would be required of conventional detectors, to acquire sufficient statistics for energy, composition or anisotropy studies, means that new techniques that reduce manpower and financial resources are continually being sought. In this dissertation, the development of a remote sensing technique based observatory known as bistatic radar, which aims to achieve extensive coverage of the Earth's surface, cf. Telescope Array's 700 km2 surface detector, is discussed. Construction of the radar projects transmitter station was completed in the summer of 2013, and remote receiver stations were deployed in June and November of 2014. These stations accomplish radar echo detection using an analog signal chain. Subject to less radio interference, the remote stations add stereoscopic measurement capabilities that theoretically allow unique determination of cosmic ray geometry and core location. An FPGA is used as a distributed data processing node within the project. The FPGA provides triggering logic for data sampled at 200 MSa/s, detecting Cosmic Ray shower echoes chirping at -1 to -10 Megahertz/microsecond (depending on the geometry) for several microseconds. The data acquisition system with low power consumption at a cost that is also comparatively inexpensive is described herein.
In-die mask registration measurement on 28nm-node and beyond
NASA Astrophysics Data System (ADS)
Chen, Shen Hung; Cheng, Yung Feng; Chen, Ming Jui
2013-09-01
As semiconductor go to smaller node, the critical dimension (CD) of process become more and more small. For lithography, RET (Resolution Enhancement Technology) applications can be used for wafer printing of smaller CD/pitch on 28nm node and beyond. SMO (Source Mask Optimization), DPT (Double Patterning Technology) and SADP (Self-Align Double Patterning) can provide lower k1 value for lithography. In another way, image placement error and overlay control also become more and more important for smaller chip size (advanced node). Mask registration (image placement error) and mask overlay are important factors to affect wafer overlay control/performance especially for DPT or SADP. In traditional method, the designed registration marks (cross type, square type) with larger CD were put into scribe-line of mask frame for registration and overlay measurement. However, these patterns are far way from real patterns. It does not show the registration of real pattern directly and is not a convincing method. In this study, the in-die (in-chip) registration measurement is introduced. We extract the dummy patterns that are close to main pattern from post-OPC (Optical Proximity Correction) gds by our desired rule and choose the patterns that distribute over whole mask uniformly. The convergence test shows 100 points measurement has a reliable result.
NASA Astrophysics Data System (ADS)
Zarrabi, Nawid; Clausen, Caterina; Düser, Monika G.; Börsch, Michael
2013-02-01
Conformational changes of individual fluorescently labeled proteins can be followed in solution using a confocal microscope. Two fluorophores attached to selected domains of the protein report fluctuating conformations. Based on Förster resonance energy transfer (FRET) between these fluorophores on a single protein, sequential distance changes between the dyes provide the real time trajectories of protein conformations. However, observation times are limited for freely diffusing biomolecules by Brownian motion through the confocal detection volume. A. E. Cohen and W. E. Moerner have invented and built microfluidic devices with 4 electrodes for an Anti-Brownian Electrokinetic Trap (ABELtrap). Here we present an ABELtrap based on a laser focus pattern generated by a pair of acousto-optical beam deflectors and controlled by a programmable FPGA chip. Fluorescent 20-nm beads in solution were used to mimic freely diffusing large proteins like solubilized FoF1-ATP synthase. The ABELtrap could hold these nanobeads for about 10 seconds at the given position. Thereby, observation times of a single particle were increased by a factor of 1000.
Infrared small target tracking based on SOPC
NASA Astrophysics Data System (ADS)
Hu, Taotao; Fan, Xiang; Zhang, Yu-Jin; Cheng, Zheng-dong; Zhu, Bin
2011-01-01
The paper presents a low cost FPGA based solution for a real-time infrared small target tracking system. A specialized architecture is presented based on a soft RISC processor capable of running kernel based mean shift tracking algorithm. Mean shift tracking algorithm is realized in NIOS II soft-core with SOPC (System on a Programmable Chip) technology. Though mean shift algorithm is widely used for target tracking, the original mean shift algorithm can not be directly used for infrared small target tracking. As infrared small target only has intensity information, so an improved mean shift algorithm is presented in this paper. How to describe target will determine whether target can be tracked by mean shift algorithm. Because color target can be tracked well by mean shift algorithm, imitating color image expression, spatial component and temporal component are advanced to describe target, which forms pseudo-color image. In order to improve the processing speed parallel technology and pipeline technology are taken. Two RAM are taken to stored images separately by ping-pong technology. A FLASH is used to store mass temp data. The experimental results show that infrared small target is tracked stably in complicated background.
NASA Astrophysics Data System (ADS)
Yang, C.; Zheng, W.; Zhang, M.; Yuan, T.; Zhuang, G.; Pan, Y.
2016-06-01
Measurement and control of the plasma in real-time are critical for advanced Tokamak operation. It requires high speed real-time data acquisition and processing. ITER has designed the Fast Plant System Controllers (FPSC) for these purposes. At J-TEXT Tokamak, a real-time data acquisition and processing framework has been designed and implemented using standard ITER FPSC technologies. The main hardware components of this framework are an Industrial Personal Computer (IPC) with a real-time system and FlexRIO devices based on FPGA. With FlexRIO devices, data can be processed by FPGA in real-time before they are passed to the CPU. The software elements are based on a real-time framework which runs under Red Hat Enterprise Linux MRG-R and uses Experimental Physics and Industrial Control System (EPICS) for monitoring and configuring. That makes the framework accord with ITER FPSC standard technology. With this framework, any kind of data acquisition and processing FlexRIO FPGA program can be configured with a FPSC. An application using the framework has been implemented for the polarimeter-interferometer diagnostic system on J-TEXT. The application is able to extract phase-shift information from the intermediate frequency signal produced by the polarimeter-interferometer diagnostic system and calculate plasma density profile in real-time. Different algorithms implementations on the FlexRIO FPGA are compared in the paper.
NASA Astrophysics Data System (ADS)
Deliparaschos, Kyriakos M.; Michail, Konstantinos; Zolotas, Argyrios C.; Tzafestas, Spyros G.
2016-05-01
This work presents a field programmable gate array (FPGA)-based embedded software platform coupled with a software-based plant, forming a hardware-in-the-loop (HIL) that is used to validate a systematic sensor selection framework. The systematic sensor selection framework combines multi-objective optimization, linear-quadratic-Gaussian (LQG)-type control, and the nonlinear model of a maglev suspension. A robustness analysis of the closed-loop is followed (prior to implementation) supporting the appropriateness of the solution under parametric variation. The analysis also shows that quantization is robust under different controller gains. While the LQG controller is implemented on an FPGA, the physical process is realized in a high-level system modeling environment. FPGA technology enables rapid evaluation of the algorithms and test designs under realistic scenarios avoiding heavy time penalty associated with hardware description language (HDL) simulators. The HIL technique facilitates significant speed-up in the required execution time when compared to its software-based counterpart model.
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)
Li, Isaac TS; Shum, Warren; Truong, Kevin
2007-01-01
Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).
Li, Isaac T S; Shum, Warren; Truong, Kevin
2007-06-07
To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Computing Models for FPGA-Based Accelerators
Herbordt, Martin C.; Gu, Yongfeng; VanCourt, Tom; Model, Josh; Sukhwani, Bharat; Chiu, Matt
2011-01-01
Field-programmable gate arrays are widely considered as accelerators for compute-intensive applications. A critical phase of FPGA application development is finding and mapping to the appropriate computing model. FPGA computing enables models with highly flexible fine-grained parallelism and associative operations such as broadcast and collective response. Several case studies demonstrate the effectiveness of using these computing models in developing FPGA applications for molecular modeling. PMID:21603152
NASA Astrophysics Data System (ADS)
Seljak, A.; Cumming, H. S.; Varner, G.; Vallerga, J.; Raffanti, R.; Virta, V.
2018-02-01
Our collaboration works on the development of a large aperture, high resolution, UV single-photon imaging detector, funded through NASA's Strategic Astrophysics Technology (SAT) program. The detector uses a microchannel plate for charge multiplication, and orthogonal cross strip (XS) anodes for charge readout. Our target is to make an advancement in the technology readiness level (TRL), which enables real scale prototypes to be tested for future NASA missions. The baseline detector has an aperture of 50×50 mm and requires 160 low-noise charge-sensitive channels, in order to extrapolate the incoming photon position with a spatial resolution of about 20 μm FWHM. Technologies involving space flight require highly integrated electronic systems operating at very low power. We have designed two ASICs which enable the construction of such readout system. First, a charge sensitive amplifier (CSAv3) ASIC provides an equivalent noise charge (ENC) of around 600 e-, and a baseline gain of 10 mV/fC. The second, a Giga Sample per Second (GSPS) ASIC, called HalfGRAPH, is a 12-bit analog to digital converter. Its architecture is based on waveform sampling capacitor arrays and has about 8 μs of analog storage memory per channel. Both chips encapsulate 16 measurement channels. Using these chips, a small scale prototype readout system has been constructed on a FPGA Mezzanine Board (FMC), equipped with 32 measurement channels for system evaluation. We describe the construction of HalfGRAPH ASIC, detector's readout system concept and obtained results from the prototype system. As part of the space flight qualification, these chips were irradiated with a Cobalt gamma-ray source, to verify functional operation under ionizing radiation exposure.
Filling the Assurance Gap on Complex Electronics
NASA Technical Reports Server (NTRS)
Plastow, Richard A.
2007-01-01
Many of the methods used to develop software bare a close resemblance to Complex Electronics (CE) development. CE are now programmed to perform tasks that were previously handled by software, such as communication protocols. For example, the James Webb Space Telescope will use Field Programmable Gate Arrays (FPGAs), which can have over a million logic gates, to send telemetry. System-on-chip (SoC) devices, another type of complex electronics, can combine a microprocessor, input and output channels, and sometimes an FPGA for programmability. With this increased intricacy, the possibility of software-like bugs such as incorrect design, logic, and unexpected interactions within the logic is great. Since CE devices are obscuring the hardware/software boundary, mature software methodologies have been proposed, with slight modifications, to develop these devices. By using standardized S/W Engineering methods such as checklists, missing requirements and bugs can be detected earlier in the development cycle, thus creating a development process for CE that can be easily maintained and configurable based on the device used.
Reconfigurable vision system for real-time applications
NASA Astrophysics Data System (ADS)
Torres-Huitzil, Cesar; Arias-Estrada, Miguel
2002-03-01
Recently, a growing community of researchers has used reconfigurable systems to solve computationally intensive problems. Reconfigurability provides optimized processors for systems on chip designs, and makes easy to import technology to a new system through reusable modules. The main objective of this work is the investigation of a reconfigurable computer system targeted for computer vision and real-time applications. The system is intended to circumvent the inherent computational load of most window-based computer vision algorithms. It aims to build a system for such tasks by providing an FPGA-based hardware architecture for task specific vision applications with enough processing power, using the minimum amount of hardware resources as possible, and a mechanism for building systems using this architecture. Regarding the software part of the system, a library of pre-designed and general-purpose modules that implement common window-based computer vision operations is being investigated. A common generic interface is established for these modules in order to define hardware/software components. These components can be interconnected to develop more complex applications, providing an efficient mechanism for transferring image and result data among modules. Some preliminary results are presented and discussed.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Monenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used proportional-integral-derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM-based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a DSP (Digital Signal Processor) or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSP) devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching this goal.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Montenegro, Justino (Technical Monitor)
2002-01-01
Much has been made of the capabilities of Field Programmable Gate Arrays (FPGA's) in the hardware implementation of fast digital signal processing functions. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used Proportional-Integral-Derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a Digital Signal Processor (DSP) device or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using DSP devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, Pulse Width Modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacemap. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive-control algorithm approaches. Radiation tolerant FPGA's are a feasible option for reaching this goal.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.
Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan
2017-07-01
In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
Functional Specification and Simulation of a Floating Point Co-Processor for SPUR
1986-08-01
depend on this state will not be stable until the next phase; this leaves the problem of how to control events that must occur on phi 1 of a cycle. The... problems with the structure of the chip description. The worst of these problems is the absence of Slang constructs for coding separate chip component...constructs such as UNK as well. Another related problem was the inability to explicitly declare the size of Slang node values. \\Vhile the correct
A Real-Time System for Lane Detection Based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Xiao, Jing; Li, Shutao; Sun, Bin
2016-12-01
This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
InGaAs/InP SPAD photon-counting module with auto-calibrated gate-width generation and remote control
NASA Astrophysics Data System (ADS)
Tosi, Alberto; Ruggeri, Alessandro; Bahgat Shehata, Andrea; Della Frera, Adriano; Scarcella, Carmelo; Tisa, Simone; Giudice, Andrea
2013-01-01
We present a photon-counting module based on InGaAs/InP SPAD (Single-Photon Avalanche Diode) for detecting single photons up to 1.7 μm. The module exploits a novel architecture for generating and calibrating the gate width, along with other functions (such as module supervision, counting and processing of detected photons, etc.). The gate width, i.e. the time interval when the SPAD is ON, is user-programmable in the range from 500 ps to 1.5 μs, by means of two different delay generation methods implemented with an FPGA (Field-Programmable Gate Array). In order to compensate chip-to-chip delay variation, an auto-calibration circuit picks out a combination of delays in order to match at best the selected gate width. The InGaAs/InP module accepts asynchronous and aperiodic signals and introduces very low timing jitter. Moreover the photon counting module provides other new features like a microprocessor for system supervision, a touch-screen for local user interface, and an Ethernet link for smart remote control. Thanks to the fullyprogrammable and configurable architecture, the overall instrument provides high system flexibility and can easily match all requirements set by many different applications requiring single photon-level sensitivity in the near infrared with very low photon timing jitter.
A digital frequency stabilization system of external cavity diode laser based on LabVIEW FPGA
NASA Astrophysics Data System (ADS)
Liu, Zhuohuan; Hu, Zhaohui; Qi, Lu; Wang, Tao
2015-10-01
Frequency stabilization for external cavity diode laser has played an important role in physics research. Many laser frequency locking solutions have been proposed by researchers. Traditionally, the locking process was accomplished by analog system, which has fast feedback control response speed. However, analog system is susceptible to the effects of environment. In order to improve the automation level and reliability of the frequency stabilization system, we take a grating-feedback external cavity diode laser as the laser source and set up a digital frequency stabilization system based on National Instrument's FPGA (NI FPGA). The system consists of a saturated absorption frequency stabilization of beam path, a differential photoelectric detector, a NI FPGA board and a host computer. Many functions, such as piezoelectric transducer (PZT) sweeping, atomic saturation absorption signal acquisition, signal peak identification, error signal obtaining and laser PZT voltage feedback controlling, are totally completed by LabVIEW FPGA program. Compared with the analog system, the system built by the logic gate circuits, performs stable and reliable. User interface programmed by LabVIEW is friendly. Besides, benefited from the characteristics of reconfiguration, the LabVIEW program is good at transplanting in other NI FPGA boards. Most of all, the system periodically checks the error signal. Once the abnormal error signal is detected, FPGA will restart frequency stabilization process without manual control. Through detecting the fluctuation of error signal of the atomic saturation absorption spectrum line in the frequency locking state, we can infer that the laser frequency stability can reach 1MHz.
Design of an LVDS to USB3.0 adapter and application
NASA Astrophysics Data System (ADS)
Qiu, Xiaohan; Wang, Yu; Zhao, Xin; Chang, Zhen; Zhang, Quan; Tian, Yuze; Zhang, Yunyi; Lin, Fang; Liu, Wenqing
2016-10-01
USB 3.0 specification was published in 2008. With the development of technology, USB 3.0 is becoming popular. LVDS(Low Voltage Differential Signaling) to USB 3.0 Adapter connects the communication port of spectrometer device and the USB 3.0 port of a computer, and converts the output of an LVDS spectrometer device data to USB. In order to adapt to the changing and developing of technology, LVDS to USB3.0 Adapter was designed and developed based on LVDS to USB2.0 Adapter. The CYUSB3014, a new generation of USB bus interface chip produced by Cypress and conforming to USB3.0 communication protocol, utilizes GPIF-II (GPIF, general programmable interface) to connect the FPGA and increases effective communication speed to 2Gbps. Therefore, the adapter, based on USB3.0 technology, is able to connect more spectrometers to single computer and provides technical basis for the development of the higher speed industrial camera. This article describes the design and development process of the LVDS to USB3.0 adapter.
An ultra-low power wireless sensor network for bicycle torque performance measurements.
Gharghan, Sadik K; Nordin, Rosdiadee; Ismail, Mahamod
2015-05-21
In this paper, we propose an energy-efficient transmission technique known as the sleep/wake algorithm for a bicycle torque sensor node. This paper aims to highlight the trade-off between energy efficiency and the communication range between the cyclist and coach. Two experiments were conducted. The first experiment utilised the Zigbee protocol (XBee S2), and the second experiment used the Advanced and Adaptive Network Technology (ANT) protocol based on the Nordic nRF24L01 radio transceiver chip. The current consumption of ANT was measured, simulated and compared with a torque sensor node that uses the XBee S2 protocol. In addition, an analytical model was derived to correlate the sensor node average current consumption with a crank arm cadence. The sensor node achieved 98% power savings for ANT relative to ZigBee when they were compared alone, and the power savings amounted to 30% when all components of the sensor node are considered. The achievable communication range was 65 and 50 m for ZigBee and ANT, respectively, during measurement on an outdoor cycling track (i.e., velodrome). The conclusions indicate that the ANT protocol is more suitable for use in a torque sensor node when power consumption is a crucial demand, whereas the ZigBee protocol is more convenient in ensuring data communication between cyclist and coach.
An Ultra-Low Power Wireless Sensor Network for Bicycle Torque Performance Measurements
Gharghan, Sadik K.; Nordin, Rosdiadee; Ismail, Mahamod
2015-01-01
In this paper, we propose an energy-efficient transmission technique known as the sleep/wake algorithm for a bicycle torque sensor node. This paper aims to highlight the trade-off between energy efficiency and the communication range between the cyclist and coach. Two experiments were conducted. The first experiment utilised the Zigbee protocol (XBee S2), and the second experiment used the Advanced and Adaptive Network Technology (ANT) protocol based on the Nordic nRF24L01 radio transceiver chip. The current consumption of ANT was measured, simulated and compared with a torque sensor node that uses the XBee S2 protocol. In addition, an analytical model was derived to correlate the sensor node average current consumption with a crank arm cadence. The sensor node achieved 98% power savings for ANT relative to ZigBee when they were compared alone, and the power savings amounted to 30% when all components of the sensor node are considered. The achievable communication range was 65 and 50 m for ZigBee and ANT, respectively, during measurement on an outdoor cycling track (i.e., velodrome). The conclusions indicate that the ANT protocol is more suitable for use in a torque sensor node when power consumption is a crucial demand, whereas the ZigBee protocol is more convenient in ensuring data communication between cyclist and coach. PMID:26007728
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
ERIC Educational Resources Information Center
Zumel, P.; Fernandez, C.; Sanz, M.; Lazaro, A.; Barrado, A.
2011-01-01
In this paper, a short introductory course to introduce field-programmable gate array (FPGA)-based digital control of dc/dc switching power converters is presented. Digital control based on specific hardware has been at the leading edge of low-medium power dc/dc switching converters in recent years. Besides industry's interest in this topic, from…
NASA Astrophysics Data System (ADS)
Qiu, Mo; Yu, Simin; Wen, Yuqiong; Lü, Jinhu; He, Jianbin; Lin, Zhuosheng
In this paper, a novel design methodology and its FPGA hardware implementation for a universal chaotic signal generator is proposed via the Verilog HDL fixed-point algorithm and state machine control. According to continuous-time or discrete-time chaotic equations, a Verilog HDL fixed-point algorithm and its corresponding digital system are first designed. In the FPGA hardware platform, each operation step of Verilog HDL fixed-point algorithm is then controlled by a state machine. The generality of this method is that, for any given chaotic equation, it can be decomposed into four basic operation procedures, i.e. nonlinear function calculation, iterative sequence operation, iterative values right shifting and ceiling, and chaotic iterative sequences output, each of which corresponds to only a state via state machine control. Compared with the Verilog HDL floating-point algorithm, the Verilog HDL fixed-point algorithm can save the FPGA hardware resources and improve the operation efficiency. FPGA-based hardware experimental results validate the feasibility and reliability of the proposed approach.
Design of embedded endoscopic ultrasonic imaging system
NASA Astrophysics Data System (ADS)
Li, Ming; Zhou, Hao; Wen, Shijie; Chen, Xiodong; Yu, Daoyin
2008-12-01
Endoscopic ultrasonic imaging system is an important component in the endoscopic ultrasonography system (EUS). Through the ultrasonic probe, the characteristics of the fault histology features of digestive organs is detected by EUS, and then received by the reception circuit which making up of amplifying, gain compensation, filtering and A/D converter circuit, in the form of ultrasonic echo. Endoscopic ultrasonic imaging system is the back-end processing system of the EUS, with the function of receiving digital ultrasonic echo modulated by the digestive tract wall from the reception circuit, acquiring and showing the fault histology features in the form of image and characteristic data after digital signal processing, such as demodulation, etc. Traditional endoscopic ultrasonic imaging systems are mainly based on image acquisition and processing chips, which connecting to personal computer with USB2.0 circuit, with the faults of expensive, complicated structure, poor portability, and difficult to popularize. To against the shortcomings above, this paper presents the methods of digital signal acquisition and processing specially based on embedded technology with the core hardware structure of ARM and FPGA for substituting the traditional design with USB2.0 and personal computer. With built-in FIFO and dual-buffer, FPGA implement the ping-pong operation of data storage, simultaneously transferring the image data into ARM through the EBI bus by DMA function, which is controlled by ARM to carry out the purpose of high-speed transmission. The ARM system is being chosen to implement the responsibility of image display every time DMA transmission over and actualizing system control with the drivers and applications running on the embedded operating system Windows CE, which could provide a stable, safe and reliable running platform for the embedded device software. Profiting from the excellent graphical user interface (GUI) and good performance of Windows CE, we can not only clearly show 511×511 pixels ultrasonic echo images through application program, but also provide a simple and friendly operating interface with mouse and touch screen which is more convenient than the traditional endoscopic ultrasonic imaging system. Including core and peripheral circuits of FPGA and ARM, power network circuit and LCD display circuit, we designed the whole embedded system, achieving the desired purpose by implementing ultrasonic image display properly after the experimental verification, solving the problem of hugeness and complexity of the traditional endoscopic ultrasonic imaging system.
Meeting critical gate linewidth control needs at the 65 nm node
NASA Astrophysics Data System (ADS)
Mahorowala, Arpan; Halle, Scott; Gabor, Allen; Chu, William; Barberet, Alexandra; Samuels, Donald; Abdo, Amr; Tsou, Len; Yan, Wendy; Iseda, Seiji; Patel, Kaushal; Dirahoui, Bachir; Nomura, Asuka; Ahsan, Ishtiaq; Azam, Faisal; Berg, Gary; Brendler, Andrew; Zimmerman, Jeffrey; Faure, Tom
2006-03-01
With the nominal gate length at the 65 nm node being only 35 nm, controlling the critical dimension (CD) in polysilicon to within a few nanometers is essential to achieve a competitive power-to-performance ratio. Gate linewidths must be controlled, not only at the chip level so that the chip performs as the circuit designers and device engineers had intended, but also at the wafer level so that more chips with the optimum power-to-performance ratio are manufactured. Achieving tight across-chip linewidth variation (ACLV) and chip mean variation (CMV) is possible only if the mask-making, lithography, and etching processes are all controlled to very tight specifications. This paper identifies the various ACLV and CMV components, describes their root causes, and discusses a methodology to quantify them. For example, the site-to-site ACLV component is divided into systematic and random sub-components. The systematic component of the variation is attributed in part to pattern density variation across the field, and variation in exposure dose across the slit. The paper demonstrates our team's success in achieving the tight gate CD tolerances required for 65 nm technology. Certain key challenges faced, and methods employed to overcome them are described. For instance, the use of dose-compensation strategies to correct the small but systematic CD variations measured across the wafer, is described. Finally, the impact of immersion lithography on both ACLV and CMV is briefly discussed.
Software Process Assurance for Complex Electronics
NASA Technical Reports Server (NTRS)
Plastow, Richard A.
2007-01-01
Complex Electronics (CE) now perform tasks that were previously handled in software, such as communication protocols. Many methods used to develop software bare a close resemblance to CE development. Field Programmable Gate Arrays (FPGAs) can have over a million logic gates while system-on-chip (SOC) devices can combine a microprocessor, input and output channels, and sometimes an FPGA for programmability. With this increased intricacy, the possibility of software-like bugs such as incorrect design, logic, and unexpected interactions within the logic is great. With CE devices obscuring the hardware/software boundary, we propose that mature software methodologies may be utilized with slight modifications in the development of these devices. Software Process Assurance for Complex Electronics (SPACE) is a research project that used standardized S/W Assurance/Engineering practices to provide an assurance framework for development activities. Tools such as checklists, best practices and techniques were used to detect missing requirements and bugs earlier in the development cycle creating a development process for CE that was more easily maintained, consistent and configurable based on the device used.
Design of remote car anti-theft system based on ZigBee
NASA Astrophysics Data System (ADS)
Fang, Hong; Yan, GangFeng; Li, Hong Lian
2015-12-01
A set of remote car anti-theft system based on ZigBee and GPRS with ARM11 built-in chip S3C6410 as the controller is designed. This system can detect the alarm information of the car with vibration sensor, pyroelectric sensor and infrared sensor. When the sensor detects any alarm signal, the ZigBee node in sleep will be awakened and then directly send the alarm signal to the microcontroller chip S3C6410 in the control room of the parking lot through ZigBee wireless transceiver module. After S3C6410 processes and analyzes the alarm signal, when any two sensors of the three collect the alarm signal, the LCD will display and generate an alarm and meanwhile it will send the alarm signal to the phone of the user in a wireless manner through the form of short message through GPRS module. Thus, the wireless remote monitoring of the system is realized.
Designing Next Generation Massively Multithreaded Architectures for Irregular Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tumeo, Antonino; Secchi, Simone; Villa, Oreste
Irregular applications, such as data mining or graph-based computations, show unpredictable memory/network access patterns and control structures. Massively multi-threaded architectures with large node count, like the Cray XMT, have been shown to address their requirements better than commodity clusters. In this paper we present the approaches that we are currently pursuing to design future generations of these architectures. First, we introduce the Cray XMT and compare it to other multithreaded architectures. We then propose an evolution of the architecture, integrating multiple cores per node and next generation network interconnect. We advocate the use of hardware support for remote memory referencemore » aggregation to optimize network utilization. For this evaluation we developed a highly parallel, custom simulation infrastructure for multi-threaded systems. Our simulator executes unmodified XMT binaries with very large datasets, capturing effects due to contention and hot-spotting, while predicting execution times with greater than 90% accuracy. We also discuss the FPGA prototyping approach that we are employing to study efficient support for irregular applications in next generation manycore processors.« less
An Embedded Sensor Node Microcontroller with Crypto-Processors.
Panić, Goran; Stecklina, Oliver; Stamenković, Zoran
2016-04-27
Wireless sensor network applications range from industrial automation and control, agricultural and environmental protection, to surveillance and medicine. In most applications, data are highly sensitive and must be protected from any type of attack and abuse. Security challenges in wireless sensor networks are mainly defined by the power and computing resources of sensor devices, memory size, quality of radio channels and susceptibility to physical capture. In this article, an embedded sensor node microcontroller designed to support sensor network applications with severe security demands is presented. It features a low power 16-bitprocessor core supported by a number of hardware accelerators designed to perform complex operations required by advanced crypto algorithms. The microcontroller integrates an embedded Flash and an 8-channel 12-bit analog-to-digital converter making it a good solution for low-power sensor nodes. The article discusses the most important security topics in wireless sensor networks and presents the architecture of the proposed hardware solution. Furthermore, it gives details on the chip implementation, verification and hardware evaluation. Finally, the chip power dissipation and performance figures are estimated and analyzed.
An Embedded Sensor Node Microcontroller with Crypto-Processors
Panić, Goran; Stecklina, Oliver; Stamenković, Zoran
2016-01-01
Wireless sensor network applications range from industrial automation and control, agricultural and environmental protection, to surveillance and medicine. In most applications, data are highly sensitive and must be protected from any type of attack and abuse. Security challenges in wireless sensor networks are mainly defined by the power and computing resources of sensor devices, memory size, quality of radio channels and susceptibility to physical capture. In this article, an embedded sensor node microcontroller designed to support sensor network applications with severe security demands is presented. It features a low power 16-bitprocessor core supported by a number of hardware accelerators designed to perform complex operations required by advanced crypto algorithms. The microcontroller integrates an embedded Flash and an 8-channel 12-bit analog-to-digital converter making it a good solution for low-power sensor nodes. The article discusses the most important security topics in wireless sensor networks and presents the architecture of the proposed hardware solution. Furthermore, it gives details on the chip implementation, verification and hardware evaluation. Finally, the chip power dissipation and performance figures are estimated and analyzed. PMID:27128925
HDL Based FPGA Interface Library for Data Acquisition and Multipurpose Real Time Algorithms
NASA Astrophysics Data System (ADS)
Fernandes, Ana M.; Pereira, R. C.; Sousa, J.; Batista, A. J. N.; Combo, A.; Carvalho, B. B.; Correia, C. M. B. A.; Varandas, C. A. F.
2011-08-01
The inherent parallelism of the logic resources, the flexibility in its configuration and the performance at high processing frequencies makes the field programmable gate array (FPGA) the most suitable device to be used both for real time algorithm processing and data transfer in instrumentation modules. Moreover, the reconfigurability of these FPGA based modules enables exploiting different applications on the same module. When using a reconfigurable module for various applications, the availability of a common interface library for easier implementation of the algorithms on the FPGA leads to more efficient development. The FPGA configuration is usually specified in a hardware description language (HDL) or other higher level descriptive language. The critical paths, such as the management of internal hardware clocks that require deep knowledge of the module behavior shall be implemented in HDL to optimize the timing constraints. The common interface library should include these critical paths, freeing the application designer from hardware complexity and able to choose any of the available high-level abstraction languages for the algorithm implementation. With this purpose a modular Verilog code was developed for the Virtex 4 FPGA of the in-house Transient Recorder and Processor (TRP) hardware module, based on the Advanced Telecommunications Computing Architecture (ATCA), with eight channels sampling at up to 400 MSamples/s (MSPS). The TRP was designed to perform real time Pulse Height Analysis (PHA), Pulse Shape Discrimination (PSD) and Pile-Up Rejection (PUR) algorithms at a high count rate (few Mevent/s). A brief description of this modular code is presented and examples of its use as an interface with end user algorithms, including a PHA with PUR, are described.
An FPGA-based heterogeneous image fusion system design method
NASA Astrophysics Data System (ADS)
Song, Le; Lin, Yu-chi; Chen, Yan-hua; Zhao, Mei-rong
2011-08-01
Taking the advantages of FPGA's low cost and compact structure, an FPGA-based heterogeneous image fusion platform is established in this study. Altera's Cyclone IV series FPGA is adopted as the core processor of the platform, and the visible light CCD camera and infrared thermal imager are used as the image-capturing device in order to obtain dualchannel heterogeneous video images. Tailor-made image fusion algorithms such as gray-scale weighted averaging, maximum selection and minimum selection methods are analyzed and compared. VHDL language and the synchronous design method are utilized to perform a reliable RTL-level description. Altera's Quartus II 9.0 software is applied to simulate and implement the algorithm modules. The contrast experiments of various fusion algorithms show that, preferably image quality of the heterogeneous image fusion can be obtained on top of the proposed system. The applied range of the different fusion algorithms is also discussed.
FPGA and USB based control board for quantum random number generator
NASA Astrophysics Data System (ADS)
Wang, Jian; Wan, Xu; Zhang, Hong-Fei; Gao, Yuan; Chen, Teng-Yun; Liang, Hao
2009-09-01
The design and implementation of FPGA-and-USB-based control board for quantum experiments are discussed. The usage of quantum true random number generator, control- logic in FPGA and communication with computer through USB protocol are proposed in this paper. Programmable controlled signal input and output ports are implemented. The error-detections of data frame header and frame length are designed. This board has been used in our decoy-state based quantum key distribution (QKD) system successfully.
Implementation of random contact hole design with CPL mask by using IML technology
NASA Astrophysics Data System (ADS)
Hsu, Michael; Van Den Broeke, Doug; Hsu, Stephen; Chen, J. Fung; Shi, Xuelong; Corcoran, Noel; Yu, Linda
2005-11-01
The contact hole imaging is a very challenge task for the optical lithography process during IC manufacturing. Lots of RETs were proposed to improve the contrast of small opening hole. Scattering Bar (SB) OPC, together with optimized illumination, is no doubt one of the critical enablers for low k1 contact imaging. In this study, an effective model-based SB OPC based on IML technology is implemented for contact layer at 90nm, 65nm, and 45nm nodes. For our full-chip implementation flow, the first step is to determine the critical design area and then to proceed with NA and illumination optimization. Then, we selected the best NA in combination with optimum illumination via a Diffraction Optical Element (DOE). With optimized illumination, it is now possible to construct an interference map for the full-chip mask pattern. Utilizing the interference map, the model-based SB OPC is performed. Next, model OPC can be applied with the presence of SB for the entire chip. It is important to note that, for patterning at k1 near 0.35 or below, it may be necessary to include 3D mask effects with a high NA OPC model. With enhanced DOF by IML and immersion process, the low k1 production worthy contact process is feasible.
High speed FPGA-based Phasemeter for the far-infrared laser interferometers on EAST
NASA Astrophysics Data System (ADS)
Yao, Y.; Liu, H.; Zou, Z.; Li, W.; Lian, H.; Jie, Y.
2017-12-01
The far-infrared laser-based HCN interferometer and POlarimeter/INTerferometer\\break (POINT) system are important diagnostics for plasma density measurement on EAST tokamak. Both HCN and POINT provide high spatial and temporal resolution of electron density measurement and used for plasma density feedback control. The density is calculated by measuring the real-time phase difference between the reference beams and the probe beams. For long-pulse operations on EAST, the calculation of density has to meet the requirements of Real-Time and high precision. In this paper, a Phasemeter for far-infrared laser-based interferometers will be introduced. The FPGA-based Phasemeter leverages fast ADCs to obtain the three-frequency signals from VDI planar-diode Mixers, and realizes digital filters and an FFT algorithm in FPGA to provide real-time, high precision electron density output. Implementation of the Phasemeter will be helpful for the future plasma real-time feedback control in long-pulse discharge.
Netlist Oriented Sensitivity Evaluation (NOSE)
2017-03-01
developing methodologies to assess sensitivities of alternative chip design netlist implementations. The research is somewhat foundational in that such...Netlist-Oriented Sensitivity Evaluation (NOSE) project was to develop methodologies to assess sensitivities of alternative chip design netlist...analysis to devise a methodology for scoring the sensitivity of circuit nodes in a netlist and thus providing the raw data for any meaningful
NASA Astrophysics Data System (ADS)
Chen, Yuan-Ho
2017-05-01
In this work, we propose a counting-weighted calibration method for field-programmable-gate-array (FPGA)-based time-to-digital converter (TDC) to provide non-linearity calibration for use in positron emission tomography (PET) scanners. To deal with the non-linearity in FPGA, we developed a counting-weighted delay line (CWD) to count the delay time of the delay cells in the TDC in order to reduce the differential non-linearity (DNL) values based on code density counts. The performance of the proposed CWD-TDC with regard to linearity far exceeds that of TDC with a traditional tapped delay line (TDL) architecture, without the need for nonlinearity calibration. When implemented in a Xilinx Vertix-5 FPGA device, the proposed CWD-TDC achieved time resolution of 60 ps with integral non-linearity (INL) and DNL of [-0.54, 0.24] and [-0.66, 0.65] least-significant-bit (LSB), respectively. This is a clear indication of the suitability of the proposed FPGA-based CWD-TDC for use in PET scanners.
NASA/BAE Systems SpaceWire Efforts
NASA Technical Reports Server (NTRS)
Rakow, Glenn Parker; Schnurr, Richard G.; Kapcio, Paul
2003-01-01
This paper discusses the state of the NASA and BAE SYSTEMS developments using Spacewire. NASA has developed intellectual property that implements Spacewire in Register Transfer Level VHDL for a Spacewire link and router. This design has been extensively verified using directed tests from the Spacewire Standard and design specification, as well as being randomly tested to flush out hard to find bugs in the code. The high level features of the design will be discussed, including the support for multiple time code masters, which will be useful for the James Webb Space Telescope electrical architecture. This design is now ready to be targeted to FPGA's and ASICs. Target utilization and performance information will be presented for some spaceflight qualified FPGA's and a discussion of the ASIC implementations will be addressed. In particular, the BAE SYSTEMS ASIC will be highlighted which will be implemented in their 0.25 micron rad-hard line. The chip will implement a 4-port router with the ability to tie chips together to make larger routers without external glue logic. This part will have integrated LVDS driver/receivers, include a PLL and include skew control logic. It will be targeted to run at greater than 300 MHz and include the implementation for the proposed Spacewire transport layer. The need to provide a reliable transport mechanism for Spacewire has been identified by both NASA and ESA, who are attempting to define a transport layer standard that utilizes a low overhead, low latency connection oriented approach. The Transport layer needs to be implemented in hardware-to prevent bottlenecks.
Low-power, transparent optical network interface for high bandwidth off-chip interconnects.
Liboiron-Ladouceur, Odile; Wang, Howard; Garg, Ajay S; Bergman, Keren
2009-04-13
The recent emergence of multicore architectures and chip multiprocessors (CMPs) has accelerated the bandwidth requirements in high-performance processors for both on-chip and off-chip interconnects. For next generation computing clusters, the delivery of scalable power efficient off-chip communications to each compute node has emerged as a key bottleneck to realizing the full computational performance of these systems. The power dissipation is dominated by the off-chip interface and the necessity to drive high-speed signals over long distances. We present a scalable photonic network interface approach that fully exploits the bandwidth capacity offered by optical interconnects while offering significant power savings over traditional E/O and O/E approaches. The power-efficient interface optically aggregates electronic serial data streams into a multiple WDM channel packet structure at time-of-flight latencies. We demonstrate a scalable optical network interface with 70% improvement in power efficiency for a complete end-to-end PCI Express data transfer.
Experiences on developing digital down conversion algorithms using Xilinx system generator
NASA Astrophysics Data System (ADS)
Xu, Chengfa; Yuan, Yuan; Zhao, Lizhi
2013-07-01
The Digital Down Conversion (DDC) algorithm is a classical signal processing method which is widely used in radar and communication systems. In this paper, the DDC function is implemented by Xilinx System Generator tool on FPGA. System Generator is an FPGA design tool provided by Xilinx Inc and MathWorks Inc. It is very convenient for programmers to manipulate the design and debug the function, especially for the complex algorithm. Through the developing process of DDC function based on System Generator, the results show that System Generator is a very fast and efficient tool for FPGA design.
Designing a VMEbus FDDI adapter card
NASA Astrophysics Data System (ADS)
Venkataraman, Raman
1992-03-01
This paper presents a system architecture for a VMEbus FDDI adapter card containing a node core, FDDI block, frame buffer memory and system interface unit. Most of the functions of the PHY and MAC layers of FDDI are implemented with National's FDDI chip set and the SMT implementation is simplified with a low cost microcontroller. The factors that influence the system bus bandwidth utilization and FDDI bandwidth utilization are the data path and frame buffer memory architecture. The VRAM based frame buffer memory has two sections - - LLC frame memory and SMT frame memory. Each section with an independent serial access memory (SAM) port provides an independent access after the initial data transfer cycle on the main port and hence, the throughput is maximized on each port of the memory. The SAM port simplifies the system bus master DMA design and the VMEbus interface can be designed with low-cost off-the-shelf interface chips.
Adapting Wave-front Algorithms to Efficiently Utilize Systems with Deep Communication Hierarchies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerbyson, Darren J.; Lang, Michael; Pakin, Scott
2011-09-30
Large-scale systems increasingly exhibit a differential between intra-chip and inter-chip communication performance especially in hybrid systems using accelerators. Processorcores on the same socket are able to communicate at lower latencies, and with higher bandwidths, than cores on different sockets either within the same node or between nodes. A key challenge is to efficiently use this communication hierarchy and hence optimize performance. We consider here the class of applications that contains wavefront processing. In these applications data can only be processed after their upstream neighbors have been processed. Similar dependencies result between processors in which communication is required to pass boundarymore » data downstream and whose cost is typically impacted by the slowest communication channel in use. In this work we develop a novel hierarchical wave-front approach that reduces the use of slower communications in the hierarchy but at the cost of additional steps in the parallel computation and higher use of on-chip communications. This tradeoff is explored using a performance model. An implementation using the Reverse-acceleration programming model on the petascale Roadrunner system demonstrates a 27% performance improvement at full system-scale on a kernel application. The approach is generally applicable to large-scale multi-core and accelerated systems where a differential in system communication performance exists.« less
Adapting wave-front algorithms to efficiently utilize systems with deep communication hierarchies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerbyson, Darren J; Lang, Michael; Pakin, Scott
2009-01-01
Large-scale systems increasingly exhibit a differential between intra-chip and inter-chip communication performance. Processor-cores on the same socket are able to communicate at lower latencies, and with higher bandwidths, than cores on different sockets either within the same node or between nodes. A key challenge is to efficiently use this communication hierarchy and hence optimize performance. We consider here the class of applications that contain wave-front processing. In these applications data can only be processed after their upstream neighbors have been processed. Similar dependencies result between processors in which communication is required to pass boundary data downstream and whose cost ismore » typically impacted by the slowest communication channel in use. In this work we develop a novel hierarchical wave-front approach that reduces the use of slower communications in the hierarchy but at the cost of additional computation and higher use of on-chip communications. This tradeoff is explored using a performance model and an implementation on the Petascale Roadrunner system demonstrates a 27% performance improvement at full system-scale on a kernel application. The approach is generally applicable to large-scale multi-core and accelerated systems where a differential in system communication performance exists.« less
FPGA platform for prototyping and evaluation of neural network automotive applications
NASA Technical Reports Server (NTRS)
Aranki, N.; Tawel, R.
2002-01-01
In this paper we present an FPGA based reconfigurable computing platform for prototyping and evaluation of advanced neural network based applications for control and diagnostics in an automotive sub-systems.
Development of new data acquisition system for COMPASS experiment
NASA Astrophysics Data System (ADS)
Bodlak, M.; Frolov, V.; Jary, V.; Huber, S.; Konorov, I.; Levit, D.; Novy, J.; Salac, R.; Virius, M.
2016-04-01
This paper presents development and recent status of the new data acquisiton system of the COMPASS experiment at CERN with up to 50 kHz trigger rate and 36 kB average event size during 10 second period with beam followed by approximately 40 second period without beam. In the original DAQ, the event building is performed by software deployed on switched computer network, moreover the data readout is based on deprecated PCI technology; the new system replaces the event building network with a custom FPGA-based hardware. The custom cards are introduced and advantages of the FPGA technology for DAQ related tasks are discussed. In this paper, we focus on the software part that is mainly responsible for control and monitoring. The most of the system can run as slow control; only readout process has realtime requirements. The design of the software is built on state machines that are implemented using the Qt framework; communication between remote nodes that form the software architecture is based on the DIM library and IPBus technology. Furthermore, PHP and JS languages are used to maintain system configuration; the MySQL database was selected as storage for both configuration of the system and system messages. The system has been design with maximum throughput of 1500 MB/s and large buffering ability used to spread load on readout computers over longer period of time. Great emphasis is put on data latency, data consistency, and even timing checks which are done at each stage of event assembly. System collects results of these checks which together with special data format allows the software to localize origin of problems in data transmission process. A prototype version of the system has already been developed and tested the new system fulfills all given requirements. It is expected that the full-scale version of the system will be finalized in June 2014 and deployed on September provided that tests with cosmic run succeed.
NASA Technical Reports Server (NTRS)
Howard, J. W.; Kim, H.; Berg, M.; LaBel, K. A.; Stansberry, S.; Friendlich, M.; Irwin, T.
2006-01-01
A viewgraph presentation on the development of a low cost, high speed tester reconfigurable Field Programmable Gata Array (FPGA) is shown. The topics include: 1) Introduction; 2) Objectives; 3) Tester Descriptions; 4) Tester Validations and Demonstrations; 5) Future Work; and 6) Summary.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fernandes, Ana; Pereira, Rita C.; Sousa, Jorge
The Instituto de Plasmas e Fusao Nuclear (IPFN) has developed dedicated re-configurable modules based on field programmable gate array (FPGA) devices for several nuclear fusion machines worldwide. Moreover, new Advanced Telecommunication Computing Architecture (ATCA) based modules developed by IPFN are already included in the ITER catalogue. One of the requirements for re-configurable modules operating in future nuclear environments including ITER is the remote update capability. Accordingly, this work presents an alternative method for FPGA remote programing to be implemented in new ATCA based re-configurable modules. FPGAs are volatile devices and their programming code is usually stored in dedicated flash memoriesmore » for properly configuration during module power-on. The presented method is capable to store new FPGA codes in Serial Peripheral Interface (SPI) flash memories using the PCIexpress (PCIe) network established on the ATCA back-plane, linking data acquisition endpoints and the data switch blades. The method is based on the Xilinx Quick Boot application note, adapted to PCIe protocol and ATCA based modules. (authors)« less
FPGA-Based Reconfigurable Processor for Ultrafast Interlaced Ultrasound and Photoacoustic Imaging
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2016-01-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models. PMID:22828830
FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging.
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2012-07-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models.
A dynamically reconfigurable multi-functional PLL for SRAM-based FPGA in 65nm CMOS technology
NASA Astrophysics Data System (ADS)
Yang, Mingqian; Chen, Lei; Li, Xuewu; Zhang, Yanlong
2018-04-01
Phase-locked loops (PLL) have been widely utilized in FPGA as an important module for clock management. PLL with dynamic reconfiguration capability is always welcomed in FPGA design as it is able to decrease power consumption and simultaneously improve flexibility. In this paper, a multi-functional PLL with dynamic reconfiguration capability for 65nm SRAM-based FPGA is proposed. Firstly, configurable charge pump and loop filter are utilized to optimize the loop bandwidth. Secondly, the PLL incorporates a VCO with dual control voltages to accelerate the adjustment of oscillation frequency. Thirdly, three configurable dividers are presented for flexible frequency synthesis. Lastly, a configuration block with dynamic reconfiguration function is proposed. Simulation results demonstrate that the proposed multi-functional PLL can output clocks with configurable division ratio, phase shift and duty cycle. The PLL can also be dynamically reconfigured without affecting other parts' running or halting the FPGA device.
2006-06-14
Robert Graybill . A Raw hoard for the use of this project was provided by the Computer Architecture Croup at the Massachusetts Institute of Technology...simulator is presented by MIT as being an accurate model of the Raw chip, we have found that it does not accurately model the board. Our comparison...G4 processor, model 7410. with a 32 kbyte level-1 cache on-chip and a 2 Mbyte L2 cache connected through a 250 MH/ bus [12]. Each node has 256 Mbyte
A fast event preprocessor for the Simbol-X Low-Energy Detector
NASA Astrophysics Data System (ADS)
Schanz, T.; Tenzer, C.; Kendziorra, E.; Santangelo, A.
2008-07-01
The Simbol-X1 Low Energy Detector (LED), a 128 × 128 pixel DEPFET array, will be read out very fast (8000 frames/second). This requires a very fast onboard data preprocessing of the raw data. We present an FPGA based Event Preprocessor (EPP) which can fulfill this requirements. The design is developed in the hardware description language VHDL and can be later ported on an ASIC technology. The EPP performs a pixel related offset correction and can apply different energy thresholds to each pixel of the frame. It also provides a line related common-mode correction to reduce noise that is unavoidably caused by the analog readout chip of the DEPFET. An integrated pattern detector can block all invalid pixel patterns. The EPP has an internal pipeline structure and can perform all operation in realtime (< 2 μs per line of 64 pixel) with a base clock frequency of 100 MHz. It is utilizing a fast median-value detection algorithm for common-mode correction and a new pattern scanning algorithm to select only valid events. Both new algorithms were developed during the last year at our institute.
Embedded algorithms within an FPGA-based system to process nonlinear time series data
NASA Astrophysics Data System (ADS)
Jones, Jonathan D.; Pei, Jin-Song; Tull, Monte P.
2008-03-01
This paper presents some preliminary results of an ongoing project. A pattern classification algorithm is being developed and embedded into a Field-Programmable Gate Array (FPGA) and microprocessor-based data processing core in this project. The goal is to enable and optimize the functionality of onboard data processing of nonlinear, nonstationary data for smart wireless sensing in structural health monitoring. Compared with traditional microprocessor-based systems, fast growing FPGA technology offers a more powerful, efficient, and flexible hardware platform including on-site (field-programmable) reconfiguration capability of hardware. An existing nonlinear identification algorithm is used as the baseline in this study. The implementation within a hardware-based system is presented in this paper, detailing the design requirements, validation, tradeoffs, optimization, and challenges in embedding this algorithm. An off-the-shelf high-level abstraction tool along with the Matlab/Simulink environment is utilized to program the FPGA, rather than coding the hardware description language (HDL) manually. The implementation is validated by comparing the simulation results with those from Matlab. In particular, the Hilbert Transform is embedded into the FPGA hardware and applied to the baseline algorithm as the centerpiece in processing nonlinear time histories and extracting instantaneous features of nonstationary dynamic data. The selection of proper numerical methods for the hardware execution of the selected identification algorithm and consideration of the fixed-point representation are elaborated. Other challenges include the issues of the timing in the hardware execution cycle of the design, resource consumption, approximation accuracy, and user flexibility of input data types limited by the simplicity of this preliminary design. Future work includes making an FPGA and microprocessor operate together to embed a further developed algorithm that yields better computational and power efficiency.
V&V Plan for FPGA-based ESF-CCS Using System Engineering Approach.
NASA Astrophysics Data System (ADS)
Maerani, Restu; Mayaka, Joyce; El Akrat, Mohamed; Cheon, Jung Jae
2018-02-01
Instrumentation and Control (I&C) systems play an important role in maintaining the safety of Nuclear Power Plant (NPP) operation. However, most current I&C safety systems are based on Programmable Logic Controller (PLC) hardware, which is difficult to verify and validate, and is susceptible to software common cause failure. Therefore, a plan for the replacement of the PLC-based safety systems, such as the Engineered Safety Feature - Component Control System (ESF-CCS), with Field Programmable Gate Arrays (FPGA) is needed. By using a systems engineering approach, which ensures traceability in every phase of the life cycle, from system requirements, design implementation to verification and validation, the system development is guaranteed to be in line with the regulatory requirements. The Verification process will ensure that the customer and stakeholder’s needs are satisfied in a high quality, trustworthy, cost efficient and schedule compliant manner throughout a system’s entire life cycle. The benefit of the V&V plan is to ensure that the FPGA based ESF-CCS is correctly built, and to ensure that the measurement of performance indicators has positive feedback that “do we do the right thing” during the re-engineering process of the FPGA based ESF-CCS.
Wörgötter, F
1999-10-01
In a stereoscopic system both eyes or cameras have a slightly different view. As a consequence small variations between the projected images exist ("disparities") which are spatially evaluated in order to retrieve depth information. We will show that two related algorithmic versions can be designed which recover disparity. Both approaches are based on the comparison of filter outputs from filtering the left and the right image. The difference of the phase components between left and right filter responses encodes the disparity. One approach uses regular Gabor filters and computes the spatial phase differences in a conventional way as described already in 1988 by Sanger. Novel to this approach, however, is that we formulate it in a way which is fully compatible with neural operations in the visual cortex. The second approach uses the apparently paradoxical similarity between the analysis of visual disparities and the determination of the azimuth of a sound source. Animals determine the direction of the sound from the temporal delay between the left and right ear signals. Similarly, in our second approach we transpose the spatially defined problem of disparity analysis into the temporal domain and utilize two resonators implemented in the form of causal (electronic) filters to determine the disparity as local temporal phase differences between the left and right filter responses. This approach permits video real-time analysis of stereo image sequences (see movies at http://www.neurop.ruhr-uni-bochum.de/Real- Time-Stereo) and a FPGA-based PC-board has been developed which performs stereo-analysis at full PAL resolution in video real-time. An ASIC chip will be available in March 2000.
Remote hardware-reconfigurable robotic camera
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.
2001-10-01
In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.
NASA Astrophysics Data System (ADS)
Wojenski, Andrzej; Kasprowicz, Grzegorz; Pozniak, Krzysztof T.; Romaniuk, Ryszard
2013-10-01
The paper describes a concept of automatic firmware generation for reconfigurable measurement systems, which uses FPGA devices and measurement cards in FMC standard. Following sections are described in details: automatic HDL code generation for FPGA devices, automatic communication interfaces implementation, HDL drivers for measurement cards, automatic serial connection between multiple measurement backplane boards, automatic build of memory map (address space), automatic generated firmware management. Presented solutions are required in many advanced measurement systems, like Beam Position Monitors or GEM detectors. This work is a part of a wider project for automatic firmware generation and management of reconfigurable systems. Solutions presented in this paper are based on previous publication in SPIE.
Floating-Point Units and Algorithms for field-programmable gate arrays
DOE Office of Scientific and Technical Information (OSTI.GOV)
Underwood, Keith D.; Hemmert, K. Scott
2005-11-01
The software that we are attempting to copyright is a package of floating-point unit descriptions and example algorithm implementations using those units for use in FPGAs. The floating point units are best-in-class implementations of add, multiply, divide, and square root floating-point operations. The algorithm implementations are sample (not highly flexible) implementations of FFT, matrix multiply, matrix vector multiply, and dot product. Together, one could think of the collection as an implementation of parts of the BLAS library or something similar to the FFTW packages (without the flexibility) for FPGAs. Results from this work has been published multiple times and wemore » are working on a publication to discuss the techniques we use to implement the floating-point units, For some more background, FPGAS are programmable hardware. "Programs" for this hardware are typically created using a hardware description language (examples include Verilog, VHDL, and JHDL). Our floating-point unit descriptions are written in JHDL, which allows them to include placement constraints that make them highly optimized relative to some other implementations of floating-point units. Many vendors (Nallatech from the UK, SRC Computers in the US) have similar implementations, but our implementations seem to be somewhat higher performance. Our algorithm implementations are written in VHDL and models of the floating-point units are provided in VHDL as well. FPGA "programs" make multiple "calls" (hardware instantiations) to libraries of intellectual property (IP), such as the floating-point unit library described here. These programs are then compiled using a tool called a synthesizer (such as a tool from Synplicity, Inc.). The compiled file is a netlist of gates and flip-flops. This netlist is then mapped to a particular type of FPGA by a mapper and then a place- and-route tool. These tools assign the gates in the netlist to specific locations on the specific type of FPGA chip used and constructs the required routes between them. The result is a "bitstream" that is analogous to a compiled binary. The bitstream is loaded into the FPGA to create a specific hardware configuration.« less
Temperature Tolerant Evolvable Systems Utilizing FPGA Boards and Bias-Controlled Amplifiers
NASA Technical Reports Server (NTRS)
Kumar, Nikhil R.
2005-01-01
Space missions often require radiation and extreme-temperature hardened electronics to survive the harsh environments beyond Earth's atmosphere. Traditional approaches to preserve electronics incorporate shielding, insulation and redundancy at the expense of power and weight. However, a novel way of bypassing these problems is the concept of evolutionary hardware. A reconfigurable device, consisting of several switches interconnected with analog/digital parts, is controlled by an evolutionary processor (EP). When the EP detects degradation in the circuit it sends signals to reconfigure the switches, thus forming a new circuit with the desired output. This concept has been developed since the mid-l990s, but one problem remains-the EP cannot degrade substantially. For this reason, extensive testing at extreme temperatures (-180 to 120 C) has been done on devices found on FPGA boards (taking the role of the EP), such as the Analog to Digital and the Digital to Analog Converter. The EP is used in conjunction with a bias-controlled amplifier and a new prototype relay board, which is interconnected with 6 G4-FETs, a tri-input transistor-like element developed at JPL. The greatest improvements to be made lie in the reconfigurable device, so future design and testing of the G4-FET chip is required.
Multi-channel imaging cytometry with a single detector
NASA Astrophysics Data System (ADS)
Locknar, Sarah; Barton, John; Entwistle, Mark; Carver, Gary; Johnson, Robert
2018-02-01
Multi-channel microscopy and multi-channel flow cytometry generate high bit data streams. Multiple channels (both spectral and spatial) are important in diagnosing diseased tissue and identifying individual cells. Omega Optical has developed techniques for mapping multiple channels into the time domain for detection by a single high gain, high bandwidth detector. This approach is based on pulsed laser excitation and a serial array of optical fibers coated with spectral reflectors such that up to 15 wavelength bins are sequentially detected by a single-element detector within 2.5 μs. Our multichannel microscopy system uses firmware running on dedicated DSP and FPGA chips to synchronize the laser, scanning mirrors, and sampling clock. The signals are digitized by an NI board into 14 bits at 60MHz - allowing for 232 by 174 pixel fields in up to 15 channels with 10x over sampling. Our multi-channel imaging cytometry design adds channels for forward scattering and back scattering to the fluorescence spectral channels. All channels are detected within the 2.5 μs - which is compatible with fast cytometry. Going forward, we plan to digitize at 16 bits with an A-toD chip attached to a custom board. Processing these digital signals in custom firmware would allow an on-board graphics processing unit to display imaging flow cytometry data over configurable scanning line lengths. The scatter channels can be used to trigger data buffering when a cell is present in the beam. This approach enables a low cost mechanically robust imaging cytometer.
ICE-Based Custom Full-Mesh Network for the CHIME High Bandwidth Radio Astronomy Correlator
NASA Astrophysics Data System (ADS)
Bandura, K.; Cliche, J. F.; Dobbs, M. A.; Gilbert, A. J.; Ittah, D.; Mena Parra, J.; Smecher, G.
2016-03-01
New generation radio interferometers encode signals from thousands of antenna feeds across large bandwidth. Channelizing and correlating this data requires networking capabilities that can handle unprecedented data rates with reasonable cost. The Canadian Hydrogen Intensity Mapping Experiment (CHIME) correlator processes 8-bits from N=2,048 digitizer inputs across 400MHz of bandwidth. Measured in N2× bandwidth, it is the largest radio correlator that is currently commissioning. Its digital back-end must exchange and reorganize the 6.6terabit/s produced by its 128 digitizing and channelizing nodes, and feed it to the 256 graphics processing unit (GPU) node spatial correlator in a way that each node obtains data from all digitizer inputs but across a small fraction of the bandwidth (i.e. ‘corner-turn’). In order to maximize performance and reliability of the corner-turn system while minimizing cost, a custom networking solution has been implemented. The system makes use of Field Programmable Gate Array (FPGA) transceivers to implement direct, passive copper, full-mesh, high speed serial connections between sixteen circuit boards in a crate, to exchange data between crates, and to offload the data to a cluster of 256 GPU nodes using standard 10Gbit/s Ethernet links. The GPU nodes complete the corner-turn by combining data from all crates and then computing visibilities. Eye diagrams and frame error counters confirm error-free operation of the corner-turn network in both the currently operating CHIME Pathfinder telescope (a prototype for the full CHIME telescope) and a representative fraction of the full CHIME hardware providing an end-to-end system validation. An analysis of an equivalent corner-turn system built with Ethernet switches instead of custom passive data links is provided.
NASA Astrophysics Data System (ADS)
Passas, Georgios; Freear, Steven; Fawcett, Darren
2010-08-01
Orthogonal frequency division multiplexing (OFDM)-based feed-forward space-time trellis code (FFSTTC) encoders can be synthesised as very high speed integrated circuit hardware description language (VHDL) designs. Evaluation of their FPGA implementation can lead to conclusions that help a designer to decide the optimum implementation, given the encoder structural parameters. VLSI architectures based on 1-bit multipliers and look-up tables (LUTs) are compared in terms of FPGA slices and block RAMs (area), as well as in terms of minimum clock period (speed). Area and speed graphs versus encoder memory order are provided for quadrature phase shift keying (QPSK) and 8 phase shift keying (8-PSK) modulation and two transmit antennas, revealing best implementation under these conditions. The effect of number of modulation bits and transmit antennas on the encoder implementation complexity is also investigated.
Design of CMOS imaging system based on FPGA
NASA Astrophysics Data System (ADS)
Hu, Bo; Chen, Xiaolai
2017-10-01
In order to meet the needs of engineering applications for high dynamic range CMOS camera under the rolling shutter mode, a complete imaging system is designed based on the CMOS imaging sensor NSC1105. The paper decides CMOS+ADC+FPGA+Camera Link as processing architecture and introduces the design and implementation of the hardware system. As for camera software system, which consists of CMOS timing drive module, image acquisition module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The ISE 14.6 emulator ISim is used in the simulation of signals. The imaging experimental results show that the system exhibits a 1280*1024 pixel resolution, has a frame frequency of 25 fps and a dynamic range more than 120dB. The imaging quality of the system satisfies the requirement of the index.
A high-efficiency self-powered wireless sensor node for monitoring concerning vibratory events
NASA Astrophysics Data System (ADS)
Xu, Dacheng; Li, Suiqiong; Li, Mengyang; Xie, Danpeng; Dong, Chuan; Li, Xinxin
2017-09-01
This paper presents a self-powered wireless alarming sensor node (SWASN), which was designed to monitor the occurrence of concerning vibratory events. The major components of the sensor node include a vibration-threshold-triggered energy harvester (VTTEH) that powers the sensor node, a dual threshold voltage control circuit (DTVCC) for power management and a radio frequency (RF) signal transmitting module. The VTTEH generates significant electric energy only when the input vibration reaches certain amplitude. Thus, the VTTEH serves as both the power source and the vibration-event-sensing element for the sensor node. The DTVCC was specifically designed to utilize the limited power supply from the VTTEH to operate the sensor node. Constructed with only voltage detectors and MOSFETs, the DTVCC achieved low power consumption, which was 65% lower compared with the power management circuit designed in our previous work. Meanwhile, a RF transmit circuit was constructed based on the commercially available CC1110-F32 wireless transceiver chip and a compact planar antenna was designed to improve the signal transmission distance. The sensor node was fabricated and was characterized both in the laboratory and in the field. Experimental results showed that the SWASN could automatically send out alarming signals when the simulated concerning event occurred. The waiting time between two consecutive transmission periods is less than 125 s and the transmission distance can reach 1.31 km. The SWASN will have broad applications in field surveillances.
Real-time orthorectification by FPGA-based hardware acceleration
NASA Astrophysics Data System (ADS)
Kuo, David; Gordon, Don
2010-10-01
Orthorectification that corrects the perspective distortion of remote sensing imagery, providing accurate geolocation and ease of correlation to other images is a valuable first-step in image processing for information extraction. However, the large amount of metadata and the floating-point matrix transformations required to operate on each pixel make this a computation and I/O (Input/Output) intensive process. As result much imagery is either left unprocessed or loses timesensitive value in the long processing cycle. However, the computation on each pixel can be reduced substantially by using computational results of the neighboring pixels and accelerated by special pipelined hardware architecture in one to two orders of magnitude. A specialized coprocessor that is implemented inside an FPGA (Field Programmable Gate Array) chip and surrounded by vendorsupported hardware IP (Intellectual Property) shares the computation workload with CPU through PCI-Express interface. The ultimate speed of one pixel per clock (125 MHz) is achieved by the pipelined systolic array architecture. The optimal partition between software and hardware, the timing profile among image I/O and computation, and the highly automated GUI (Graphical User Interface) that fully exploits this speed increase to maximize overall image production throughput will also be discussed. The software that runs on a workstation with the acceleration hardware orthorectifies 16 Megapixels per second, which is 16 times faster than without the hardware. It turns the production time from months to days. A real-life successful story of an imaging satellite company that adopted such workstations for their orthorectified imagery production will be presented. The potential candidacy of the image processing computation that can be accelerated more efficiently by the same approach will also be analyzed.
NASA Astrophysics Data System (ADS)
Masuda, Nobuyuki; Sugie, Takashige; Ito, Tomoyoshi; Tanaka, Shinjiro; Hamada, Yu; Satake, Shin-ichi; Kunugi, Tomoaki; Sato, Kazuho
2010-12-01
We have designed a PC cluster system with special purpose computer boards for visualization of fluid flow using digital holographic particle tracking velocimetry (DHPTV). In this board, there is a Field Programmable Gate Array (FPGA) chip in which is installed a pipeline for calculating the intensity of an object from a hologram by fast Fourier transform (FFT). This cluster system can create 1024 reconstructed images from a 1024×1024-grid hologram in 0.77 s. It is expected that this system will contribute to the analysis of fluid flow using DHPTV.
Advanced Technology for Ultra-Low Power System-on-Chip (SoC)
2017-06-01
design at IDS=1mA/μm compared with that in experimental 14nm-node FinFET. The redistributed electric field along the channel length direction can... design can result in more uniform electron density and electron velocity distributions compared to a homojunction device. This uniform electron... design at IDS=1mA/μm compared with that in experimental 14nm-node FinFET. 14 Approved for public release, distribution is unlimited. 0 5 10 15 20
An event-based architecture for solving constraint satisfaction problems
Mostafa, Hesham; Müller, Lorenz K.; Indiveri, Giacomo
2015-01-01
Constraint satisfaction problems are ubiquitous in many domains. They are typically solved using conventional digital computing architectures that do not reflect the distributed nature of many of these problems, and are thus ill-suited for solving them. Here we present a parallel analogue/digital hardware architecture specifically designed to solve such problems. We cast constraint satisfaction problems as networks of stereotyped nodes that communicate using digital pulses, or events. Each node contains an oscillator implemented using analogue circuits. The non-repeating phase relations among the oscillators drive the exploration of the solution space. We show that this hardware architecture can yield state-of-the-art performance on random SAT problems under reasonable assumptions on the implementation. We present measurements from a prototype electronic chip to demonstrate that a physical implementation of the proposed architecture is robust to practical non-idealities and to validate the theory proposed. PMID:26642827
SCA security verification on wireless sensor network node
NASA Astrophysics Data System (ADS)
He, Wei; Pizarro, Carlos; de la Torre, Eduardo; Portilla, Jorge; Riesgo, Teresa
2011-05-01
Side Channel Attack (SCA) differs from traditional mathematic attacks. It gets around of the exhaustive mathematic calculation and precisely pin to certain points in the cryptographic algorithm to reveal confidential information from the running crypto-devices. Since the introduction of SCA by Paul Kocher et al [1], it has been considered to be one of the most critical threats to the resource restricted but security demanding applications, such as wireless sensor networks. In this paper, we focus our work on the SCA-concerned security verification on WSN (wireless sensor network). A detailed setup of the platform and an analysis of the results of DPA (power attack) and EMA (electromagnetic attack) is presented. The setup follows the way of low-cost setup to make effective SCAs. Meanwhile, surveying the weaknesses of WSNs in resisting SCA attacks, especially for the EM attack. Finally, SCA-Prevention suggestions based on Differential Security Strategy for the FPGA hardware implementation in WSN will be given, helping to get an improved compromise between security and cost.
Facilitating preemptive hardware system design using partial reconfiguration techniques.
Dondo Gazzano, Julio; Rincon, Fernando; Vaderrama, Carlos; Villanueva, Felix; Caba, Julian; Lopez, Juan Carlos
2014-01-01
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then the remote integration of off-chip components. A model that offers component location transparency is also presented to enhance and facilitate system integration.
Facilitating Preemptive Hardware System Design Using Partial Reconfiguration Techniques
Rincon, Fernando; Vaderrama, Carlos; Villanueva, Felix; Caba, Julian; Lopez, Juan Carlos
2014-01-01
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then the remote integration of off-chip components. A model that offers component location transparency is also presented to enhance and facilitate system integration. PMID:24672292
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Seyong; Kim, Jungwon; Vetter, Jeffrey S
This paper presents a directive-based, high-level programming framework for high-performance reconfigurable computing. It takes a standard, portable OpenACC C program as input and generates a hardware configuration file for execution on FPGAs. We implemented this prototype system using our open-source OpenARC compiler; it performs source-to-source translation and optimization of the input OpenACC program into an OpenCL code, which is further compiled into a FPGA program by the backend Altera Offline OpenCL compiler. Internally, the design of OpenARC uses a high- level intermediate representation that separates concerns of program representation from underlying architectures, which facilitates portability of OpenARC. In fact, thismore » design allowed us to create the OpenACC-to-FPGA translation framework with minimal extensions to our existing system. In addition, we show that our proposed FPGA-specific compiler optimizations and novel OpenACC pragma extensions assist the compiler in generating more efficient FPGA hardware configuration files. Our empirical evaluation on an Altera Stratix V FPGA with eight OpenACC benchmarks demonstrate the benefits of our strategy. To demonstrate the portability of OpenARC, we show results for the same benchmarks executing on other heterogeneous platforms, including NVIDIA GPUs, AMD GPUs, and Intel Xeon Phis. This initial evidence helps support the goal of using a directive-based, high-level programming strategy for performance portability across heterogeneous HPC architectures.« less
A Test Methodology for Determining Space-Readiness of Xilinx SRAM-Based FPGA Designs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather M; Graham, Paul S; Morgan, Keith S
2008-01-01
Using reconfigurable, static random-access memory (SRAM) based field-programmable gate arrays (FPGAs) for space-based computation has been an exciting area of research for the past decade. Since both the circuit and the circuit's state is stored in radiation-tolerant memory, both could be alterd by the harsh space radiation environment. Both the circuit and the circuit's state can be prote cted by triple-moduler redundancy (TMR), but applying TMR to FPGA user designs is often an error-prone process. Faulty application of TMR could cause the FPGA user circuit to output incorrect data. This paper will describe a three-tiered methodology for testing FPGA usermore » designs for space-readiness. We will describe the standard approach to testing FPGA user designs using a particle accelerator, as well as two methods using fault injection and a modeling tool. While accelerator testing is the current 'gold standard' for pre-launch testing, we believe the use of fault injection and modeling tools allows for easy, cheap and uniform access for discovering errors early in the design process.« less
An embedded laser marking controller based on ARM and FPGA processors.
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking.
NASA Astrophysics Data System (ADS)
Zhu, Jun; Zhang, David Wei; Kuo, Chinte; Wang, Qing; Wei, Fang; Zhang, Chenming; Chen, Han; He, Daquan; Hsu, Stephen D.
2017-07-01
As technology node shrinks, aggressive design rules for contact and other back end of line (BEOL) layers continue to drive the need for more effective full chip patterning optimization. Resist top loss is one of the major challenges for 28 nm and below technology nodes, which can lead to post-etch hotspots that are difficult to predict and eventually degrade the process window significantly. To tackle this problem, we used an advanced programmable illuminator (FlexRay) and Tachyon SMO (Source Mask Optimization) platform to make resistaware source optimization possible, and it is proved to greatly improve the imaging contrast, enhance focus and exposure latitude, and minimize resist top loss thus improving the yield.
Design of a real-time system of moving ship tracking on-board based on FPGA in remote sensing images
NASA Astrophysics Data System (ADS)
Yang, Tie-jun; Zhang, Shen; Zhou, Guo-qing; Jiang, Chuan-xian
2015-12-01
With the broad attention of countries in the areas of sea transportation and trade safety, the requirements of efficiency and accuracy of moving ship tracking are becoming higher. Therefore, a systematic design of moving ship tracking onboard based on FPGA is proposed, which uses the Adaptive Inter Frame Difference (AIFD) method to track a ship with different speed. For the Frame Difference method (FD) is simple but the amount of computation is very large, it is suitable for the use of FPGA to implement in parallel. But Frame Intervals (FIs) of the traditional FD method are fixed, and in remote sensing images, a ship looks very small (depicted by only dozens of pixels) and moves slowly. By applying invariant FIs, the accuracy of FD for moving ship tracking is not satisfactory and the calculation is highly redundant. So we use the adaptation of FD based on adaptive extraction of key frames for moving ship tracking. A FPGA development board of Xilinx Kintex-7 series is used for simulation. The experiments show that compared with the traditional FD method, the proposed one can achieve higher accuracy of moving ship tracking, and can meet the requirement of real-time tracking in high image resolution.
Real-time distortion correction for visual inspection systems based on FPGA
NASA Astrophysics Data System (ADS)
Liang, Danhua; Zhang, Zhaoxia; Chen, Xiaodong; Yu, Daoyin
2008-03-01
Visual inspection is a kind of new technology based on the research of computer vision, which focuses on the measurement of the object's geometry and location. It can be widely used in online measurement, and other real-time measurement process. Because of the defects of the traditional visual inspection, a new visual detection mode -all-digital intelligent acquisition and transmission is presented. The image processing, including filtering, image compression, binarization, edge detection and distortion correction, can be completed in the programmable devices -FPGA. As the wide-field angle lens is adopted in the system, the output images have serious distortion. Limited by the calculating speed of computer, software can only correct the distortion of static images but not the distortion of dynamic images. To reach the real-time need, we design a distortion correction system based on FPGA. The method of hardware distortion correction is that the spatial correction data are calculated first under software circumstance, then converted into the address of hardware storage and stored in the hardware look-up table, through which data can be read out to correct gray level. The major benefit using FPGA is that the same circuit can be used for other circularly symmetric wide-angle lenses without being modified.
Reconfigurable Processing Module
NASA Technical Reports Server (NTRS)
Somervill, Kevin; Hodson, Robert; Jones, Robert; Williams, John
2005-01-01
To accommodate a wide spectrum of applications and technologies, NASA s Exploration System's Missions Directorate has called for reconfigurable and modular technologies to support future missions to the moon and Mars. In response, Langley Research Center is leading a program entitled Reconfigurable Scaleable Computing (RSC) that is centered on the development of FPGA-based computing resources in a stackable form factor. This paper details the architecture and implementation of the Reconfigurable Processing Module (RPM), which is the key element of the RSC system. The RPM is an FPGA-based, space-qualified printed circuit assembly leveraging terrestrial/commercial design standards into the space applications domain. The form factor is similar to, and backwards compatible with, the PCI-104 standard utilizing only the PCI interface. The size is expanded to accommodate the required functionality while still better than 30% smaller than a 3U CompactPCI(TradeMark)card and without the overhead of the backplane. The architecture is built around two FPGA devices, one hosting PCI and memory interfaces, and another hosting mission application resources; both of which are connected with a high-speed data bus. The PCI interface FPGA provides access via the PCI bus to onboard SDRAM, flash PROM, and the application resources; both configuration management as well as runtime interaction. The reconfigurable FPGA, referred to as the Application FPGA - or simply "the application" - is a radiation-tolerant Xilinx Virtex-4 FX60 hosting custom application specific logic or soft microprocessor IP. The RPM implements various SEE mitigation techniques including TMR, EDAC, and configuration scrubbing of the reconfigurable FPGA. Prototype hardware and formal modeling techniques are used to explore the performability trade space. These models provide a novel way to calculate quality-of-service performance measures while simultaneously considering fault-related behavior due to SEE soft errors.
Secure Authentication for Remote Patient Monitoring with Wireless Medical Sensor Networks †
Hayajneh, Thaier; Mohd, Bassam J; Imran, Muhammad; Almashaqbeh, Ghada; Vasilakos, Athanasios V.
2016-01-01
There is broad consensus that remote health monitoring will benefit all stakeholders in the healthcare system and that it has the potential to save billions of dollars. Among the major concerns that are preventing the patients from widely adopting this technology are data privacy and security. Wireless Medical Sensor Networks (MSNs) are the building blocks for remote health monitoring systems. This paper helps to identify the most challenging security issues in the existing authentication protocols for remote patient monitoring and presents a lightweight public-key-based authentication protocol for MSNs. In MSNs, the nodes are classified into sensors that report measurements about the human body and actuators that receive commands from the medical staff and perform actions. Authenticating these commands is a critical security issue, as any alteration may lead to serious consequences. The proposed protocol is based on the Rabin authentication algorithm, which is modified in this paper to improve its signature signing process, making it suitable for delay-sensitive MSN applications. To prove the efficiency of the Rabin algorithm, we implemented the algorithm with different hardware settings using Tmote Sky motes and also programmed the algorithm on an FPGA to evaluate its design and performance. Furthermore, the proposed protocol is implemented and tested using the MIRACL (Multiprecision Integer and Rational Arithmetic C/C++) library. The results show that secure, direct, instant and authenticated commands can be delivered from the medical staff to the MSN nodes. PMID:27023540
Secure Authentication for Remote Patient Monitoring with Wireless Medical Sensor Networks.
Hayajneh, Thaier; Mohd, Bassam J; Imran, Muhammad; Almashaqbeh, Ghada; Vasilakos, Athanasios V
2016-03-24
There is broad consensus that remote health monitoring will benefit all stakeholders in the healthcare system and that it has the potential to save billions of dollars. Among the major concerns that are preventing the patients from widely adopting this technology are data privacy and security. Wireless Medical Sensor Networks (MSNs) are the building blocks for remote health monitoring systems. This paper helps to identify the most challenging security issues in the existing authentication protocols for remote patient monitoring and presents a lightweight public-key-based authentication protocol for MSNs. In MSNs, the nodes are classified into sensors that report measurements about the human body and actuators that receive commands from the medical staff and perform actions. Authenticating these commands is a critical security issue, as any alteration may lead to serious consequences. The proposed protocol is based on the Rabin authentication algorithm, which is modified in this paper to improve its signature signing process, making it suitable for delay-sensitive MSN applications. To prove the efficiency of the Rabin algorithm, we implemented the algorithm with different hardware settings using Tmote Sky motes and also programmed the algorithm on an FPGA to evaluate its design and performance. Furthermore, the proposed protocol is implemented and tested using the MIRACL (Multiprecision Integer and Rational Arithmetic C/C++) library. The results show that secure, direct, instant and authenticated commands can be delivered from the medical staff to the MSN nodes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobrek, Miljko; Albright, Austin P
This paper presents FPGA implementation of the Reed-Solomon decoder for use in IEEE 802.16 WiMAX systems. The decoder is based on RS(255,239) code, and is additionally shortened and punctured according to the WiMAX specifications. Simulink model based on Sysgen library of Xilinx blocks was used for simulation and hardware implementation. At the end, simulation results and hardware implementation performances are presented.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Ormsby, John (Technical Monitor)
2002-01-01
Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing (DSP) functions. Such capability also makes and FPGA a suitable platform for the digital implementation of closed loop controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance in a compact form-factor. Other researchers have presented the notion that a second order digital filter with proportional-integral-derivative (PID) control functionality can be implemented in an FPGA. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSF) devices. Our goal is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. Meeting our goals requires alternative compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching these goals.
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors
NASA Technical Reports Server (NTRS)
Flatley, Thomas P.
2015-01-01
SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms
NASA Astrophysics Data System (ADS)
Du, Hongtao; Qi, Hairong; Szu, Harold H.
2005-03-01
Recent advances in intelligence technology have boosted the development of micro- Unmanned Air Vehicles (UAVs) including Sliver Fox, Shadow, and Scan Eagle for various surveillance and reconnaissance applications. These affordable and reusable devices have to fit a series of size, weight, and power constraints. Cameras used on such micro-UAVs are therefore mounted directly at a fixed angle without any motion-compensated gimbals. This mounting scheme has resulted in the so-called jitter effect in which jitter is defined as sub-pixel or small amplitude vibrations. The jitter blur caused by the jitter effect needs to be corrected before any other processing algorithms can be practically applied. Jitter restoration has been solved by various optimization techniques, including Wiener approximation, maximum a-posteriori probability (MAP), etc. However, these algorithms normally assume a spatial-invariant blur model that is not the case with jitter blur. Szu et al. developed a smart real-time algorithm based on auto-regression (AR) with its natural generalization of unsupervised artificial neural network (ANN) learning to achieve restoration accuracy at the sub-pixel level. This algorithm resembles the capability of the human visual system, in which an agreement between the pair of eyes indicates "signal", otherwise, the jitter noise. Using this non-statistical method, for each single pixel, a deterministic blind sources separation (BSS) process can then be carried out independently based on a deterministic minimum of the Helmholtz free energy with a generalization of Shannon's information theory applied to open dynamic systems. From a hardware implementation point of view, the process of jitter restoration of an image using Szu's algorithm can be optimized by pixel-based parallelization. In our previous work, a parallelly structured independent component analysis (ICA) algorithm has been implemented on both Field Programmable Gate Array (FPGA) and Application-Specific Integrated Circuit (ASIC) using standard-height cells. ICA is an algorithm that can solve BSS problems by carrying out the all-order statistical, decorrelation-based transforms, in which an assumption that neighborhood pixels share the same but unknown mixing matrix A is made. In this paper, we continue our investigation on the design challenges of firmware approaches to smart algorithms. We think two levels of parallelization can be explored, including pixel-based parallelization and the parallelization of the restoration algorithm performed at each pixel. This paper focuses on the latter and we use ICA as an example to explain the design and implementation methods. It is well known that the capacity constraints of single FPGA have limited the implementation of many complex algorithms including ICA. Using the reconfigurability of FPGA, we show, in this paper, how to manipulate the FPGA-based system to provide extra computing power for the parallelized ICA algorithm with limited FPGA resources. The synthesis aiming at the pilchard re-configurable FPGA platform is reported. The pilchard board is embedded with single Xilinx VIRTEX 1000E FPGA and transfers data directly to CPU on the 64-bit memory bus at the maximum frequency of 133MHz. Both the feasibility performance evaluations and experimental results validate the effectiveness and practicality of this synthesis, which can be extended to the spatial-variant jitter restoration for micro-UAV deployment.
A real-time n/γ digital pulse shape discriminator based on FPGA.
Li, Shiping; Xu, Xiufeng; Cao, Hongrui; Yuan, Guoliang; Yang, Qingwei; Yin, Zejie
2013-02-01
A FPGA-based real-time digital pulse shape discriminator has been employed to distinguish between neutrons (n) and gammas (γ) in the Neutron Flux Monitor (NFM) for International Thermonuclear Experimental Reactor (ITER). The discriminator takes advantages of the Field Programmable Gate Array (FPGA) parallel and pipeline process capabilities to carry out the real-time sifting of neutrons in n/γ mixed radiation fields, and uses the rise time and amplitude inspection techniques simultaneously as the discrimination algorithm to observe good n/γ separation. Some experimental results have been presented which show that this discriminator can realize the anticipated goals of NFM perfectly with its excellent discrimination quality and zero dead time. Copyright © 2012 Elsevier Ltd. All rights reserved.
Uranus: a rapid prototyping tool for FPGA embedded computer vision
NASA Astrophysics Data System (ADS)
Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.
2007-01-01
The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Design of area array CCD image acquisition and display system based on FPGA
NASA Astrophysics Data System (ADS)
Li, Lei; Zhang, Ning; Li, Tianting; Pan, Yue; Dai, Yuming
2014-09-01
With the development of science and technology, CCD(Charge-coupled Device) has been widely applied in various fields and plays an important role in the modern sensing system, therefore researching a real-time image acquisition and display plan based on CCD device has great significance. This paper introduces an image data acquisition and display system of area array CCD based on FPGA. Several key technical challenges and problems of the system have also been analyzed and followed solutions put forward .The FPGA works as the core processing unit in the system that controls the integral time sequence .The ICX285AL area array CCD image sensor produced by SONY Corporation has been used in the system. The FPGA works to complete the driver of the area array CCD, then analog front end (AFE) processes the signal of the CCD image, including amplification, filtering, noise elimination, CDS correlation double sampling, etc. AD9945 produced by ADI Corporation to convert analog signal to digital signal. Developed Camera Link high-speed data transmission circuit, and completed the PC-end software design of the image acquisition, and realized the real-time display of images. The result through practical testing indicates that the system in the image acquisition and control is stable and reliable, and the indicators meet the actual project requirements.
Combine Flash-Based FPGA TID and Long-Term Retention Reliabilities Through VT Shift
NASA Astrophysics Data System (ADS)
Wang, Jih-Jong; Rezzak, Nadia; Dsilva, Durwyn; Xue, Fengliang; Samiee, Salim; Singaraju, Pavan; Jia, James; Nguyen, Victor; Hawley, Frank; Hamdy, Esmat
2016-08-01
Reliability test results of data retention and total ionizing dose (TID) in 65 nm Flash-based field programmable gate array (FPGA) are presented. Long-chain inverter design is recommended for reliability evaluation because it is the worst case design for both effects. Based on preliminary test data, both issues are unified and modeled by one natural decay equation. The relative contributions of TID induced threshold-voltage shift and retention mechanisms are evaluated by analyzing test data.
NASA Astrophysics Data System (ADS)
Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery
2016-10-01
This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Design of time interval generator based on hybrid counting method
NASA Astrophysics Data System (ADS)
Yao, Yuan; Wang, Zhaoqi; Lu, Houbing; Chen, Lian; Jin, Ge
2016-10-01
Time Interval Generators (TIGs) are frequently used for the characterizations or timing operations of instruments in particle physics experiments. Though some "off-the-shelf" TIGs can be employed, the necessity of a custom test system or control system makes the TIGs, being implemented in a programmable device desirable. Nowadays, the feasibility of using Field Programmable Gate Arrays (FPGAs) to implement particle physics instrumentation has been validated in the design of Time-to-Digital Converters (TDCs) for precise time measurement. The FPGA-TDC technique is based on the architectures of Tapped Delay Line (TDL), whose delay cells are down to few tens of picosecond. In this case, FPGA-based TIGs with high delay step are preferable allowing the implementation of customized particle physics instrumentations and other utilities on the same FPGA device. A hybrid counting method for designing TIGs with both high resolution and wide range is presented in this paper. The combination of two different counting methods realizing an integratable TIG is described in detail. A specially designed multiplexer for tap selection is emphatically introduced. The special structure of the multiplexer is devised for minimizing the different additional delays caused by the unpredictable routings from different taps to the output. A Kintex-7 FPGA is used for the hybrid counting-based implementation of a TIG, providing a resolution up to 11 ps and an interval range up to 8 s.
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.
Ly, Daniel Le; Chow, Paul
2010-11-01
Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
The upgrade of the H.E.S.S. cameras
NASA Astrophysics Data System (ADS)
Giavitto, Gianluca; Ashton, Terry; Balzer, Arnim; Berge, David; Brun, Francois; Chaminade, Thomas; Delagnes, Eric; Fontaine, Gerard; Füßling, Matthias; Giebels, Berrie; Glicenstein, Jean-Francois; Gräber, Tobias; Hinton, Jim; Jahnke, Albert; Klepser, Stefan; Kossatz, Marko; Kretzschmann, Axel; Lefranc, Valentin; Leich, Holger; Lüdecke, Hartmut; Lypova, Iryna; Manigot, Pascal; Marandon, Vincent; Moulin, Emmanuel; de Naurois, Mathieu; Nayman, Patrick; Ohm, Stefan; Penno, Marek; Ross, Duncan; Salek, David; Schade, Markus; Schwab, Thomas; Simoni, Rachel; Stegmann, Christian; Steppa, Constantin; Thornhill, Julian; Toussnel, Francois
2017-01-01
The High Energy Stereoscopic System (H.E.S.S.) is an array of five imaging atmospheric Cherenkov telescopes (IACT) located in Namibia. In order to assure the continuous operation of H.E.S.S. at its full sensitivity until and possibly beyond the advent of CTA, the older cameras, installed in 2003, are currently undergoing an extensive upgrade. Its goals are reducing the system failure rate, reducing the dead time and improving the overall performance of the array. All camera components have been upgraded, except the mechanical structure and the photo-multiplier tubes (PMTs). Novel technical solutions have been introduced: the upgraded readout electronics is based on the NECTAr analog memory chip; the control of the hardware is carried out by an FPGA coupled to an embedded ARM computer; the control software was re-written from scratch and it is based on modern C++ open source libraries. These hardware and software solutions offer very good performance, robustness and flexibility. The first camera was fielded in July 2015 and has been successfully commissioned; the rest is scheduled to be upgraded in September 2016. The present contribution describes the design, the testing and the performance of the new H.E.S.S. camera and its components.
Web-based DAQ systems: connecting the user and electronics front-ends
NASA Astrophysics Data System (ADS)
Lenzi, Thomas
2016-12-01
Web technologies are quickly evolving and are gaining in computational power and flexibility, allowing for a paradigm shift in the field of Data Acquisition (DAQ) systems design. Modern web browsers offer the possibility to create intricate user interfaces and are able to process and render complex data. Furthermore, new web standards such as WebSockets allow for fast real-time communication between the server and the user with minimal overhead. Those improvements make it possible to move the control and monitoring operations from the back-end servers directly to the user and to the front-end electronics, thus reducing the complexity of the data acquisition chain. Moreover, web-based DAQ systems offer greater flexibility, accessibility, and maintainability on the user side than traditional applications which often lack portability and ease of use. As proof of concept, we implemented a simplified DAQ system on a mid-range Spartan6 Field Programmable Gate Array (FPGA) development board coupled to a digital front-end readout chip. The system is connected to the Internet and can be accessed from any web browser. It is composed of custom code to control the front-end readout and of a dual soft-core Microblaze processor to communicate with the client.
NASA Technical Reports Server (NTRS)
Papale, William; Chullen, Cinda; Campbell, Colin; Conger, Bruce; McMillin, Summer; Jeng, Frank
2014-01-01
Development activities related to the Rapid Cycle Amine (RCA) Carbon Dioxide (CO2) and Humidity control system have progressed to the point of integrating the RCA into an advanced Primary Life Support System (PLSS 2.0) to evaluate the interaction of the RCA among other PLSS components in a ground test environment. The RCA 2.0 assembly (integrated into PLSS 2.0) consists of a valve assembly with commercial actuator motor, a sorbent canister, and a field-programmable gate array (FPGA)-based process node controller. Continued design and development activities for RCA 3.0 have been aimed at optimizing the canister size and incorporating greater fidelity in the valve actuator motor and valve position feedback design. Further, the RCA process node controller is envisioned to incorporate a higher degree of functionality to support a distributed PLSS control architecture. This paper will describe the progression of technology readiness levels of RCA 1.0, 2.0 and 3.0 along with a review of the design and manufacturing successes and challenges for 2.0 and 3.0 units. The anticipated interfaces and interactions with the PLSS 2.0/2.5/3.0 assemblies will also be discussed.
Luo, Qiang; Yan, Zhuangzhi; Gu, Dongxing; Cao, Lei
This paper proposed an image interpolation algorithm based on bilinear interpolation and a color correction algorithm based on polynomial regression on FPGA, which focused on the limited number of imaging pixels and color distortion of the ultra-thin electronic endoscope. Simulation experiment results showed that the proposed algorithm realized the real-time display of 1280 x 720@60Hz HD video, and using the X-rite color checker as standard colors, the average color difference was reduced about 30% comparing with that before color correction.
Use of Commercial FPGA-Based Evaluation Boards for Single-Event Testing of DDR2 and DDR3 SDRAMs
NASA Technical Reports Server (NTRS)
Ladbury, R. L.; Berg, M. D.; Wilcox, E. P.; LaBel, K. A.; Kim, H. S.; Phan, A. M.; Seidleck, C. M.
2013-01-01
We investigate the use of commercial FPGA based evaluation boards for radiation testing DDR2 and DDR3 SDRAMs. We evaluate the resulting data quality and the tradeoffs involved in the use of these boards.
An Embedded Laser Marking Controller Based on ARM and FPGA Processors
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking. PMID:24772028
FPGA-based Upgrade to RITS-6 Control System, Designed with EMP Considerations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harold D. Anderson, John T. Williams
2009-07-01
The existing control system for the RITS-6, a 20-MA 3-MV pulsed-power accelerator located at Sandia National Laboratories, was built as a system of analog switches because the operators needed to be close enough to the machine to hear pulsed-power breakdowns, yet the electromagnetic pulse (EMP) emitted would disable any processor-based solutions. The resulting system requires operators to activate and deactivate a series of 110-V relays manually in a complex order. The machine is sensitive to both the order of operation and the time taken between steps. A mistake in either case would cause a misfire and possible machine damage. Basedmore » on these constraints, a field-programmable gate array (FPGA) was chosen as the core of a proposed upgrade to the control system. An FPGA is a series of logic elements connected during programming. Based on their connections, the elements can mimic primitive logic elements, a process called synthesis. The circuit is static; all paths exist simultaneously and do not depend on a processor. This should make it less sensitive to EMP. By shielding it and using good electromagnetic interference-reduction practices, it should continue to operate well in the electrically noisy environment. The FPGA has two advantages over the existing system. In manual operation mode, the synthesized logic gates keep the operators in sequence. In addition, a clock signal and synthesized countdown circuit provides an automated sequence, with adjustable delays, for quickly executing the time-critical portions of charging and firing. The FPGA is modeled as a set of states, each state being a unique set of values for the output signals. The state is determined by the input signals, and in the automated segment by the value of the synthesized countdown timer, with the default mode placing the system in a safe configuration. Unlike a processor-based system, any system stimulus that results in an abort situation immediately executes a shutdown, with only a tens-of-nanoseconds delay to propagate across the FPGA. This paper discusses the design, installation, and testing of the proposed system upgrade, including failure statistics and modifications to the original design.« less
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor
NASA Astrophysics Data System (ADS)
Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram
2007-09-01
Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
Monitoring tools of COMPASS experiment at CERN
NASA Astrophysics Data System (ADS)
Bodlak, M.; Frolov, V.; Huber, S.; Jary, V.; Konorov, I.; Levit, D.; Novy, J.; Salac, R.; Tomsa, J.; Virius, M.
2015-12-01
This paper briefly introduces the data acquisition system of the COMPASS experiment and is mainly focused on the part that is responsible for the monitoring of the nodes in the whole newly developed data acquisition system of this experiment. The COMPASS is a high energy particle experiment with a fixed target located at the SPS of the CERN laboratory in Geneva, Switzerland. The hardware of the data acquisition system has been upgraded to use FPGA cards that are responsible for data multiplexing and event building. The software counterpart of the system includes several processes deployed in heterogenous network environment. There are two processes, namely Message Logger and Message Browser, taking care of monitoring. These tools handle messages generated by nodes in the system. While Message Logger collects and saves messages to the database, the Message Browser serves as a graphical interface over the database containing these messages. For better performance, certain database optimizations have been used. Lastly, results of performance tests are presented.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.
Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir
2013-01-01
DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
Montaje Experimental de Optica Adaptiva con Tecnología FPGA
NASA Astrophysics Data System (ADS)
Rodriguez Brizuela, F.; Verasay, J. P.; Recabarren, P.
An experimental platform based on FPGA devices, dedicated to implement active and adaptive optic software in HDL has been developed. The devel- oped assembly is the first of a series of works focused on this important area of instrumental astronomy. The exposed development is part of a Final Project of Electronic Engineering of the National University of Cordoba. FULL TEXT IN SPANISH
Wang, Jinlong; Lu, Mai; Hu, Yanwen; Chen, Xiaoqiang; Pan, Qiangqiang
2015-12-01
Neuron is the basic unit of the biological neural system. The Hodgkin-Huxley (HH) model is one of the most realistic neuron models on the electrophysiological characteristic description of neuron. Hardware implementation of neuron could provide new research ideas to clinical treatment of spinal cord injury, bionics and artificial intelligence. Based on the HH model neuron and the DSP Builder technology, in the present study, a single HH model neuron hardware implementation was completed in Field Programmable Gate Array (FPGA). The neuron implemented in FPGA was stimulated by different types of current, the action potential response characteristics were analyzed, and the correlation coefficient between numerical simulation result and hardware implementation result were calculated. The results showed that neuronal action potential response of FPGA was highly consistent with numerical simulation result. This work lays the foundation for hardware implementation of neural network.
Diversification of Processors Based on Redundancy in Instruction Set
NASA Astrophysics Data System (ADS)
Ichikawa, Shuichi; Sawada, Takashi; Hata, Hisashi
By diversifying processor architecture, computer software is expected to be more resistant to plagiarism, analysis, and attacks. This study presents a new method to diversify instruction set architecture (ISA) by utilizing the redundancy in the instruction set. Our method is particularly suited for embedded systems implemented with FPGA technology, and realizes a genuine instruction set randomization, which has not been provided by the preceding studies. The evaluation results on four typical ISAs indicate that our scheme can provide a far larger degree of freedom than the preceding studies. Diversified processors based on MIPS architecture were actually implemented and evaluated with Xilinx Spartan-3 FPGA. The increase of logic scale was modest: 5.1% in Specialized design and 3.6% in RAM-mapped design. The performance overhead was also modest: 3.4% in Specialized design and 11.6% in RAM-mapped design. From these results, our scheme is regarded as a practical and promising way to secure FPGA-based embedded systems.
Zou, Ding; Djordjevic, Ivan B
2016-09-05
In this paper, we propose a rate-adaptive FEC scheme based on LDPC codes together with its software reconfigurable unified FPGA architecture. By FPGA emulation, we demonstrate that the proposed class of rate-adaptive LDPC codes based on shortening with an overhead from 25% to 42.9% provides a coding gain ranging from 13.08 dB to 14.28 dB at a post-FEC BER of 10-15 for BPSK transmission. In addition, the proposed rate-adaptive LDPC coding combined with higher-order modulations have been demonstrated including QPSK, 8-QAM, 16-QAM, 32-QAM, and 64-QAM, which covers a wide range of signal-to-noise ratios. Furthermore, we apply the unequal error protection by employing different LDPC codes on different bits in 16-QAM and 64-QAM, which results in additional 0.5dB gain compared to conventional LDPC coded modulation with the same code rate of corresponding LDPC code.
Software Process Assurance for Complex Electronics (SPACE)
NASA Technical Reports Server (NTRS)
Plastow, Richard A.
2007-01-01
Complex Electronics (CE) are now programmed to perform tasks that were previously handled in software, such as communication protocols. Many of the methods used to develop software bare a close resemblance to CE development. For instance, Field Programmable Gate Arrays (FPGAs) can have over a million logic gates while system-on-chip (SOC) devices can combine a microprocessor, input and output channels, and sometimes an FPGA for programmability. With this increased intricacy, the possibility of software-like bugs such as incorrect design, logic, and unexpected interactions within the logic is great. Since CE devices are obscuring the hardware/software boundary, we propose that mature software methodologies may be utilized with slight modifications in the development of these devices. Software Process Assurance for Complex Electronics (SPACE) is a research project that looks at using standardized S/W Assurance/Engineering practices to provide an assurance framework for development activities. Tools such as checklists, best practices and techniques can be used to detect missing requirements and bugs earlier in the development cycle creating a development process for CE that will be more easily maintained, consistent and configurable based on the device used.
A high-resolution programmable Vernier delay generator based on carry chains in FPGA
NASA Astrophysics Data System (ADS)
Cui, Ke; Li, Xiangyu; Zhu, Rihong
2017-06-01
This paper presents an architecture of a high-resolution delay generator implemented in a single field programmable gate array chip by exploiting the method of utilizing dedicated carry chains. It serves as the core component in various physical instruments. The proposed delay generator contains the coarse delay step and the fine delay step to guarantee both large dynamic range and high resolution. The carry chains are organized in the Vernier delay loop style to fulfill the fine delay step with high precision and high linearity. The delay generator was implemented in the EP3SE110F1152I3 Stratix III device from Altera on a self-designed test board. Test results show that the obtained resolution is 38.6 ps, and the differential nonlinearity/integral nonlinearity is in the range of [-0.18 least significant bit (LSB), 0.24 LSB]/(-0.02 LSB, 0.01 LSB) under the nominal supply voltage of 1100 mV and environmental temperature of 2 0°C. The delay generator is rather efficient concerning resource cost, which uses only 668 look-up tables and 146 registers in total.
Lincoln, Peter; Blate, Alex; Singh, Montek; Whitted, Turner; State, Andrei; Lastra, Anselmo; Fuchs, Henry
2016-04-01
We describe an augmented reality, optical see-through display based on a DMD chip with an extremely fast (16 kHz) binary update rate. We combine the techniques of post-rendering 2-D offsets and just-in-time tracking updates with a novel modulation technique for turning binary pixels into perceived gray scale. These processing elements, implemented in an FPGA, are physically mounted along with the optical display elements in a head tracked rig through which users view synthetic imagery superimposed on their real environment. The combination of mechanical tracking at near-zero latency with reconfigurable display processing has given us a measured average of 80 µs of end-to-end latency (from head motion to change in photons from the display) and also a versatile test platform for extremely-low-latency display systems. We have used it to examine the trade-offs between image quality and cost (i.e. power and logical complexity) and have found that quality can be maintained with a fairly simple display modulation scheme.
FPGA-based Klystron linearization implementations in scope of ILC
Omet, M.; Michizono, S.; Matsumoto, T.; ...
2015-01-23
We report the development and implementation of four FPGA-based predistortion-type klystron linearization algorithms. Klystron linearization is essential for the realization of ILC, since it is required to operate the klystrons 7% in power below their saturation. The work presented was performed in international collaborations at the Fermi National Accelerator Laboratory (FNAL), USA and the Deutsches Elektronen Synchrotron (DESY), Germany. With the newly developed algorithms, the generation of correction factors on the FPGA was improved compared to past algorithms, avoiding quantization and decreasing memory requirements. At FNAL, three algorithms were tested at the Advanced Superconducting Test Accelerator (ASTA), demonstrating a successfulmore » implementation for one algorithm and a proof of principle for two algorithms. Furthermore, the functionality of the algorithm implemented at DESY was demonstrated successfully in a simulation.« less
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
NASA Technical Reports Server (NTRS)
Wagner, Raymond S.; Barton, Richard J.
2011-01-01
Wireless Sensor Networks (WSNs) can provide a substantial benefit in spacecraft systems, reducing launch weight and providing unprecedented flexibility by allowing instrumentation capabilities to grow and change over time. Achieving data transport reliability on par with that of wired systems, however, can prove extremely challenging in practice. Fortunately, much progress has been made in developing standard WSN radio protocols for applications from non-critical home automation to mission-critical industrial process control. The relative performances of candidate protocols must be compared in representative aerospace environments, however, to determine their suitability for spaceflight applications. In this paper, we will present the results of a rigorous laboratory analysis of the performance of two standards-based, low power, low data rate WSN protocols: ZigBee Pro and ISA100.11a. Both are based on IEEE 802.15.4 and augment that standard's specifications to build complete, multi-hop networking stacks. ZigBee Pro targets primarily the home and office automation markets, providing an ad-hoc protocol that is computationally lightweight and easy to implement in inexpensive system-on-a-chip components. As a result of this simplicity, however, ZigBee Pro can be susceptible to radio frequency (RF) interference. ISA100.11a, on the other hand, targets the industrial process control market, providing a robust, centrally-managed protocol capable of tolerating a significant amount of RF interference. To achieve these gains, a coordinated channel hopping mechanism is employed, which entails a greater computational complexity than ZigBee and requires more sophisticated and costly hardware. To guide future aerospace deployments, we must understand how well these standards relatively perform in analog environments under expected operating conditions. Specifically, we are interested in evaluating goodput -- application level throughput -- in a representative crewed environment in the presence of varying levels of 802.11g Wi-Fi traffic. To do so, we use the NASA Johnson Space Center Wireless Habitat Testbed (WHT), a metallic, habitation-sized module designed for co-existence testing of wireless systems. In its quiescent state, the sealed WHT provides an RF-quiet environment to which we can selectively add interfering systems; it also provides a realistic level of multi-path self-interference for systems under investigation. In our test, we deploy two representative five node networks, configured in a star topology with all nodes reporting directly to a WSN gateway. Each ZigBee network WSN node is built using a Texas Instruments (TI) CC2530 system-on-a-chip radio running TI's ZigBee Pro Z-stack. Each ISA100.11a network node is built using a Nivis VersaNode 210 system-on-a-chip radio. In both cases, radios interface with TI MSP430-F5438 microcontroller implementing a common test application. Interference is provided by a D-link 802.11g Wi-Fi router transporting traffic generated using the Iperf network testing tool. For the single-channel ZigBee network, effects of both direct and indirect Wi-Fi interference are evaluated. For the channel-hopping ISA100.11a network, effects of interference from multiple Wi-Fi routers configured in non-overlapping 802.11g channels are evaluated. Our results show that, in general, the more lightweight ZigBee network performs well at low interference levels, but performance degrades as interference increases. Conversely, the more complex and costly ISA100.11a network continues to perform well as Wi-Fi interference levels increase.
High-Speed Current dq PI Controller for Vector Controlled PMSM Drive
Reaz, Mamun Bin Ibne; Rahman, Labonnah Farzana; Chang, Tae Gyu
2014-01-01
High-speed current controller for vector controlled permanent magnet synchronous motor (PMSM) is presented. The controller is developed based on modular design for faster calculation and uses fixed-point proportional-integral (PI) method for improved accuracy. Current dq controller is usually implemented in digital signal processor (DSP) based computer. However, DSP based solutions are reaching their physical limits, which are few microseconds. Besides, digital solutions suffer from high implementation cost. In this research, the overall controller is realizing in field programmable gate array (FPGA). FPGA implementation of the overall controlling algorithm will certainly trim down the execution time significantly to guarantee the steadiness of the motor. Agilent 16821A Logic Analyzer is employed to validate the result of the implemented design in FPGA. Experimental results indicate that the proposed current dq PI controller needs only 50 ns of execution time in 40 MHz clock, which is the lowest computational cycle for the era. PMID:24574913
A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations
Mak, Terrence S. T.; Rachmuth, Guy; Lam, Kai-Pui; Poon, Chi-Sang
2008-01-01
Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field Programmable Gate Array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the AMPA and NMDA synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired. PMID:17190033
Kamehama, Hiroki; Kawahito, Shoji; Shrestha, Sumeet; Nakanishi, Syunta; Yasutomi, Keita; Takeda, Ayaki; Tsuru, Takeshi Go
2017-01-01
This paper presents a novel full-depletion Si X-ray detector based on silicon-on-insulator pixel (SOIPIX) technology using a pinned depleted diode structure, named the SOIPIX-PDD. The SOIPIX-PDD greatly reduces stray capacitance at the charge sensing node, the dark current of the detector, and capacitive coupling between the sensing node and SOI circuits. These features of the SOIPIX-PDD lead to low read noise, resulting high X-ray energy resolution and stable operation of the pixel. The back-gate surface pinning structure using neutralized p-well at the back-gate surface and depleted n-well underneath the p-well for all the pixel area other than the charge sensing node is also essential for preventing hole injection from the p-well by making the potential barrier to hole, reducing dark current from the Si-SiO2 interface and creating lateral drift field to gather signal electrons in the pixel area into the small charge sensing node. A prototype chip using 0.2 μm SOI technology shows very low readout noise of 11.0 e−rms, low dark current density of 56 pA/cm2 at −35 °C and the energy resolution of 200 eV(FWHM) at 5.9 keV and 280 eV (FWHM) at 13.95 keV. PMID:29295523
Kamehama, Hiroki; Kawahito, Shoji; Shrestha, Sumeet; Nakanishi, Syunta; Yasutomi, Keita; Takeda, Ayaki; Tsuru, Takeshi Go; Arai, Yasuo
2017-12-23
This paper presents a novel full-depletion Si X-ray detector based on silicon-on-insulator pixel (SOIPIX) technology using a pinned depleted diode structure, named the SOIPIX-PDD. The SOIPIX-PDD greatly reduces stray capacitance at the charge sensing node, the dark current of the detector, and capacitive coupling between the sensing node and SOI circuits. These features of the SOIPIX-PDD lead to low read noise, resulting high X-ray energy resolution and stable operation of the pixel. The back-gate surface pinning structure using neutralized p-well at the back-gate surface and depleted n-well underneath the p-well for all the pixel area other than the charge sensing node is also essential for preventing hole injection from the p-well by making the potential barrier to hole, reducing dark current from the Si-SiO₂ interface and creating lateral drift field to gather signal electrons in the pixel area into the small charge sensing node. A prototype chip using 0.2 μm SOI technology shows very low readout noise of 11.0 e - rms , low dark current density of 56 pA/cm² at -35 °C and the energy resolution of 200 eV(FWHM) at 5.9 keV and 280 eV (FWHM) at 13.95 keV.
Patterning roadmap: 2017 prospects
NASA Astrophysics Data System (ADS)
Neisser, Mark
2017-06-01
Road mapping of semiconductor chips has been underway for over 20 years, first with the International Technology Roadmap for Semiconductors (ITRS) roadmap and now with the International Roadmap for Devices and Systems (IRDS) roadmap. The original roadmap was mostly driven bottom up and was developed to ensure that the large numbers of semiconductor producers and suppliers had good information to base their research and development on. The current roadmap is generated more top-down, where the customers of semiconductor chips anticipate what will be needed in the future and the roadmap projects what will be needed to fulfill that demand. The More Moore section of the roadmap projects that advanced logic will drive higher-resolution patterning, rather than memory chips. Potential solutions for patterning future logic nodes can be derived as extensions of `next-generation' patterning technologies currently under development. Advanced patterning has made great progress, and two `next-generation' patterning technologies, EUV and nanoimprint lithography, have potential to be in production as early as 2018. The potential adoption of two different next-generation patterning technologies suggests that patterning technology is becoming more specialized. This is good for the industry in that it lowers overall costs, but may lead to slower progress in extending any one patterning technology in the future.
Design of a network for concurrent message passing systems
NASA Astrophysics Data System (ADS)
Song, Paul Y.
1988-08-01
We describe the design of the network design frame (NDF), a self-timed routing chip for a message-passing concurrent computer. The NDF uses a partitioned data path, low-voltage output drivers, and a distributed token-passing arbiter to provide a bandwidth of 450 Mbits/sec into the network. Wormhole routing and bidirectional virtual channels are used to provide low latency communications, less than 2us latency to deliver a 216 bit message across the diameter of a 1K node mess-connected machine. To support concurrent software systems, the NDF provides two logical networks, one for user messages and one for system messages. The two networks share the same set of physical wires. To facilitate the development of network nodes, the NDF is a design frame. The NDF circuitry is integrated into the pad frame of a chip leaving the center of the chip uncommitted. We define an analytic framework in which to study the effects of network size, network buffering capacity, bidirectional channels, and traffic on this class of networks. The response of the network to various combinations of these parameters are obtained through extensive simulation of the network model. Through simulation, we are able to observe the macro behavior of the network as opposed to the micro behavior of the NDF routing controller.
The GANDALF 128-Channel Time-to-Digital Converter
NASA Astrophysics Data System (ADS)
Büchele, M.; Fischer, H.; Herrmann, F.; Königsmann, K.; Schill, C.; Schopferer, S.
The GANDALF 6U-VME64x/VXS module has been designed to cope with a variety of readout tasks in high energy and nuclear physics experiments, in particular the COMPASS experiment at CERN. The exchangeable mezzanine cards allow for an employment of the system in very different applications such as analog-to-digital or time-to-digital conversions, coincidence matrix formation, fast pattern recognition or fast trigger generation. Based on this platform, we present a 128-channel TDC which is implemented in a single Xilinx Virtex-5 FPGA using a shifted clock sampling method. In this concept each input signal is continuously sampled by 16 flip-flops using equidistant phase-shifted clocks. Compared to previous FPGA designs, usually based on delay lines and comprising few TDC channels with resolutions in the order of 10 ps, our design permits the implementation of a large number of TDC channels with a resolution of 64 ps in a single FPGA. Predictable placement of logic components and uniform routing inside the FPGA fabric is a particular challenge of this design. We present measurement results for the time resolution and the nonlinearity of the TDC readout system.
Intelligent FPGA Data Acquisition Framework
NASA Astrophysics Data System (ADS)
Bai, Yunpeng; Gaisbauer, Dominic; Huber, Stefan; Konorov, Igor; Levit, Dmytro; Steffen, Dominik; Paul, Stephan
2017-06-01
In this paper, we present the field programmable gate arrays (FPGA)-based framework intelligent FPGA data acquisition (IFDAQ), which is used for the development of DAQ systems for detectors in high-energy physics. The framework supports Xilinx FPGA and provides a collection of IP cores written in very high speed integrated circuit hardware description language, which use the common interconnect interface. The IP core library offers functionality required for the development of the full DAQ chain. The library consists of Serializer/Deserializer (SERDES)-based time-to-digital conversion channels, an interface to a multichannel 80-MS/s 10-b analog-digital conversion, data transmission, and synchronization protocol between FPGAs, event builder, and slow control. The functionality is distributed among FPGA modules built in the AMC form factor: front end and data concentrator. This modular design also helps to scale and adapt the DAQ system to the needs of the particular experiment. The first application of the IFDAQ framework is the upgrade of the read-out electronics for the drift chambers and the electromagnetic calorimeters (ECALs) of the COMPASS experiment at CERN. The framework will be presented and discussed in the context of this paper.
NASA Astrophysics Data System (ADS)
Jridi, Maher; Alfalou, Ayman
2017-05-01
By this paper, the major goal is to investigate the Multi-CPU/FPGA SoC (System on Chip) design flow and to transfer a know-how and skills to rapidly design embedded real-time vision system. Our aim is to show how the use of these devices can be benefit for system level integration since they make possible simultaneous hardware and software development. We take the facial detection and pretreatments as case study since they have a great potential to be used in several applications such as video surveillance, building access control and criminal identification. The designed system use the Xilinx Zedboard platform. The last is the central element of the developed vision system. The video acquisition is performed using either standard webcam connected to the Zedboard via USB interface or several camera IP devices. The visualization of video content and intermediate results are possible with HDMI interface connected to HD display. The treatments embedded in the system are as follow: (i) pre-processing such as edge detection implemented in the ARM and in the reconfigurable logic, (ii) software implementation of motion detection and face detection using either ViolaJones or LBP (Local Binary Pattern), and (iii) application layer to select processing application and to display results in a web page. One uniquely interesting feature of the proposed system is that two functions have been developed to transmit data from and to the VDMA port. With the proposed optimization, the hardware implementation of the Sobel filter takes 27 ms and 76 ms for 640x480, and 720p resolutions, respectively. Hence, with the FPGA implementation, an acceleration of 5 times is obtained which allow the processing of 37 fps and 13 fps for 640x480, and 720p resolutions, respectively.
FPGA-Based Networked Phasemeter for a Heterodyne Interferometer
NASA Technical Reports Server (NTRS)
Rao, Shanti
2009-01-01
A document discusses a component of a laser metrology system designed to measure displacements along the line of sight with precision on the order of a tenth the diameter of an atom. This component, the phasemeter, measures the relative phase of two electrical signals and transfers that information to a computer. Because the metrology system measures the differences between two optical paths, the phasemeter has two inputs, called measure and reference. The reference signal is nominally a perfect square wave with a 50- percent duty cycle (though only rising edges are used). As the metrology system detects motion, the difference between the reference and measure signal phases is proportional to the displacement of the motion. The phasemeter, therefore, counts the elapsed time between rising edges in the two signals, and converts the time into an estimate of phase delay. The hardware consists of a circuit board that plugs into a COTS (commercial, off-the- shelf) Spartan-III FPGA (field-programmable gate array) evaluation board. It has two BNC inputs, (reference and measure), a CMOS logic chip to buffer the inputs, and an Ethernet jack for transmitting reduced-data to a PC. Two extra BNC connectors can be attached for future expandability, such as external synchronization. Each phasemeter handles one metrology channel. A bank of six phasemeters (and two zero-crossing detector cards) with an Ethernet switch can monitor the rigid body motion of an object. This device is smaller and cheaper than existing zero-crossing phasemeters. Also, because it uses Ethernet for communication with a computer, instead of a VME bridge, it is much easier to use. The phasemeter is a key part of the Precision Deployable Apertures and Structures strategic R&D effort to design large, deployable, segmented space telescopes.
NASA Astrophysics Data System (ADS)
Wang, Yonggang; Xiao, Yong; Cheng, Xinyi; Li, Deng; Wang, Liwei
2016-02-01
For the continuous crystal-based positron emission tomography (PET) detector built in our lab, a maximum likelihood algorithm adapted for implementation on a field programmable gate array (FPGA) is proposed to estimate the three-dimensional (3D) coordinate of interaction position with the single-end detected scintillation light response. The row-sum and column-sum readout scheme organizes the 64 channels of photomultiplier (PMT) into eight row signals and eight column signals to be readout for X- and Y-coordinates estimation independently. By the reference events irradiated in a known oblique angle, the probability density function (PDF) for each depth-of-interaction (DOI) segment is generated, by which the reference events in perpendicular irradiation are assigned to DOI segments for generating the PDFs for X and Y estimation in each DOI layer. Evaluated by the experimental data, the algorithm achieves an average X resolution of 1.69 mm along the central X-axis, and DOI resolution of 3.70 mm over the whole thickness (0-10 mm) of crystal. The performance improvements from 2D estimation to the 3D algorithm are also presented. Benefiting from abundant resources of FPGA and a hierarchical storage arrangement, the whole algorithm can be implemented into a middle-scale FPGA. By a parallel structure in pipelines, the 3D position estimator on the FPGA can achieve a processing throughput of 15 M events/s, which is sufficient for the requirement of real-time PET imaging.
Design Tools for Reconfigurable Hardware in Orbit (RHinO)
NASA Technical Reports Server (NTRS)
French, Mathew; Graham, Paul; Wirthlin, Michael; Larchev, Gregory; Bellows, Peter; Schott, Brian
2004-01-01
The Reconfigurable Hardware in Orbit (RHinO) project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. These tools leverage an established FPGA design environment and focus primarily on space effects mitigation and power optimization. The project is creating software to automatically test and evaluate the single-event-upsets (SEUs) sensitivities of an FPGA design and insert mitigation techniques. Extensions into the tool suite will also allow evolvable algorithm techniques to reconfigure around single-event-latchup (SEL) events. In the power domain, tools are being created for dynamic power visualiization and optimization. Thus, this technology seeks to enable the use of Reconfigurable Hardware in Orbit, via an integrated design tool-suite aiming to reduce risk, cost, and design time of multimission reconfigurable space processors using SRAM-based FPGAs.
Using Spare Logic Resources To Create Dynamic Test Points
NASA Technical Reports Server (NTRS)
Katz, Richard; Kleyner, Igor
2011-01-01
A technique has been devised to enable creation of a dynamic set of test points in an embedded digital electronic system. As a result, electronics contained in an application specific circuit [e.g., gate array, field programmable gate array (FPGA)] can be internally probed, even when contained in a closed housing during all phases of test. In the present technique, the test points are not fixed and limited to a small number; the number of test points can vastly exceed the number of buffers or pins, resulting in a compact footprint. Test points are selected by means of spare logic resources within the ASIC(s) and/or FPGA(s). A register is programmed with a command, which is used to select the signals that are sent off-chip and out of the housing for monitoring by test engineers and external test equipment. The register can be commanded by any suitable means: for example, it could be commanded through a command port that would normally be used in the operation of the system. In the original application of the technique, commanding of the register is performed via a MIL-STD-1553B communication subsystem.
Design and FPGA implementation for MAC layer of Ethernet PON
NASA Astrophysics Data System (ADS)
Zhu, Zengxi; Lin, Rujian; Chen, Jian; Ye, Jiajun; Chen, Xinqiao
2004-04-01
Ethernet passive optical network (EPON), which represents the convergence of low-cost, high-bandwidth and supporting multiple services, appears to be one of the best candidates for the next-generation access network. The work of standardizing EPON as a solution for access network is still underway in the IEEE802.3ah Ethernet in the first mile (EFM) task force. The final release is expected in 2004. Up to now, there has been no standard application specific integrated circuit (ASIC) chip available which fulfills the functions of media access control (MAC) layer of EPON. The MAC layer in EPON system has many functions, such as point-to-point emulation (P2PE), Ethernet MAC functionality, multi-point control protocol (MPCP), network operation, administration and maintenance (OAM) and link security. To implement those functions mentioned above, an embedded real-time operating system (RTOS) and a flexible programmable logic device (PLD) with an embedded processor are used. The software and hardware functions in MAC layer are realized through programming embedded microprocessor and field programmable gate array(FPGA). Finally, some experimental results are given in this paper. The method stated here can provide a valuable reference for developing EPON MAC layer ASIC.
Power Management Integrated Circuit for Indoor Photovoltaic Energy Harvesting System
NASA Astrophysics Data System (ADS)
Jain, Vipul
In today's world, power dissipation is a main concern for battery operated mobile devices. Key design decisions are being governed by power rather than area/delay because power requirements are growing more stringent every year. Hence, a hybrid power management system is proposed, which uses both a solar panel to harvest energy from indoor lighting and a battery to power the load. The system tracks the maximum power point of the solar panel and regulates the battery and microcontroller output load voltages through the use of an on-chip switched-capacitor DC-DC converter. System performance is verified through simulation at the 180nm technology node and is made to be integrated on-chip with 0.25 second startup time, 79% efficiency, --8/+14% ripple on the load, an average 1micro A of quiescent current (3.7micro W of power) and total on-chip area of 1.8mm2 .
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal
Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
2017-03-20
sub-array, which is based on all-pass filters (APFs) is realized using 130 nm CMOS technology. Approximate- discrete Fourier transform (a-DFT...fixed beams are directed at known directions [9]. The proposed approximate- discrete Fourier transform (a-DFT) based multi-beamformer [9] yields L...to digital conversion daughter board. occurs in the discrete time domain (in ROACH-2 FPGA platform) following signal digitization (see Figs. 1(d) and
Novel intelligent real-time position tracking system using FPGA and fuzzy logic.
Soares dos Santos, Marco P; Ferreira, J A F
2014-03-01
The main aim of this paper is to test if FPGAs are able to achieve better position tracking performance than software-based soft real-time platforms. For comparison purposes, the same controller design was implemented in these architectures. A Multi-state Fuzzy Logic controller (FLC) was implemented both in a Xilinx(®) Virtex-II FPGA (XC2v1000) and in a soft real-time platform NI CompactRIO(®)-9002. The same sampling time was used. The comparative tests were conducted using a servo-pneumatic actuation system. Steady-state errors lower than 4 μm were reached for an arbitrary vertical positioning of a 6.2 kg mass when the controller was embedded into the FPGA platform. Performance gains up to 16 times in the steady-state error, up to 27 times in the overshoot and up to 19.5 times in the settling time were achieved by using the FPGA-based controller over the software-based FLC controller. © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Berdychowski, Piotr P.; Zabolotny, Wojciech M.
2010-09-01
The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.
NASA Astrophysics Data System (ADS)
Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA).
Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene
2010-01-01
Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA). PMID:22319345
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel
2010-01-01
This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
Using pattern enumeration to accelerate process development and ramp yield
NASA Astrophysics Data System (ADS)
Zhuang, Linda; Pang, Jenny; Xu, Jessy; Tsai, Mengfeng; Wang, Amy; Zhang, Yifan; Sweis, Jason; Lai, Ya-Chieh; Ding, Hua
2016-03-01
During a new technology node process setup phase, foundries do not initially have enough product chip designs to conduct exhaustive process development. Different operational teams use manually designed simple test keys to set up their process flows and recipes. When the very first version of the design rule manual (DRM) is ready, foundries enter the process development phase where new experiment design data is manually created based on these design rules. However, these IP/test keys contain very uniform or simple design structures. This kind of design normally does not contain critical design structures or process unfriendly design patterns that pass design rule checks but are found to be less manufacturable. It is desired to have a method to generate exhaustive test patterns allowed by design rules at development stage to verify the gap of design rule and process. This paper presents a novel method of how to generate test key patterns which contain known problematic patterns as well as any constructs which designers could possibly draw based on current design rules. The enumerated test key patterns will contain the most critical design structures which are allowed by any particular design rule. A layout profiling method is used to do design chip analysis in order to find potential weak points on new incoming products so fab can take preemptive action to avoid yield loss. It can be achieved by comparing different products and leveraging the knowledge learned from previous manufactured chips to find possible yield detractors.
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix
Zhang, Min; Wang, Hai; Liu, Yan
2017-01-01
In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time. PMID:28420121
A novel FPGA-programmable switch matrix interconnection element in quantum-dot cellular automata
NASA Astrophysics Data System (ADS)
Hashemi, Sara; Rahimi Azghadi, Mostafa; Zakerolhosseini, Ali; Navi, Keivan
2015-04-01
The Quantum-dot cellular automata (QCA) is a novel nanotechnology, promising extra low-power, extremely dense and very high-speed structure for the construction of logical circuits at a nanoscale. In this paper, initially previous works on QCA-based FPGA's routing elements are investigated, and then an efficient, symmetric and reliable QCA programmable switch matrix (PSM) interconnection element is introduced. This element has a simple structure and offers a complete routing capability. It is implemented using a bottom-up design approach that starts from a dense and high-speed 2:1 multiplexer and utilise it to build the target PSM interconnection element. In this study, simulations of the proposed circuits are carried out using QCAdesigner, a layout and simulation tool for QCA circuits. The results demonstrate high efficiency of the proposed designs in QCA-based FPGA routing.
A 7.4 ps FPGA-Based TDC with a 1024-Unit Measurement Matrix.
Zhang, Min; Wang, Hai; Liu, Yan
2017-04-14
In this paper, a high-resolution time-to-digital converter (TDC) based on a field programmable gate array (FPGA) device is proposed and tested. During the implementation, a new architecture of TDC is proposed which consists of a measurement matrix with 1024 units. The utilization of routing resources as the delay elements distinguishes the proposed design from other existing designs, which contributes most to the device insensitivity to variations of temperature and voltage. Experimental results suggest that the measurement resolution is 7.4 ps, and the INL (integral nonlinearity) and DNL (differential nonlinearity) are 11.6 ps and 5.5 ps, which indicates that the proposed TDC offers high performance among the available TDCs. Benefitting from the FPGA platform, the proposed TDC has superiorities in easy implementation, low cost, and short development time.
Hardware-Assisted Large-Scale Neuroevolution for Multiagent Learning
2014-12-30
SECURITY CLASSIFICATION OF: This DURIP equipment award was used to purchase, install, and bring on-line two Berkeley Emulation Engines ( BEEs ) and two...mini- BEE machines to establish an FPGA-based high-performance multiagent training platform and its associated software. This acquisition of BEE4-W...Platform; Probabilistic Domain Transformation; Hardware-Assisted; FPGA; BEE ; Hive Brain; Multiagent. REPORT DOCUMENTATION PAGE 11. SPONSOR/MONITOR’S
Tateno, Takashi; Nishikawa, Jun
2014-01-01
In this report, we describe the system integration of a complementary metal oxide semiconductor (CMOS) integrated circuit (IC) chip, capable of both stimulation and recording of neurons or neural tissues, to investigate electrical signal propagation within cellular networks in vitro. The overall system consisted of three major subunits: a 5.0 × 5.0 mm CMOS IC chip, a reconfigurable logic device (field-programmable gate array, FPGA), and a PC. To test the system, microelectrode arrays (MEAs) were used to extracellularly measure the activity of cultured rat cortical neurons and mouse cortical slices. The MEA had 64 bidirectional (stimulation and recording) electrodes. In addition, the CMOS IC chip was equipped with dedicated analog filters, amplification stages, and a stimulation buffer. Signals from the electrodes were sampled at 15.6 kHz with 16-bit resolution. The measured input-referred circuitry noise was 10.1 μ V root mean square (10 Hz to 100 kHz), which allowed reliable detection of neural signals ranging from several millivolts down to approximately 33 μ Vpp. Experiments were performed involving the stimulation of neurons with several spatiotemporal patterns and the recording of the triggered activity. An advantage over current MEAs, as demonstrated by our experiments, includes the ability to stimulate (voltage stimulation, 5-bit resolution) spatiotemporal patterns in arbitrary subsets of electrodes. Furthermore, the fast stimulation reset mechanism allowed us to record neuronal signals from a stimulating electrode around 3 ms after stimulation. We demonstrate that the system can be directly applied to, for example, auditory neural prostheses in conjunction with an acoustic sensor and a sound processing system. PMID:25346683
Tateno, Takashi; Nishikawa, Jun
2014-01-01
In this report, we describe the system integration of a complementary metal oxide semiconductor (CMOS) integrated circuit (IC) chip, capable of both stimulation and recording of neurons or neural tissues, to investigate electrical signal propagation within cellular networks in vitro. The overall system consisted of three major subunits: a 5.0 × 5.0 mm CMOS IC chip, a reconfigurable logic device (field-programmable gate array, FPGA), and a PC. To test the system, microelectrode arrays (MEAs) were used to extracellularly measure the activity of cultured rat cortical neurons and mouse cortical slices. The MEA had 64 bidirectional (stimulation and recording) electrodes. In addition, the CMOS IC chip was equipped with dedicated analog filters, amplification stages, and a stimulation buffer. Signals from the electrodes were sampled at 15.6 kHz with 16-bit resolution. The measured input-referred circuitry noise was 10.1 μ V root mean square (10 Hz to 100 kHz), which allowed reliable detection of neural signals ranging from several millivolts down to approximately 33 μ Vpp. Experiments were performed involving the stimulation of neurons with several spatiotemporal patterns and the recording of the triggered activity. An advantage over current MEAs, as demonstrated by our experiments, includes the ability to stimulate (voltage stimulation, 5-bit resolution) spatiotemporal patterns in arbitrary subsets of electrodes. Furthermore, the fast stimulation reset mechanism allowed us to record neuronal signals from a stimulating electrode around 3 ms after stimulation. We demonstrate that the system can be directly applied to, for example, auditory neural prostheses in conjunction with an acoustic sensor and a sound processing system.
Fabless company mask technology approach: fabless but not fab-careless
NASA Astrophysics Data System (ADS)
Hisamura, Toshiyuki; Wu, Xin
2009-10-01
There are two different foundry-fabless working models in the aspect of mask. Some foundries have in-house mask facility while others contract with merchant mask vendors. Significant progress has been made in both kinds of situations. Xilinx as one of the pioneers of fabless semiconductor companies has been continually working very closely with both merchant mask vendors and mask facilities of foundries in past many years, contributed well in both technology development and benefited from corporations. Our involvement in manufacturing is driven by the following three elements: The first element is to understand the new fabrication and mask technologies and then find a suitable design / layout style to better utilize these new technologies and avoid potential risks. Because Xilinx has always been involved in early stage of advanced technology nodes, this early understanding and adoption is especially important. The second element is time to market. Reduction in mask and wafer manufacturing cycle-time can ensure faster time to market. The third element is quality. Commitment to quality is our highest priority for our customers. We have enough visibility on any manufacturing issues affecting the device functionality. Good correlation has consistently been observed between FPGA speed uniformity and the poly mask Critical Dimension (CD) uniformity performance. To achieve FPGA speed uniformity requirement, the manufacturing process as well as the mask and wafer CD uniformity has to be monitored. Xilinx works closely with the wafer foundries and mask suppliers to improve productivity and the yield from initial development stage of mask making operations. As an example, defect density reduction is one of the biggest challenges for mask supplier in development stage to meet the yield target satisfying the mask cost and mask turn-around-time (TAT) requirement. Historically, masks were considered to be defect free but at these advanced process nodes, that assumption no longer holds true. There is a need to be flexible enough on unrepairable defect at early stage but also a need for efficient risk management system on mask defect waivers. Mask defects are often waived in low design criticality area in favor of scrapping the mask and delaying the mask and wafer schedule. Xilinx's involvement in mask manufacturing has contributed significantly to our success in past many nodes and will continue.
Tamplin, Owen J; Cox, Brian J; Rossant, Janet
2011-12-15
The node and notochord are key tissues required for patterning of the vertebrate body plan. Understanding the gene regulatory network that drives their formation and function is therefore important. Foxa2 is a key transcription factor at the top of this genetic hierarchy and finding its targets will help us to better understand node and notochord development. We performed an extensive microarray-based gene expression screen using sorted embryonic notochord cells to identify early notochord-enriched genes. We validated their specificity to the node and notochord by whole mount in situ hybridization. This provides the largest available resource of notochord-expressed genes, and therefore candidate Foxa2 target genes in the notochord. Using existing Foxa2 ChIP-seq data from adult liver, we were able to identify a set of genes expressed in the notochord that had associated regions of Foxa2-bound chromatin. Given that Foxa2 is a pioneer transcription factor, we reasoned that these sites might represent notochord-specific enhancers. Candidate Foxa2-bound regions were tested for notochord specific enhancer function in a zebrafish reporter assay and 7 novel notochord enhancers were identified. Importantly, sequence conservation or predictive models could not have readily identified these regions. Mutation of putative Foxa2 binding elements in two of these novel enhancers abrogated reporter expression and confirmed their Foxa2 dependence. The combination of highly specific gene expression profiling and genome-wide ChIP analysis is a powerful means of understanding developmental pathways, even for small cell populations such as the notochord. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Oztekin, Halit; Temurtas, Feyzullah; Gulbag, Ali
The Arithmetic and Logic Unit (ALU) design is one of the important topics in Computer Architecture and Organization course in Computer and Electrical Engineering departments. There are ALU designs that have non-modular nature to be used as an educational tool. As the programmable logic technology has developed rapidly, it is feasible that ALU design based on Field Programmable Gate Array (FPGA) is implemented in this course. In this paper, we have adopted the modular approach to ALU design based on FPGA. All the modules in the ALU design are realized using schematic structure on Altera's Cyclone II Development board. Under this model, the ALU content is divided into four distinct modules. These are arithmetic unit except for multiplication and division operations, logic unit, multiplication unit and division unit. User can easily design any size of ALU unit since this approach has the modular nature. Then, this approach was applied to microcomputer architecture design named BZK.SAU.FPGA10.0 instead of the current ALU unit.
NASA Astrophysics Data System (ADS)
Holik, Michael
2010-01-01
The article describes a design and the test of the datalogger unit. Main demands on the datalogger were to achieve the power consumption as low as possible and the ability to capture short-time events. The datalogger is based on a programmable logic device FPGA. VHDL language is used to design the architecture fitted into the FPGA. The results of the test confirmed low power consumption feature of the device as well as proper functionality of the unit.
High-speed polarization sensitive optical coherence tomography for retinal diagnostics
NASA Astrophysics Data System (ADS)
Yin, Biwei; Wang, Bingqing; Vemishetty, Kalyanramu; Nagle, Jim; Liu, Shuang; Wang, Tianyi; Rylander, Henry G., III; Milner, Thomas E.
2012-01-01
We report design and construction of an FPGA-based high-speed swept-source polarization-sensitive optical coherence tomography (SS-PS-OCT) system for clinical retinal imaging. Clinical application of the SS-PS-OCT system is accurate measurement and display of thickness, phase retardation and birefringence maps of the retinal nerve fiber layer (RNFL) in human subjects for early detection of glaucoma. The FPGA-based SS-PS-OCT system provides three incident polarization states on the eye and uses a bulk-optic polarization sensitive balanced detection module to record two orthogonal interference fringe signals. Interference fringe signals and relative phase retardation between two orthogonal polarization states are used to obtain Stokes vectors of light returning from each RNFL depth. We implement a Levenberg-Marquardt algorithm on a Field Programmable Gate Array (FPGA) to compute accurate phase retardation and birefringence maps. For each retinal scan, a three-state Levenberg-Marquardt nonlinear algorithm is applied to 360 clusters each consisting of 100 A-scans to determine accurate maps of phase retardation and birefringence in less than 1 second after patient measurement allowing real-time clinical imaging-a speedup of more than 300 times over previous implementations. We report application of the FPGA-based SS-PS-OCT system for real-time clinical imaging of patients enrolled in a clinical study at the Eye Institute of Austin and Duke Eye Center.
NASA Astrophysics Data System (ADS)
García, Aday; Santos, Lucana; López, Sebastián.; Callicó, Gustavo M.; Lopez, Jose F.; Sarmiento, Roberto
2014-05-01
Efficient onboard satellite hyperspectral image compression represents a necessity and a challenge for current and future space missions. Therefore, it is mandatory to provide hardware implementations for this type of algorithms in order to achieve the constraints required for onboard compression. In this work, we implement the Lossy Compression for Exomars (LCE) algorithm on an FPGA by means of high-level synthesis (HSL) in order to shorten the design cycle. Specifically, we use CatapultC HLS tool to obtain a VHDL description of the LCE algorithm from C-language specifications. Two different approaches are followed for HLS: on one hand, introducing the whole C-language description in CatapultC and on the other hand, splitting the C-language description in functional modules to be implemented independently with CatapultC, connecting and controlling them by an RTL description code without HLS. In both cases the goal is to obtain an FPGA implementation. We explain the several changes applied to the original Clanguage source code in order to optimize the results obtained by CatapultC for both approaches. Experimental results show low area occupancy of less than 15% for a SRAM-based Virtex-5 FPGA and a maximum frequency above 80 MHz. Additionally, the LCE compressor was implemented into an RTAX2000S antifuse-based FPGA, showing an area occupancy of 75% and a frequency around 53 MHz. All these serve to demonstrate that the LCE algorithm can be efficiently executed on an FPGA onboard a satellite. A comparison between both implementation approaches is also provided. The performance of the algorithm is finally compared with implementations on other technologies, specifically a graphics processing unit (GPU) and a single-threaded CPU.