fpga-based reconfigurable computing: Topics by Science.gov

Sample records for fpga-based reconfigurable computing

Reconfigurable Processing Module

NASA Technical Reports Server (NTRS)

Somervill, Kevin; Hodson, Robert; Jones, Robert; Williams, John

2005-01-01

To accommodate a wide spectrum of applications and technologies, NASA s Exploration System's Missions Directorate has called for reconfigurable and modular technologies to support future missions to the moon and Mars. In response, Langley Research Center is leading a program entitled Reconfigurable Scaleable Computing (RSC) that is centered on the development of FPGA-based computing resources in a stackable form factor. This paper details the architecture and implementation of the Reconfigurable Processing Module (RPM), which is the key element of the RSC system. The RPM is an FPGA-based, space-qualified printed circuit assembly leveraging terrestrial/commercial design standards into the space applications domain. The form factor is similar to, and backwards compatible with, the PCI-104 standard utilizing only the PCI interface. The size is expanded to accommodate the required functionality while still better than 30% smaller than a 3U CompactPCI(TradeMark)card and without the overhead of the backplane. The architecture is built around two FPGA devices, one hosting PCI and memory interfaces, and another hosting mission application resources; both of which are connected with a high-speed data bus. The PCI interface FPGA provides access via the PCI bus to onboard SDRAM, flash PROM, and the application resources; both configuration management as well as runtime interaction. The reconfigurable FPGA, referred to as the Application FPGA - or simply "the application" - is a radiation-tolerant Xilinx Virtex-4 FX60 hosting custom application specific logic or soft microprocessor IP. The RPM implements various SEE mitigation techniques including TMR, EDAC, and configuration scrubbing of the reconfigurable FPGA. Prototype hardware and formal modeling techniques are used to explore the performability trade space. These models provide a novel way to calculate quality-of-service performance measures while simultaneously considering fault-related behavior due to SEE soft errors.
Design Tools for Reconfigurable Hardware in Orbit (RHinO)

NASA Technical Reports Server (NTRS)

French, Mathew; Graham, Paul; Wirthlin, Michael; Larchev, Gregory; Bellows, Peter; Schott, Brian

2004-01-01

The Reconfigurable Hardware in Orbit (RHinO) project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. These tools leverage an established FPGA design environment and focus primarily on space effects mitigation and power optimization. The project is creating software to automatically test and evaluate the single-event-upsets (SEUs) sensitivities of an FPGA design and insert mitigation techniques. Extensions into the tool suite will also allow evolvable algorithm techniques to reconfigure around single-event-latchup (SEL) events. In the power domain, tools are being created for dynamic power visualiization and optimization. Thus, this technology seeks to enable the use of Reconfigurable Hardware in Orbit, via an integrated design tool-suite aiming to reduce risk, cost, and design time of multimission reconfigurable space processors using SRAM-based FPGAs.
FPGA platform for prototyping and evaluation of neural network automotive applications

NASA Technical Reports Server (NTRS)

Aranki, N.; Tawel, R.

2002-01-01

In this paper we present an FPGA based reconfigurable computing platform for prototyping and evaluation of advanced neural network based applications for control and diagnostics in an automotive sub-systems.
Radiation Tolerant, FPGA-Based SmallSat Computer System

NASA Technical Reports Server (NTRS)

LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew

2015-01-01

The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm

NASA Technical Reports Server (NTRS)

Weir, John M.; Wells, B. Earl

2003-01-01

Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Fernandes, Ana; Pereira, Rita C.; Sousa, Jorge

The Instituto de Plasmas e Fusao Nuclear (IPFN) has developed dedicated re-configurable modules based on field programmable gate array (FPGA) devices for several nuclear fusion machines worldwide. Moreover, new Advanced Telecommunication Computing Architecture (ATCA) based modules developed by IPFN are already included in the ITER catalogue. One of the requirements for re-configurable modules operating in future nuclear environments including ITER is the remote update capability. Accordingly, this work presents an alternative method for FPGA remote programing to be implemented in new ATCA based re-configurable modules. FPGAs are volatile devices and their programming code is usually stored in dedicated flash memoriesmore » for properly configuration during module power-on. The presented method is capable to store new FPGA codes in Serial Peripheral Interface (SPI) flash memories using the PCIexpress (PCIe) network established on the ATCA back-plane, linking data acquisition endpoints and the data switch blades. The method is based on the Xilinx Quick Boot application note, adapted to PCIe protocol and ATCA based modules. (authors)« less
Radiation Mitigation and Power Optimization Design Tools for Reconfigurable Hardware in Orbit

NASA Technical Reports Server (NTRS)

French, Matthew; Graham, Paul; Wirthlin, Michael; Wang, Li; Larchev, Gregory

2005-01-01

The Reconfigurable Hardware in Orbit (RHinO)project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. In the second year of the project, design tools that leverage an established FPGA design environment have been created to visualize and analyze an FPGA circuit for radiation weaknesses and power inefficiencies. For radiation, a single event Upset (SEU) emulator, persistence analysis tool, and a half-latch removal tool for Xilinx/Virtex-II devices have been created. Research is underway on a persistence mitigation tool and multiple bit upsets (MBU) studies. For power, synthesis level dynamic power visualization and analysis tools have been completed. Power optimization tools are under development and preliminary test results are positive.
Remote hardware-reconfigurable robotic camera

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.

2001-10-01

In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors

NASA Technical Reports Server (NTRS)

Flatley, Thomas P.

2015-01-01

SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
A dynamically reconfigurable multi-functional PLL for SRAM-based FPGA in 65nm CMOS technology

NASA Astrophysics Data System (ADS)

Yang, Mingqian; Chen, Lei; Li, Xuewu; Zhang, Yanlong

2018-04-01

Phase-locked loops (PLL) have been widely utilized in FPGA as an important module for clock management. PLL with dynamic reconfiguration capability is always welcomed in FPGA design as it is able to decrease power consumption and simultaneously improve flexibility. In this paper, a multi-functional PLL with dynamic reconfiguration capability for 65nm SRAM-based FPGA is proposed. Firstly, configurable charge pump and loop filter are utilized to optimize the loop bandwidth. Secondly, the PLL incorporates a VCO with dual control voltages to accelerate the adjustment of oscillation frequency. Thirdly, three configurable dividers are presented for flexible frequency synthesis. Lastly, a configuration block with dynamic reconfiguration function is proposed. Simulation results demonstrate that the proposed multi-functional PLL can output clocks with configurable division ratio, phase shift and duty cycle. The PLL can also be dynamically reconfigured without affecting other parts' running or halting the FPGA device.
An Embedded Reconfigurable Logic Module

NASA Technical Reports Server (NTRS)

Tucker, Jerry H.; Klenke, Robert H.; Shams, Qamar A. (Technical Monitor)

2002-01-01

A Miniature Embedded Reconfigurable Computer and Logic (MERCAL) module has been developed and verified. MERCAL was designed to be a general-purpose, universal module that that can provide significant hardware and software resources to meet the requirements of many of today's complex embedded applications. This is accomplished in the MERCAL module by combining a sub credit card size PC in a DIMM form factor with a XILINX Spartan I1 FPGA. The PC has the ability to download program files to the FPGA to configure it for different hardware functions and to transfer data to and from the FPGA via the PC's ISA bus during run time. The MERCAL module combines, in a compact package, the computational power of a 133 MHz PC with up to 150,000 gate equivalents of digital logic that can be reconfigured by software. The general architecture and functionality of the MERCAL hardware and system software are described.
A Survey on FPGA-Based Sensor Systems: Towards Intelligent and Reconfigurable Low-Power Sensors for Computer Vision, Control and Signal Processing

PubMed Central

García, Gabriel J.; Jara, Carlos A.; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M.; Torres, Fernando

2014-01-01

The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field. PMID:24691100
A survey on FPGA-based sensor systems: towards intelligent and reconfigurable low-power sensors for computer vision, control and signal processing.

PubMed

García, Gabriel J; Jara, Carlos A; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M; Torres, Fernando

2014-03-31

The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.
PCI-based WILDFIRE reconfigurable computing engines

NASA Astrophysics Data System (ADS)

Fross, Bradley K.; Donaldson, Robert L.; Palmer, Douglas J.

1996-10-01

WILDFORCE is the first PCI-based custom reconfigurable computer that is based on the Splash 2 technology transferred from the National Security Agency and the Institute for Defense Analyses, Supercomputing Research Center (SRC). The WILDFORCE architecture has many of the features of the WILDFIRE computer, such as field- programmable gate array (FPGA) based processing elements, linear array and crossbar interconnection, and high- performance memory and I/O subsystems. New features introduced in the PCI-based WILDFIRE systems include memory/processor options that can be added to any processing element. These options include static and dynamic memory, digital signal processors (DSPs), FPGAs, and microprocessors. In addition to memory/processor options, many different application specific connectors can be used to extend the I/O capabilities of the system, including systolic I/O, camera input and video display output. This paper also discusses how this new PCI-based reconfigurable computing engine is used for rapid-prototyping, real-time video processing and other DSP applications.
Reconfigurable vision system for real-time applications

NASA Astrophysics Data System (ADS)

Torres-Huitzil, Cesar; Arias-Estrada, Miguel

2002-03-01

Recently, a growing community of researchers has used reconfigurable systems to solve computationally intensive problems. Reconfigurability provides optimized processors for systems on chip designs, and makes easy to import technology to a new system through reusable modules. The main objective of this work is the investigation of a reconfigurable computer system targeted for computer vision and real-time applications. The system is intended to circumvent the inherent computational load of most window-based computer vision algorithms. It aims to build a system for such tasks by providing an FPGA-based hardware architecture for task specific vision applications with enough processing power, using the minimum amount of hardware resources as possible, and a mechanism for building systems using this architecture. Regarding the software part of the system, a library of pre-designed and general-purpose modules that implement common window-based computer vision operations is being investigated. A common generic interface is established for these modules in order to define hardware/software components. These components can be interconnected to develop more complex applications, providing an efficient mechanism for transferring image and result data among modules. Some preliminary results are presented and discussed.
Compiling for Application Specific Computational Acceleration in Reconfigurable Architectures Final Report CRADA No. TSB-2033-01

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Supinski, B.; Caliga, D.

2017-09-28

The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.
Automatic HDL firmware generation for FPGA-based reconfigurable measurement and control systems with mezzanines in FMC standard

NASA Astrophysics Data System (ADS)

Wojenski, Andrzej; Kasprowicz, Grzegorz; Pozniak, Krzysztof T.; Romaniuk, Ryszard

2013-10-01

The paper describes a concept of automatic firmware generation for reconfigurable measurement systems, which uses FPGA devices and measurement cards in FMC standard. Following sections are described in details: automatic HDL code generation for FPGA devices, automatic communication interfaces implementation, HDL drivers for measurement cards, automatic serial connection between multiple measurement backplane boards, automatic build of memory map (address space), automatic generated firmware management. Presented solutions are required in many advanced measurement systems, like Beam Position Monitors or GEM detectors. This work is a part of a wider project for automatic firmware generation and management of reconfigurable systems. Solutions presented in this paper are based on previous publication in SPIE.
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform

NASA Technical Reports Server (NTRS)

White, Michael J.

2004-01-01

This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
HALO: a reconfigurable image enhancement and multisensor fusion system

NASA Astrophysics Data System (ADS)

Wu, F.; Hickman, D. L.; Parker, Steve J.

2014-06-01

Contemporary high definition (HD) cameras and affordable infrared (IR) imagers are set to dramatically improve the effectiveness of security, surveillance and military vision systems. However, the quality of imagery is often compromised by camera shake, or poor scene visibility due to inadequate illumination or bad atmospheric conditions. A versatile vision processing system called HALO™ is presented that can address these issues, by providing flexible image processing functionality on a low size, weight and power (SWaP) platform. Example processing functions include video distortion correction, stabilisation, multi-sensor fusion and image contrast enhancement (ICE). The system is based around an all-programmable system-on-a-chip (SoC), which combines the computational power of a field-programmable gate array (FPGA) with the flexibility of a CPU. The FPGA accelerates computationally intensive real-time processes, whereas the CPU provides management and decision making functions that can automatically reconfigure the platform based on user input and scene content. These capabilities enable a HALO™ equipped reconnaissance or surveillance system to operate in poor visibility, providing potentially critical operational advantages in visually complex and challenging usage scenarios. The choice of an FPGA based SoC is discussed, and the HALO™ architecture and its implementation are described. The capabilities of image distortion correction, stabilisation, fusion and ICE are illustrated using laboratory and trials data.
An FPGA-based reconfigurable DDC algorithm

NASA Astrophysics Data System (ADS)

Juszczyk, B.; Kasprowicz, G.

2016-09-01

This paper describes implementation of reconfigurable digital down converter in an FPGA structure. System is designed to work with quadrature signals. One of the main criteria of the project was to provied wide range of reconfiguration in order to fulfill various application rage. Potential applications include: software defined radio receiver, passive noise radars and measurement data compression. This document contains general system overview, short description of hardware used in the project and gateware implementation.

Adaptive Instrument Module: Space Instrument Controller "Brain" through Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Darrin, Ann Garrison; Conde, Richard; Chern, Bobbie; Luers, Phil; Jurczyk, Steve; Mills, Carl; Day, John H. (Technical Monitor)

2001-01-01

The Adaptive Instrument Module (AIM) will be the first true demonstration of reconfigurable computing with field-programmable gate arrays (FPGAs) in space, enabling the 'brain' of the system to evolve or adapt to changing requirements. In partnership with NASA Goddard Space Flight Center and the Australian Cooperative Research Centre for Satellite Systems (CRC-SS), APL has built the flight version to be flown on the Australian university-class satellite FEDSAT. The AIM provides satellites the flexibility to adapt to changing mission requirements by reconfiguring standardized processing hardware rather than incurring the large costs associated with new builds. This ability to reconfigure the processing in response to changing mission needs leads to true evolveable computing, wherein the instrument 'brain' can learn from new science data in order to perform state-of-the-art data processing. The development of the AIM is significant in its enormous potential to reduce total life-cycle costs for future space exploration missions. The advent of RAM-based FPGAs whose configuration can be changed at any time has enabled the development of the AIM for processing tasks that could not be performed in software. The use of the AIM enables reconfiguration of the FPGA circuitry while the spacecraft is in flight, with many accompanying advantages. The AIM demonstrates the practicalities of using reconfigurable computing hardware devices by conducting a series of designed experiments. These include the demonstration of implementing data compression, data filtering, and communication message processing and inter-experiment data computation. The second generation is the Adaptive Processing Template (ADAPT) which is further described in this paper. The next step forward is to make the hardware itself adaptable and the ADAPT pursues this challenge by developing a reconfigurable module that will be capable of functioning efficiently in various applications. ADAPT will take advantage of radiation tolerant RAM-based field programmable gate array (FPGA) technology to develop a reconfigurable processor that combines the flexibility of a general purpose processor running software with the performance of application specific processing hardware for a variety of high performance computing applications.
Development of a Low-Cost and High-speed Single Event Effects Testers based on Reconfigurable Field Programmable Gate Arrays (FPGA)

NASA Technical Reports Server (NTRS)

Howard, J. W.; Kim, H.; Berg, M.; LaBel, K. A.; Stansberry, S.; Friendlich, M.; Irwin, T.

2006-01-01

A viewgraph presentation on the development of a low cost, high speed tester reconfigurable Field Programmable Gata Array (FPGA) is shown. The topics include: 1) Introduction; 2) Objectives; 3) Tester Descriptions; 4) Tester Validations and Demonstrations; 5) Future Work; and 6) Summary.
Research on NC motion controller based on SOPC technology

NASA Astrophysics Data System (ADS)

Jiang, Tingbiao; Meng, Biao

2006-11-01

With the rapid development of the digitization and informationization, the application of numerical control technology in the manufacturing industry becomes more and more important. However, the conventional numerical control system usually has some shortcomings such as the poor in system openness, character of real-time, cutability and reconfiguration. In order to solve these problems, this paper investigates the development prospect and advantage of the application in numerical control area with system-on-a-Programmable-Chip (SOPC) technology, and puts forward to a research program approach to the NC controller based on SOPC technology. Utilizing the characteristic of SOPC technology, we integrate high density logic device FPGA, memory SRAM, and embedded processor ARM into a single programmable logic device. We also combine the 32-bit RISC processor with high computing capability of the complicated algorithm with the FPGA device with strong motivable reconfiguration logic control ability. With these steps, we can greatly resolve the defect described in above existing numerical control systems. For the concrete implementation method, we use FPGA chip embedded with ARM hard nuclear processor to construct the control core of the motion controller. We also design the peripheral circuit of the controller according to the requirements of actual control functions, transplant real-time operating system into ARM, design the driver of the peripheral assisted chip, develop the application program to control and configuration of FPGA, design IP core of logic algorithm for various NC motion control to configured it into FPGA. The whole control system uses the concept of modular and structured design to develop hardware and software system. Thus the NC motion controller with the advantage of easily tailoring, highly opening, reconfigurable, and expandable can be implemented.
FPGA-Based High-Performance Embedded Systems for Adaptive Edge Computing in Cyber-Physical Systems: The ARTICo³ Framework.

PubMed

Rodríguez, Alfonso; Valverde, Juan; Portilla, Jorge; Otero, Andrés; Riesgo, Teresa; de la Torre, Eduardo

2018-06-08

Cyber-Physical Systems are experiencing a paradigm shift in which processing has been relocated to the distributed sensing layer and is no longer performed in a centralized manner. This approach, usually referred to as Edge Computing, demands the use of hardware platforms that are able to manage the steadily increasing requirements in computing performance, while keeping energy efficiency and the adaptability imposed by the interaction with the physical world. In this context, SRAM-based FPGAs and their inherent run-time reconfigurability, when coupled with smart power management strategies, are a suitable solution. However, they usually fail in user accessibility and ease of development. In this paper, an integrated framework to develop FPGA-based high-performance embedded systems for Edge Computing in Cyber-Physical Systems is presented. This framework provides a hardware-based processing architecture, an automated toolchain, and a runtime to transparently generate and manage reconfigurable systems from high-level system descriptions without additional user intervention. Moreover, it provides users with support for dynamically adapting the available computing resources to switch the working point of the architecture in a solution space defined by computing performance, energy consumption and fault tolerance. Results show that it is indeed possible to explore this solution space at run time and prove that the proposed framework is a competitive alternative to software-based edge computing platforms, being able to provide not only faster solutions, but also higher energy efficiency for computing-intensive algorithms with significant levels of data-level parallelism.
Soft-core processor study for node-based architectures.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Houten, Jonathan Roger; Jarosz, Jason P.; Welch, Benjamin James

2008-09-01

Node-based architecture (NBA) designs for future satellite projects hold the promise of decreasing system development time and costs, size, weight, and power and positioning the laboratory to address other emerging mission opportunities quickly. Reconfigurable Field Programmable Gate Array (FPGA) based modules will comprise the core of several of the NBA nodes. Microprocessing capabilities will be necessary with varying degrees of mission-specific performance requirements on these nodes. To enable the flexibility of these reconfigurable nodes, it is advantageous to incorporate the microprocessor into the FPGA itself, either as a hardcore processor built into the FPGA or as a soft-core processor builtmore » out of FPGA elements. This document describes the evaluation of three reconfigurable FPGA based processors for use in future NBA systems--two soft cores (MicroBlaze and non-fault-tolerant LEON) and one hard core (PowerPC 405). Two standard performance benchmark applications were developed for each processor. The first, Dhrystone, is a fixed-point operation metric. The second, Whetstone, is a floating-point operation metric. Several trials were run at varying code locations, loop counts, processor speeds, and cache configurations. FPGA resource utilization was recorded for each configuration. Cache configurations impacted the results greatly; for optimal processor efficiency it is necessary to enable caches on the processors. Processor caches carry a penalty; cache error mitigation is necessary when operating in a radiation environment.« less
Realization of the FPGA-based reconfigurable computing environment by the example of morphological processing of a grayscale image

NASA Astrophysics Data System (ADS)

Shatravin, V.; Shashev, D. V.

2018-05-01

Currently, robots are increasingly being used in every industry. One of the most high-tech areas is creation of completely autonomous robotic devices including vehicles. The results of various global research prove the efficiency of vision systems in autonomous robotic devices. However, the use of these systems is limited because of the computational and energy resources available in the robot device. The paper describes the results of applying the original approach for image processing on reconfigurable computing environments by the example of morphological operations over grayscale images. This approach is prospective for realizing complex image processing algorithms and real-time image analysis in autonomous robotic devices.
Single-Event Effect (SEE) Survey of Advanced Reconfigurable Field Programmable Gate Arrays: NASA Electronic Parts and Packaging (NEPP) Program Office of Safety and Mission Assurance

NASA Technical Reports Server (NTRS)

Allen, Gregory

2011-01-01

The NEPP Reconfigurable Field-Programmable Gate Array (FPGA) task has been charged to evaluate reconfigurable FPGA technologies for use in space. Under this task, the Xilinx single-event-immune, reconfigurable FPGA (SIRF) XQR5VFX130 device was evaluated for SEE. Additionally, the Altera Stratix-IV and SiliconBlue iCE65 were screened for single-event latchup (SEL).
HDL Based FPGA Interface Library for Data Acquisition and Multipurpose Real Time Algorithms

NASA Astrophysics Data System (ADS)

Fernandes, Ana M.; Pereira, R. C.; Sousa, J.; Batista, A. J. N.; Combo, A.; Carvalho, B. B.; Correia, C. M. B. A.; Varandas, C. A. F.

2011-08-01

The inherent parallelism of the logic resources, the flexibility in its configuration and the performance at high processing frequencies makes the field programmable gate array (FPGA) the most suitable device to be used both for real time algorithm processing and data transfer in instrumentation modules. Moreover, the reconfigurability of these FPGA based modules enables exploiting different applications on the same module. When using a reconfigurable module for various applications, the availability of a common interface library for easier implementation of the algorithms on the FPGA leads to more efficient development. The FPGA configuration is usually specified in a hardware description language (HDL) or other higher level descriptive language. The critical paths, such as the management of internal hardware clocks that require deep knowledge of the module behavior shall be implemented in HDL to optimize the timing constraints. The common interface library should include these critical paths, freeing the application designer from hardware complexity and able to choose any of the available high-level abstraction languages for the algorithm implementation. With this purpose a modular Verilog code was developed for the Virtex 4 FPGA of the in-house Transient Recorder and Processor (TRP) hardware module, based on the Advanced Telecommunications Computing Architecture (ATCA), with eight channels sampling at up to 400 MSamples/s (MSPS). The TRP was designed to perform real time Pulse Height Analysis (PHA), Pulse Shape Discrimination (PSD) and Pile-Up Rejection (PUR) algorithms at a high count rate (few Mevent/s). A brief description of this modular code is presented and examples of its use as an interface with end user algorithms, including a PHA with PUR, are described.
A Field Programmable Gate Array-Based Reconfigurable Smart-Sensor Network for Wireless Monitoring of New Generation Computer Numerically Controlled Machines

PubMed Central

Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; de Jesus Romero-Troncoso, Rene

2010-01-01

Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node. PMID:22163602
A field programmable gate array-based reconfigurable smart-sensor network for wireless monitoring of new generation computer numerically controlled machines.

PubMed

Moreno-Tapia, Sandra Veronica; Vera-Salas, Luis Alberto; Osornio-Rios, Roque Alfredo; Dominguez-Gonzalez, Aurelio; Stiharu, Ion; Romero-Troncoso, Rene de Jesus

2010-01-01

Computer numerically controlled (CNC) machines have evolved to adapt to increasing technological and industrial requirements. To cover these needs, new generation machines have to perform monitoring strategies by incorporating multiple sensors. Since in most of applications the online Processing of the variables is essential, the use of smart sensors is necessary. The contribution of this work is the development of a wireless network platform of reconfigurable smart sensors for CNC machine applications complying with the measurement requirements of new generation CNC machines. Four different smart sensors are put under test in the network and their corresponding signal processing techniques are implemented in a Field Programmable Gate Array (FPGA)-based sensor node.
A Test Methodology for Determining Space-Readiness of Xilinx SRAM-Based FPGA Designs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather M; Graham, Paul S; Morgan, Keith S

2008-01-01

Using reconfigurable, static random-access memory (SRAM) based field-programmable gate arrays (FPGAs) for space-based computation has been an exciting area of research for the past decade. Since both the circuit and the circuit's state is stored in radiation-tolerant memory, both could be alterd by the harsh space radiation environment. Both the circuit and the circuit's state can be prote cted by triple-moduler redundancy (TMR), but applying TMR to FPGA user designs is often an error-prone process. Faulty application of TMR could cause the FPGA user circuit to output incorrect data. This paper will describe a three-tiered methodology for testing FPGA usermore » designs for space-readiness. We will describe the standard approach to testing FPGA user designs using a particle accelerator, as well as two methods using fault injection and a modeling tool. While accelerator testing is the current 'gold standard' for pre-launch testing, we believe the use of fault injection and modeling tools allows for easy, cheap and uniform access for discovering errors early in the design process.« less
FPGA Based Reconfigurable ATM Switch Test Bed

NASA Technical Reports Server (NTRS)

Chu, Pong P.; Jones, Robert E.

1998-01-01

Various issues associated with "FPGA Based Reconfigurable ATM Switch Test Bed" are presented in viewgraph form. Specific topics include: 1) Network performance evaluation; 2) traditional approaches; 3) software simulation; 4) hardware emulation; 5) test bed highlights; 6) design environment; 7) test bed architecture; 8) abstract sheared-memory switch; 9) detailed switch diagram; 10) traffic generator; 11) data collection circuit and user interface; 12) initial results; and 13) the following conclusions: Advances in FPGA make hardware emulation feasible for performance evaluation, hardware emulation can provide several orders of magnitude speed-up over software simulation; due to the complexity of hardware synthesis process, development in emulation is much more difficult than simulation and requires knowledge in both networks and digital design.
Toward a Dynamically Reconfigurable Computing and Communication System for Small Spacecraft

NASA Technical Reports Server (NTRS)

Kifle, Muli; Andro, Monty; Tran, Quang K.; Fujikawa, Gene; Chu, Pong P.

2003-01-01

Future science missions will require the use of multiple spacecraft with multiple sensor nodes autonomously responding and adapting to a dynamically changing space environment. The acquisition of random scientific events will require rapidly changing network topologies, distributed processing power, and a dynamic resource management strategy. Optimum utilization and configuration of spacecraft communications and navigation resources will be critical in meeting the demand of these stringent mission requirements. There are two important trends to follow with respect to NASA's (National Aeronautics and Space Administration) future scientific missions: the use of multiple satellite systems and the development of an integrated space communications network. Reconfigurable computing and communication systems may enable versatile adaptation of a spacecraft system's resources by dynamic allocation of the processor hardware to perform new operations or to maintain functionality due to malfunctions or hardware faults. Advancements in FPGA (Field Programmable Gate Array) technology make it possible to incorporate major communication and network functionalities in FPGA chips and provide the basis for a dynamically reconfigurable communication system. Advantages of higher computation speeds and accuracy are envisioned with tremendous hardware flexibility to ensure maximum survivability of future science mission spacecraft. This paper discusses the requirements, enabling technologies, and challenges associated with dynamically reconfigurable space communications systems.
A Modular Approach to Arithmetic and Logic Unit Design on a Reconfigurable Hardware Platform for Educational Purpose

NASA Astrophysics Data System (ADS)

Oztekin, Halit; Temurtas, Feyzullah; Gulbag, Ali

The Arithmetic and Logic Unit (ALU) design is one of the important topics in Computer Architecture and Organization course in Computer and Electrical Engineering departments. There are ALU designs that have non-modular nature to be used as an educational tool. As the programmable logic technology has developed rapidly, it is feasible that ALU design based on Field Programmable Gate Array (FPGA) is implemented in this course. In this paper, we have adopted the modular approach to ALU design based on FPGA. All the modules in the ALU design are realized using schematic structure on Altera's Cyclone II Development board. Under this model, the ALU content is divided into four distinct modules. These are arithmetic unit except for multiplication and division operations, logic unit, multiplication unit and division unit. User can easily design any size of ALU unit since this approach has the modular nature. Then, this approach was applied to microcomputer architecture design named BZK.SAU.FPGA10.0 instead of the current ALU unit.
Using reconfigurable hardware to accelerate multiple sequence alignment with ClustalW.

PubMed

Oliver, Tim; Schmidt, Bertil; Nathan, Darran; Clemens, Ralf; Maskell, Douglas

2005-08-15

Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model.

PubMed

Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid

2014-01-01

A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well.
Field Programmable Gate Array (FPGA) Respiratory Monitoring System Using a Flow Microsensor and an Accelerometer

NASA Astrophysics Data System (ADS)

Mellal, Idir; Laghrouche, Mourad; Bui, Hung Tien

2017-04-01

This paper describes a non-invasive system for respiratory monitoring using a Micro Electro Mechanical Systems (MEMS) flow sensor and an IMU (Inertial Measurement Unit) accelerometer. The designed system is intended to be wearable and used in a hospital or at home to assist people with respiratory disorders. To ensure the accuracy of our system, we proposed a calibration method based on ANN (Artificial Neural Network) to compensate the temperature drift of the silicon flow sensor. The sigmoid activation functions used in the ANN model were computed with the CORDIC (COordinate Rotation DIgital Computer) algorithm. This algorithm was also used to estimate the tilt angle in body position. The design was implemented on reconfigurable platform FPGA.
FPGA-Based Reconfigurable Processor for Ultrafast Interlaced Ultrasound and Photoacoustic Imaging

PubMed Central

Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing

2016-01-01

In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models. PMID:22828830
FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging.

PubMed

Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing

2012-07-01

In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models.
High-speed multiple sequence alignment on a reconfigurable platform.

PubMed

Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

2006-01-01

Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.

A Secure Content Delivery System Based on a Partially Reconfigurable FPGA

NASA Astrophysics Data System (ADS)

Hori, Yohei; Yokoyama, Hiroyuki; Sakane, Hirofumi; Toda, Kenji

We developed a content delivery system using a partially reconfigurable FPGA to securely distribute digital content on the Internet. With partial reconfigurability of a Xilinx Virtex-II Pro FPGA, the system provides an innovative single-chip solution for protecting digital content. In the system, a partial circuit must be downloaded from a server to the client terminal to play content. Content will be played only when the downloaded circuit is correctly combined (=interlocked) with the circuit built in the terminal. Since each circuit has a unique I/O configuration, the downloaded circuit interlocks with the corresponding built-in circuit designed for a particular terminal. Thus, the interface of the circuit itself provides a novel authentication mechanism. This paper describes the detailed architecture of the system and clarify the feasibility and effectiveness of the system. In addition, we discuss a fail-safe mechanism and future work necessary for the practical application of the system.
Dynamically programmable cache

NASA Astrophysics Data System (ADS)

Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas

1998-10-01

Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
FPGA implementation of a biological neural network based on the Hodgkin-Huxley neuron model

PubMed Central

Yaghini Bonabi, Safa; Asgharian, Hassan; Safari, Saeed; Nili Ahmadabadi, Majid

2014-01-01

A set of techniques for efficient implementation of Hodgkin-Huxley-based (H-H) model of a neural network on FPGA (Field Programmable Gate Array) is presented. The central implementation challenge is H-H model complexity that puts limits on the network size and on the execution speed. However, basics of the original model cannot be compromised when effect of synaptic specifications on the network behavior is the subject of study. To solve the problem, we used computational techniques such as CORDIC (Coordinate Rotation Digital Computer) algorithm and step-by-step integration in the implementation of arithmetic circuits. In addition, we employed different techniques such as sharing resources to preserve the details of model as well as increasing the network size in addition to keeping the network execution speed close to real time while having high precision. Implementation of a two mini-columns network with 120/30 excitatory/inhibitory neurons is provided to investigate the characteristic of our method in practice. The implementation techniques provide an opportunity to construct large FPGA-based network models to investigate the effect of different neurophysiological mechanisms, like voltage-gated channels and synaptic activities, on the behavior of a neural network in an appropriate execution time. Additional to inherent properties of FPGA, like parallelism and re-configurability, our approach makes the FPGA-based system a proper candidate for study on neural control of cognitive robots and systems as well. PMID:25484854
Reconfigurable fault tolerant avionics system

NASA Astrophysics Data System (ADS)

Ibrahim, M. M.; Asami, K.; Cho, Mengu

This paper presents the design of a reconfigurable avionics system based on modern Static Random Access Memory (SRAM)-based Field Programmable Gate Array (FPGA) to be used in future generations of nano satellites. A major concern in satellite systems and especially nano satellites is to build robust systems with low-power consumption profiles. The system is designed to be flexible by providing the capability of reconfiguring itself based on its orbital position. As Single Event Upsets (SEU) do not have the same severity and intensity in all orbital locations, having the maximum at the South Atlantic Anomaly (SAA) and the polar cusps, the system does not have to be fully protected all the time in its orbit. An acceptable level of protection against high-energy cosmic rays and charged particles roaming in space is provided within the majority of the orbit through software fault tolerance. Check pointing and roll back, besides control flow assertions, is used for that level of protection. In the minority part of the orbit where severe SEUs are expected to exist, a reconfiguration for the system FPGA is initiated where the processor systems are triplicated and protection through Triple Modular Redundancy (TMR) with feedback is provided. This technique of reconfiguring the system as per the level of the threat expected from SEU-induced faults helps in reducing the average dynamic power consumption of the system to one-third of its maximum. This technique can be viewed as a smart protection through system reconfiguration. The system is built on the commercial version of the (XC5VLX50) Xilinx Virtex5 FPGA on bulk silicon with 324 IO. Simulations of orbit SEU rates were carried out using the SPENVIS web-based software package.
Volumetric visualization algorithm development for an FPGA-based custom computing machine

NASA Astrophysics Data System (ADS)

Sallinen, Sami J.; Alakuijala, Jyrki; Helminen, Hannu; Laitinen, Joakim

1998-05-01

Rendering volumetric medical images is a burdensome computational task for contemporary computers due to the large size of the data sets. Custom designed reconfigurable hardware could considerably speed up volume visualization if an algorithm suitable for the platform is used. We present an algorithm and speedup techniques for visualizing volumetric medical CT and MR images with a custom-computing machine based on a Field Programmable Gate Array (FPGA). We also present simulated performance results of the proposed algorithm calculated with a software implementation running on a desktop PC. Our algorithm is capable of generating perspective projection renderings of single and multiple isosurfaces with transparency, simulated X-ray images, and Maximum Intensity Projections (MIP). Although more speedup techniques exist for parallel projection than for perspective projection, we have constrained ourselves to perspective viewing, because of its importance in the field of radiotherapy. The algorithm we have developed is based on ray casting, and the rendering is sped up by three different methods: shading speedup by gradient precalculation, a new generalized version of Ray-Acceleration by Distance Coding (RADC), and background ray elimination by speculative ray selection.
A Fixed Point VHDL Component Library for a High Efficiency Reconfigurable Radio Design Methodology

NASA Technical Reports Server (NTRS)

Hoy, Scott D.; Figueiredo, Marco A.

2006-01-01

Advances in Field Programmable Gate Array (FPGA) technologies enable the implementation of reconfigurable radio systems for both ground and space applications. The development of such systems challenges the current design paradigms and requires more robust design techniques to meet the increased system complexity. Among these techniques is the development of component libraries to reduce design cycle time and to improve design verification, consequently increasing the overall efficiency of the project development process while increasing design success rates and reducing engineering costs. This paper describes the reconfigurable radio component library developed at the Software Defined Radio Applications Research Center (SARC) at Goddard Space Flight Center (GSFC) Microwave and Communications Branch (Code 567). The library is a set of fixed-point VHDL components that link the Digital Signal Processing (DSP) simulation environment with the FPGA design tools. This provides a direct synthesis path based on the latest developments of the VHDL tools as proposed by the BEE VBDL 2004 which allows for the simulation and synthesis of fixed-point math operations while maintaining bit and cycle accuracy. The VHDL Fixed Point Reconfigurable Radio Component library does not require the use of the FPGA vendor specific automatic component generators and provide a generic path from high level DSP simulations implemented in Mathworks Simulink to any FPGA device. The access to the component synthesizable, source code provides full design verification capability:
PROGRAPE-1: A Programmable, Multi-Purpose Computer for Many-Body Simulations

NASA Astrophysics Data System (ADS)

Hamada, Tsuyoshi; Fukushige, Toshiyuki; Kawai, Atsushi; Makino, Junichiro

2000-10-01

We have developed PROGRAPE-1 (PROgrammable GRAPE-1), a programmable multi-purpose computer for many-body simulations. The main difference between PROGRAPE-1 and ``traditional'' GRAPE systems is that the former uses FPGA (Field Programmable Gate Array) chips as the processing elements, while the latter relies on a hardwired pipeline processor specialized to gravitational interactions. Since the logic implemented in FPGA chips can be reconfigured, we can use PROGRAPE-1 to calculate not only gravitational interactions, but also other forms of interactions, such as the van der Waals force, hydro\\-dynamical interactions in the SPHr calculation, and so on. PROGRAPE-1 comprises two Altera EPF10K100 FPGA chips, each of which contains nominally 100000 gates. To evaluate the programmability and performance of PROGRAPE-1, we implemented a pipeline for gravitational interactions similar to that of GRAPE-3. One pipeline is fitted into a single FPGA chip, operated at 16 MHz clock. Thus, for gravitational interactions, PROGRAPE-1 provided a speed of 0.96 Gflops-equivalent. PROGRAPE will prove to be useful for a wide-range of particle-based simulations in which the calculation cost of interactions other than gravity is high, such as the evaluation of SPH interactions.
Design of an FPGA-based electronic flow regulator (EFR) for spacecraft propulsion system

NASA Astrophysics Data System (ADS)

Manikandan, J.; Jayaraman, M.; Jayachandran, M.

2011-02-01

This paper describes a scheme for electronically regulating the flow of propellant to the thruster from a high-pressure storage tank used in spacecraft application. Precise flow delivery of propellant to thrusters ensures propulsion system operation at best efficiency by maximizing the propellant and power utilization for the mission. The proposed field programmable gate array (FPGA) based electronic flow regulator (EFR) is used to ensure precise flow of propellant to the thrusters from a high-pressure storage tank used in spacecraft application. This paper presents hardware and software design of electronic flow regulator and implementation of the regulation logic onto an FPGA.Motivation for proposed FPGA-based electronic flow regulation is on the disadvantages of conventional approach of using analog circuits. Digital flow regulation overcomes the analog equivalent as digital circuits are highly flexible, are not much affected due to noise, accurate performance is repeatable, interface is easier to computers, storing facilities are possible and finally failure rate of digital circuits is less. FPGA has certain advantages over ASIC and microprocessor/micro-controller that motivated us to opt for FPGA-based electronic flow regulator. Also the control algorithm being software, it is well modifiable without changing the hardware. This scheme is simple enough to adopt for a wide range of applications, where the flow is to be regulated for efficient operation.The proposed scheme is based on a space-qualified re-configurable field programmable gate arrays (FPGA) and hybrid micro circuit (HMC). A graphical user interface (GUI) based application software is also developed for debugging, monitoring and controlling the electronic flow regulator from PC COM port.
Fault-Tolerant, Radiation-Hard DSP

NASA Technical Reports Server (NTRS)

Czajkowski, David

2011-01-01

Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high-performance signal processing include significant increase in onboard science data processing, enabling orders of magnitude reduction in required communication bandwidth for science data return, orders of magnitude improvement in onboard mission planning and critical decision making, and the ability to rapidly respond to changing mission environments, thus enabling opportunistic science and orders of magnitude reduction in the cost of mission operations through reduction of required staff. Additional benefits of COTS-based, high-performance signal processing include the ability to leverage considerable commercial and academic investments in advanced computing tools, techniques, and infra structure, and the familiarity of the science and IT community with these computing environments.
FPGA design of correlation-based pattern recognition

NASA Astrophysics Data System (ADS)

Jridi, Maher; Alfalou, Ayman

2017-05-01

Optical/Digital pattern recognition and tracking based on optical/digital correlation are a well-known techniques to detect, identify and localize a target object in a scene. Despite the limited number of treatments required by the correlation scheme, computational time and resources are relatively high. The most computational intensive treatment required by the correlation is the transformation from spatial to spectral domain and then from spectral to spatial domain. Furthermore, these transformations are used on optical/digital encryption schemes like the double random phase encryption (DRPE). In this paper, we present a VLSI architecture for the correlation scheme based on the fast Fourier transform (FFT). One interesting feature of the proposed scheme is its ability to stream image processing in order to perform correlation for video sequences. A trade-off between the hardware consumption and the robustness of the correlation can be made in order to understand the limitations of the correlation implementation in reconfigurable and portable platforms. Experimental results obtained from HDL simulations and FPGA prototype have demonstrated the advantages of the proposed scheme.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms

NASA Astrophysics Data System (ADS)

Du, Hongtao; Qi, Hairong; Szu, Harold H.

2005-03-01

Recent advances in intelligence technology have boosted the development of micro- Unmanned Air Vehicles (UAVs) including Sliver Fox, Shadow, and Scan Eagle for various surveillance and reconnaissance applications. These affordable and reusable devices have to fit a series of size, weight, and power constraints. Cameras used on such micro-UAVs are therefore mounted directly at a fixed angle without any motion-compensated gimbals. This mounting scheme has resulted in the so-called jitter effect in which jitter is defined as sub-pixel or small amplitude vibrations. The jitter blur caused by the jitter effect needs to be corrected before any other processing algorithms can be practically applied. Jitter restoration has been solved by various optimization techniques, including Wiener approximation, maximum a-posteriori probability (MAP), etc. However, these algorithms normally assume a spatial-invariant blur model that is not the case with jitter blur. Szu et al. developed a smart real-time algorithm based on auto-regression (AR) with its natural generalization of unsupervised artificial neural network (ANN) learning to achieve restoration accuracy at the sub-pixel level. This algorithm resembles the capability of the human visual system, in which an agreement between the pair of eyes indicates "signal", otherwise, the jitter noise. Using this non-statistical method, for each single pixel, a deterministic blind sources separation (BSS) process can then be carried out independently based on a deterministic minimum of the Helmholtz free energy with a generalization of Shannon's information theory applied to open dynamic systems. From a hardware implementation point of view, the process of jitter restoration of an image using Szu's algorithm can be optimized by pixel-based parallelization. In our previous work, a parallelly structured independent component analysis (ICA) algorithm has been implemented on both Field Programmable Gate Array (FPGA) and Application-Specific Integrated Circuit (ASIC) using standard-height cells. ICA is an algorithm that can solve BSS problems by carrying out the all-order statistical, decorrelation-based transforms, in which an assumption that neighborhood pixels share the same but unknown mixing matrix A is made. In this paper, we continue our investigation on the design challenges of firmware approaches to smart algorithms. We think two levels of parallelization can be explored, including pixel-based parallelization and the parallelization of the restoration algorithm performed at each pixel. This paper focuses on the latter and we use ICA as an example to explain the design and implementation methods. It is well known that the capacity constraints of single FPGA have limited the implementation of many complex algorithms including ICA. Using the reconfigurability of FPGA, we show, in this paper, how to manipulate the FPGA-based system to provide extra computing power for the parallelized ICA algorithm with limited FPGA resources. The synthesis aiming at the pilchard re-configurable FPGA platform is reported. The pilchard board is embedded with single Xilinx VIRTEX 1000E FPGA and transfers data directly to CPU on the 64-bit memory bus at the maximum frequency of 133MHz. Both the feasibility performance evaluations and experimental results validate the effectiveness and practicality of this synthesis, which can be extended to the spatial-variant jitter restoration for micro-UAV deployment.
Colt: an experiment in wormhole run-time reconfiguration

NASA Astrophysics Data System (ADS)

Bittner, Ray; Athanas, Peter M.; Musgrove, Mark

1996-10-01

Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
Temperature-Adaptive Circuits on Reconfigurable Analog Arrays

NASA Technical Reports Server (NTRS)

Stoica, Adrian; Zebulum, Ricardo S.; Keymeulen, Didier; Ramesham, Rajeshuni; Neff, Joseph; Katkoori, Srinivas

2006-01-01

Demonstration of a self-reconfigurable Integrated Circuit (IC) that would operate under extreme temperature (-180 C and 120 C) and radiation (300krad), without the protection of thermal controls and radiation shields. Self-Reconfigurable Electronics platform: a) Evolutionary Processor (EP) to run reconfiguration mechanism; b) Reconfigurable chip (FPGA, FPAA, etc).
Spacecube V2.0 Micro Single Board Computer

NASA Technical Reports Server (NTRS)

Petrick, David J. (Inventor); Geist, Alessandro (Inventor); Lin, Michael R. (Inventor); Crum, Gary R. (Inventor)

2017-01-01

A single board computer system radiation hardened for space flight includes a printed circuit board having a top side and bottom side; a reconfigurable field programmable gate array (FPGA) processor device disposed on the top side; a connector disposed on the top side; a plurality of peripheral components mounted on the bottom side; and wherein a size of the single board computer system is not greater than approximately 7 cm.times.7 cm.
A Reconfigurable Communications System for Small Spacecraft

NASA Technical Reports Server (NTRS)

Chu, Pong P.; Kifle, Muli

2004-01-01

Two trends of NASA missions are the use of multiple small spacecraft and the development of an integrated space network. To achieve these goals, a robust and agile communications system is needed. Advancements in field programmable gate array (FPGA) technology have made it possible to incorporate major communication and network functionalities in FPGA chips; thus this technology has great potential as the basis for a reconfigurable communications system. This report discusses the requirements of future space communications, reviews relevant issues, and proposes a methodology to design and construct a reconfigurable communications system for small scientific spacecraft.
The trigger system for the external target experiment in the HIRFL cooling storage ring

NASA Astrophysics Data System (ADS)

Li, Min; Zhao, Lei; Liu, Jin-Xin; Lu, Yi-Ming; Liu, Shu-Bin; An, Qi

2016-08-01

A trigger system was designed for the external target experiment in the Cooling Storage Ring (CSR) of the Heavy Ion Research Facility in Lanzhou (HIRFL). Considering that different detectors are scattered over a large area, the trigger system is designed based on a master-slave structure and fiber-based serial data transmission technique. The trigger logic is organized in hierarchies, and flexible reconfiguration of the trigger function is achieved based on command register access or overall field-programmable gate array (FPGA) logic on-line reconfiguration controlled by remote computers. We also conducted tests to confirm the function of the trigger electronics, and the results indicate that this trigger system works well. Supported by the National Natural Science Foundation of China (11079003), the Knowledge Innovation Program of the Chinese Academy of Sciences (KJCX2-YW-N27), and the CAS Center for Excellence in Particle Physics (CCEPP).
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Seyong; Kim, Jungwon; Vetter, Jeffrey S

This paper presents a directive-based, high-level programming framework for high-performance reconfigurable computing. It takes a standard, portable OpenACC C program as input and generates a hardware configuration file for execution on FPGAs. We implemented this prototype system using our open-source OpenARC compiler; it performs source-to-source translation and optimization of the input OpenACC program into an OpenCL code, which is further compiled into a FPGA program by the backend Altera Offline OpenCL compiler. Internally, the design of OpenARC uses a high- level intermediate representation that separates concerns of program representation from underlying architectures, which facilitates portability of OpenARC. In fact, thismore » design allowed us to create the OpenACC-to-FPGA translation framework with minimal extensions to our existing system. In addition, we show that our proposed FPGA-specific compiler optimizations and novel OpenACC pragma extensions assist the compiler in generating more efficient FPGA hardware configuration files. Our empirical evaluation on an Altera Stratix V FPGA with eight OpenACC benchmarks demonstrate the benefits of our strategy. To demonstrate the portability of OpenARC, we show results for the same benchmarks executing on other heterogeneous platforms, including NVIDIA GPUs, AMD GPUs, and Intel Xeon Phis. This initial evidence helps support the goal of using a directive-based, high-level programming strategy for performance portability across heterogeneous HPC architectures.« less
A digital frequency stabilization system of external cavity diode laser based on LabVIEW FPGA

NASA Astrophysics Data System (ADS)

Liu, Zhuohuan; Hu, Zhaohui; Qi, Lu; Wang, Tao

2015-10-01

Frequency stabilization for external cavity diode laser has played an important role in physics research. Many laser frequency locking solutions have been proposed by researchers. Traditionally, the locking process was accomplished by analog system, which has fast feedback control response speed. However, analog system is susceptible to the effects of environment. In order to improve the automation level and reliability of the frequency stabilization system, we take a grating-feedback external cavity diode laser as the laser source and set up a digital frequency stabilization system based on National Instrument's FPGA (NI FPGA). The system consists of a saturated absorption frequency stabilization of beam path, a differential photoelectric detector, a NI FPGA board and a host computer. Many functions, such as piezoelectric transducer (PZT) sweeping, atomic saturation absorption signal acquisition, signal peak identification, error signal obtaining and laser PZT voltage feedback controlling, are totally completed by LabVIEW FPGA program. Compared with the analog system, the system built by the logic gate circuits, performs stable and reliable. User interface programmed by LabVIEW is friendly. Besides, benefited from the characteristics of reconfiguration, the LabVIEW program is good at transplanting in other NI FPGA boards. Most of all, the system periodically checks the error signal. Once the abnormal error signal is detected, FPGA will restart frequency stabilization process without manual control. Through detecting the fluctuation of error signal of the atomic saturation absorption spectrum line in the frequency locking state, we can infer that the laser frequency stability can reach 1MHz.
Design of the Wind Tunnel Model Communication Controller Board. Degree awarded by Christopher Newport Univ. on Dec. 1998

NASA Technical Reports Server (NTRS)

Wilson, William C.

1999-01-01

The NASA Langley Research Center's Wind Tunnel Reinvestment project plans to shrink the existing data acquisition electronics to fit inside a wind tunnel model. Space limitations within a model necessitate a distributed system of Application Specific Integrated Circuits (ASICs) rather than a centralized system based on PC boards. This thesis will focus on the design of the prototype of the communication Controller board. A portion of the communication Controller board is to be used as the basis of an ASIC design. The communication Controller board will communicate between the internal model modules and the external data acquisition computer. This board is based around an Field Programmable Gate Array (FPGA), to allow for reconfigurability. In addition to the FPGA, this board contains buffer Random Access Memory (RAM), configuration memory (EEPROM), drivers for the communications ports, and passive components.
A Gigabit-per-Second Ka-Band Demonstration Using a Reconfigurable FPGA Modulator

NASA Technical Reports Server (NTRS)

Lee, Dennis; Gray, Andrew A.; Kang, Edward C.; Tsou, Haiping; Lay, Norman E.; Fong, Wai; Fisher, Dave; Hoy, Scott

2005-01-01

Gigabit-per-second communications have been a desired target for future NASA Earth science missions, and for potential manned lunar missions. Frequency bandwidth at S-band and X-band is typically insufficient to support missions at these high data rates. In this paper, we present the results of a 1 Gbps 32-QAM end-to-end experiment at Ka-band using a reconfigurable Field Programmable Gate Array (FPGA) baseband modulator board. Bit error rate measurements of the received signal using a software receiver demonstrate the feasibility of using ultra-high data rates at Ka-band, although results indicate that error correcting coding and/or modulator predistortion must be implemented in addition. Also, results of the demonstration validate the low-cost, MOS-based reconfigurable modulator approach taken to development of a high rate modulator, as opposed to more expensive ASIC or pure analog approaches.

A CCD experimental platform for large telescope in Antarctica based on FPGA

NASA Astrophysics Data System (ADS)

Zhu, Yuhua; Qi, Yongjun

2014-07-01

The CCD , as a detector , is one of the important components of astronomical telescopes. For a large telescope in Antarctica, a set of CCD detector system with large size, high sensitivity and low noise is indispensable. Because of the extremely low temperatures and unattended, system maintenance and software and hardware upgrade become hard problems. This paper introduces a general CCD controller experiment platform, using Field programmable gate array FPGA, which is, in fact, a large-scale field reconfigurable array. Taking the advantage of convenience to modify the system, construction of driving circuit, digital signal processing module, network communication interface, control algorithm validation, and remote reconfigurable module may realize. With the concept of integrated hardware and software, the paper discusses the key technology of building scientific CCD system suitable for the special work environment in Antarctica, focusing on the method of remote reconfiguration for controller via network and then offering a feasible hardware and software solution.
STRS Compliant FPGA Waveform Development

NASA Technical Reports Server (NTRS)

Nappier, Jennifer; Downey, Joseph; Mortensen, Dale

2008-01-01

The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. The extension of STRS to the SSP hardware will promote easier waveform reconfiguration and reuse. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. A FPGA-based transmit waveform implementation of the proposed standard interfaces on a laboratory breadboard SDR will be discussed.
Application of Reconfigurable Computing Technology to Multi-KiloHertz Micro-Laser Altimeter (MMLA) Data Processing

NASA Technical Reports Server (NTRS)

Powell, Wesley; Dabney, Philip; Hicks, Edward; Pinchinat, Maxime; Day, John H. (Technical Monitor)

2002-01-01

The Multi-KiloHertz Micro-Laser Altimeter (MMLA) is an aircraft based instrument developed by NASA Goddard Space Flight Center with several potential spaceflight applications. This presentation describes how reconfigurable computing technology was employed to perform MMLA signal extraction in real-time under realistic operating constraints. The MMLA is a "single-photon-counting" airborne laser altimeter that is used to measure land surface features such as topography and vegetation canopy height. This instrument has to date flown a number of times aboard the NASA P3 aircraft acquiring data at a number of sites in the Mid-Atlantic region. This instrument pulses a relatively low-powered laser at a very high rate (10 kHz) and then measures the time-of-flight of discrete returns from the target surface. The instrument then bins these measurements into a two-dimensional array (vertical height vs. horizontal ground track) and selects the most likely signal path through the array. Return data that does not correspond to the selected signal path are classified as noise returns and are then discarded. The MMLA signal extraction algorithm is very compute intensive in that a score must be computed for every possible path through the two dimensional array in order to select the most likely signal path. Given a typical array size with 50 x 6, up to 33 arrays must be processed per second. And for each of these arrays, roughly 12,000 individual paths must be scored. Furthermore, the number of paths increases exponentially with the horizontal size of the array, and linearly with the vertical size. Yet, increasing the horizontal and vertical sizes of the array offer science advantages such as improved range, resolution, and noise rejection. Due to the volume of return data and the compute intensive signal extraction algorithm, the existing PC-based MMLA data system has been unable to perform signal extraction in real-time unless the array is limited in size to one column, This limits the ability of the MMLA to operate in environments with sparse signal returns and a high number of noise return. However, under an IR&D project, an FPGA-based, reconfigurable computing data system has been developed that has been demonstrated to perform real-time signal extraction under realistic operating constraints. This reconfigurable data system is based on the commercially available Firebird Board from Annapolis Microsystems. This PCI board consists of a Xilinx Virtex 2000E FPGA along with 36 MB of SRAM arranged in five separately addressable banks. This board is housed in a rackmount PC with dual 850MHz Pentium processors running the Windows 2000 operating system. This data system performs all signal extraction in hardware on the Firebird, but also runs the existing "software based" signal extraction in tandem for comparison purposes. Using a relatively small amount of the Virtex XCV2000E resources, the reconfigurable data system has demonstrated to improve performance improvement over the existing software based data system by an order of magnitude. Performance could be further improved by employing parallelism. Ground testing and a preliminary engineering test flight aboard the NASA P3 has been performed, during which the reconfigurable data system has been demonstrated to match the results of the existing data system.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Learn, Mark Walter

Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
Implementation of Multispectral Image Classification on a Remote Adaptive Computer

NASA Technical Reports Server (NTRS)

Figueiredo, Marco A.; Gloster, Clay S.; Stephens, Mark; Graves, Corey A.; Nakkar, Mouna

1999-01-01

As the demand for higher performance computers for the processing of remote sensing science algorithms increases, the need to investigate new computing paradigms its justified. Field Programmable Gate Arrays enable the implementation of algorithms at the hardware gate level, leading to orders of m a,gnitude performance increase over microprocessor based systems. The automatic classification of spaceborne multispectral images is an example of a computation intensive application, that, can benefit from implementation on an FPGA - based custom computing machine (adaptive or reconfigurable computer). A probabilistic neural network is used here to classify pixels of of a multispectral LANDSAT-2 image. The implementation described utilizes Java client/server application programs to access the adaptive computer from a remote site. Results verify that a remote hardware version of the algorithm (implemented on an adaptive computer) is significantly faster than a local software version of the same algorithm implemented on a typical general - purpose computer).
Research, Development and Testing of a Fault-Tolerant FPGA-Based Sequencer for CubeSat Launching Applications

DTIC Science & Technology

2013-03-01

amounts of time and effort to implement. Future testing with commercial, fault-tolerant synthesis software, under a radiation environment, will yield ...initial viewpoint of the author is to take the flash-based FPGA route. This will yield a simple, reconfigurable circuit while providing the added...structure seen in Figure 30. Each of these full adder blocks were replaced in subsequent iterations to yield proper comparison with this baseline
Embedded algorithms within an FPGA-based system to process nonlinear time series data

NASA Astrophysics Data System (ADS)

Jones, Jonathan D.; Pei, Jin-Song; Tull, Monte P.

2008-03-01

This paper presents some preliminary results of an ongoing project. A pattern classification algorithm is being developed and embedded into a Field-Programmable Gate Array (FPGA) and microprocessor-based data processing core in this project. The goal is to enable and optimize the functionality of onboard data processing of nonlinear, nonstationary data for smart wireless sensing in structural health monitoring. Compared with traditional microprocessor-based systems, fast growing FPGA technology offers a more powerful, efficient, and flexible hardware platform including on-site (field-programmable) reconfiguration capability of hardware. An existing nonlinear identification algorithm is used as the baseline in this study. The implementation within a hardware-based system is presented in this paper, detailing the design requirements, validation, tradeoffs, optimization, and challenges in embedding this algorithm. An off-the-shelf high-level abstraction tool along with the Matlab/Simulink environment is utilized to program the FPGA, rather than coding the hardware description language (HDL) manually. The implementation is validated by comparing the simulation results with those from Matlab. In particular, the Hilbert Transform is embedded into the FPGA hardware and applied to the baseline algorithm as the centerpiece in processing nonlinear time histories and extracting instantaneous features of nonstationary dynamic data. The selection of proper numerical methods for the hardware execution of the selected identification algorithm and consideration of the fixed-point representation are elaborated. Other challenges include the issues of the timing in the hardware execution cycle of the design, resource consumption, approximation accuracy, and user flexibility of input data types limited by the simplicity of this preliminary design. Future work includes making an FPGA and microprocessor operate together to embed a further developed algorithm that yields better computational and power efficiency.
High-Performance, Radiation-Hardened Electronics for Space Environments

NASA Technical Reports Server (NTRS)

Keys, Andrew S.; Watson, Michael D.; Frazier, Donald O.; Adams, James H.; Johnson, Michael A.; Kolawa, Elizabeth A.

2007-01-01

The Radiation Hardened Electronics for Space Environments (RHESE) project endeavors to advance the current state-of-the-art in high-performance, radiation-hardened electronics and processors, ensuring successful performance of space systems required to operate within extreme radiation and temperature environments. Because RHESE is a project within the Exploration Technology Development Program (ETDP), RHESE's primary customers will be the human and robotic missions being developed by NASA's Exploration Systems Mission Directorate (ESMD) in partial fulfillment of the Vision for Space Exploration. Benefits are also anticipated for NASA's science missions to planetary and deep-space destinations. As a technology development effort, RHESE provides a broad-scoped, full spectrum of approaches to environmentally harden space electronics, including new materials, advanced design processes, reconfigurable hardware techniques, and software modeling of the radiation environment. The RHESE sub-project tasks are: SelfReconfigurable Electronics for Extreme Environments, Radiation Effects Predictive Modeling, Radiation Hardened Memory, Single Event Effects (SEE) Immune Reconfigurable Field Programmable Gate Array (FPGA) (SIRF), Radiation Hardening by Software, Radiation Hardened High Performance Processors (HPP), Reconfigurable Computing, Low Temperature Tolerant MEMS by Design, and Silicon-Germanium (SiGe) Integrated Electronics for Extreme Environments. These nine sub-project tasks are managed by technical leads as located across five different NASA field centers, including Ames Research Center, Goddard Space Flight Center, the Jet Propulsion Laboratory, Langley Research Center, and Marshall Space Flight Center. The overall RHESE integrated project management responsibility resides with NASA's Marshall Space Flight Center (MSFC). Initial technology development emphasis within RHESE focuses on the hardening of Field Programmable Gate Arrays (FPGA)s and Field Programmable Analog Arrays (FPAA)s for use in reconfigurable architectures. As these component/chip level technologies mature, the RHESE project emphasis shifts to focus on efforts encompassing total processor hardening techniques and board-level electronic reconfiguration techniques featuring spare and interface modularity. This phased approach to distributing emphasis between technology developments provides hardened FPGA/FPAAs for early mission infusion, then migrates to hardened, board-level, high speed processors with associated memory elements and high density storage for the longer duration missions encountered for Lunar Outpost and Mars Exploration occurring later in the Constellation schedule.
An adaptive cryptographic accelerator for network storage security on dynamically reconfigurable platform

NASA Astrophysics Data System (ADS)

Tang, Li; Liu, Jing-Ning; Feng, Dan; Tong, Wei

2008-12-01

Existing security solutions in network storage environment perform poorly because cryptographic operations (encryption and decryption) implemented in software can dramatically reduce system performance. In this paper we propose a cryptographic hardware accelerator on dynamically reconfigurable platform for the security of high performance network storage system. We employ a dynamic reconfigurable platform based on a FPGA to implement a PowerPCbased embedded system, which executes cryptographic algorithms. To reduce the reconfiguration latency, we apply prefetch scheduling. Moreover, the processing elements could be dynamically configured to support different cryptographic algorithms according to the request received by the accelerator. In the experiment, we have implemented AES (Rijndael) and 3DES cryptographic algorithms in the reconfigurable accelerator. Our proposed reconfigurable cryptographic accelerator could dramatically increase the performance comparing with the traditional software-based network storage systems.
The SMS4 cryptographic system design based on dynamic partial self-reconfiguration technology

NASA Astrophysics Data System (ADS)

Wang, Jianxin; Gao, Xianwei; Li, Xiuying; Sui, Meili

2013-03-01

This paper describes SMS4 algorithm by using dynamic partial self-reconfiguration. The design is implemented on Xilinx VirtexII-Pro XC2VP30 FPGA devices. The partial self-reconfiguration encryption/decryption module data throughput is up to 50Mb/s, key expansion and encryption/decryption modules use 1606 and 1570 slices respectively, and the resource utilization ratio of the key expansion by using partial self-reconfiguration technology is less 32.03% and slices are less 757 than the non-reconfiguration technology. SMS4 implementation gets a good balance between high performance and low complexity in area. The theoretical and practical research of dynamic partial self-reconfiguration has a broad space for development and application prospect.
Integration of multi-interface conversion channel using FPGA for modular photonic network

NASA Astrophysics Data System (ADS)

Janicki, Tomasz; Pozniak, Krzysztof T.; Romaniuk, Ryszard S.

2010-09-01

The article discusses the integration of different types of interfaces with FPGA circuits using a reconfigurable communication platform. The solution has been implemented in practice in a single node of a distributed measurement system. Construction of communication platform has been presented with its selected hardware modules, described in VHDL and implemented in FPGA circuits. The graphical user interface (GUI) has been described that allows a user to control the operation of the system. In the final part of the article selected practical solutions have been introduced. The whole measurement system resides on multi-gigabit optical network. The optical network construction is highly modular, reconfigurable and scalable.
High-performance reconfigurable coincidence counting unit based on a field programmable gate array.

PubMed

Park, Byung Kwon; Kim, Yong-Su; Kwon, Osung; Han, Sang-Wook; Moon, Sung

2015-05-20

We present a high-performance reconfigurable coincidence counting unit (CCU) using a low-end field programmable gate array (FPGA) and peripheral circuits. Because of the flexibility guaranteed by the FPGA program, we can easily change system parameters, such as internal input delays, coincidence configurations, and the coincidence time window. In spite of a low-cost implementation, the proposed CCU architecture outperforms previous ones in many aspects: it has 8 logic inputs and 4 coincidence outputs that can measure up to eight-fold coincidences. The minimum coincidence time window and the maximum input frequency are 0.47 ns and 163 MHz, respectively. The CCU will be useful in various experimental research areas, including the field of quantum optics and quantum information.
Facilitating preemptive hardware system design using partial reconfiguration techniques.

PubMed

Dondo Gazzano, Julio; Rincon, Fernando; Vaderrama, Carlos; Villanueva, Felix; Caba, Julian; Lopez, Juan Carlos

2014-01-01

In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then the remote integration of off-chip components. A model that offers component location transparency is also presented to enhance and facilitate system integration.
Facilitating Preemptive Hardware System Design Using Partial Reconfiguration Techniques

PubMed Central

Rincon, Fernando; Vaderrama, Carlos; Villanueva, Felix; Caba, Julian; Lopez, Juan Carlos

2014-01-01

In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then the remote integration of off-chip components. A model that offers component location transparency is also presented to enhance and facilitate system integration. PMID:24672292
PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

NASA Astrophysics Data System (ADS)

Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

2004-11-01

Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
FPGA-based gating and logic for multichannel single photon counting

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pooser, Raphael C; Earl, Dennis Duncan; Evans, Philip G

2012-01-01

We present results characterizing multichannel InGaAs single photon detectors utilizing gated passive quenching circuits (GPQC), self-differencing techniques, and field programmable gate array (FPGA)-based logic for both diode gating and coincidence counting. Utilizing FPGAs for the diode gating frontend and the logic counting backend has the advantage of low cost compared to custom built logic circuits and current off-the-shelf detector technology. Further, FPGA logic counters have been shown to work well in quantum key distribution (QKD) test beds. Our setup combines multiple independent detector channels in a reconfigurable manner via an FPGA backend and post processing in order to perform coincidencemore » measurements between any two or more detector channels simultaneously. Using this method, states from a multi-photon polarization entangled source are detected and characterized via coincidence counting on the FPGA. Photons detection events are also processed by the quantum information toolkit for application testing (QITKAT)« less
Active vibration control of a full scale aircraft wing using a reconfigurable controller

NASA Astrophysics Data System (ADS)

Prakash, Shashikala; Renjith Kumar, T. G.; Raja, S.; Dwarakanathan, D.; Subramani, H.; Karthikeyan, C.

2016-01-01

This work highlights the design of a Reconfigurable Active Vibration Control (AVC) System for aircraft structures using adaptive techniques. The AVC system with a multichannel capability is realized using Filtered-X Least Mean Square algorithm (FxLMS) on Xilinx Virtex-4 Field Programmable Gate Array (FPGA) platform in Very High Speed Integrated Circuits Hardware Description Language, (VHDL). The HDL design is made based on Finite State Machine (FSM) model with Floating point Intellectual Property (IP) cores for arithmetic operations. The use of FPGA facilitates to modify the system parameters even during runtime depending on the changes in user's requirements. The locations of the control actuators are optimized based on dynamic modal strain approach using genetic algorithm (GA). The developed system has been successfully deployed for the AVC testing of the full-scale wing of an all composite two seater transport aircraft. Several closed loop configurations like single channel and multi-channel control have been tested. The experimental results from the studies presented here are very encouraging. They demonstrate the usefulness of the system's reconfigurability for real time applications.
Exploiting the chaotic behaviour of atmospheric models with reconfigurable architectures

NASA Astrophysics Data System (ADS)

Russell, Francis P.; Düben, Peter D.; Niu, Xinyu; Luk, Wayne; Palmer, T. N.

2017-12-01

Reconfigurable architectures are becoming mainstream: Amazon, Microsoft and IBM are supporting such architectures in their data centres. The computationally intensive nature of atmospheric modelling is an attractive target for hardware acceleration using reconfigurable computing. Performance of hardware designs can be improved through the use of reduced-precision arithmetic, but maintaining appropriate accuracy is essential. We explore reduced-precision optimisation for simulating chaotic systems, targeting atmospheric modelling, in which even minor changes in arithmetic behaviour will cause simulations to diverge quickly. The possibility of equally valid simulations having differing outcomes means that standard techniques for comparing numerical accuracy are inappropriate. We use the Hellinger distance to compare statistical behaviour between reduced-precision CPU implementations to guide reconfigurable designs of a chaotic system, then analyse accuracy, performance and power efficiency of the resulting implementations. Our results show that with only a limited loss in accuracy corresponding to less than 10% uncertainty in input parameters, the throughput and energy efficiency of a single-precision chaotic system implemented on a Xilinx Virtex-6 SX475T Field Programmable Gate Array (FPGA) can be more than doubled.
Dynamic partial reconfiguration of logic controllers implemented in FPGAs

NASA Astrophysics Data System (ADS)

Bazydło, Grzegorz; Wiśniewski, Remigiusz

2016-09-01

Technological progress in recent years benefits in digital circuits containing millions of logic gates with the capability for reprogramming and reconfiguring. On the one hand it provides the unprecedented computational power, but on the other hand the modelled systems are becoming increasingly complex, hierarchical and concurrent. Therefore, abstract modelling supported by the Computer Aided Design tools becomes a very important task. Even the higher consumption of the basic electronic components seems to be acceptable because chip manufacturing costs tend to fall over the time. The paper presents a modelling approach for logic controllers with the use of Unified Modelling Language (UML). Thanks to the Model Driven Development approach, starting with a UML state machine model, through the construction of an intermediate Hierarchical Concurrent Finite State Machine model, a collection of Verilog files is created. The system description generated in hardware description language can be synthesized and implemented in reconfigurable devices, such as FPGAs. Modular specification of the prototyped controller permits for further dynamic partial reconfiguration of the prototyped system. The idea bases on the exchanging of the functionality of the already implemented controller without stopping of the FPGA device. It means, that a part (for example a single module) of the logic controller is replaced by other version (called context), while the rest of the system is still running. The method is illustrated by a practical example by an exemplary Home Area Network system.
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems

NASA Technical Reports Server (NTRS)

Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.

2012-01-01

Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.

Using SRAM Based FPGAs for Power-Aware High Performance Wireless Sensor Networks

PubMed Central

Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa

2012-01-01

While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today’s applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements. PMID:22736971
Using SRAM based FPGAs for power-aware high performance wireless sensor networks.

PubMed

Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa

2012-01-01

While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today's applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements.
DDGIPS: a general image processing system in robot vision

NASA Astrophysics Data System (ADS)

Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang

2000-10-01

Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi-algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6201s each has a local memory space, and they also have a shared system memory space which enables them to intercommunicate and exchange data efficiently. At the same time, they can be directly inter-connected in star-shaped architecture. All of these are under the control of a FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
DDGIPS: a general image processing system in robot vision

NASA Astrophysics Data System (ADS)

Tian, Yuan; Ying, Jun; Ye, Xiuqing; Gu, Weikang

2000-10-01

Real-Time Image Processing is the key work in robot vision. With the limitation of the hardware technique, many algorithm-oriented firmware systems were designed in the past. But their architectures were not flexible enough to achieve a multi- algorithm development system. Because of the rapid development of microelectronics technique, many high performance DSP chips and high density FPGA chips have come to life, and this makes it possible to construct a more flexible architecture in real-time image processing system. In this paper, a Double DSP General Image Processing System (DDGIPS) is concerned. We try to construct a two-DSP-based FPGA-computational system with two TMS320C6201s. The TMS320C6x devices are fixed-point processors based on the advanced VLIW CPU, which has eight functional units, including two multipliers and six arithmetic logic units. These features make C6x a good candidate for a general purpose system. In our system, the two TMS320C6210s each has a local memory space, and they also have a shared system memory space which enable them to intercommunicate and exchange data efficiently. At the same time, they can be directly interconnected in star- shaped architecture. All of these are under the control of FPGA group. As the core of the system, FPGA plays a very important role: it takes charge of DPS control, DSP communication, memory space access arbitration and the communication between the system and the host machine. And taking advantage of reconfiguring FPGA, all of the interconnection between the two DSP or between DSP and FPGA can be changed. In this way, users can easily rebuild the real-time image processing system according to the data stream and the task of the application and gain great flexibility.
Laplace Transform Based Radiative Transfer Studies

NASA Astrophysics Data System (ADS)

Hu, Y.; Lin, B.; Ng, T.; Yang, P.; Wiscombe, W.; Herath, J.; Duffy, D.

2006-12-01

Multiple scattering is the major uncertainty for data analysis of space-based lidar measurements. Until now, accurate quantitative lidar data analysis has been limited to very thin objects that are dominated by single scattering, where photons from the laser beam only scatter a single time with particles in the atmosphere before reaching the receiver, and simple linear relationship between physical property and lidar signal exists. In reality, multiple scattering is always a factor in space-based lidar measurement and it dominates space- based lidar returns from clouds, dust aerosols, vegetation canopy and phytoplankton. While multiple scattering are clear signals, the lack of a fast-enough lidar multiple scattering computation tool forces us to treat the signal as unwanted "noise" and use simple multiple scattering correction scheme to remove them. Such multiple scattering treatments waste the multiple scattering signals and may cause orders of magnitude errors in retrieved physical properties. Thus the lack of fast and accurate time-dependent radiative transfer tools significantly limits lidar remote sensing capabilities. Analyzing lidar multiple scattering signals requires fast and accurate time-dependent radiative transfer computations. Currently, multiple scattering is done with Monte Carlo simulations. Monte Carlo simulations take minutes to hours and are too slow for interactive satellite data analysis processes and can only be used to help system / algorithm design and error assessment. We present an innovative physics approach to solve the time-dependent radiative transfer problem. The technique utilizes FPGA based reconfigurable computing hardware. The approach is as following, 1. Physics solution: Perform Laplace transform on the time and spatial dimensions and Fourier transform on the viewing azimuth dimension, and convert the radiative transfer differential equation solving into a fast matrix inversion problem. The majority of the radiative transfer computation goes to matrix inversion processes, FFT and inverse Laplace transforms. 2. Hardware solutions: Perform the well-defined matrix inversion, FFT and Laplace transforms on highly parallel, reconfigurable computing hardware. This physics-based computational tool leads to accurate quantitative analysis of space-based lidar signals and improves data quality of current lidar mission such as CALIPSO. This presentation will introduce the basic idea of this approach, preliminary results based on SRC's FPGA-based Mapstation, and how we may apply it to CALIPSO data analysis.
Adapting the Reconfigurable SpaceCube Processing System for Multiple Mission Applications

NASA Technical Reports Server (NTRS)

Petrick, Dave

2014-01-01

This paper will detail the use of SpaceCube in multiple space flight applications including the Hubble Space Telescope Servicing Mission 4 (HST-SM4), an International Space Station (ISS) radiation test bed experiment, and the main avionics subsystem for two separate ISS attached payloads. Each mission has had varying degrees of data processing complexities, performance requirements, and external interfaces. We will show the methodology used to minimize the changes required to the physical hardware, FPGA designs, embedded software interfaces, and testing.This paper will summarize significant results as they apply to each mission application. In the HST-SM4 application we utilized the FPGA resources to accelerate portions of the image processing algorithms more than 25 times faster than a standard space processor in order to meet computational speed requirements. For the ISS radiation on-orbit demonstration, the main goal is to show that we can rely on the commercial FPGAs and processors in a space environment. We describe our FPGA and processor radiation mitigation strategies that have resulted in our eight PowerPCs being available and error free for more than 99.99 of the time over the period of four years. This positive data and proven reliability of the SpaceCube on ISS resulted in the Department of Defense (DoD) selecting SpaceCube, which is replacing an older and slower computer currently used on ISS, as the main avionics for two upcoming ISS experiment campaigns. This paper will show how we quickly reconfigured the SpaceCube system to meet the more stringent reliability requirements
FPGA-based voltage and current dual drive system for high frame rate electrical impedance tomography.

PubMed

Khan, Shadab; Manwaring, Preston; Borsic, Andrea; Halter, Ryan

2015-04-01

Electrical impedance tomography (EIT) is used to image the electrical property distribution of a tissue under test. An EIT system comprises complex hardware and software modules, which are typically designed for a specific application. Upgrading these modules is a time-consuming process, and requires rigorous testing to ensure proper functioning of new modules with the existing ones. To this end, we developed a modular and reconfigurable data acquisition (DAQ) system using National Instruments' (NI) hardware and software modules, which offer inherent compatibility over generations of hardware and software revisions. The system can be configured to use up to 32-channels. This EIT system can be used to interchangeably apply current or voltage signal, and measure the tissue response in a semi-parallel fashion. A novel signal averaging algorithm, and 512-point fast Fourier transform (FFT) computation block was implemented on the FPGA. FFT output bins were classified as signal or noise. Signal bins constitute a tissue's response to a pure or mixed tone signal. Signal bins' data can be used for traditional applications, as well as synchronous frequency-difference imaging. Noise bins were used to compute noise power on the FPGA. Noise power represents a metric of signal quality, and can be used to ensure proper tissue-electrode contact. Allocation of these computationally expensive tasks to the FPGA reduced the required bandwidth between PC, and the FPGA for high frame rate EIT. In 16-channel configuration, with a signal-averaging factor of 8, the DAQ frame rate at 100 kHz exceeded 110 frames s (-1), and signal-to-noise ratio exceeded 90 dB across the spectrum. Reciprocity error was found to be for frequencies up to 1 MHz. Static imaging experiments were performed on a high-conductivity inclusion placed in a saline filled tank; the inclusion was clearly localized in the reconstructions obtained for both absolute current and voltage mode data.
A single-board NMR spectrometer based on a software defined radio architecture

NASA Astrophysics Data System (ADS)

Tang, Weinan; Wang, Weimin

2011-01-01

A single-board software defined radio (SDR) spectrometer for nuclear magnetic resonance (NMR) is presented. The SDR-based architecture, realized by combining a single field programmable gate array (FPGA) and a digital signal processor (DSP) with peripheral radio frequency (RF) front-end circuits, makes the spectrometer compact and reconfigurable. The DSP, working as a pulse programmer, communicates with a personal computer via a USB interface and controls the FPGA through a parallel port. The FPGA accomplishes digital processing tasks such as a numerically controlled oscillator (NCO), digital down converter (DDC) and gradient waveform generator. The NCO, with agile control of phase, frequency and amplitude, is part of a direct digital synthesizer that is used to generate an RF pulse. The DDC performs quadrature demodulation, multistage low-pass filtering and gain adjustment to produce a bandpass signal (receiver bandwidth from 3.9 kHz to 10 MHz). The gradient waveform generator is capable of outputting shaped gradient pulse waveforms and supports eddy-current compensation. The spectrometer directly acquires an NMR signal up to 30 MHz in the case of baseband sampling and is suitable for low-field (<0.7 T) application. Due to the featured SDR architecture, this prototype has flexible add-on ability and is expected to be suitable for portable NMR systems.
Dynamically Reconfigurable Systolic Array Accelerator

NASA Technical Reports Server (NTRS)

Dasu, Aravind; Barnes, Robert

2012-01-01

A polymorphic systolic array framework has been developed that works in conjunction with an embedded microprocessor on a field-programmable gate array (FPGA), which allows for dynamic and complimentary scaling of acceleration levels of two algorithms active concurrently on the FPGA. Use is made of systolic arrays and a hardware-software co-design to obtain an efficient multi-application acceleration system. The flexible and simple framework allows hosting of a broader range of algorithms, and is extendable to more complex applications in the area of aerospace embedded systems. FPGA chips can be responsive to realtime demands for changing applications needs, but only if the electronic fabric can respond fast enough. This systolic array framework allows for rapid partial and dynamic reconfiguration of the chip in response to the real-time needs of scalability, and adaptability of executables.
New design environment for defect detection in web inspection systems

NASA Astrophysics Data System (ADS)

Hajimowlana, S. Hossain; Muscedere, Roberto; Jullien, Graham A.; Roberts, James W.

1997-09-01

One of the aims of industrial machine vision is to develop computer and electronic systems destined to replace human vision in the process of quality control of industrial production. In this paper we discuss the development of a new design environment developed for real-time defect detection using reconfigurable FPGA and DSP processor mounted inside a DALSA programmable CCD camera. The FPGA is directly connected to the video data-stream and outputs data to a low bandwidth output bus. The system is targeted for web inspection but has the potential for broader application areas. We describe and show test results of the prototype system board, mounted inside a DALSA camera and discuss some of the algorithms currently simulated and implemented for web inspection applications.
Low-Cost Space Hardware and Software

NASA Technical Reports Server (NTRS)

Shea, Bradley Franklin

2013-01-01

The goal of this project is to demonstrate and support the overall vision of NASA's Rocket University (RocketU) through the design of an electrical power system (EPS) monitor for implementation on RUBICS (Rocket University Broad Initiatives CubeSat), through the support for the CHREC (Center for High-Performance Reconfigurable Computing) Space Processor, and through FPGA (Field Programmable Gate Array) design. RocketU will continue to provide low-cost innovations even with continuous cuts to the budget.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)

2002-01-01

Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
Hierarchical MFMO Circuit Modules for an Energy-Efficient SDR DBF

NASA Astrophysics Data System (ADS)

Mar, Jeich; Kuo, Chi-Cheng; Wu, Shin-Ru; Lin, You-Rong

The hierarchical multi-function matrix operation (MFMO) circuit modules are designed using coordinate rotations digital computer (CORDIC) algorithm for realizing the intensive computation of matrix operations. The paper emphasizes that the designed hierarchical MFMO circuit modules can be used to develop a power-efficient software-defined radio (SDR) digital beamformer (DBF). The formulas of the processing time for the scalable MFMO circuit modules implemented in field programmable gate array (FPGA) are derived to allocate the proper logic resources for the hardware reconfiguration. The hierarchical MFMO circuit modules are scalable to the changing number of array branches employed for the SDR DBF to achieve the purpose of power saving. The efficient reuse of the common MFMO circuit modules in the SDR DBF can also lead to energy reduction. Finally, the power dissipation and reconfiguration function in the different modes of the SDR DBF are observed from the experiment results.
LinoSPAD: a time-resolved 256×1 CMOS SPAD line sensor system featuring 64 FPGA-based TDC channels running at up to 8.5 giga-events per second

NASA Astrophysics Data System (ADS)

Burri, Samuel; Homulle, Harald; Bruschini, Claudio; Charbon, Edoardo

2016-04-01

LinoSPAD is a reconfigurable camera sensor with a 256×1 CMOS SPAD (single-photon avalanche diode) pixel array connected to a low cost Xilinx Spartan 6 FPGA. The LinoSPAD sensor's line of pixels has a pitch of 24 μm and 40% fill factor. The FPGA implements an array of 64 TDCs and histogram engines capable of processing up to 8.5 giga-photons per second. The LinoSPAD sensor measures 1.68 mm×6.8 mm and each pixel has a direct digital output to connect to the FPGA. The chip is bonded on a carrier PCB to connect to the FPGA motherboard. 64 carry chain based TDCs sampled at 400 MHz can generate a timestamp every 7.5 ns with a mean time resolution below 25 ps per code. The 64 histogram engines provide time-of-arrival histograms covering up to 50 ns. An alternative mode allows the readout of 28 bit timestamps which have a range of up to 4.5 ms. Since the FPGA TDCs have considerable non-linearity we implemented a correction module capable of increasing histogram linearity at real-time. The TDC array is interfaced to a computer using a super-speed USB3 link to transfer over 150k histograms per second for the 12.5 ns reference period used in our characterization. After characterization and subsequent programming of the post-processing we measure an instrument response histogram shorter than 100 ps FWHM using a strong laser pulse with 50 ps FWHM. A timing resolution that when combined with the high fill factor makes the sensor well suited for a wide variety of applications from fluorescence lifetime microscopy over Raman spectroscopy to 3D time-of-flight.
Real-Time Algebraic Derivative Estimations Using a Novel Low-Cost Architecture Based on Reconfigurable Logic

PubMed Central

Morales, Rafael; Rincón, Fernando; Gazzano, Julio Dondo; López, Juan Carlos

2014-01-01

Time derivative estimation of signals plays a very important role in several fields, such as signal processing and control engineering, just to name a few of them. For that purpose, a non-asymptotic algebraic procedure for the approximate estimation of the system states is used in this work. The method is based on results from differential algebra and furnishes some general formulae for the time derivatives of a measurable signal in which two algebraic derivative estimators run simultaneously, but in an overlapping fashion. The algebraic derivative algorithm presented in this paper is computed online and in real-time, offering high robustness properties with regard to corrupting noises, versatility and ease of implementation. Besides, in this work, we introduce a novel architecture to accelerate this algebraic derivative estimator using reconfigurable logic. The core of the algorithm is implemented in an FPGA, improving the speed of the system and achieving real-time performance. Finally, this work proposes a low-cost platform for the integration of hardware in the loop in MATLAB. PMID:24859033
An acceleration framework for synthetic aperture radar algorithms

NASA Astrophysics Data System (ADS)

Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.

2017-04-01

Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
A Hardware-Accelerated Quantum Monte Carlo framework (HAQMC) for N-body systems

NASA Astrophysics Data System (ADS)

Gothandaraman, Akila; Peterson, Gregory D.; Warren, G. Lee; Hinde, Robert J.; Harrison, Robert J.

2009-12-01

Interest in the study of structural and energetic properties of highly quantum clusters, such as inert gas clusters has motivated the development of a hardware-accelerated framework for Quantum Monte Carlo simulations. In the Quantum Monte Carlo method, the properties of a system of atoms, such as the ground-state energies, are averaged over a number of iterations. Our framework is aimed at accelerating the computations in each iteration of the QMC application by offloading the calculation of properties, namely energy and trial wave function, onto reconfigurable hardware. This gives a user the capability to run simulations for a large number of iterations, thereby reducing the statistical uncertainty in the properties, and for larger clusters. This framework is designed to run on the Cray XD1 high performance reconfigurable computing platform, which exploits the coarse-grained parallelism of the processor along with the fine-grained parallelism of the reconfigurable computing devices available in the form of field-programmable gate arrays. In this paper, we illustrate the functioning of the framework, which can be used to calculate the energies for a model cluster of helium atoms. In addition, we present the capabilities of the framework that allow the user to vary the chemical identities of the simulated atoms. Program summaryProgram title: Hardware Accelerated Quantum Monte Carlo (HAQMC) Catalogue identifier: AEEP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEP_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 691 537 No. of bytes in distributed program, including test data, etc.: 5 031 226 Distribution format: tar.gz Programming language: C/C++ for the QMC application, VHDL and Xilinx 8.1 ISE/EDK tools for FPGA design and development Computer: Cray XD1 consisting of a dual-core, dualprocessor AMD Opteron 2.2 GHz with a Xilinx Virtex-4 (V4LX160) or Xilinx Virtex-II Pro (XC2VP50) FPGA per node. We use the compute node with the Xilinx Virtex-4 FPGA Operating system: Red Hat Enterprise Linux OS Has the code been vectorised or parallelized?: Yes Classification: 6.1 Nature of problem: Quantum Monte Carlo is a practical method to solve the Schrödinger equation for large many-body systems and obtain the ground-state properties of such systems. This method involves the sampling of a number of configurations of atoms and averaging the properties of the configurations over a number of iterations. We are interested in applying the QMC method to obtain the energy and other properties of highly quantum clusters, such as inert gas clusters. Solution method: The proposed framework provides a combined hardware-software approach, in which the QMC simulation is performed on the host processor, with the computationally intensive functions such as energy and trial wave function computations mapped onto the field-programmable gate array (FPGA) logic device attached as a co-processor to the host processor. We perform the QMC simulation for a number of iterations as in the case of our original software QMC approach, to reduce the statistical uncertainty of the results. However, our proposed HAQMC framework accelerates each iteration of the simulation, by significantly reducing the time taken to calculate the ground-state properties of the configurations of atoms, thereby accelerating the overall QMC simulation. We provide a generic interpolation framework that can be extended to study a variety of pure and doped atomic clusters, irrespective of the chemical identities of the atoms. For the FPGA implementation of the properties, we use a two-region approach for accurately computing the properties over the entire domain, employ deep pipelines and fixed-point for all our calculations guaranteeing the accuracy required for our simulation.
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing

PubMed Central

Bravo, Ignacio; Baliñas, Javier; Gardel, Alfredo; Lázaro, José L.; Espinosa, Felipe; García, Jorge

2011-01-01

This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely. PMID:22163739
A New Partial Reconfiguration-Based Fault-Injection System to Evaluate SEU Effects in SRAM-Based FPGAs

NASA Astrophysics Data System (ADS)

Sterpone, L.; Violante, M.

2007-08-01

Modern SRAM-based field programmable gate array (FPGA) devices offer high capability in implementing complex system. Unfortunately, SRAM-based FPGAs are extremely sensitive to single event upsets (SEUs) induced by radiation particles. In order to successfully deploy safety- or mission-critical applications, designer need to validate the correctness of the obtained designs. In this paper we describe a system based on partial-reconfiguration for running fault-injection experiments within the configuration memory of SRAM-based FPGAs. The proposed fault-injection system uses the internal configuration capabilities that modern FPGAs offer in order to inject SEU within the configuration memory. Detailed experimental results show that the technique is orders of magnitude faster than previously proposed ones.
Using partial reconfiguration for SoC design and implementation

NASA Astrophysics Data System (ADS)

Krasteva, Yana E.; Portilla, Jorge; Tobajas Guerrero, Félix; de la Torre, Eduardo

2009-05-01

Most reconfigurable systems rely on FPGA technology. Among these ones, those which permit dynamic and partial reconfiguration, offer added benefits in flexibility, in-field device upgrade, improved design and manufacturing time, and even, in some cases, power consumption reductions. However, dynamic reconfiguration is a complex task, and the real benefits of its use in real applications have been often questioned. This paper presents an overview of the partial reconfiguration technique application, along with four original applications. The main goal of these applications is to test several architectures with different flexibility and, to search for the partial reconfiguration "killing application", that is, the application that better demonstrates the benefits of today reconfigurable systems based on commercial FPGAs. Therefore, the presented applications are rather a proof of concept, than fully operative and closed systems. First, a brief introduction to the partial reconfigurable systems application topic has been included. After that, the descriptions of the created reconfigurable systems are presented: first, an on-chip communications emulation framework, second, an on chip debugging system, third, a wireless sensor network reconfigurable node and finally, a remote reconfigurable client-server device. Each application is described in a separate section of the paper along with some test and results. General conclusions are included at the end of the paper.

FPGA-based real time controller for high order correction in EDIFISE

NASA Astrophysics Data System (ADS)

Rodríguez-Ramos, L. F.; Chulani, H.; Martín, Y.; Dorta, T.; Alonso, A.; Fuensalida, J. J.

2012-07-01

EDIFISE is a technology demonstrator instrument developed at the Institute of Astrophysics of the Canary Islands (IAC), intended to explore the feasibility of combining Adaptive Optics with attenuated optical fibers in order to obtain high spatial resolution spectra at the surroundings of a star, as an alternative to coronagraphy. A simplified version with only tip tilt correction has been tested at the OGS telescope in Observatorio del Teide (Canary islands, Spain) and a complete version is intended to be tested at the OGS and at the WHT telescope in Observatorio del Roque de los Muchachos, (Canary Islands, Spain). This paper describes the FPGA-based real time control of the High Order unit, responsible of the computation of the actuation values of a 97-actuactor deformable mirror (11x11) with the information provided by a configurable wavefront sensor of up to 16x16 subpupils at 500 Hz (128x128 pixels). The reconfigurable logic hardware will allow both zonal and modal control approaches, will full access to select which mode loops should be closed and with a number of utilities for influence matrix and open loop response measurements. The system has been designed in a modular way to allow for easy upgrade to faster frame rates (1500 Hz) and bigger wavefront sensors (240x240 pixels), accepting also several interfaces from the WFS and towards the mirror driver. The FPGA-based (Field Programmable Gate Array) real time controller provides bias and flat-fielding corrections, subpupil slopes to modal matrix computation for up to 97 modes, independent servo loop controllers for each mode with user control for independent loop opening or closing, mode to actuator matrix computation and non-common path aberration correction capability. It also provides full housekeeping control via UPD/IP for matrix reloading and full system data logging.
Computing Models for FPGA-Based Accelerators

PubMed Central

Herbordt, Martin C.; Gu, Yongfeng; VanCourt, Tom; Model, Josh; Sukhwani, Bharat; Chiu, Matt

2011-01-01

Field-programmable gate arrays are widely considered as accelerators for compute-intensive applications. A critical phase of FPGA application development is finding and mapping to the appropriate computing model. FPGA computing enables models with highly flexible fine-grained parallelism and associative operations such as broadcast and collective response. Several case studies demonstrate the effectiveness of using these computing models in developing FPGA applications for molecular modeling. PMID:21603152
Development of IR imaging system simulator

NASA Astrophysics Data System (ADS)

Xiang, Xinglang; He, Guojing; Dong, Weike; Dong, Lu

2017-02-01

To overcome the disadvantages of the tradition semi-physical simulation and injection simulation equipment in the performance evaluation of the infrared imaging system (IRIS), a low-cost and reconfigurable IRIS simulator, which can simulate the realistic physical process of infrared imaging, is proposed to test and evaluate the performance of the IRIS. According to the theoretical simulation framework and the theoretical models of the IRIS, the architecture of the IRIS simulator is constructed. The 3D scenes are generated and the infrared atmospheric transmission effects are simulated using OGRE technology in real-time on the computer. The physical effects of the IRIS are classified as the signal response characteristic, modulation transfer characteristic and noise characteristic, and they are simulated on the single-board signal processing platform based on the core processor FPGA in real-time using high-speed parallel computation method.
Statechart-based design controllers for FPGA partial reconfiguration

NASA Astrophysics Data System (ADS)

Łabiak, Grzegorz; Wegrzyn, Marek; Rosado Muñoz, Alfredo

2015-09-01

Statechart diagram and UML technique can be a vital part of early conceptual modeling. At the present time there is no much support in hardware design methodologies for reconfiguration features of reprogrammable devices. Authors try to bridge the gap between imprecise UML model and formal HDL description. The key concept in author's proposal is to describe the behavior of the digital controller by statechart diagrams and to map some parts of the behavior into reprogrammable logic by means of group of states which forms sequential automaton. The whole process is illustrated by the example with experimental results.
An embedded processor for real-time atmoshperic compensation

NASA Astrophysics Data System (ADS)

Bodnar, Michael R.; Curt, Petersen F.; Ortiz, Fernando E.; Carrano, Carmen J.; Kelmelis, Eric J.

2009-05-01

Imaging over long distances is crucial to a number of defense and security applications, such as homeland security and launch tracking. However, the image quality obtained from current long-range optical systems can be severely degraded by the turbulent atmosphere in the path between the region under observation and the imager. While this obscured image information can be recovered using post-processing techniques, the computational complexity of such approaches has prohibited deployment in real-time scenarios. To overcome this limitation, we have coupled a state-of-the-art atmospheric compensation algorithm, the average-bispectrum speckle method, with a powerful FPGA-based embedded processing board. The end result is a light-weight, lower-power image processing system that improves the quality of long-range imagery in real-time, and uses modular video I/O to provide a flexible interface to most common digital and analog video transport methods. By leveraging the custom, reconfigurable nature of the FPGA, a 20x speed increase over a modern desktop PC was achieved in a form-factor that is compact, low-power, and field-deployable.
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors.

PubMed

Cheung, Kit; Schultz, Simon R; Luk, Wayne

2015-01-01

NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation.
NeuroFlow: A General Purpose Spiking Neural Network Simulation Platform using Customizable Processors

PubMed Central

Cheung, Kit; Schultz, Simon R.; Luk, Wayne

2016-01-01

NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation. PMID:26834542
FPGA implementation of sparse matrix algorithm for information retrieval

NASA Astrophysics Data System (ADS)

Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

2005-06-01

Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
FPGA-based rate-adaptive LDPC-coded modulation for the next generation of optical communication systems.

PubMed

Zou, Ding; Djordjevic, Ivan B

2016-09-05

In this paper, we propose a rate-adaptive FEC scheme based on LDPC codes together with its software reconfigurable unified FPGA architecture. By FPGA emulation, we demonstrate that the proposed class of rate-adaptive LDPC codes based on shortening with an overhead from 25% to 42.9% provides a coding gain ranging from 13.08 dB to 14.28 dB at a post-FEC BER of 10^-15 for BPSK transmission. In addition, the proposed rate-adaptive LDPC coding combined with higher-order modulations have been demonstrated including QPSK, 8-QAM, 16-QAM, 32-QAM, and 64-QAM, which covers a wide range of signal-to-noise ratios. Furthermore, we apply the unequal error protection by employing different LDPC codes on different bits in 16-QAM and 64-QAM, which results in additional 0.5dB gain compared to conventional LDPC coded modulation with the same code rate of corresponding LDPC code.
FPGA-based distributed computing microarchitecture for complex physical dynamics investigation.

PubMed

Borgese, Gianluca; Pace, Calogero; Pantano, Pietro; Bilotta, Eleonora

2013-09-01

In this paper, we present a distributed computing system, called DCMARK, aimed at solving partial differential equations at the basis of many investigation fields, such as solid state physics, nuclear physics, and plasma physics. This distributed architecture is based on the cellular neural network paradigm, which allows us to divide the differential equation system solving into many parallel integration operations to be executed by a custom multiprocessor system. We push the number of processors to the limit of one processor for each equation. In order to test the present idea, we choose to implement DCMARK on a single FPGA, designing the single processor in order to minimize its hardware requirements and to obtain a large number of easily interconnected processors. This approach is particularly suited to study the properties of 1-, 2- and 3-D locally interconnected dynamical systems. In order to test the computing platform, we implement a 200 cells, Korteweg-de Vries (KdV) equation solver and perform a comparison between simulations conducted on a high performance PC and on our system. Since our distributed architecture takes a constant computing time to solve the equation system, independently of the number of dynamical elements (cells) of the CNN array, it allows us to reduce the elaboration time more than other similar systems in the literature. To ensure a high level of reconfigurability, we design a compact system on programmable chip managed by a softcore processor, which controls the fast data/control communication between our system and a PC Host. An intuitively graphical user interface allows us to change the calculation parameters and plot the results.
Applying a Genetic Algorithm to Reconfigurable Hardware

NASA Technical Reports Server (NTRS)

Wells, B. Earl; Weir, John; Trevino, Luis; Patrick, Clint; Steincamp, Jim

2004-01-01

This paper investigates the feasibility of applying genetic algorithms to solve optimization problems that are implemented entirely in reconfgurable hardware. The paper highlights the pe$ormance/design space trade-offs that must be understood to effectively implement a standard genetic algorithm within a modem Field Programmable Gate Array, FPGA, reconfgurable hardware environment and presents a case-study where this stochastic search technique is applied to standard test-case problems taken from the technical literature. In this research, the targeted FPGA-based platform and high-level design environment was the Starbridge Hypercomputing platform, which incorporates multiple Xilinx Virtex II FPGAs, and the Viva TM graphical hardware description language.
Finite Element Methods for real-time Haptic Feedback of Soft-Tissue Models in Virtual Reality Simulators

NASA Technical Reports Server (NTRS)

Frank, Andreas O.; Twombly, I. Alexander; Barth, Timothy J.; Smith, Jeffrey D.; Dalton, Bonnie P. (Technical Monitor)

2001-01-01

We have applied the linear elastic finite element method to compute haptic force feedback and domain deformations of soft tissue models for use in virtual reality simulators. Our results show that, for virtual object models of high-resolution 3D data (>10,000 nodes), haptic real time computations (>500 Hz) are not currently possible using traditional methods. Current research efforts are focused in the following areas: 1) efficient implementation of fully adaptive multi-resolution methods and 2) multi-resolution methods with specialized basis functions to capture the singularity at the haptic interface (point loading). To achieve real time computations, we propose parallel processing of a Jacobi preconditioned conjugate gradient method applied to a reduced system of equations resulting from surface domain decomposition. This can effectively be achieved using reconfigurable computing systems such as field programmable gate arrays (FPGA), thereby providing a flexible solution that allows for new FPGA implementations as improved algorithms become available. The resulting soft tissue simulation system would meet NASA Virtual Glovebox requirements and, at the same time, provide a generalized simulation engine for any immersive environment application, such as biomedical/surgical procedures or interactive scientific applications.
A reconfigurable cryogenic platform for the classical control of quantum processors

NASA Astrophysics Data System (ADS)

Homulle, Harald; Visser, Stefan; Patra, Bishnu; Ferrari, Giorgio; Prati, Enrico; Sebastiano, Fabio; Charbon, Edoardo

2017-04-01

The implementation of a classical control infrastructure for large-scale quantum computers is challenging due to the need for integration and processing time, which is constrained by coherence time. We propose a cryogenic reconfigurable platform as the heart of the control infrastructure implementing the digital error-correction control loop. The platform is implemented on a field-programmable gate array (FPGA) that supports the functionality required by several qubit technologies and that can operate close to the physical qubits over a temperature range from 4 K to 300 K. This work focuses on the extensive characterization of the electronic platform over this temperature range. All major FPGA building blocks (such as look-up tables (LUTs), carry chains (CARRY4), mixed-mode clock manager (MMCM), phase-locked loop (PLL), block random access memory, and IDELAY2 (programmable delay element)) operate correctly and the logic speed is very stable. The logic speed of LUTs and CARRY4 changes less then 5%, whereas the jitter of MMCM and PLL clock managers is reduced by 20%. The stability is finally demonstrated by operating an integrated 1.2 GSa/s analog-to-digital converter (ADC) with a relatively stable performance over temperature. The ADCs effective number of bits drops from 6 to 4.5 bits when operating at 15 K.
A reconfigurable cryogenic platform for the classical control of quantum processors.

PubMed

Homulle, Harald; Visser, Stefan; Patra, Bishnu; Ferrari, Giorgio; Prati, Enrico; Sebastiano, Fabio; Charbon, Edoardo

2017-04-01

The implementation of a classical control infrastructure for large-scale quantum computers is challenging due to the need for integration and processing time, which is constrained by coherence time. We propose a cryogenic reconfigurable platform as the heart of the control infrastructure implementing the digital error-correction control loop. The platform is implemented on a field-programmable gate array (FPGA) that supports the functionality required by several qubit technologies and that can operate close to the physical qubits over a temperature range from 4 K to 300 K. This work focuses on the extensive characterization of the electronic platform over this temperature range. All major FPGA building blocks (such as look-up tables (LUTs), carry chains (CARRY4), mixed-mode clock manager (MMCM), phase-locked loop (PLL), block random access memory, and IDELAY2 (programmable delay element)) operate correctly and the logic speed is very stable. The logic speed of LUTs and CARRY4 changes less then 5%, whereas the jitter of MMCM and PLL clock managers is reduced by 20%. The stability is finally demonstrated by operating an integrated 1.2 GSa/s analog-to-digital converter (ADC) with a relatively stable performance over temperature. The ADCs effective number of bits drops from 6 to 4.5 bits when operating at 15 K.
The dynamical analysis of modified two-compartment neuron model and FPGA implementation

NASA Astrophysics Data System (ADS)

Lin, Qianjin; Wang, Jiang; Yang, Shuangming; Yi, Guosheng; Deng, Bin; Wei, Xile; Yu, Haitao

2017-10-01

The complexity of neural models is increasing with the investigation of larger biological neural network, more various ionic channels and more detailed morphologies, and the implementation of biological neural network is a task with huge computational complexity and power consumption. This paper presents an efficient digital design using piecewise linearization on field programmable gate array (FPGA), to succinctly implement the reduced two-compartment model which retains essential features of more complicated models. The design proposes an approximate neuron model which is composed of a set of piecewise linear equations, and it can reproduce different dynamical behaviors to depict the mechanisms of a single neuron model. The consistency of hardware implementation is verified in terms of dynamical behaviors and bifurcation analysis, and the simulation results including varied ion channel characteristics coincide with the biological neuron model with a high accuracy. Hardware synthesis on FPGA demonstrates that the proposed model has reliable performance and lower hardware resource compared with the original two-compartment model. These investigations are conducive to scalability of biological neural network in reconfigurable large-scale neuromorphic system.
FPGA-based RF spectrum merging and adaptive hopset selection

NASA Astrophysics Data System (ADS)

McLean, R. K.; Flatley, B. N.; Silvius, M. D.; Hopkinson, K. M.

The radio frequency (RF) spectrum is a limited resource. Spectrum allotment disputes stem from this scarcity as many radio devices are confined to a fixed frequency or frequency sequence. One alternative is to incorporate cognition within a reconfigurable radio platform, therefore enabling the radio to adapt to dynamic RF spectrum environments. In this way, the radio is able to actively sense the RF spectrum, decide, and act accordingly, thereby sharing the spectrum and operating in more flexible manner. In this paper, we present a novel solution for merging many distributed RF spectrum maps into one map and for subsequently creating an adaptive hopset. We also provide an example of our system in operation, the result of which is a pseudorandom adaptive hopset. The paper then presents a novel hardware design for the frequency merger and adaptive hopset selector, both of which are written in VHDL and implemented as a custom IP core on an FPGA-based embedded system using the Xilinx Embedded Development Kit (EDK) software tool. The design of the custom IP core is optimized for area, and it can process a high-volume digital input via a low-latency circuit architecture. The complete embedded system includes the Xilinx PowerPC microprocessor, UART serial connection, and compact flash memory card IP cores, and our custom map merging/hopset selection IP core, all of which are targeted to the Virtex IV FPGA. This system is then incorporated into a cognitive radio prototype on a Rice University Wireless Open Access Research Platform (WARP) reconfigurable radio.
Advanced Avionics and Processor Systems for a Flexible Space Exploration Architecture

NASA Technical Reports Server (NTRS)

Keys, Andrew S.; Adams, James H.; Smith, Leigh M.; Johnson, Michael A.; Cressler, John D.

2010-01-01

The Advanced Avionics and Processor Systems (AAPS) project, formerly known as the Radiation Hardened Electronics for Space Environments (RHESE) project, endeavors to develop advanced avionic and processor technologies anticipated to be used by NASA s currently evolving space exploration architectures. The AAPS project is a part of the Exploration Technology Development Program, which funds an entire suite of technologies that are aimed at enabling NASA s ability to explore beyond low earth orbit. NASA s Marshall Space Flight Center (MSFC) manages the AAPS project. AAPS uses a broad-scoped approach to developing avionic and processor systems. Investment areas include advanced electronic designs and technologies capable of providing environmental hardness, reconfigurable computing techniques, software tools for radiation effects assessment, and radiation environment modeling tools. Near-term emphasis within the multiple AAPS tasks focuses on developing prototype components using semiconductor processes and materials (such as Silicon-Germanium (SiGe)) to enhance a device s tolerance to radiation events and low temperature environments. As the SiGe technology will culminate in a delivered prototype this fiscal year, the project emphasis shifts its focus to developing low-power, high efficiency total processor hardening techniques. In addition to processor development, the project endeavors to demonstrate techniques applicable to reconfigurable computing and partially reconfigurable Field Programmable Gate Arrays (FPGAs). This capability enables avionic architectures the ability to develop FPGA-based, radiation tolerant processor boards that can serve in multiple physical locations throughout the spacecraft and perform multiple functions during the course of the mission. The individual tasks that comprise AAPS are diverse, yet united in the common endeavor to develop electronics capable of operating within the harsh environment of space. Specifically, the AAPS tasks for the Federal fiscal year of 2010 are: Silicon-Germanium (SiGe) Integrated Electronics for Extreme Environments, Modeling of Radiation Effects on Electronics, Radiation Hardened High Performance Processors (HPP), and and Reconfigurable Computing.
Spaceborne Hybrid-FPGA System for Processing FTIR Data

NASA Technical Reports Server (NTRS)

Bekker, Dmitriy; Blavier, Jean-Francois L.; Pingree, Paula J.; Lukowiak, Marcin; Shaaban, Muhammad

2008-01-01

Progress has been made in a continuing effort to develop a spaceborne computer system for processing readout data from a Fourier-transform infrared (FTIR) spectrometer to reduce the volume of data transmitted to Earth. The approach followed in this effort, oriented toward reducing design time and reducing the size and weight of the spectrometer electronics, has been to exploit the versatility of recently developed hybrid field-programmable gate arrays (FPGAs) to run diverse software on embedded processors while also taking advantage of the reconfigurable hardware resources of the FPGAs.
High-Speed On-Board Data Processing for Science Instruments

NASA Technical Reports Server (NTRS)

Beyon, Jeffrey Y.; Ng, Tak-Kwong; Lin, Bing; Hu, Yongxiang; Harrison, Wallace

2014-01-01

A new development of on-board data processing platform has been in progress at NASA Langley Research Center since April, 2012, and the overall review of such work is presented in this paper. The project is called High-Speed On-Board Data Processing for Science Instruments (HOPS) and focuses on a high-speed scalable data processing platform for three particular National Research Council's Decadal Survey missions such as Active Sensing of CO2 Emissions over Nights, Days, and Seasons (ASCENDS), Aerosol-Cloud-Ecosystems (ACE), and Doppler Aerosol Wind Lidar (DAWN) 3-D Winds. HOPS utilizes advanced general purpose computing with Field Programmable Gate Array (FPGA) based algorithm implementation techniques. The significance of HOPS is to enable high speed on-board data processing for current and future science missions with its reconfigurable and scalable data processing platform. A single HOPS processing board is expected to provide approximately 66 times faster data processing speed for ASCENDS, more than 70% reduction in both power and weight, and about two orders of cost reduction compared to the state-of-the-art (SOA) on-board data processing system. Such benchmark predictions are based on the data when HOPS was originally proposed in August, 2011. The details of these improvement measures are also presented. The two facets of HOPS development are identifying the most computationally intensive algorithm segments of each mission and implementing them in a FPGA-based data processing board. A general introduction of such facets is also the purpose of this paper.
FPGA Coprocessor for Accelerated Classification of Images

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.

2008-01-01

An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.

Architectural evaluation of dynamic and partial reconfigurable systems designed with DREAMS tool

NASA Astrophysics Data System (ADS)

Otero, Andrés.; Gallego, Ángel; de la Torre, Eduardo; Riesgo, Teresa

2013-05-01

Benefits of dynamic and partial reconfigurable systems are increasingly being more accepted by the industry. For this reason, SRAM-based FPGA manufacturers have improved, or even included for the first time, the support they offer for the design of this kind of systems. However, commercial tools still offer a poor flexibility, which leads to a limited efficiency. This is witnessed by the overhead introduced by the communication primitives, as well as by the inability to relocate reconfigurable modules, among others. For this reason, authors have proposed an academic design tool called DREAMS, which targets the design of dynamically reconfigurable systems. In this paper, main features offered by DREAMS are described, comparing them with existing commercial and academic tools. Moreover, a graphic user interface (GUI) is originally described in this work, with the aim of simplifying the design process, as well as to hide the low level device dependent details to the system designer. The overall goal is to increase the designer productivity. Using the graphic interface, different reconfigurable architectures are provided as design examples. Among them, both conventional slot-based architectures and mesh type designs have been included.
Computer vision camera with embedded FPGA processing

NASA Astrophysics Data System (ADS)

Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

2000-03-01

Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

PubMed Central

Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M

2008-01-01

Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation. PMID:18412963
Architectural design for a low cost FPGA-based traffic signal detection system in vehicles

NASA Astrophysics Data System (ADS)

López, Ignacio; Salvador, Rubén; Alarcón, Jaime; Moreno, Félix

2007-05-01

In this paper we propose an architecture for an embedded traffic signal detection system. Development of Advanced Driver Assistance Systems (ADAS) is one of the major trends of research in automotion nowadays. Examples of past and ongoing projects in the field are CHAMELEON ("Pre-Crash Application all around the vehicle" IST 1999-10108), PREVENT (Preventive and Active Safety Applications, FP6-507075, http://www.prevent-ip.org/) and AVRT in the US (Advanced Vision-Radar Threat Detection (AVRT): A Pre-Crash Detection and Active Safety System). It can be observed a major interest in systems for real-time analysis of complex driving scenarios, evaluating risk and anticipating collisions. The system will use a low cost CCD camera on the dashboard facing the road. The images will be processed by an Altera Cyclone family FPGA. The board does median and Sobel filtering of the incoming frames at PAL rate, and analyzes them for several categories of signals. The result is conveyed to the driver. The scarce resources provided by the hardware require an architecture developed for optimal use. The system will use a combination of neural networks and an adapted blackboard architecture. Several neural networks will be used in sequence for image analysis, by reconfiguring a single, generic hardware neural network in the FPGA. This generic network is optimized for speed, in order to admit several executions within the frame rate. The sequence will follow the execution cycle of the blackboard architecture. The global, blackboard architecture being developed and the hardware architecture for the generic, reconfigurable FPGA perceptron will be explained in this paper. The project is still at an early stage. However, some hardware implementation results are already available and will be offered in the paper.
Energy efficiency analysis and implementation of AES on an FPGA

NASA Astrophysics Data System (ADS)

Kenney, David

The Advanced Encryption Standard (AES) was developed by Joan Daemen and Vincent Rjimen and endorsed by the National Institute of Standards and Technology in 2001. It was designed to replace the aging Data Encryption Standard (DES) and be useful for a wide range of applications with varying throughput, area, power dissipation and energy consumption requirements. Field Programmable Gate Arrays (FPGAs) are flexible and reconfigurable integrated circuits that are useful for many different applications including the implementation of AES. Though they are highly flexible, FPGAs are often less efficient than Application Specific Integrated Circuits (ASICs); they tend to operate slower, take up more space and dissipate more power. There have been many FPGA AES implementations that focus on obtaining high throughput or low area usage, but very little research done in the area of low power or energy efficient FPGA based AES; in fact, it is rare for estimates on power dissipation to be made at all. This thesis presents a methodology to evaluate the energy efficiency of FPGA based AES designs and proposes a novel FPGA AES implementation which is highly flexible and energy efficient. The proposed methodology is implemented as part of a novel scripting tool, the AES Energy Analyzer, which is able to fully characterize the power dissipation and energy efficiency of FPGA based AES designs. Additionally, this thesis introduces a new FPGA power reduction technique called Opportunistic Combinational Operand Gating (OCOG) which is used in the proposed energy efficient implementation. The AES Energy Analyzer was able to estimate the power dissipation and energy efficiency of the proposed AES design during its most commonly performed operations. It was found that the proposed implementation consumes less energy per operation than any previous FPGA based AES implementations that included power estimations. Finally, the use of Opportunistic Combinational Operand Gating on an AES cipher was found to reduce its dynamic power consumption by up to 17% when compared to an identical design that did not employ the technique.
Radiation Tolerant Intelligent Memory Stack (RTIMS)

NASA Technical Reports Server (NTRS)

Ng, Tak-kwong; Herath, Jeffrey A.

2006-01-01

The Radiation Tolerant Intelligent Memory Stack (RTIMS), suitable for both geostationary and low earth orbit missions, has been developed. The memory module is fully functional and undergoing environmental and radiation characterization. A self-contained flight-like module is expected to be completed in 2006. RTIMS provides reconfigurable circuitry and 2 gigabits of error corrected or 1 gigabit of triple redundant digital memory in a small package. RTIMS utilizes circuit stacking of heterogeneous components and radiation shielding technologies. A reprogrammable field programmable gate array (FPGA), six synchronous dynamic random access memories, linear regulator, and the radiation mitigation circuitries are stacked into a module of 42.7mm x 42.7mm x 13.00mm. Triple module redundancy, current limiting, configuration scrubbing, and single event function interrupt detection are employed to mitigate radiation effects. The mitigation techniques significantly simplify system design. RTIMS is well suited for deployment in real-time data processing, reconfigurable computing, and memory intensive applications.
Real-time embedded atmospheric compensation for long-range imaging using the average bispectrum speckle method

NASA Astrophysics Data System (ADS)

Curt, Petersen F.; Bodnar, Michael R.; Ortiz, Fernando E.; Carrano, Carmen J.; Kelmelis, Eric J.

2009-02-01

While imaging over long distances is critical to a number of security and defense applications, such as homeland security and launch tracking, current optical systems are limited in resolving power. This is largely a result of the turbulent atmosphere in the path between the region under observation and the imaging system, which can severely degrade captured imagery. There are a variety of post-processing techniques capable of recovering this obscured image information; however, the computational complexity of such approaches has prohibited real-time deployment and hampers the usability of these technologies in many scenarios. To overcome this limitation, we have designed and manufactured an embedded image processing system based on commodity hardware which can compensate for these atmospheric disturbances in real-time. Our system consists of a reformulation of the average bispectrum speckle method coupled with a high-end FPGA processing board, and employs modular I/O capable of interfacing with most common digital and analog video transport methods (composite, component, VGA, DVI, SDI, HD-SDI, etc.). By leveraging the custom, reconfigurable nature of the FPGA, we have achieved performance twenty times faster than a modern desktop PC, in a form-factor that is compact, low-power, and field-deployable.
A FPGA-based architecture for real-time image matching

NASA Astrophysics Data System (ADS)

Wang, Jianhui; Zhong, Sheng; Xu, Wenhui; Zhang, Weijun; Cao, Zhiguo

2013-10-01

Image matching is a fundamental task in computer vision. It is used to establish correspondence between two images taken at different viewpoint or different time from the same scene. However, its large computational complexity has been a challenge to most embedded systems. This paper proposes a single FPGA-based image matching system, which consists of SIFT feature detection, BRIEF descriptor extraction and BRIEF matching. It optimizes the FPGA architecture for the SIFT feature detection to reduce the FPGA resources utilization. Moreover, we implement BRIEF description and matching on FPGA also. The proposed system can implement image matching at 30fps (frame per second) for 1280x720 images. Its processing speed can meet the demand of most real-life computer vision applications.
Neuron array with plastic synapses and programmable dendrites.

PubMed

Ramakrishnan, Shubha; Wunderlich, Richard; Hasler, Jennifer; George, Suma

2013-10-01

We describe a novel neuromorphic chip architecture that models neurons for efficient computation. Traditional architectures of neuron array chips consist of large scale systems that are interfaced with AER for implementing intra- or inter-chip connectivity. We present a chip that uses AER for inter-chip communication but uses fast, reconfigurable FPGA-style routing with local memory for intra-chip connectivity. We model neurons with biologically realistic channel models, synapses and dendrites. This chip is suitable for small-scale network simulations and can also be used for sequence detection, utilizing directional selectivity properties of dendrites, ultimately for use in word recognition.
Evaluation of the FIR Example using Xilinx Vivado High-Level Synthesis Compiler

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Finkel, Hal; Yoshii, Kazutomo

Compared to central processing units (CPUs) and graphics processing units (GPUs), field programmable gate arrays (FPGAs) have major advantages in reconfigurability and performance achieved per watt. This development flow has been augmented with high-level synthesis (HLS) flow that can convert programs written in a high-level programming language to Hardware Description Language (HDL). Using high-level programming languages such as C, C++, and OpenCL for FPGA-based development could allow software developers, who have little FPGA knowledge, to take advantage of the FPGA-based application acceleration. This improves developer productivity and makes the FPGA-based acceleration accessible to hardware and software developers. Xilinx Vivado HLSmore » compiler is a high-level synthesis tool that enables C, C++ and System C specification to be directly targeted into Xilinx FPGAs without the need to create RTL manually. The white paper [1] published recently by Xilinx uses a finite impulse response (FIR) example to demonstrate the variable-precision features in the Vivado HLS compiler and the resource and power benefits of converting floating point to fixed point for a design. To get a better understanding of variable-precision features in terms of resource usage and performance, this report presents the experimental results of evaluating the FIR example using Vivado HLS 2017.1 and a Kintex Ultrascale FPGA. In addition, we evaluated the half-precision floating-point data type against the double-precision and single-precision data type and present the detailed results.« less
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Ormsby, John (Technical Monitor)

2002-01-01

Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing (DSP) functions. Such capability also makes and FPGA a suitable platform for the digital implementation of closed loop controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance in a compact form-factor. Other researchers have presented the notion that a second order digital filter with proportional-integral-derivative (PID) control functionality can be implemented in an FPGA. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSF) devices. Our goal is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. Meeting our goals requires alternative compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching these goals.
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor

NASA Astrophysics Data System (ADS)

Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram

2007-09-01

Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
Reconfigurable signal processor designs for advanced digital array radar systems

NASA Astrophysics Data System (ADS)

Suarez, Hernan; Zhang, Yan (Rockee); Yu, Xining

2017-05-01

The new challenges originated from Digital Array Radar (DAR) demands a new generation of reconfigurable backend processor in the system. The new FPGA devices can support much higher speed, more bandwidth and processing capabilities for the need of digital Line Replaceable Unit (LRU). This study focuses on using the latest Altera and Xilinx devices in an adaptive beamforming processor. The field reprogrammable RF devices from Analog Devices are used as analog front end transceivers. Different from other existing Software-Defined Radio transceivers on the market, this processor is designed for distributed adaptive beamforming in a networked environment. The following aspects of the novel radar processor will be presented: (1) A new system-on-chip architecture based on Altera's devices and adaptive processing module, especially for the adaptive beamforming and pulse compression, will be introduced, (2) Successful implementation of generation 2 serial RapidIO data links on FPGA, which supports VITA-49 radio packet format for large distributed DAR processing. (3) Demonstration of the feasibility and capabilities of the processor in a Micro-TCA based, SRIO switching backplane to support multichannel beamforming in real-time. (4) Application of this processor in ongoing radar system development projects, including OU's dual-polarized digital array radar, the planned new cylindrical array radars, and future airborne radars.
Radiation effects in reconfigurable FPGAs

NASA Astrophysics Data System (ADS)

Quinn, Heather

2017-04-01

Field-programmable gate arrays (FPGAs) are co-processing hardware used in image and signal processing. FPGA are programmed with custom implementations of an algorithm. These algorithms are highly parallel hardware designs that are faster than software implementations. This flexibility and speed has made FPGAs attractive for many space programs that need in situ, high-speed signal processing for data categorization and data compression. Most commercial FPGAs are affected by the space radiation environment, though. Problems with TID has restricted the use of flash-based FPGAs. Static random access memory based FPGAs must be mitigated to suppress errors from single-event upsets. This paper provides a review of radiation effects issues in reconfigurable FPGAs and discusses methods for mitigating these problems. With careful design it is possible to use these components effectively and resiliently.
Reconfigurable Gabor Filter For Fingerprint Recognition Using FPGA Verilog

NASA Astrophysics Data System (ADS)

Rosshidi, H. T.; Hadi, A. R.

2009-06-01

This paper present the implementations of Gabor filter for fingerprint recognition using Verilog HDL. This work demonstrates the application of Gabor Filter technique to enhance the fingerprint image. The incoming signal in form of image pixel will be filter out or convolute by the Gabor filter to define the ridge and valley regions of fingerprint. This is done with the application of a real time convolve based on Field Programmable Gate Array (FPGA) to perform the convolution operation. The main characteristic of the proposed approach are the usage of memory to store the incoming image pixel and the coefficient of the Gabor filter before the convolution matrix take place. The result was the signal convoluted with the Gabor coefficient.
Dynamically Reconfigurable Systolic Array Accelorators

NASA Technical Reports Server (NTRS)

Dasu, Aravind (Inventor); Barnes, Robert C. (Inventor)

2014-01-01

A polymorphic systolic array framework that works in conjunction with an embedded microprocessor on an FPGA, that allows for dynamic and complimentary scaling of acceleration levels of two algorithms active concurrently on the FPGA. Use is made of systolic arrays and hardware-software co-design to obtain an efficient multi-application acceleration system. The flexible and simple framework allows hosting of a broader range of algorithms and extendable to more complex applications in the area of aerospace embedded systems.
Field Programmable Gate Array Failure Rate Estimation Guidelines for Launch Vehicle Fault Tree Models

NASA Technical Reports Server (NTRS)

Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.

2017-01-01

Today's launch vehicles complex electronic and avionics systems heavily utilize Field Programmable Gate Array (FPGA) integrated circuits (IC) for their superb speed and reconfiguration capabilities. Consequently, FPGAs are prevalent ICs in communication protocols such as MILSTD- 1553B and in control signal commands such as in solenoid valve actuations. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Monenegro, Justino (Technical Monitor)

2002-01-01

Much has been made of the capabilities of FPGA's (Field Programmable Gate Arrays) in the hardware implementation of fast digital signal processing. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used proportional-integral-derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM-based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a DSP (Digital Signal Processor) or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using digital signal processor (DSP) devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, pulse width modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive control algorithm approaches. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacecraft. Radiation tolerant FPGA's are a feasible option for reaching this goal.
Implementation of Adaptive Digital Controllers on Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Gwaltney, David A.; King, Kenneth D.; Smith, Keary J.; Montenegro, Justino (Technical Monitor)

2002-01-01

Much has been made of the capabilities of Field Programmable Gate Arrays (FPGA's) in the hardware implementation of fast digital signal processing functions. Such capability also makes an FPGA a suitable platform for the digital implementation of closed loop controllers. Other researchers have implemented a variety of closed-loop digital controllers on FPGA's. Some of these controllers include the widely used Proportional-Integral-Derivative (PID) controller, state space controllers, neural network and fuzzy logic based controllers. There are myriad advantages to utilizing an FPGA for discrete-time control functions which include the capability for reconfiguration when SRAM- based FPGA's are employed, fast parallel implementation of multiple control loops and implementations that can meet space level radiation tolerance requirements in a compact form-factor. Generally, a software implementation on a Digital Signal Processor (DSP) device or microcontroller is used to implement digital controllers. At Marshall Space Flight Center, the Control Electronics Group has been studying adaptive discrete-time control of motor driven actuator systems using DSP devices. While small form factor, commercial DSP devices are now available with event capture, data conversion, Pulse Width Modulated (PWM) outputs and communication peripherals, these devices are not currently available in designs and packages which meet space level radiation requirements. In general, very few DSP devices are produced that are designed to meet any level of radiation tolerance or hardness. An alternative is required for compact implementation of such functionality to withstand the harsh environment encountered on spacemap. The goal of this effort is to create a fully digital, flight ready controller design that utilizes an FPGA for implementation of signal conditioning for control feedback signals, generation of commands to the controlled system, and hardware insertion of adaptive-control algorithm approaches. Radiation tolerant FPGA's are a feasible option for reaching this goal.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less

Radiation Hardened Electronics for Space Environments (RHESE)

NASA Technical Reports Server (NTRS)

Keys, Andrew S.; Adams, James H.; Frazier, Donald O.; Patrick, Marshall C.; Watson, Michael D.; Johnson, Michael A.; Cressler, John D.; Kolawa, Elizabeth A.

2007-01-01

Radiation Environmental Modeling is crucial to proper predictive modeling and electronic response to the radiation environment. When compared to on-orbit data, CREME96 has been shown to be inaccurate in predicting the radiation environment. The NEDD bases much of its radiation environment data on CREME96 output. Close coordination and partnership with DoD radiation-hardened efforts will result in leveraged - not duplicated or independently developed - technology capabilities of: a) Radiation-hardened, reconfigurable FPGA-based electronics; and b) High Performance Processors (NOT duplication or independent development).
Real Time Fault Detection and Diagnostics Using FPGA-Based Architectures

DTIC Science & Technology

2010-03-01

vector sets sent to the DUT. The testing platform combines a myriad of testing and measuring equipment and work hours onto one small reprogrammable ...recently few reprogrammable devices have been used on spacecraft due to their sensitivity to involuntary reconfiguration due to Single Event Upsets...Determination of Nuclear Yield from Thermal Degradation of Automobile Paint MS Thesis. AFIT/GWM/ENP/10-M10. Wright-Patterson AFB OH: Graduate School of
An FPGA-based High Speed Parallel Signal Processing System for Adaptive Optics Testbed

NASA Astrophysics Data System (ADS)

Kim, H.; Choi, Y.; Yang, Y.

In this paper a state-of-the-art FPGA (Field Programmable Gate Array) based high speed parallel signal processing system (SPS) for adaptive optics (AO) testbed with 1 kHz wavefront error (WFE) correction frequency is reported. The AO system consists of Shack-Hartmann sensor (SHS) and deformable mirror (DM), tip-tilt sensor (TTS), tip-tilt mirror (TTM) and an FPGA-based high performance SPS to correct wavefront aberrations. The SHS is composed of 400 subapertures and the DM 277 actuators with Fried geometry, requiring high speed parallel computing capability SPS. In this study, the target WFE correction speed is 1 kHz; therefore, it requires massive parallel computing capabilities as well as strict hard real time constraints on measurements from sensors, matrix computation latency for correction algorithms, and output of control signals for actuators. In order to meet them, an FPGA based real-time SPS with parallel computing capabilities is proposed. In particular, the SPS is made up of a National Instrument's (NI's) real time computer and five FPGA boards based on state-of-the-art Xilinx Kintex 7 FPGA. Programming is done with NI's LabView environment, providing flexibility when applying different algorithms for WFE correction. It also facilitates faster programming and debugging environment as compared to conventional ones. One of the five FPGA's is assigned to measure TTS and calculate control signals for TTM, while the rest four are used to receive SHS signal, calculate slops for each subaperture and correction signal for DM. With this parallel processing capabilities of the SPS the overall closed-loop WFE correction speed of 1 kHz has been achieved. System requirements, architecture and implementation issues are described; furthermore, experimental results are also given.
Moving Beyond 3D Hetero-Integration and Towards Monolithic Integration of Phase-Change RF Switches with SiGe BiCMOS

DTIC Science & Technology

2016-03-31

Corporation, Linthicum, Maryland *Corresponding author: Pavel.Borodulin@ngc.com Abstract: A chip -scale, highly-reconfigurable transmitter and...the technology has been used in a chip -scale, reconfigurable receiver demonstration and ongoing efforts to increase the level of performance and...circuit (RF-FPGA). It consists of a heterogeneous assembly of a SiGe BiCMOS chip with multiple 3D-integrated, low-loss, phase-change switch chiplets
A reconfigurable computing platform for plume tracking with mobile sensor networks

NASA Astrophysics Data System (ADS)

Kim, Byung Hwa; D'Souza, Colin; Voyles, Richard M.; Hesch, Joel; Roumeliotis, Stergios I.

2006-05-01

Much work has been undertaken recently toward the development of low-power, high-performance sensor networks. There are many static remote sensing applications for which this is appropriate. The focus of this development effort is applications that require higher performance computation, but still involve severe constraints on power and other resources. Toward that end, we are developing a reconfigurable computing platform for miniature robotic and human-deployed sensor systems composed of several mobile nodes. The system provides static and dynamic reconfigurability for both software and hardware by the combination of CPU (central processing unit) and FPGA (field-programmable gate array) allowing on-the-fly reprogrammability. Static reconfigurability of the hardware manifests itself in the form of a "morphing bus" architecture that permits the modular connection of various sensors with no bus interface logic. Dynamic hardware reconfigurability provides for the reallocation of hardware resources at run-time as the mobile, resource-constrained nodes encounter unknown environmental conditions that render various sensors ineffective. This computing platform will be described in the context of work on chemical/biological/radiological plume tracking using a distributed team of mobile sensors. The objective for a dispersed team of ground and/or aerial autonomous vehicles (or hand-carried sensors) is to acquire measurements of the concentration of the chemical agent from optimal locations and estimate its source and spread. This requires appropriate distribution, coordination and communication within the team members across a potentially unknown environment. The key problem is to determine the parameters of the distribution of the harmful agent so as to use these values for determining its source and predicting its spread. The accuracy and convergence rate of this estimation process depend not only on the number and accuracy of the sensor measurements but also on their spatial distribution over time (the sampling strategy). For the safety of a human-deployed distribution of sensors, optimized trajectories to minimize human exposure are also of importance. The systems described in this paper are currently being developed. Parts of the system are already in existence and some results from these are described.
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)

PubMed Central

Li, Isaac TS; Shum, Warren; Truong, Kevin

2007-01-01

Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).

PubMed

Li, Isaac T S; Shum, Warren; Truong, Kevin

2007-06-07

To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Motion camera based on a custom vision sensor and an FPGA architecture

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel

1998-09-01

A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Reconfigurable environmentally adaptive computing

NASA Technical Reports Server (NTRS)

Coxe, Robin L. (Inventor); Galica, Gary E. (Inventor)

2008-01-01

Described are methods and apparatus, including computer program products, for reconfigurable environmentally adaptive computing technology. An environmental signal representative of an external environmental condition is received. A processing configuration is automatically selected, based on the environmental signal, from a plurality of processing configurations. A reconfigurable processing element is reconfigured to operate according to the selected processing configuration. In some examples, the environmental condition is detected and the environmental signal is generated based on the detected condition.
Radiation-hardened optically reconfigurable gate array exploiting holographic memory characteristics

NASA Astrophysics Data System (ADS)

Seto, Daisaku; Watanabe, Minoru

2015-09-01

In this paper, we present a proposal for a radiation-hardened optically reconfigurable gate array (ORGA). The ORGA is a type of field programmable gate array (FPGA). The ORGA configuration can be executed by the exploitation of holographic memory characteristics even if 20% of the configuration data are damaged. Moreover, the optoelectronic technology enables the high-speed reconfiguration of the programmable gate array. Such a high-speed reconfiguration can increase the radiation tolerance of its programmable gate array to 9.3 × 104 times higher than that of current FPGAs. Through experimentation, this study clarified the configuration dependability using the impulse-noise emulation and high-speed configuration capabilities of the ORGA with corrupt configuration contexts. Moreover, the radiation tolerance of the programmable gate array was confirmed theoretically through probabilistic calculation.
FPGA-Based Smart Sensor for Online Displacement Measurements Using a Heterodyne Interferometer

PubMed Central

Vera-Salas, Luis Alberto; Moreno-Tapia, Sandra Veronica; Garcia-Perez, Arturo; de Jesus Romero-Troncoso, Rene; Osornio-Rios, Roque Alfredo; Serroukh, Ibrahim; Cabal-Yepez, Eduardo

2011-01-01

The measurement of small displacements on the nanometric scale demands metrological systems of high accuracy and precision. In this context, interferometer-based displacement measurements have become the main tools used for traceable dimensional metrology. The different industrial applications in which small displacement measurements are employed requires the use of online measurements, high speed processes, open architecture control systems, as well as good adaptability to specific process conditions. The main contribution of this work is the development of a smart sensor for large displacement measurement based on phase measurement which achieves high accuracy and resolution, designed to be used with a commercial heterodyne interferometer. The system is based on a low-cost Field Programmable Gate Array (FPGA) allowing the integration of several functions in a single portable device. This system is optimal for high speed applications where online measurement is needed and the reconfigurability feature allows the addition of different modules for error compensation, as might be required by a specific application. PMID:22164040
A Plug and Play GNC Architecture Using FPGA Components

NASA Technical Reports Server (NTRS)

KrishnaKumar, K.; Kaneshige, J.; Waterman, R.; Pires, C.; Ippoloito, C.

2005-01-01

The goal of Plug and Play, or PnP, is to allow hardware and software components to work together automatically, without requiring manual setup procedures. As a result, new or replacement hardware can be plugged into a system and automatically configured with the appropriate resource assignments. However, in many cases it may not be practical or even feasible to physically replace hardware components. One method for handling these types of situations is through the incorporation of reconfigurable hardware such as Field Programmable Gate Arrays, or FPGAs. This paper describes a phased approach to developing a Guidance, Navigation, and Control (GNC) architecture that expands on the traditional concepts of PnP, in order to accommodate hardware reconfiguration without requiring detailed knowledge of the hardware. This is achieved by establishing a functional based interface that defines how the hardware will operate, and allow the hardware to reconfigure itself. The resulting system combines the flexibility of manipulating software components with the speed and efficiency of hardware.
An information-theoretic approach to motor action decoding with a reconfigurable parallel architecture.

PubMed

Craciun, Stefan; Brockmeier, Austin J; George, Alan D; Lam, Herman; Príncipe, José C

2011-01-01

Methods for decoding movements from neural spike counts using adaptive filters often rely on minimizing the mean-squared error. However, for non-Gaussian distribution of errors, this approach is not optimal for performance. Therefore, rather than using probabilistic modeling, we propose an alternate non-parametric approach. In order to extract more structure from the input signal (neuronal spike counts) we propose using minimum error entropy (MEE), an information-theoretic approach that minimizes the error entropy as part of an iterative cost function. However, the disadvantage of using MEE as the cost function for adaptive filters is the increase in computational complexity. In this paper we present a comparison between the decoding performance of the analytic Wiener filter and a linear filter trained with MEE, which is then mapped to a parallel architecture in reconfigurable hardware tailored to the computational needs of the MEE filter. We observe considerable speedup from the hardware design. The adaptation of filter weights for the multiple-input, multiple-output linear filters, necessary in motor decoding, is a highly parallelizable algorithm. It can be decomposed into many independent computational blocks with a parallel architecture readily mapped to a field-programmable gate array (FPGA) and scales to large numbers of neurons. By pipelining and parallelizing independent computations in the algorithm, the proposed parallel architecture has sublinear increases in execution time with respect to both window size and filter order.
Malleable architecture generator for FPGA computing

NASA Astrophysics Data System (ADS)

Gokhale, Maya; Kaba, James; Marks, Aaron; Kim, Jang

1996-10-01

The malleable architecture generator (MARGE) is a tool set that translates high-level parallel C to configuration bit streams for field-programmable logic based computing systems. MARGE creates an application-specific instruction set and generates the custom hardware components required to perform exactly those computations specified by the C program. In contrast to traditional fixed-instruction processors, MARGE's dynamic instruction set creation provides for efficient use of hardware resources. MARGE processes intermediate code in which each operation is annotated by the bit lengths of the operands. Each basic block (sequence of straight line code) is mapped into a single custom instruction which contains all the operations and logic inherent in the block. A synthesis phase maps the operations comprising the instructions into register transfer level structural components and control logic which have been optimized to exploit functional parallelism and function unit reuse. As a final stage, commercial technology-specific tools are used to generate configuration bit streams for the desired target hardware. Technology- specific pre-placed, pre-routed macro blocks are utilized to implement as much of the hardware as possible. MARGE currently supports the Xilinx-based Splash-2 reconfigurable accelerator and National Semiconductor's CLAy-based parallel accelerator, MAPA. The MARGE approach has been demonstrated on systolic applications such as DNA sequence comparison.
Achieving High Performance with FPGA-Based Computing

PubMed Central

Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug

2011-01-01

Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088
Leveraging FPGAs for Accelerating Short Read Alignment.

PubMed

Arram, James; Kaplan, Thomas; Luk, Wayne; Jiang, Peiyong

2017-01-01

One of the key challenges facing genomics today is how to efficiently analyze the massive amounts of data produced by next-generation sequencing platforms. With general-purpose computing systems struggling to address this challenge, specialized processors such as the Field-Programmable Gate Array (FPGA) are receiving growing interest. The means by which to leverage this technology for accelerating genomic data analysis is however largely unexplored. In this paper, we present a runtime reconfigurable architecture for accelerating short read alignment using FPGAs. This architecture exploits the reconfigurability of FPGAs to allow the development of fast yet flexible alignment designs. We apply this architecture to develop an alignment design which supports exact and approximate alignment with up to two mismatches. Our design is based on the FM-index, with optimizations to improve the alignment performance. In particular, the n-step FM-index, index oversampling, a seed-and-compare stage, and bi-directional backtracking are included. Our design is implemented and evaluated on a 1U Maxeler MPC-X2000 dataflow node with eight Altera Stratix-V FPGAs. Measurements show that our design is 28 times faster than Bowtie2 running with 16 threads on dual Intel Xeon E5-2640 CPUs, and nine times faster than Soap3-dp running on an NVIDIA Tesla C2070 GPU.
Temperature Tolerant Evolvable Systems Utilizing FPGA Boards and Bias-Controlled Amplifiers

NASA Technical Reports Server (NTRS)

Kumar, Nikhil R.

2005-01-01

Space missions often require radiation and extreme-temperature hardened electronics to survive the harsh environments beyond Earth's atmosphere. Traditional approaches to preserve electronics incorporate shielding, insulation and redundancy at the expense of power and weight. However, a novel way of bypassing these problems is the concept of evolutionary hardware. A reconfigurable device, consisting of several switches interconnected with analog/digital parts, is controlled by an evolutionary processor (EP). When the EP detects degradation in the circuit it sends signals to reconfigure the switches, thus forming a new circuit with the desired output. This concept has been developed since the mid-l990s, but one problem remains-the EP cannot degrade substantially. For this reason, extensive testing at extreme temperatures (-180 to 120 C) has been done on devices found on FPGA boards (taking the role of the EP), such as the Analog to Digital and the Digital to Analog Converter. The EP is used in conjunction with a bias-controlled amplifier and a new prototype relay board, which is interconnected with 6 G4-FETs, a tri-input transistor-like element developed at JPL. The greatest improvements to be made lie in the reconfigurable device, so future design and testing of the G4-FET chip is required.
The SpaceCube Family of Hybrid On-Board Science Data Processors: An Update

NASA Astrophysics Data System (ADS)

Flatley, T.

2012-12-01

SpaceCube is an FPGA based on-board hybrid science data processing system developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. The SpaceCube design strategy incorporates commercial rad-tolerant FPGA technology and couples it with an upset mitigation software architecture to provide "order of magnitude" improvements in computing power over traditional rad-hard flight systems. Many of the missions proposed in the Earth Science Decadal Survey (ESDS) will require "next generation" on-board processing capabilities to meet their specified mission goals. Advanced laser altimeter, radar, lidar and hyper-spectral instruments are proposed for at least ten of the ESDS missions, and all of these instrument systems will require advanced on-board processing capabilities to facilitate the timely conversion of Earth Science data into Earth Science information. Both an "order of magnitude" increase in processing power and the ability to "reconfigure on the fly" are required to implement algorithms that detect and react to events, to produce data products on-board for applications such as direct downlink, quick look, and "first responder" real-time awareness, to enable "sensor web" multi-platform collaboration, and to perform on-board "lossless" data reduction by migrating typical ground-based processing functions on-board, thus reducing on-board storage and downlink requirements. This presentation will highlight a number of SpaceCube technology developments to date and describe current and future efforts, including the collaboration with the U.S. Department of Defense - Space Test Program (DoD/STP) on the STP-H4 ISS experiment pallet (launch June 2013) that will demonstrate SpaceCube 2.0 technology on-orbit.; ;
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.

PubMed

Ly, Daniel Le; Chow, Paul

2010-11-01

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
SpaceCube 2.0: An Advanced Hybrid Onboard Data Processor

NASA Technical Reports Server (NTRS)

Lin, Michael; Flatley, Thomas; Godfrey, John; Geist, Alessandro; Espinosa, Daniel; Petrick, David

2011-01-01

The SpaceCube 2.0 is a compact, high performance, low-power onboard processing system that takes advantage of cutting-edge hybrid (CPU/FPGA/DSP) processing elements. The SpaceCube 2.0 design concept includes two commercial Virtex-5 field-programmable gate array (FPGA) parts protected by gradiation hardened by software" technology, and possesses exceptional size, weight, and power characteristics [5x5x7 in., 3.5 lb (approximately equal to 12.7 x 12.7 x 17.8 cm, 1.6 kg) 5-25 W, depending on the application fs required clock rate]. The two Virtex-5 FPGA parts are implemented in a unique back-toback configuration to maximize data transfer and computing performance. Draft computing power specifications for the SpaceCube 2.0 unit include four PowerPC 440s (1100 DMIPS each), 500+ DSP48Es (2x580 GMACS), 100+ LVDS high-speed serial I/Os (1.25 Gbps each), and 2x190 GFLOPS single-precision (65 GFLOPS double-precision) floating point performance. The SpaceCube 2.0 includes PROM memory for CPU boot, health and safety, and basic command and telemetry functionality; RAM memory for program execution; and FLASH/EEPROM memory to store algorithms and application code for the CPU, FPGA, and DSP processing elements. Program execution can be reconfigured in real time and algorithms can be updated, modified, and/or replaced at any point during the mission. Gigabit Ethernet, Spacewire, SATA and highspeed LVDS serial/parallel I/O channels are available for instrument/sensor data ingest, and mission-unique instrument interfaces can be accommodated using a compact PCI (cPCI) expansion card interface. The SpaceCube 2.0 can be utilized in NASA Earth Science, Helio/Astrophysics and Exploration missions, and Department of Defense satellites for onboard data processing. It can also be used in commercial communication and mapping satellites.

Parallel Hough Transform-Based Straight Line Detection and Its FPGA Implementation in Embedded Vision

PubMed Central

Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam

2013-01-01

Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness. PMID:23867746
Parallel Hough Transform-based straight line detection and its FPGA implementation in embedded vision.

PubMed

Lu, Xiaofeng; Song, Li; Shen, Sumin; He, Kang; Yu, Songyu; Ling, Nam

2013-07-17

Hough Transform has been widely used for straight line detection in low-definition and still images, but it suffers from execution time and resource requirements. Field Programmable Gate Arrays (FPGA) provide a competitive alternative for hardware acceleration to reap tremendous computing performance. In this paper, we propose a novel parallel Hough Transform (PHT) and FPGA architecture-associated framework for real-time straight line detection in high-definition videos. A resource-optimized Canny edge detection method with enhanced non-maximum suppression conditions is presented to suppress most possible false edges and obtain more accurate candidate edge pixels for subsequent accelerated computation. Then, a novel PHT algorithm exploiting spatial angle-level parallelism is proposed to upgrade computational accuracy by improving the minimum computational step. Moreover, the FPGA based multi-level pipelined PHT architecture optimized by spatial parallelism ensures real-time computation for 1,024 × 768 resolution videos without any off-chip memory consumption. This framework is evaluated on ALTERA DE2-115 FPGA evaluation platform at a maximum frequency of 200 MHz, and it can calculate straight line parameters in 15.59 ms on the average for one frame. Qualitative and quantitative evaluation results have validated the system performance regarding data throughput, memory bandwidth, resource, speed and robustness.
Enabling Future Robotic Missions with Multicore Processors

NASA Technical Reports Server (NTRS)

Powell, Wesley A.; Johnson, Michael A.; Wilmot, Jonathan; Some, Raphael; Gostelow, Kim P.; Reeves, Glenn; Doyle, Richard J.

2011-01-01

Recent commercial developments in multicore processors (e.g. Tilera, Clearspeed, HyperX) have provided an option for high performance embedded computing that rivals the performance attainable with FPGA-based reconfigurable computing architectures. Furthermore, these processors offer more straightforward and streamlined application development by allowing the use of conventional programming languages and software tools in lieu of hardware design languages such as VHDL and Verilog. With these advantages, multicore processors can significantly enhance the capabilities of future robotic space missions. This paper will discuss these benefits, along with onboard processing applications where multicore processing can offer advantages over existing or competing approaches. This paper will also discuss the key artchitecural features of current commercial multicore processors. In comparison to the current art, the features and advancements necessary for spaceflight multicore processors will be identified. These include power reduction, radiation hardening, inherent fault tolerance, and support for common spacecraft bus interfaces. Lastly, this paper will explore how multicore processors might evolve with advances in electronics technology and how avionics architectures might evolve once multicore processors are inserted into NASA robotic spacecraft.
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
Training a molecular automaton to play a game

NASA Astrophysics Data System (ADS)

Pei, Renjun; Matamoros, Elizabeth; Liu, Manhong; Stefanovic, Darko; Stojanovic, Milan N.

2010-11-01

Research at the interface between chemistry and cybernetics has led to reports of `programmable molecules', but what does it mean to say `we programmed a set of solution-phase molecules to do X'? A survey of recently implemented solution-phase circuitry indicates that this statement could be replaced with `we pre-mixed a set of molecules to do X and functional subsets of X'. These hard-wired mixtures are then exposed to a set of molecular inputs, which can be interpreted as being keyed to human moves in a game, or as assertions of logical propositions. In nucleic acids-based systems, stemming from DNA computation, these inputs can be seen as generic oligonucleotides. Here, we report using reconfigurable nucleic acid catalyst-based units to build a multipurpose reprogrammable molecular automaton that goes beyond single-purpose `hard-wired' molecular automata. The automaton covers all possible responses to two consecutive sets of four inputs (such as four first and four second moves for a generic set of trivial two-player two-move games). This is a model system for more general molecular field programmable gate array (FPGA)-like devices that can be programmed by example, which means that the operator need not have any knowledge of molecular computing methods.
Training a molecular automaton to play a game.

PubMed

Pei, Renjun; Matamoros, Elizabeth; Liu, Manhong; Stefanovic, Darko; Stojanovic, Milan N

2010-11-01

Research at the interface between chemistry and cybernetics has led to reports of 'programmable molecules', but what does it mean to say 'we programmed a set of solution-phase molecules to do X'? A survey of recently implemented solution-phase circuitry indicates that this statement could be replaced with 'we pre-mixed a set of molecules to do X and functional subsets of X'. These hard-wired mixtures are then exposed to a set of molecular inputs, which can be interpreted as being keyed to human moves in a game, or as assertions of logical propositions. In nucleic acids-based systems, stemming from DNA computation, these inputs can be seen as generic oligonucleotides. Here, we report using reconfigurable nucleic acid catalyst-based units to build a multipurpose reprogrammable molecular automaton that goes beyond single-purpose 'hard-wired' molecular automata. The automaton covers all possible responses to two consecutive sets of four inputs (such as four first and four second moves for a generic set of trivial two-player two-move games). This is a model system for more general molecular field programmable gate array (FPGA)-like devices that can be programmed by example, which means that the operator need not have any knowledge of molecular computing methods.
Trigger design for a gamma ray detector of HIRFL-ETF

NASA Astrophysics Data System (ADS)

Du, Zhong-Wei; Su, Hong; Qian, Yi; Kong, Jie

2013-10-01

The Gamma Ray Array Detector (GRAD) is one subsystem of HIRFL-ETF (the External Target Facility (ETF) of the Heavy Ion Research Facility in Lanzhou (HIRFL)). It is capable of measuring the energy of gamma-rays with 1024 CsI scintillators in in-beam nuclear experiments. The GRAD trigger should select the valid events and reject the data from the scintillators which are not hit by the gamma-ray. The GRAD trigger has been developed based on the Field Programmable Gate Array (FPGAs) and PXI interface. It makes prompt trigger decisions to select valid events by processing the hit signals from the 1024 CsI scintillators. According to the physical requirements, the GRAD trigger module supplies 12-bit trigger information for the global trigger system of ETF and supplies a trigger signal for data acquisition (DAQ) system of GRAD. In addition, the GRAD trigger generates trigger data that are packed and transmitted to the host computer via PXI bus to be saved for off-line analysis. The trigger processing is implemented in the front-end electronics of GRAD and one FPGA of the GRAD trigger module. The logic of PXI transmission and reconfiguration is implemented in another FPGA of the GRAD trigger module. During the gamma-ray experiments, the GRAD trigger performs reliably and efficiently. The function of GRAD trigger is capable of satisfying the physical requirements.
A hierarchical scheduling and management solution for dynamic reconfiguration in FPGA-based embedded systems

NASA Astrophysics Data System (ADS)

Cervero, T.; Gómez, A.; López, S.; Sarmiento, R.; Dondo, J.; Rincón, F.; López, J. C.

2013-05-01

One of the limiting factors that have prevented a widely dissemination of the reconfigurable technology is the absence of an appropriate model for certain target applications capable of offering a reliable control. Moreover, the lack of flexible and easy-to-use scheduling and management systems are also relevant drawbacks to be considered. Under static scenarios, it is relatively easy to schedule and manage the reconfiguration process since all the variations corresponding to predetermined and well-known tasks. However, the difficulty increases when the adaptation needs of the overall system change semi-randomly according to the environmental fluctuations. In this context, this work proposes a change in the paradigm of dynamically reconfigurable systems, by attending to the dynamically reconfigurable control problematic as a whole, in which the scheduling and the placement issues are packed together as a hierarchical management structure, interacting together as one entity from the system point of view, but performing their tasks with certain degree of independence each other. In this sense, the top hierarchical level corresponds with a dynamic scheduler in charge of planning and adjusting all the reconfigurable modules according to the variations of the external stimulus. The lower level interacts with the physical layer of the device by means of instantiating, relocating, removing a reconfigurable module following the scheduler's instructions. In regards to how fast is the proposed solution, the total partial reconfiguration time achieved with this proposal has been measured and compared with other two approaches: 1) using traditional Xilinx's tools; 2) using an optimized version of the Xilinx's drivers. The collected numbers demonstrate that our solution reaches a gain up to 10 times faster than the other approaches.
Stego on FPGA: An IWT Approach

PubMed Central

Ramalingam, Balakrishnan

2014-01-01

A reconfigurable hardware architecture for the implementation of integer wavelet transform (IWT) based adaptive random image steganography algorithm is proposed. The Haar-IWT was used to separate the subbands namely, LL, LH, HL, and HH, from 8 × 8 pixel blocks and the encrypted secret data is hidden in the LH, HL, and HH blocks using Moore and Hilbert space filling curve (SFC) scan patterns. Either Moore or Hilbert SFC was chosen for hiding the encrypted data in LH, HL, and HH coefficients, whichever produces the lowest mean square error (MSE) and the highest peak signal-to-noise ratio (PSNR). The fixated random walk's verdict of all blocks is registered which is nothing but the furtive key. Our system took 1.6 µs for embedding the data in coefficient blocks and consumed 34% of the logic elements, 22% of the dedicated logic register, and 2% of the embedded multiplier on Cyclone II field programmable gate array (FPGA). PMID:24723794
Real-time windowing in imaging radar using FPGA technique

NASA Astrophysics Data System (ADS)

Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique

2005-02-01

The imaging radar uses the high frequency electromagnetic waves reflected from different objects for estimating of its parameters. Pulse compression is a standard signal processing technique used to minimize the peak transmission power and to maximize SNR, and to get a better resolution. Usually the pulse compression can be achieved using a matched filter. The level of the side-lobes in the imaging radar can be reduced using the special weighting function processing. There are very known different weighting functions: Hamming, Hanning, Blackman, Chebyshev, Blackman-Harris, Kaiser-Bessel, etc., widely used in the signal processing applications. Field Programmable Gate Arrays (FPGAs) offers great benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. This reconfiguration makes FPGAs a better solution over custom-made integrated circuits. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal and pulse compression using Matlab, Simulink, and System Generator. Employing FPGA and mentioned software we have proposed the pulse compression design on FPGA using classical and novel windows technique to reduce the side-lobes level. This permits increasing the detection ability of the small or nearly placed targets in imaging radar. The advantage of FPGA that can do parallelism in real time processing permits to realize the proposed algorithms. The paper also presents the experimental results of proposed windowing procedure in the marine radar with such the parameters: signal is linear FM (Chirp); frequency deviation DF is 9.375MHz; the pulse width T is 3.2μs taps number in the matched filter is 800 taps; sampling frequency 253.125*106 MHz. It has been realized the reducing of side-lobes levels in real time permitting better resolution of the small targets.
Design Architecture and Initial Results from an FPGA Based Digital Receiver for Multistatic Meteor Measurements

NASA Astrophysics Data System (ADS)

Palo, Scott; Vaudrin, Cody

Defined by a minimal RF front-end followed by an analog-to-digital converter (ADC) and con-trolled by a reconfigurable logic device (FPGA), the digital receiver will replace conventional heterodyning analog receivers currently in use by the COBRA meteor radar. A basic hardware overview touches on the major digital receiver components, theory of operation and data han-dling strategies. We address concerns within the community regarding the implementation of digital receivers in small-scale scientific radars, and outline the numerous benefits with a focus on reconfigurability. From a remote sensing viewpoint, having complete visibility into a band of the EM spectrum allows an experiment designer to focus on parameter estimation rather than hardware limitations. Finally, we show some basic multistatic receiver configurations enabled through GPS time synchronization. Currently, the digital receiver is configured to facilitate range and radial velocity determination of meteors in the MLT region for use with the COBRA meteor radar. Initial measurements from data acquired at Platteville, Colorado and Tierra Del Fuego in Argentina will be presented. We show an improvement in detection rates compared to conventional analog systems. Scientific justification for a digital receiver is clearly made by the presentation of RTI plots created using data acquired from the receiver. These plots reveal an interesting phenomenon concerning vacillating power structures in a select number of meteor trails.
Field Programmable Gate Array Reliability Analysis Guidelines for Launch Vehicle Reliability Block Diagrams

NASA Technical Reports Server (NTRS)

Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.

2017-01-01

Field Programmable Gate Arrays (FPGAs) integrated circuits (IC) are one of the key electronic components in today's sophisticated launch and space vehicle complex avionic systems, largely due to their superb reprogrammable and reconfigurable capabilities combined with relatively low non-recurring engineering costs (NRE) and short design cycle. Consequently, FPGAs are prevalent ICs in communication protocols and control signal commands. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
Optimal reconfiguration strategy for a degradable multimodule computing system

NASA Technical Reports Server (NTRS)

Lee, Yann-Hang; Shin, Kang G.

1987-01-01

The present quantitative approach to the problem of reconfiguring a degradable multimode system assigns some modules to computation and arranges others for reliability. By using expected total reward as the optimal criterion, there emerges an active reconfiguration strategy based not only on the occurrence of failure but the progression of the given mission. This reconfiguration strategy requires specification of the times at which the system should undergo reconfiguration, and the configurations to which the system should change. The optimal reconfiguration problem is converted to integer nonlinear knapsack and fractional programming problems.
Python based high-level synthesis compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radosław; Pozniak, Krzysztof; Romaniuk, Ryszard

2014-11-01

This paper presents a python based High-Level synthesis (HLS) compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and map it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Creating parallel programs implemented in FPGAs is not trivial. This article describes design, implementation and first results of created Python based compiler.
A reconfigurable continuous-flow fluidic routing fabric using a modular, scalable primitive.

PubMed

Silva, Ryan; Bhatia, Swapnil; Densmore, Douglas

2016-07-05

Microfluidic devices, by definition, are required to move liquids from one physical location to another. Given a finite and frequently fixed set of physical channels to route fluids, a primitive design element that allows reconfigurable routing of that fluid from any of n input ports to any n output ports will dramatically change the paradigms by which these chips are designed and applied. Furthermore, if these elements are "regular" regarding their design, the programming and fabrication of these elements becomes scalable. This paper presents such a design element called a transposer. We illustrate the design, fabrication and operation of a single transposer. We then scale this design to create a programmable fabric towards a general-purpose, reconfigurable microfluidic platform analogous to the Field Programmable Gate Array (FPGA) found in digital electronics.
The research of data acquisition system for Raman spectrometer

NASA Astrophysics Data System (ADS)

Cui, Xiao; Guo, Pan; Zhang, Yinchao; Chen, Siying; Chen, He; Chen, Wenbo

2011-11-01

Raman spectrometer has been widely used as an identification tool for analyzing material structure and composition in many fields. However, Raman scattering echo signal is very weak, about dozens of photons at most in one laser plus signal. Therefore, it is a great challenge to design a Raman spectrum data acquisition system which could accurately receive the weak echo signal. The system designed in this paper receives optical signals with the principle of photon counter and could detect single photon. The whole system consists of a photoelectric conversion module H7421-40 and a photo counting card including a field programmable gate array (FPGA) chip and a PCI9054 chip. The module H7421-40 including a PMT, an amplifier and a discriminator has high sensitivity on wavelength from 300nm to 720nm. The Center Wavelength is 580nm which is close to the excitation wavelength (532nm), QE 40% at peak wavelength, Count Sensitivity is 7.8*105(S-1PW-1) and Count Linearity is 1.5MHZ. In FPGA chip, the functions are divided into three parts: parameter setting module, controlling module, data collection and storage module. All the commands, parameters and data are transmitted between FPGA and computer by PCI9054 chip through the PCI interface. The result of experiment shows that the Raman spectrum data acquisition system is reasonable and efficient. There are three primary advantages of the data acquisition system: the first one is the high sensitivity with single photon detection capability; the second one is the high integrated level which means all the operation could be done by the photo counting card; and the last one is the high expansion ability because of the smart reconfigurability of FPGA chip.
Real-time FPGA architectures for computer vision

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar

2000-03-01

This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
Radiation-Tolerant Intelligent Memory Stack - RTIMS

NASA Technical Reports Server (NTRS)

Ng, Tak-kwong; Herath, Jeffrey A.

2011-01-01

This innovation provides reconfigurable circuitry and 2-Gb of error-corrected or 1-Gb of triple-redundant digital memory in a small package. RTIMS uses circuit stacking of heterogeneous components and radiation shielding technologies. A reprogrammable field-programmable gate array (FPGA), six synchronous dynamic random access memories, linear regulator, and the radiation mitigation circuits are stacked into a module of 42.7 42.7 13 mm. Triple module redundancy, current limiting, configuration scrubbing, and single- event function interrupt detection are employed to mitigate radiation effects. The novel self-scrubbing and single event functional interrupt (SEFI) detection allows a relatively soft FPGA to become radiation tolerant without external scrubbing and monitoring hardware
Highly Reconfigurable Beamformer Stimulus Generator

NASA Astrophysics Data System (ADS)

Vaviļina, E.; Gaigals, G.

2018-02-01

The present paper proposes a highly reconfigurable beamformer stimulus generator of radar antenna array, which includes three main blocks: settings of antenna array, settings of objects (signal sources) and a beamforming simulator. Following from the configuration of antenna array and object settings, different stimulus can be generated as the input signal for a beamformer. This stimulus generator is developed under a greater concept with two utterly independent paths where one is the stimulus generator and the other is the hardware beamformer. Both paths can be complemented in final and in intermediate steps as well to check and improve system performance. This way the technology development process is promoted by making each of the future hardware steps more substantive. Stimulus generator configuration capabilities and test results are presented proving the application of the stimulus generator for FPGA based beamforming unit development and tuning as an alternative to an actual antenna system.
Self-Adaptive System based on Field Programmable Gate Array for Extreme Temperature Electronics

NASA Technical Reports Server (NTRS)

Keymeulen, Didier; Zebulum, Ricardo; Rajeshuni, Ramesham; Stoica, Adrian; Katkoori, Srinivas; Graves, Sharon; Novak, Frank; Antill, Charles

2006-01-01

In this work, we report the implementation of a self-adaptive system using a field programmable gate array (FPGA) and data converters. The self-adaptive system can autonomously recover the lost functionality of a reconfigurable analog array (RAA) integrated circuit (IC) [3]. Both the RAA IC and the self-adaptive system are operating in extreme temperatures (from 120 C down to -180 C). The RAA IC consists of reconfigurable analog blocks interconnected by several switches and programmable by bias voltages. It implements filters/amplifiers with bandwidth up to 20 MHz. The self-adaptive system controls the RAA IC and is realized on Commercial-Off-The-Shelf (COTS) parts. It implements a basic compensation algorithm that corrects a RAA IC in less than a few milliseconds. Experimental results for the cold temperature environment (down to -180 C) demonstrate the feasibility of this approach.

FPGA and USB based control board for quantum random number generator

NASA Astrophysics Data System (ADS)

Wang, Jian; Wan, Xu; Zhang, Hong-Fei; Gao, Yuan; Chen, Teng-Yun; Liang, Hao

2009-09-01

The design and implementation of FPGA-and-USB-based control board for quantum experiments are discussed. The usage of quantum true random number generator, control- logic in FPGA and communication with computer through USB protocol are proposed in this paper. Programmable controlled signal input and output ports are implemented. The error-detections of data frame header and frame length are designed. This board has been used in our decoy-state based quantum key distribution (QKD) system successfully.
Heterogeneous real-time computing in radio astronomy

NASA Astrophysics Data System (ADS)

Ford, John M.; Demorest, Paul; Ransom, Scott

2010-07-01

Modern computer architectures suited for general purpose computing are often not the best choice for either I/O-bound or compute-bound problems. Sometimes the best choice is not to choose a single architecture, but to take advantage of the best characteristics of different computer architectures to solve your problems. This paper examines the tradeoffs between using computer systems based on the ubiquitous X86 Central Processing Units (CPU's), Field Programmable Gate Array (FPGA) based signal processors, and Graphical Processing Units (GPU's). We will show how a heterogeneous system can be produced that blends the best of each of these technologies into a real-time signal processing system. FPGA's tightly coupled to analog-to-digital converters connect the instrument to the telescope and supply the first level of computing to the system. These FPGA's are coupled to other FPGA's to continue to provide highly efficient processing power. Data is then packaged up and shipped over fast networks to a cluster of general purpose computers equipped with GPU's, which are used for floating-point intensive computation. Finally, the data is handled by the CPU and written to disk, or further processed. Each of the elements in the system has been chosen for its specific characteristics and the role it can play in creating a system that does the most for the least, in terms of power, space, and money.
Tethered Forth system for FPGA applications

NASA Astrophysics Data System (ADS)

Goździkowski, Paweł; Zabołotny, Wojciech M.

2013-10-01

This paper presents the tethered Forth system dedicated for testing and debugging of FPGA based electronic systems. Use of the Forth language allows to interactively develop and run complex testing or debugging routines. The solution is based on a small, 16-bit soft core CPU, used to implement the Forth Virtual Machine. Thanks to the use of the tethered Forth model it is possible to minimize usage of the internal RAM memory in the FPGA. The function of the intelligent terminal, which is an essential part of the tethered Forth system, may be fulfilled by the standard PC computer or by the smartphone. System is implemented in Python (the software for intelligent terminal), and in VHDL (the IP core for FPGA), so it can be easily ported to different hardware platforms. The connection between the terminal and FPGA may be established and disconnected many times without disturbing the state of the FPGA based system. The presented system has been verified in the hardware, and may be used as a tool for debugging, testing and even implementing of control algorithms for FPGA based systems.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.

PubMed

Zierke, Stephanie; Bakos, Jason D

2010-04-12

Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
An FPGA Implementation of a Polychronous Spiking Neural Network with Delay Adaptation.

PubMed

Wang, Runchun; Cohen, Gregory; Stiefel, Klaus M; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, André

2013-01-01

We present an FPGA implementation of a re-configurable, polychronous spiking neural network with a large capacity for spatial-temporal patterns. The proposed neural network generates delay paths de novo, so that only connections that actually appear in the training patterns will be created. This allows the proposed network to use all the axons (variables) to store information. Spike Timing Dependent Delay Plasticity is used to fine-tune and add dynamics to the network. We use a time multiplexing approach allowing us to achieve 4096 (4k) neurons and up to 1.15 million programmable delay axons on a Virtex 6 FPGA. Test results show that the proposed neural network is capable of successfully recalling more than 95% of all spikes for 96% of the stored patterns. The tests also show that the neural network is robust to noise from random input spikes.
Launching GUPPI: the Green Bank Ultimate Pulsar Processing Instrument

NASA Astrophysics Data System (ADS)

DuPlain, Ron; Ransom, Scott; Demorest, Paul; Brandt, Patrick; Ford, John; Shelton, Amy L.

2008-08-01

The National Radio Astronomy Observatory (NRAO) is launching the Green Bank Ultimate Pulsar Processing Instrument (GUPPI), a prototype flexible digital signal processor designed for pulsar observations with the Robert C. Byrd Green Bank Telescope (GBT). GUPPI uses field programmable gate array (FPGA) hardware and design tools developed by the Center for Astronomy Signal Processing and Electronics Research (CASPER) at the University of California, Berkeley. The NRAO has been concurrently developing GUPPI software and hardware using minimal software resources. The software handles instrument monitor and control, data acquisition, and hardware interfacing. GUPPI is currently an expert-only spectrometer, but supports future integration with the full GBT production system. The NRAO was able to take advantage of the unique flexibility of the CASPER FPGA hardware platform, develop hardware and software in parallel, and build a suite of software tools for monitoring, controlling, and acquiring data with a new instrument over a short timeline of just a few months. The NRAO interacts regularly with CASPER and its users, and GUPPI stands as an example of what reconfigurable computing and open-source development can do for radio astronomy. GUPPI is modular for portability, and the NRAO provides the results of development as an open-source resource.
Investigation of FPGA-Based Real-Time Adaptive Digital Pulse Shaping for High-Count-Rate Applications

NASA Astrophysics Data System (ADS)

Saxena, Shefali; Hawari, Ayman I.

2017-07-01

Digital signal processing techniques have been widely used in radiation spectrometry to provide improved stability and performance with compact physical size over the traditional analog signal processing. In this paper, field-programmable gate array (FPGA)-based adaptive digital pulse shaping techniques are investigated for real-time signal processing. National Instruments (NI) NI 5761 14-bit, 250-MS/s adaptor module is used for digitizing high-purity germanium (HPGe) detector's preamplifier pulses. Digital pulse processing algorithms are implemented on the NI PXIe-7975R reconfigurable FPGA (Kintex-7) using the LabVIEW FPGA module. Based on the time separation between successive input pulses, the adaptive shaping algorithm selects the optimum shaping parameters (rise time and flattop time of trapezoid-shaping filter) for each incoming signal. A digital Sallen-Key low-pass filter is implemented to enhance signal-to-noise ratio and reduce baseline drifting in trapezoid shaping. A recursive trapezoid-shaping filter algorithm is employed for pole-zero compensation of exponentially decayed (with two-decay constants) preamplifier pulses of an HPGe detector. It allows extraction of pulse height information at the beginning of each pulse, thereby reducing the pulse pileup and increasing throughput. The algorithms for RC-CR2 timing filter, baseline restoration, pile-up rejection, and pulse height determination are digitally implemented for radiation spectroscopy. Traditionally, at high-count-rate conditions, a shorter shaping time is preferred to achieve high throughput, which deteriorates energy resolution. In this paper, experimental results are presented for varying count-rate and pulse shaping conditions. Using adaptive shaping, increased throughput is accepted while preserving the energy resolution observed using the longer shaping times.
Design and Evaluation of a Scalable and Reconfigurable Multi-Platform System for Acoustic Imaging

PubMed Central

Izquierdo, Alberto; Villacorta, Juan José; del Val Puente, Lara; Suárez, Luis

2016-01-01

This paper proposes a scalable and multi-platform framework for signal acquisition and processing, which allows for the generation of acoustic images using planar arrays of MEMS (Micro-Electro-Mechanical Systems) microphones with low development and deployment costs. Acoustic characterization of MEMS sensors was performed, and the beam pattern of a module, based on an 8 × 8 planar array and of several clusters of modules, was obtained. A flexible framework, formed by an FPGA, an embedded processor, a computer desktop, and a graphic processing unit, was defined. The processing times of the algorithms used to obtain the acoustic images, including signal processing and wideband beamforming via FFT, were evaluated in each subsystem of the framework. Based on this analysis, three frameworks are proposed, defined by the specific subsystems used and the algorithms shared. Finally, a set of acoustic images obtained from sound reflected from a person are presented as a case study in the field of biometric identification. These results reveal the feasibility of the proposed system. PMID:27727174
Remotely Powered Reconfigurable Receiver for Extreme Environment Sensing Platforms

NASA Technical Reports Server (NTRS)

Sheldon, Douglas J.

2012-01-01

Wireless sensors connected in a local network offer revolutionary exploration capabilities, but the current solutions do not work in extreme environments of low temperatures (200K) and low to moderate radiation levels (<50 krad). These sensors (temperature, radiation, infrared, etc.) would need to operate outside the spacecraft/ lander and be totally independent of power from the spacecraft/lander. Flash memory field-programmable gate arrays (FPGAs) are being used as the main signal processing and protocol generation platform in a new receiver. Flash-based FPGAs have been shown to have at least 100 reduced standby power and 10 reduction operating power when compared to normal SRAM-based FPGA technology.
Design techniques for a stable operation of cryogenic field-programmable gate arrays.

PubMed

Homulle, Harald; Visser, Stefan; Patra, Bishnu; Charbon, Edoardo

2018-01-01

In this paper, we show how a deep-submicron field-programmable gate array (FPGA) can be operated more stably at extremely low temperatures through special firmware design techniques. Stability at low temperatures is limited through long power supply wires and reduced performance of various printed circuit board components commonly employed at room temperature. Extensive characterization of these components shows that the majority of decoupling capacitor types and voltage regulators are not well behaved at cryogenic temperatures, asking for an ad hoc solution to stabilize the FPGA supply voltage, especially for sensitive applications. Therefore, we have designed a firmware that enforces a constant power consumption, so as to stabilize the supply voltage in the interior of the FPGA. The FPGA is powered with a supply at several meters distance, causing significant resistive voltage drop and thus fluctuations on the local supply voltage. To achieve the stabilization, the variation in digital logic speed, which directly corresponds to changes in supply voltage, is constantly measured and corrected for through a tunable oscillator farm, implemented on the FPGA. The impact of the stabilization technique is demonstrated together with a reconfigurable analog-to-digital converter (ADC), completely implemented in the FPGA fabric and operating at 15 K. The ADC performance can be improved by at most 1.5 bits (effective number of bits) thanks to the more stable supply voltage. The method is versatile and robust, enabling seamless porting to other FPGA families and configurations.
Design techniques for a stable operation of cryogenic field-programmable gate arrays

NASA Astrophysics Data System (ADS)

Homulle, Harald; Visser, Stefan; Patra, Bishnu; Charbon, Edoardo

2018-01-01

In this paper, we show how a deep-submicron field-programmable gate array (FPGA) can be operated more stably at extremely low temperatures through special firmware design techniques. Stability at low temperatures is limited through long power supply wires and reduced performance of various printed circuit board components commonly employed at room temperature. Extensive characterization of these components shows that the majority of decoupling capacitor types and voltage regulators are not well behaved at cryogenic temperatures, asking for an ad hoc solution to stabilize the FPGA supply voltage, especially for sensitive applications. Therefore, we have designed a firmware that enforces a constant power consumption, so as to stabilize the supply voltage in the interior of the FPGA. The FPGA is powered with a supply at several meters distance, causing significant resistive voltage drop and thus fluctuations on the local supply voltage. To achieve the stabilization, the variation in digital logic speed, which directly corresponds to changes in supply voltage, is constantly measured and corrected for through a tunable oscillator farm, implemented on the FPGA. The impact of the stabilization technique is demonstrated together with a reconfigurable analog-to-digital converter (ADC), completely implemented in the FPGA fabric and operating at 15 K. The ADC performance can be improved by at most 1.5 bits (effective number of bits) thanks to the more stable supply voltage. The method is versatile and robust, enabling seamless porting to other FPGA families and configurations.
A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations

PubMed Central

Mak, Terrence S. T.; Rachmuth, Guy; Lam, Kai-Pui; Poon, Chi-Sang

2008-01-01

Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field Programmable Gate Array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the AMPA and NMDA synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired. PMID:17190033
Design and development progress of a LLRF control system for a 500 MHz superconducting cavity

NASA Astrophysics Data System (ADS)

Lee, Y. S.; Kim, H. W.; Song, H. S.; Lee, J. H.; Park, K. H.; Yu, I. H.; Chai, J. S.

2012-07-01

The LLRF (low-level radio-frequency) control system which regulates the amplitude and the phase of the accelerating voltage inside a RF cavity is essential to ensure the stable operation of charged particle accelerators. Recent advances in digital signal processors and data acquisition systems have allowed the LLRF control system to be implemented in digitally and have made it possible to meet the higher demands associated with the performance of LLRF control systems, such as stability, accuracy, etc. For this reason, many accelerator laboratories have completed or are completing the developments of digital LLRF control systems. The digital LLRF control system has advantages related with flexibility and fast reconfiguration. This paper describes the design of the FPGA (field programmable gate array) based LLRF control system and the status of development for this system. The proposed LLRF control system includes an analog front-end, a digital board (ADC (analog to digital converter), DAC (digital to analog converter), FPGA, etc.) and a RF & clock generation system. The control algorithms will be implemented by using the VHDL (VHSIC (very high speed integrated circuits) hardware description language), and the EPICS (experiment physics and industrial control system) will be ported to the host computer for the communication. In addition, the purpose of this system is to control a 500 MHz RF cavity, so the system will be applied to the superconducting cavity to be installed in the PLS storage ring, and its performance will be tested.
Event-driven processing for hardware-efficient neural spike sorting

NASA Astrophysics Data System (ADS)

Liu, Yan; Pereira, João L.; Constandinou, Timothy G.

2018-02-01

Objective. The prospect of real-time and on-node spike sorting provides a genuine opportunity to push the envelope of large-scale integrated neural recording systems. In such systems the hardware resources, power requirements and data bandwidth increase linearly with channel count. Event-based (or data-driven) processing can provide here a new efficient means for hardware implementation that is completely activity dependant. In this work, we investigate using continuous-time level-crossing sampling for efficient data representation and subsequent spike processing. Approach. (1) We first compare signals (synthetic neural datasets) encoded with this technique against conventional sampling. (2) We then show how such a representation can be directly exploited by extracting simple time domain features from the bitstream to perform neural spike sorting. (3) The proposed method is implemented in a low power FPGA platform to demonstrate its hardware viability. Main results. It is observed that considerably lower data rates are achievable when using 7 bits or less to represent the signals, whilst maintaining the signal fidelity. Results obtained using both MATLAB and reconfigurable logic hardware (FPGA) indicate that feature extraction and spike sorting accuracies can be achieved with comparable or better accuracy than reference methods whilst also requiring relatively low hardware resources. Significance. By effectively exploiting continuous-time data representation, neural signal processing can be achieved in a completely event-driven manner, reducing both the required resources (memory, complexity) and computations (operations). This will see future large-scale neural systems integrating on-node processing in real-time hardware.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

PubMed

Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

2013-01-01

DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
Fine-grained parallelism accelerating for RNA secondary structure prediction with pseudoknots based on FPGA.

PubMed

Xia, Fei; Jin, Guoqing

2014-06-01

PKNOTS is a most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard four-dimensional (4D) dynamic programming (DP) method and is the basis of many variants and improved algorithms. Unfortunately, the O(N(6)) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS package and prototype system for accelerating RNA folding application based on FPGA chip. We adopted a series of storage optimization strategies to resolve the "Memory Wall" problem. We aggressively exploit parallel computing strategies to improve computational efficiency. We also propose several methods that collectively reduce the storage requirements for FPGA on-chip memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D DP problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 50x average speedup over the PKNOTS-1.08 software running on a PC platform with Intel Core2 Q9400 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.
Generating clock signals for a cycle accurate, cycle reproducible FPGA based hardware accelerator

DOEpatents

Asaad, Sameth W.; Kapur, Mohit

2016-01-05

A method, system and computer program product are disclosed for generating clock signals for a cycle accurate FPGA based hardware accelerator used to simulate operations of a device-under-test (DUT). In one embodiment, the DUT includes multiple device clocks generating multiple device clock signals at multiple frequencies and at a defined frequency ratio; and the FPG hardware accelerator includes multiple accelerator clocks generating multiple accelerator clock signals to operate the FPGA hardware accelerator to simulate the operations of the DUT. In one embodiment, operations of the DUT are mapped to the FPGA hardware accelerator, and the accelerator clock signals are generated at multiple frequencies and at the defined frequency ratio of the frequencies of the multiple device clocks, to maintain cycle accuracy between the DUT and the FPGA hardware accelerator. In an embodiment, the FPGA hardware accelerator may be used to control the frequencies of the multiple device clocks.
Synchronized operation by field programmable gate array based signal controller for the Thomson scattering diagnostic system in KSTAR.

PubMed

Lee, W R; Kim, H S; Park, M K; Lee, J H; Kim, K H

2012-09-01

The Thomson scattering diagnostic system is successfully installed in the Korea Superconducting Tokamak Advanced Research (KSTAR) facility. We got the electron temperature and electron density data for the first time in 2011, 4th campaign using a field programmable gate array (FPGA) based signal control board. It operates as a signal generator, a detector, a controller, and a time measuring device. This board produces two configurable trigger pulses to operate Nd:YAG laser system and receives a laser beam detection signal from a photodiode detector. It allows a trigger pulse to be delivered to a time delay module to make a scattered signal measurement, measuring an asynchronous time value between the KSTAR timing board and the laser system injection signal. All functions are controlled by the embedded processor running on operating system within a single FPGA. It provides Ethernet communication interface and is configured with standard middleware to integrate with KSTAR. This controller has operated for two experimental campaigns including commissioning and performed the reconfiguration of logic designs to accommodate varying experimental situation without hardware rebuilding.
Bio-inspired motion detection in an FPGA-based smart camera module.

PubMed

Köhler, T; Röchter, F; Lindemann, J P; Möller, R

2009-03-01

Flying insects, despite their relatively coarse vision and tiny nervous system, are capable of carrying out elegant and fast aerial manoeuvres. Studies of the fly visual system have shown that this is accomplished by the integration of signals from a large number of elementary motion detectors (EMDs) in just a few global flow detector cells. We developed an FPGA-based smart camera module with more than 10,000 single EMDs, which is closely modelled after insect motion-detection circuits with respect to overall architecture, resolution and inter-receptor spacing. Input to the EMD array is provided by a CMOS camera with a high frame rate. Designed as an adaptable solution for different engineering applications and as a testbed for biological models, the EMD detector type and parameters such as the EMD time constants, the motion-detection directions and the angle between correlated receptors are reconfigurable online. This allows a flexible and simultaneous detection of complex motion fields such as translation, rotation and looming, such that various tasks, e.g., obstacle avoidance, height/distance control or speed regulation can be performed by the same compact device.
SpaceCube v2.0 Space Flight Hybrid Reconfigurable Data Processing System

NASA Technical Reports Server (NTRS)

Petrick, Dave

2014-01-01

This paper details the design architecture, design methodology, and the advantages of the SpaceCube v2.0 high performance data processing system for space applications. The purpose in building the SpaceCube v2.0 system is to create a superior high performance, reconfigurable, hybrid data processing system that can be used in a multitude of applications including those that require a radiation hardened and reliable solution. The SpaceCube v2.0 system leverages seven years of board design, avionics systems design, and space flight application experiences. This paper shows how SpaceCube v2.0 solves the increasing computing demands of space data processing applications that cannot be attained with a standalone processor approach.The main objective during the design stage is to find a good system balance between power, size, reliability, cost, and data processing capability. These design variables directly impact each other, and it is important to understand how to achieve a suitable balance. This paper will detail how these critical design factors were managed including the construction of an Engineering Model for an experiment on the International Space Station to test out design concepts. We will describe the designs for the processor card, power card, backplane, and a mission unique interface card. The mechanical design for the box will also be detailed since it is critical in meeting the stringent thermal and structural requirements imposed by the processing system. In addition, the mechanical design uses advanced thermal conduction techniques to solve the internal thermal challenges.The SpaceCube v2.0 processing system is based on an extended version of the 3U cPCI standard form factor where each card is 190mm x 100mm in size The typical power draw of the processor card is 8 to 10W and scales with application complexity. The SpaceCube v2.0 data processing card features two Xilinx Virtex-5 QV Field Programmable Gate Arrays (FPGA), eight memory modules, a monitor FPGA with analog monitoring, Ethernet, configurable interconnect to the Xilinx FPGAs including gigabit transceivers, and the necessary voltage regulation. The processor board uses a back-to-back design methodology for common parts that maximizes the board real estate available. This paper will show how to meet the IPC 6012B Class 3A standard with a 22-layer board that has two column grid array devices with 1.0mm pitch. All layout trades such as stack-up options, via selection, and FPGA signal breakout will be discussed with feature size results. The overall board design process will be discussed including parts selection, circuit design, proper signal termination, layout placement and route planning, signal integrity design and verification, and power integrity results. The radiation mitigation techniques will also be detailed including configuration scrubbing options, Xilinx circuit mitigation and FPGA functional monitoring, and memory protection.Finally, this paper will describe how this system is being used to solve the extreme challenges of a robotic satellite servicing mission where typical space-rated processors are not sufficient enough to meet the intensive data processing requirements. The SpaceCube v2.0 is the main payload control computer and is required to control critical subsystems such as autonomous rendezvous and docking using a suite of vision sensors and object avoidance when controlling two robotic arms.

SWARM: A Compact High Resolution Correlator and Wideband VLBI Phased Array Upgrade for SMA

NASA Astrophysics Data System (ADS)

Weintroub, Jonathan

2014-06-01

A new digital back end (DBE) is being commissioned on Mauna Kea. The “SMA Wideband Astronomical ROACH2 Machine”, or SWARM, processes a 4 GHz usable band in single polarization mode and is flexibly reconfigurable for 2 GHz full Stokes dual polarization. The hardware is based on the open source Reconfigurable Open Architecture Computing Hardware 2 (ROACH2) platform from the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER). A 5 GSps quad-core analog-to-digital converter board uses a commercial chip from e2v installed on a CASPER-standard printed circuit board designed by Homin Jiang’s group at ASIAA. Two ADC channels are provided per ROACH2, each sampling a 2.3 GHz Nyquist band generated by a custom wideband block downconverter (BDC). The ROACH2 logic includes 16k-channel Polyphase Filterbank (F-engine) per input followed by a 10 GbE switch based corner-turn which feeds into correlator-accumulator logic (X-engines) co-located with the F-engines. This arrangement makes very effective use of a small amount of digital hardware (just 8 ROACH2s in 1U rack mount enclosures). The primary challenge now is to meet timing at full speed for a large and very complex FPGA bit code. Design of the VLBI phased sum and recorder interface logic is also in process. Our poster will describe the instrument design, with the focus on the particular challenges of ultra wideband signal processing. Early connected commissioning and science verification data will be presented.
Optoelectronic date acquisition system based on FPGA

NASA Astrophysics Data System (ADS)

Li, Xin; Liu, Chunyang; Song, De; Tong, Zhiguo; Liu, Xiangqing

2015-11-01

An optoelectronic date acquisition system is designed based on FPGA. FPGA chip that is EP1C3T144C8 of Cyclone devices from Altera corporation is used as the centre of logic control, XTP2046 chip is used as A/D converter, host computer that communicates with the date acquisition system through RS-232 serial communication interface are used as display device and photo resistance is used as photo sensor. We use Verilog HDL to write logic control code about FPGA. It is proved that timing sequence is correct through the simulation of ModelSim. Test results indicate that this system meets the design requirement, has fast response and stable operation by actual hardware circuit test.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
Load Forecasting Based Distribution System Network Reconfiguration -- A Distributed Data-Driven Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Huaiguang; Zhang, Yingchen; Muljadi, Eduard

In this paper, a short-term load forecasting approach based network reconfiguration is proposed in a parallel manner. Specifically, a support vector regression (SVR) based short-term load forecasting approach is designed to provide an accurate load prediction and benefit the network reconfiguration. Because of the nonconvexity of the three-phase balanced optimal power flow, a second-order cone program (SOCP) based approach is used to relax the optimal power flow problem. Then, the alternating direction method of multipliers (ADMM) is used to compute the optimal power flow in distributed manner. Considering the limited number of the switches and the increasing computation capability, themore » proposed network reconfiguration is solved in a parallel way. The numerical results demonstrate the feasible and effectiveness of the proposed approach.« less
Case for a field-programmable gate array multicore hybrid machine for an image-processing application

NASA Astrophysics Data System (ADS)

Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos

2011-01-01

General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Acceleration of Cherenkov angle reconstruction with the new Intel Xeon/FPGA compute platform for the particle identification in the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Faerber, Christian

2017-10-01

The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.
STRS Compliant FPGA Waveform Development

NASA Technical Reports Server (NTRS)

Nappier, Jennifer; Downey, Joseph

2008-01-01

The Space Telecommunications Radio System (STRS) Architecture Standard describes a standard for NASA space software defined radios (SDRs). It provides a common framework that can be used to develop and operate a space SDR in a reconfigurable and reprogrammable manner. One goal of the STRS Architecture is to promote waveform reuse among multiple software defined radios. Many space domain waveforms are designed to run in the special signal processing (SSP) hardware. However, the STRS Architecture is currently incomplete in defining a standard for designing waveforms in the SSP hardware. Therefore, the STRS Architecture needs to be extended to encompass waveform development in the SSP hardware. A transmit waveform for space applications was developed to determine ways to extend the STRS Architecture to a field programmable gate array (FPGA). These extensions include a standard hardware abstraction layer for FPGAs and a standard interface between waveform functions running inside a FPGA. Current standards were researched and new standard interfaces were proposed. The implementation of the proposed standard interfaces on a laboratory breadboard SDR will be presented.
Fast data transmission in dynamic data acquisition system for plasma diagnostics

NASA Astrophysics Data System (ADS)

Byszuk, Adrian; Poźniak, Krzysztof; Zabołotny, Wojciech M.; Kasprowicz, Grzegorz; Wojeński, Andrzej; Cieszewski, Radosław; Juszczyk, Bartłomiej; Kolasiński, Piotr; Zienkiewicz, Paweł; Chernyshova, Maryna; Czarski, Tomasz

2014-11-01

This paper describes architecture of a new data acquisition system (DAQ) targeted mainly at plasma diagnostic experiments. Modular architecture, in combination with selected hardware components, allows for straightforward reconfiguration of the whole system, both offline and online. Main emphasis will be put into the implementation of data transmission subsystem in said system. One of the biggest advantages of described system is modular architecture with well defined boundaries between main components: analog frontend (AFE), digital backplane and acquisition/control software. Usage of a FPGA chips allows for a high flexibility in design of analog frontends, including ADC <--> FPGA interface. Data transmission between backplane boards and user software was accomplished with the use of industry-standard PCI Express (PCIe) technology. PCIe implementation includes both FPGA firmware and Linux device driver. High flexibility of PCIe connections was accomplished due to use of configurable PCIe switch. Whenever it's possible, described DAQ system tries to make use of standard off-the-shelf (OTF) components, including typical x86 CPU & motherboard (acting as PCIe controller) and cabling.
Implementation of a pulse coupled neural network in FPGA.

PubMed

Waldemark, J; Millberg, M; Lindblad, T; Waldemark, K; Becanovic, V

2000-06-01

The Pulse Coupled neural network, PCNN, is a biologically inspired neural net and it can be used in various image analysis applications, e.g. time-critical applications in the field of image pre-processing like segmentation, filtering, etc. a VHDL implementation of the PCNN targeting FPGA was undertaken and the results presented here. The implementation contains many interesting features. By pipelining the PCNN structure a very high throughput of 55 million neuron iterations per second could be achieved. By making the coefficients re-configurable during operation, a complete recognition system could be implemented on one, or maybe two, chip(s). Reconsidering the ranges and resolutions of the constants may save a lot of hardware, since the higher resolution requires larger multipliers, adders, memories etc.
SACFIR: SDN-Based Application-Aware Centralized Adaptive Flow Iterative Reconfiguring Routing Protocol for WSNs.

PubMed

Aslam, Muhammad; Hu, Xiaopeng; Wang, Fan

2017-12-13

Smart reconfiguration of a dynamic networking environment is offered by the central control of Software-Defined Networking (SDN). Centralized SDN-based management architectures are capable of retrieving global topology intelligence and decoupling the forwarding plane from the control plane. Routing protocols developed for conventional Wireless Sensor Networks (WSNs) utilize limited iterative reconfiguration methods to optimize environmental reporting. However, the challenging networking scenarios of WSNs involve a performance overhead due to constant periodic iterative reconfigurations. In this paper, we propose the SDN-based Application-aware Centralized adaptive Flow Iterative Reconfiguring (SACFIR) routing protocol with the centralized SDN iterative solver controller to maintain the load-balancing between flow reconfigurations and flow allocation cost. The proposed SACFIR's routing protocol offers a unique iterative path-selection algorithm, which initially computes suitable clustering based on residual resources at the control layer and then implements application-aware threshold-based multi-hop report transmissions on the forwarding plane. The operation of the SACFIR algorithm is centrally supervised by the SDN controller residing at the Base Station (BS). This paper extends SACFIR to SDN-based Application-aware Main-value Centralized adaptive Flow Iterative Reconfiguring (SAMCFIR) to establish both proactive and reactive reporting. The SAMCFIR transmission phase enables sensor nodes to trigger direct transmissions for main-value reports, while in the case of SACFIR, all reports follow computed routes. Our SDN-enabled proposed models adjust the reconfiguration period according to the traffic burden on sensor nodes, which results in heterogeneity awareness, load-balancing and application-specific reconfigurations of WSNs. Extensive experimental simulation-based results show that SACFIR and SAMCFIR yield the maximum scalability, network lifetime and stability period when compared to existing routing protocols.
SACFIR: SDN-Based Application-Aware Centralized Adaptive Flow Iterative Reconfiguring Routing Protocol for WSNs

PubMed Central

Hu, Xiaopeng; Wang, Fan

2017-01-01

Smart reconfiguration of a dynamic networking environment is offered by the central control of Software-Defined Networking (SDN). Centralized SDN-based management architectures are capable of retrieving global topology intelligence and decoupling the forwarding plane from the control plane. Routing protocols developed for conventional Wireless Sensor Networks (WSNs) utilize limited iterative reconfiguration methods to optimize environmental reporting. However, the challenging networking scenarios of WSNs involve a performance overhead due to constant periodic iterative reconfigurations. In this paper, we propose the SDN-based Application-aware Centralized adaptive Flow Iterative Reconfiguring (SACFIR) routing protocol with the centralized SDN iterative solver controller to maintain the load-balancing between flow reconfigurations and flow allocation cost. The proposed SACFIR’s routing protocol offers a unique iterative path-selection algorithm, which initially computes suitable clustering based on residual resources at the control layer and then implements application-aware threshold-based multi-hop report transmissions on the forwarding plane. The operation of the SACFIR algorithm is centrally supervised by the SDN controller residing at the Base Station (BS). This paper extends SACFIR to SDN-based Application-aware Main-value Centralized adaptive Flow Iterative Reconfiguring (SAMCFIR) to establish both proactive and reactive reporting. The SAMCFIR transmission phase enables sensor nodes to trigger direct transmissions for main-value reports, while in the case of SACFIR, all reports follow computed routes. Our SDN-enabled proposed models adjust the reconfiguration period according to the traffic burden on sensor nodes, which results in heterogeneity awareness, load-balancing and application-specific reconfigurations of WSNs. Extensive experimental simulation-based results show that SACFIR and SAMCFIR yield the maximum scalability, network lifetime and stability period when compared to existing routing protocols. PMID:29236031
Algorithmic synthesis using Python compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej

2015-09-01

This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.
A Fine-Grained Pipelined Implementation for Large-Scale Matrix Inversion on FPGA

NASA Astrophysics Data System (ADS)

Zhou, Jie; Dou, Yong; Zhao, Jianxun; Xia, Fei; Lei, Yuanwu; Tang, Yuxing

Large-scale matrix inversion play an important role in many applications. However to the best of our knowledge, there is no FPGA-based implementation. In this paper, we explore the possibility of accelerating large-scale matrix inversion on FPGA. To exploit the computational potential of FPGA, we introduce a fine-grained parallel algorithm for matrix inversion. A scalable linear array processing elements (PEs), which is the core component of the FPGA accelerator, is proposed to implement this algorithm. A total of 12 PEs can be integrated into an Altera StratixII EP2S130F1020C5 FPGA on our self-designed board. Experimental results show that a factor of 2.6 speedup and the maximum power-performance of 41 can be achieved compare to Pentium Dual CPU with double SSE threads.
High-Speed Current dq PI Controller for Vector Controlled PMSM Drive

PubMed Central

Reaz, Mamun Bin Ibne; Rahman, Labonnah Farzana; Chang, Tae Gyu

2014-01-01

High-speed current controller for vector controlled permanent magnet synchronous motor (PMSM) is presented. The controller is developed based on modular design for faster calculation and uses fixed-point proportional-integral (PI) method for improved accuracy. Current dq controller is usually implemented in digital signal processor (DSP) based computer. However, DSP based solutions are reaching their physical limits, which are few microseconds. Besides, digital solutions suffer from high implementation cost. In this research, the overall controller is realizing in field programmable gate array (FPGA). FPGA implementation of the overall controlling algorithm will certainly trim down the execution time significantly to guarantee the steadiness of the motor. Agilent 16821A Logic Analyzer is employed to validate the result of the implemented design in FPGA. Experimental results indicate that the proposed current dq PI controller needs only 50 ns of execution time in 40 MHz clock, which is the lowest computational cycle for the era. PMID:24574913
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator

PubMed Central

Wang, Runchun M.; Thakur, Chetan S.; van Schaik, André

2018-01-01

This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks. PMID:29692702
An FPGA-Based Massively Parallel Neuromorphic Cortex Simulator.

PubMed

Wang, Runchun M; Thakur, Chetan S; van Schaik, André

2018-01-01

This paper presents a massively parallel and scalable neuromorphic cortex simulator designed for simulating large and structurally connected spiking neural networks, such as complex models of various areas of the cortex. The main novelty of this work is the abstraction of a neuromorphic architecture into clusters represented by minicolumns and hypercolumns, analogously to the fundamental structural units observed in neurobiology. Without this approach, simulating large-scale fully connected networks needs prohibitively large memory to store look-up tables for point-to-point connections. Instead, we use a novel architecture, based on the structural connectivity in the neocortex, such that all the required parameters and connections can be stored in on-chip memory. The cortex simulator can be easily reconfigured for simulating different neural networks without any change in hardware structure by programming the memory. A hierarchical communication scheme allows one neuron to have a fan-out of up to 200 k neurons. As a proof-of-concept, an implementation on one Altera Stratix V FPGA was able to simulate 20 million to 2.6 billion leaky-integrate-and-fire (LIF) neurons in real time. We verified the system by emulating a simplified auditory cortex (with 100 million neurons). This cortex simulator achieved a low power dissipation of 1.62 μW per neuron. With the advent of commercially available FPGA boards, our system offers an accessible and scalable tool for the design, real-time simulation, and analysis of large-scale spiking neural networks.
Research of the small satellite data management system

NASA Astrophysics Data System (ADS)

Yu, Xiaozhou; Zhou, Fengqi; Zhou, Jun

2007-11-01

Small satellite is the integration of light weight, small volume and low launch cost. It is a promising approach to realize the future space mission. A detailed study of the data management system has been carried out, with using new reconfiguration method based on System On Programmable Chip (SOPC). Compared with common structure of satellite, the Central Terminal Unit (CTU), the Remote Terminal Unit (RTU) and Serial Data Bus (SDB) of the data management are all integrated in single chip. Thus the reliability of the satellite is greatly improved. At the same time, the data management system has powerful performance owing to the modern FPGA processing ability.
FPGA-based architecture for motion recovering in real-time

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar

2002-03-01

A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
Real-time field programmable gate array architecture for computer vision

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar

2001-01-01

This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
Locomotive track detection for underground

NASA Astrophysics Data System (ADS)

Ma, Zhonglei; Lang, Wenhui; Li, Xiaoming; Wei, Xing

2017-08-01

In order to improve the PC-based track detection system, this paper proposes a method to detect linear track for underground locomotive based on DSP + FPGA. Firstly, the analog signal outputted from the camera is sampled by A / D chip. Then the collected digital signal is preprocessed by FPGA. Secondly, the output signal of FPGA is transmitted to DSP via EMIF port. Subsequently, the adaptive threshold edge detection, polar angle and radius constrain based Hough transform are implemented by DSP. Lastly, the detected track information is transmitted to host computer through Ethernet interface. The experimental results show that the system can not only meet the requirements of real-time detection, but also has good robustness.

Rapid Corner Detection Using FPGAs

NASA Technical Reports Server (NTRS)

Morfopoulos, Arin C.; Metz, Brandon C.

2010-01-01

In order to perform precision landings for space missions, a control system must be accurate to within ten meters. Feature detection applied against images taken during descent and correlated against the provided base image is computationally expensive and requires tens of seconds of processing time to do just one image while the goal is to process multiple images per second. To solve this problem, this algorithm takes that processing load from the central processing unit (CPU) and gives it to a reconfigurable field programmable gate array (FPGA), which is able to compute data in parallel at very high clock speeds. The workload of the processor then becomes simpler; to read an image from a camera, it is transferred into the FPGA, and the results are read back from the FPGA. The Harris Corner Detector uses the determinant and trace to find a corner score, with each step of the computation occurring on independent clock cycles. Essentially, the image is converted into an x and y derivative map. Once three lines of pixel information have been queued up, valid pixel derivatives are clocked into the product and averaging phase of the pipeline. Each x and y derivative is squared against itself, as well as the product of the ix and iy derivative, and each value is stored in a WxN size buffer, where W represents the size of the integration window and N is the width of the image. In this particular case, a window size of 5 was chosen, and the image is 640 480. Over a WxN size window, an equidistance Gaussian is applied (to bring out the stronger corners), and then each value in the entire window is summed and stored. The required components of the equation are in place, and it is just a matter of taking the determinant and trace. It should be noted that the trace is being weighted by a constant k, a value that is found empirically to be within 0.04 to 0.15 (and in this implementation is 0.05). The constant k determines the number of corners available to be compared against a threshold sigma to mark a valid corner. After a fixed delay from when the first pixel is clocked in (to fill the pipeline), a score is achieved after each successive clock. This score corresponds with an (x,y) location within the image. If the score is higher than the predetermined threshold sigma, then a flag is set high and the location is recorded.
Real-Time Spaceborne Synthetic Aperture Radar Float-Point Imaging System Using Optimized Mapping Methodology and a Multi-Node Parallel Accelerating Technique

PubMed Central

Li, Bingyi; Chen, Liang; Yu, Wenyue; Xie, Yizhuang; Bian, Mingming; Zhang, Qingjun; Pang, Long

2018-01-01

With the development of satellite load technology and very large-scale integrated (VLSI) circuit technology, on-board real-time synthetic aperture radar (SAR) imaging systems have facilitated rapid response to disasters. A key goal of the on-board SAR imaging system design is to achieve high real-time processing performance under severe size, weight, and power consumption constraints. This paper presents a multi-node prototype system for real-time SAR imaging processing. We decompose the commonly used chirp scaling (CS) SAR imaging algorithm into two parts according to the computing features. The linearization and logic-memory optimum allocation methods are adopted to realize the nonlinear part in a reconfigurable structure, and the two-part bandwidth balance method is used to realize the linear part. Thus, float-point SAR imaging processing can be integrated into a single Field Programmable Gate Array (FPGA) chip instead of relying on distributed technologies. A single-processing node requires 10.6 s and consumes 17 W to focus on 25-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384. The design methodology of the multi-FPGA parallel accelerating system under the real-time principle is introduced. As a proof of concept, a prototype with four processing nodes and one master node is implemented using a Xilinx xc6vlx315t FPGA. The weight and volume of one single machine are 10 kg and 32 cm × 24 cm × 20 cm, respectively, and the power consumption is under 100 W. The real-time performance of the proposed design is demonstrated on Chinese Gaofen-3 stripmap continuous imaging. PMID:29495637
Field-Programmable Gate Array Computer in Structural Analysis: An Initial Exploration

NASA Technical Reports Server (NTRS)

Singleterry, Robert C., Jr.; Sobieszczanski-Sobieski, Jaroslaw; Brown, Samuel

2002-01-01

This paper reports on an initial assessment of using a Field-Programmable Gate Array (FPGA) computational device as a new tool for solving structural mechanics problems. A FPGA is an assemblage of binary gates arranged in logical blocks that are interconnected via software in a manner dependent on the algorithm being implemented and can be reprogrammed thousands of times per second. In effect, this creates a computer specialized for the problem that automatically exploits all the potential for parallel computing intrinsic in an algorithm. This inherent parallelism is the most important feature of the FPGA computational environment. It is therefore important that if a problem offers a choice of different solution algorithms, an algorithm of a higher degree of inherent parallelism should be selected. It is found that in structural analysis, an 'analog computer' style of programming, which solves problems by direct simulation of the terms in the governing differential equations, yields a more favorable solution algorithm than current solution methods. This style of programming is facilitated by a 'drag-and-drop' graphic programming language that is supplied with the particular type of FPGA computer reported in this paper. Simple examples in structural dynamics and statics illustrate the solution approach used. The FPGA system also allows linear scalability in computing capability. As the problem grows, the number of FPGA chips can be increased with no loss of computing efficiency due to data flow or algorithmic latency that occurs when a single problem is distributed among many conventional processors that operate in parallel. This initial assessment finds the FPGA hardware and software to be in their infancy in regard to the user conveniences; however, they have enormous potential for shrinking the elapsed time of structural analysis solutions if programmed with algorithms that exhibit inherent parallelism and linear scalability. This potential warrants further development of FPGA-tailored algorithms for structural analysis.
Uranus: a rapid prototyping tool for FPGA embedded computer vision

NASA Astrophysics Data System (ADS)

Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.

2007-01-01

The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Reconfigurable Computing As an Enabling Technology for Single-Photon-Counting Laser Altimetry

NASA Technical Reports Server (NTRS)

Powell, Wesley; Hicks, Edward; Pinchinat, Maxime; Dabney, Philip; McGarry, Jan; Murray, Paul

2003-01-01

Single-photon-counting laser altimetry is a new measurement technique offering significant advantages in vertical resolution, reducing instrument size, mass, and power, and reducing laser complexity as compared to analog or threshold detection laser altimetry techniques. However, these improvements come at the cost of a dramatically increased requirement for onboard real-time data processing. Reconfigurable computing has been shown to offer considerable performance advantages in performing this processing. These advantages have been demonstrated on the Multi-KiloHertz Micro-Laser Altimeter (MMLA), an aircraft based single-photon-counting laser altimeter developed by NASA Goddard Space Flight Center with several potential spaceflight applications. This paper describes how reconfigurable computing technology was employed to perform MMLA data processing in real-time under realistic operating constraints, along with the results observed. This paper also expands on these prior results to identify concepts for using reconfigurable computing to enable spaceflight single-photon-counting laser altimeter instruments.
Lunar Applications in Reconfigurable Computing

NASA Technical Reports Server (NTRS)

Somervill, Kevin

2008-01-01

NASA s Constellation Program is developing a lunar surface outpost in which reconfigurable computing will play a significant role. Reconfigurable systems provide a number of benefits over conventional software-based implementations including performance and power efficiency, while the use of standardized reconfigurable hardware provides opportunities to reduce logistical overhead. The current vision for the lunar surface architecture includes habitation, mobility, and communications systems, each of which greatly benefit from reconfigurable hardware in applications including video processing, natural feature recognition, data formatting, IP offload processing, and embedded control systems. In deploying reprogrammable hardware, considerations similar to those of software systems must be managed. There needs to be a mechanism for discovery enabling applications to locate and utilize the available resources. Also, application interfaces are needed to provide for both configuring the resources as well as transferring data between the application and the reconfigurable hardware. Each of these topics are explored in the context of deploying reconfigurable resources as an integral aspect of the lunar exploration architecture.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

NASA Astrophysics Data System (ADS)

Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

2008-04-01

This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
Language Classification using N-grams Accelerated by FPGA-based Bloom Filters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacob, A; Gokhale, M

N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.
Bio-Inspired Controller on an FPGA Applied to Closed-Loop Diaphragmatic Stimulation

PubMed Central

Zbrzeski, Adeline; Bornat, Yannick; Hillen, Brian; Siu, Ricardo; Abbas, James; Jung, Ranu; Renaud, Sylvie

2016-01-01

Cervical spinal cord injury can disrupt connections between the brain respiratory network and the respiratory muscles which can lead to partial or complete loss of ventilatory control and require ventilatory assistance. Unlike current open-loop technology, a closed-loop diaphragmatic pacing system could overcome the drawbacks of manual titration as well as respond to changing ventilation requirements. We present an original bio-inspired assistive technology for real-time ventilation assistance, implemented in a digital configurable Field Programmable Gate Array (FPGA). The bio-inspired controller, which is a spiking neural network (SNN) inspired by the medullary respiratory network, is as robust as a classic controller while having a flexible, low-power and low-cost hardware design. The system was simulated in MATLAB with FPGA-specific constraints and tested with a computational model of rat breathing; the model reproduced experimentally collected respiratory data in eupneic animals. The open-loop version of the bio-inspired controller was implemented on the FPGA. Electrical test bench characterizations confirmed the system functionality. Open and closed-loop paradigm simulations were simulated to test the FPGA system real-time behavior using the rat computational model. The closed-loop system monitors breathing and changes in respiratory demands to drive diaphragmatic stimulation. The simulated results inform future acute animal experiments and constitute the first step toward the development of a neuromorphic, adaptive, compact, low-power, implantable device. The bio-inspired hardware design optimizes the FPGA resource and time costs while harnessing the computational power of spike-based neuromorphic hardware. Its real-time feature makes it suitable for in vivo applications. PMID:27378844
FPGA-based real-time phase measuring profilometry algorithm design and implementation

NASA Astrophysics Data System (ADS)

Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng

2016-11-01

Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
Real-time distortion correction for visual inspection systems based on FPGA

NASA Astrophysics Data System (ADS)

Liang, Danhua; Zhang, Zhaoxia; Chen, Xiaodong; Yu, Daoyin

2008-03-01

Visual inspection is a kind of new technology based on the research of computer vision, which focuses on the measurement of the object's geometry and location. It can be widely used in online measurement, and other real-time measurement process. Because of the defects of the traditional visual inspection, a new visual detection mode -all-digital intelligent acquisition and transmission is presented. The image processing, including filtering, image compression, binarization, edge detection and distortion correction, can be completed in the programmable devices -FPGA. As the wide-field angle lens is adopted in the system, the output images have serious distortion. Limited by the calculating speed of computer, software can only correct the distortion of static images but not the distortion of dynamic images. To reach the real-time need, we design a distortion correction system based on FPGA. The method of hardware distortion correction is that the spatial correction data are calculated first under software circumstance, then converted into the address of hardware storage and stored in the hardware look-up table, through which data can be read out to correct gray level. The major benefit using FPGA is that the same circuit can be used for other circularly symmetric wide-angle lenses without being modified.
Integration of the Reconfigurable Self-Healing eDNA Architecture in an Embedded System

NASA Technical Reports Server (NTRS)

Boesen, Michael Reibel; Keymeulen, Didier; Madsen, Jan; Lu, Thomas; Chao, Tien-Hsin

2011-01-01

In this work we describe the first real world case study for the self-healing eDNA (electronic DNA) architecture by implementing the control and data processing of a Fourier Transform Spectrometer (FTS) on an eDNA prototype. For this purpose the eDNA prototype has been ported from a Xilinx Virtex 5 FPGA to an embedded system consisting of a PowerPC and a Xilinx Virtex 5 FPGA. The FTS instrument features a novel liquid crystal waveguide, which consequently eliminates all moving parts from the instrument. The addition of the eDNA architecture to do the control and data processing has resulted in a highly fault-tolerant FTS instrument. The case study has shown that the early stage prototype of the autonomous self-healing eDNA architecture is expensive in terms of execution time.
Nonvolatile reconfigurable sequential logic in a HfO2 resistive random access memory array.

PubMed

Zhou, Ya-Xiong; Li, Yi; Su, Yu-Ting; Wang, Zhuo-Rui; Shih, Ling-Yi; Chang, Ting-Chang; Chang, Kuan-Chang; Long, Shi-Bing; Sze, Simon M; Miao, Xiang-Shui

2017-05-25

Resistive random access memory (RRAM) based reconfigurable logic provides a temporal programmable dimension to realize Boolean logic functions and is regarded as a promising route to build non-von Neumann computing architecture. In this work, a reconfigurable operation method is proposed to perform nonvolatile sequential logic in a HfO 2 -based RRAM array. Eight kinds of Boolean logic functions can be implemented within the same hardware fabrics. During the logic computing processes, the RRAM devices in an array are flexibly configured in a bipolar or complementary structure. The validity was demonstrated by experimentally implemented NAND and XOR logic functions and a theoretically designed 1-bit full adder. With the trade-off between temporal and spatial computing complexity, our method makes better use of limited computing resources, thus provides an attractive scheme for the construction of logic-in-memory systems.
Address-event-based platform for bioinspired spiking systems

NASA Astrophysics Data System (ADS)

Jiménez-Fernández, A.; Luján, C. D.; Linares-Barranco, A.; Gómez-Rodríguez, F.; Rivas, M.; Jiménez, G.; Civit, A.

2007-05-01

Address Event Representation (AER) is an emergent neuromorphic interchip communication protocol that allows a real-time virtual massive connectivity between huge number neurons, located on different chips. By exploiting high speed digital communication circuits (with nano-seconds timings), synaptic neural connections can be time multiplexed, while neural activity signals (with mili-seconds timings) are sampled at low frequencies. Also, neurons generate "events" according to their activity levels. More active neurons generate more events per unit time, and access the interchip communication channel more frequently, while neurons with low activity consume less communication bandwidth. When building multi-chip muti-layered AER systems, it is absolutely necessary to have a computer interface that allows (a) reading AER interchip traffic into the computer and visualizing it on the screen, and (b) converting conventional frame-based video stream in the computer into AER and injecting it at some point of the AER structure. This is necessary for test and debugging of complex AER systems. In the other hand, the use of a commercial personal computer implies to depend on software tools and operating systems that can make the system slower and un-robust. This paper addresses the problem of communicating several AER based chips to compose a powerful processing system. The problem was discussed in the Neuromorphic Engineering Workshop of 2006. The platform is based basically on an embedded computer, a powerful FPGA and serial links, to make the system faster and be stand alone (independent from a PC). A new platform is presented that allow to connect up to eight AER based chips to a Spartan 3 4000 FPGA. The FPGA is responsible of the network communication based in Address-Event and, at the same time, to map and transform the address space of the traffic to implement a pre-processing. A MMU microprocessor (Intel XScale 400MHz Gumstix Connex computer) is also connected to the FPGA to allow the platform to implement eventbased algorithms to interact to the AER system, like control algorithms, network connectivity, USB support, etc. The LVDS transceiver allows a bandwidth of up to 1.32 Gbps, around ~66 Mega events per second (Mevps).
Dual Super-Systolic Core for Real-Time Reconstructive Algorithms of High-Resolution Radar/SAR Imaging Systems

PubMed Central

Atoche, Alejandro Castillo; Castillo, Javier Vázquez

2012-01-01

A high-speed dual super-systolic core for reconstructive signal processing (SP) operations consists of a double parallel systolic array (SA) machine in which each processing element of the array is also conceptualized as another SA in a bit-level fashion. In this study, we addressed the design of a high-speed dual super-systolic array (SSA) core for the enhancement/reconstruction of remote sensing (RS) imaging of radar/synthetic aperture radar (SAR) sensor systems. The selected reconstructive SP algorithms are efficiently transformed in their parallel representation and then, they are mapped into an efficient high performance embedded computing (HPEC) architecture in reconfigurable Xilinx field programmable gate array (FPGA) platforms. As an implementation test case, the proposed approach was aggregated in a HW/SW co-design scheme in order to solve the nonlinear ill-posed inverse problem of nonparametric estimation of the power spatial spectrum pattern (SSP) from a remotely sensed scene. We show how such dual SSA core, drastically reduces the computational load of complex RS regularization techniques achieving the required real-time operational mode. PMID:22736964
Efficient architecture for spike sorting in reconfigurable hardware.

PubMed

Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying

2013-11-01

This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
The RTE inversion on FPGA aboard the solar orbiter PHI instrument

NASA Astrophysics Data System (ADS)

Cobos Carrascosa, J. P.; Aparicio del Moral, B.; Ramos Mas, J. L.; Balaguer, M.; López Jiménez, A. C.; del Toro Iniesta, J. C.

2016-07-01

In this work we propose a multiprocessor architecture to reach high performance in floating point operations by using radiation tolerant FPGA devices, and under narrow time and power constraints. This architecture is used in the PHI instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. The proposed architecture, in a SIMD flavor, is aimed to be an accelerator within the Data Processing Unit (it is composed by a main Leon processor and two FPGAs) for carrying out the RTE inversion on board the spacecraft using a relatively slow FPGA device - Xilinx XQR4VSX55-. The proposed architecture squeezes the FPGA resources in order to reach the computational requirements and improves the ground-based system performance based on commercial CPUs regarding time and power consumption. In this work we demonstrate the feasibility of using this FPGA devices embedded in the SO/PHI instrument. With that goal in mind, we perform tests to evaluate the scientific results and to measure the processing time and power consumption for carrying out the RTE inversion.
Evaluation of Advanced Computing Techniques and Technologies: Reconfigurable Computing

NASA Technical Reports Server (NTRS)

Wells, B. Earl

2003-01-01

The focus of this project was to survey the technology of reconfigurable computing determine its level of maturity and suitability for NASA applications. To better understand and assess the effectiveness of the reconfigurable design paradigm that is utilized within the HAL-15 reconfigurable computer system. This system was made available to NASA MSFC for this purpose, from Star Bridge Systems, Inc. To implement on at least one application that would benefit from the performance levels that are possible with reconfigurable hardware. It was originally proposed that experiments in fault tolerance and dynamically reconfigurability would be perform but time constraints mandated that these be pursued as future research.
FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

PubMed Central

2011-01-01

Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582
Reconfigurable PCI Express cards for low-latency data transport in HEP experiments

NASA Astrophysics Data System (ADS)

Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lamanna, G.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pontisso, L.; Simula, F.; Vicini, P.

2017-01-01

State-of-the-art technology supports the High Energy Physics community in addressing the problem of managing an overwhelming amount of experimental data. From the point of view of communication between the detectors' readout system and computing nodes, the critical issues are the following: latency, moving data in a deterministic and low amount of time; bandwidth, guaranteeing the maximum capability of the link and communication protocol adopted; endpoint consolidation, tight aggregation of channels on a single board. This contribution describes the status and performances of the NaNet project, whose goal is the design of a family of FPGA-based PCIe network interface cards. The efforts of the team are focused on implementing a low-latency, real-time data transport mechanism between the board network multi-channel system and CPU and GPU accelerators memories on the host. Several opportunities concerning technical solutions and scientific applications have been explored: NaNet-1 with a single GbE I/O interface, and NaNet-10, offering four 10GbE ports, for activities related to the GPU-based real-time trigger of NA62 experiment at CERN; NaNet ^3 , with four 2.5Gbit optical channels, developed for the KM3NeT-ITALIA underwater neutrino telescope.

An electrically reconfigurable logic gate intrinsically enabled by spin-orbit materials.

PubMed

Kazemi, Mohammad

2017-11-10

The spin degree of freedom in magnetic devices has been discussed widely for computing, since it could significantly reduce energy dissipation, might enable beyond Von Neumann computing, and could have applications in quantum computing. For spin-based computing to become widespread, however, energy efficient logic gates comprising as few devices as possible are required. Considerable recent progress has been reported in this area. However, proposals for spin-based logic either require ancillary charge-based devices and circuits in each individual gate or adopt principals underlying charge-based computing by employing ancillary spin-based devices, which largely negates possible advantages. Here, we show that spin-orbit materials possess an intrinsic basis for the execution of logic operations. We present a spin-orbit logic gate that performs a universal logic operation utilizing the minimum possible number of devices, that is, the essential devices required for representing the logic operands. Also, whereas the previous proposals for spin-based logic require extra devices in each individual gate to provide reconfigurability, the proposed gate is 'electrically' reconfigurable at run-time simply by setting the amplitude of the clock pulse applied to the gate. We demonstrate, analytically and numerically with experimentally benchmarked models, that the gate performs logic operations and simultaneously stores the result, realizing the 'stateful' spin-based logic scalable to ultralow energy dissipation.
Real-time implementation of camera positioning algorithm based on FPGA & SOPC

NASA Astrophysics Data System (ADS)

Yang, Mingcao; Qiu, Yuehong

2014-09-01

In recent years, with the development of positioning algorithm and FPGA, to achieve the camera positioning based on real-time implementation, rapidity, accuracy of FPGA has become a possibility by way of in-depth study of embedded hardware and dual camera positioning system, this thesis set up an infrared optical positioning system based on FPGA and SOPC system, which enables real-time positioning to mark points in space. Thesis completion include: (1) uses a CMOS sensor to extract the pixel of three objects with total feet, implemented through FPGA hardware driver, visible-light LED, used here as the target point of the instrument. (2) prior to extraction of the feature point coordinates, the image needs to be filtered to avoid affecting the physical properties of the system to bring the platform, where the median filtering. (3) Coordinate signs point to FPGA hardware circuit extraction, a new iterative threshold selection method for segmentation of images. Binary image is then segmented image tags, which calculates the coordinates of the feature points of the needle through the center of gravity method. (4) direct linear transformation (DLT) and extreme constraints method is applied to three-dimensional reconstruction of the plane array CMOS system space coordinates. using SOPC system on a chip here, taking advantage of dual-core computing systems, which let match and coordinate operations separately, thus increase processing speed.
Bridging FPGA and GPU technologies for AO real-time control

NASA Astrophysics Data System (ADS)

Perret, Denis; Lainé, Maxime; Bernard, Julien; Gratadour, Damien; Sevin, Arnaud

2016-07-01

Our team has developed a common environment for high performance simulations and real-time control of AO systems based on the use of Graphics Processors Units in the context of the COMPASS project. Such a solution, based on the ability of the real time core in the simulation to provide adequate computing performance, limits the cost of developing AO RTC systems and makes them more scalable. A code developed and validated in the context of the simulation may be injected directly into the system and tested on sky. Furthermore, the use of relatively low cost components also offers significant advantages for the system hardware platform. However, the use of GPUs in an AO loop comes with drawbacks: the traditional way of offloading computation from CPU to GPUs - involving multiple copies and unacceptable overhead in kernel launching - is not well suited in a real time context. This last application requires the implementation of a solution enabling direct memory access (DMA) to the GPU memory from a third party device, bypassing the operating system. This allows this device to communicate directly with the real-time core of the simulation feeding it with the WFS camera pixel stream. We show that DMA between a custom FPGA-based frame-grabber and a computation unit (GPU, FPGA, or Coprocessor such as Xeon-phi) across PCIe allows us to get latencies compatible with what will be needed on ELTs. As a fine-grained synchronization mechanism is not yet made available by GPU vendors, we propose the use of memory polling to avoid interrupts handling and involvement of a CPU. Network and Vision protocols are handled by the FPGA-based Network Interface Card (NIC). We present the results we obtained on a complete AO loop using camera and deformable mirror simulators.
Technology Readiness Level (TRL) Advancement of the MSPI On-Board Processing Platform for the ACE Decadal Survey Mission

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Werne, Thomas A.; Bekker, Dmitriy L.; Wilson, Thor O.

2011-01-01

The Xilinx Virtex-5QV is a new Single-event Immune Reconfigurable FPGA (SIRF) device that is targeted as the spaceborne processor for the NASA Decadal Survey Aerosol-Cloud-Ecosystem (ACE) mission's Multiangle SpectroPolarimetric Imager (MSPI) instrument, currently under development at JPL. A key technology needed for MSPI is on-board processing (OBP) to calculate polarimetry data as imaged by each of the 9 cameras forming the instrument. With funding from NASA's ESTO1 AIST2 Program, JPL is demonstrating how signal data at 95 Mbytes/sec over 16 channels for each of the 9 multi-angle cameras can be reduced to 0.45 Mbytes/sec, thereby substantially reducing the image data volume for spacecraft downlink without loss of science information. This is done via a least-squares fitting algorithm implemented on the Virtex-5 FPGA operating in real-time on the raw video data stream.
A FPGA Implementation of the CAR-FAC Cochlear Model.

PubMed

Xu, Ying; Thakur, Chetan S; Singh, Ram K; Hamilton, Tara Julia; Wang, Runchun M; van Schaik, André

2018-01-01

This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on.
A FPGA Implementation of the CAR-FAC Cochlear Model

PubMed Central

Xu, Ying; Thakur, Chetan S.; Singh, Ram K.; Hamilton, Tara Julia; Wang, Runchun M.; van Schaik, André

2018-01-01

This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on. PMID:29692700
Method and system for environmentally adaptive fault tolerant computing

NASA Technical Reports Server (NTRS)

Copenhaver, Jason L. (Inventor); Jeremy, Ramos (Inventor); Wolfe, Jeffrey M. (Inventor); Brenner, Dean (Inventor)

2010-01-01

A method and system for adapting fault tolerant computing. The method includes the steps of measuring an environmental condition representative of an environment. An on-board processing system's sensitivity to the measured environmental condition is measured. It is determined whether to reconfigure a fault tolerance of the on-board processing system based in part on the measured environmental condition. The fault tolerance of the on-board processing system may be reconfigured based in part on the measured environmental condition.
Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dubois, David H; Dubois, Andrew J; Boorman, Thomas M

2009-01-01

This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.
FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling.

PubMed

Kim, Chang-Min; Park, Hyung-Min; Kim, Taesu; Choi, Yoon-Kyung; Lee, Soo-Young

2003-01-01

An field programmable gate array (FPGA) implementation of independent component analysis (ICA) algorithm is reported for blind signal separation (BSS) and adaptive noise canceling (ANC) in real time. In order to provide enormous computing power for ICA-based algorithms with multipath reverberation, a special digital processor is designed and implemented in FPGA. The chip design fully utilizes modular concept and several chips may be put together for complex applications with a large number of noise sources. Experimental results with a fabricated test board are reported for ANC only, BSS only, and simultaneous ANC/BSS, which demonstrates successful speech enhancement in real environments in real time.
Integrating Reconfigurable Hardware-Based Grid for High Performance Computing

PubMed Central

Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

2015-01-01

FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241
Accelerating artificial intelligence with reconfigurable computing

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw

Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.

PubMed

Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan

2017-07-01

In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
Developments of FPGA-based digital back-ends for low frequency antenna arrays at Medicina radio telescopes

NASA Astrophysics Data System (ADS)

Naldi, G.; Bartolini, M.; Mattana, A.; Pupillo, G.; Hickish, J.; Foster, G.; Bianchi, G.; Lingua, A.; Monari, J.; Montebugnoli, S.; Perini, F.; Rusticelli, S.; Schiaffino, M.; Virone, G.; Zarb Adami, K.

In radio astronomy Field Programmable Gate Array (FPGA) technology is largely used for the implementation of digital signal processing techniques applied to antenna arrays. This is mainly due to the good trade-off among computing resources, power consumption and cost offered by FPGA chip compared to other technologies like ASIC, GPU and CPU. In the last years several digital backend systems based on such devices have been developed at the Medicina radio astronomical station (INAF-IRA, Bologna, Italy). Instruments like FX correlator, direct imager, beamformer, multi-beam system have been successfully designed and realized on CASPER (Collaboration for Astronomy Signal Processing and Electronics Research, https://casper.berkeley.edu) processing boards. In this paper we present the gained experience in this kind of applications.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters.

PubMed

Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Serrano, Alejandro; Godoy, Jorge; Martínez-Álvarez, Antonio; Villagra, Jorge

2017-11-11

Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle.
SpaceCube Version 1.5

NASA Technical Reports Server (NTRS)

Geist, Alessandro; Lin, Michael; Flatley, Tom; Petrick, David

2013-01-01

SpaceCube 1.5 is a high-performance and low-power system in a compact form factor. It is a hybrid processing system consisting of CPU (central processing unit), FPGA (field-programmable gate array), and DSP (digital signal processor) processing elements. The primary processing engine is the Virtex- 5 FX100T FPGA, which has two embedded processors. The SpaceCube 1.5 System was a bridge to the SpaceCube 2.0 and SpaceCube 2.0 Mini processing systems. The SpaceCube 1.5 system was the primary avionics in the successful SMART (Small Rocket/Spacecraft Technology) Sounding Rocket mission that was launched in the summer of 2011. For SMART and similar missions, an avionics processor is required that is reconfigurable, has high processing capability, has multi-gigabit interfaces, is low power, and comes in a rugged/compact form factor. The original SpaceCube 1.0 met a number of the criteria, but did not possess the multi-gigabit interfaces that were required and is a higher-cost system. The SpaceCube 1.5 was designed with those mission requirements in mind. The SpaceCube 1.5 features one Xilinx Virtex-5 FX100T FPGA and has excellent size, weight, and power characteristics [4×4×3 in. (approx. = 10×10×8 cm), 3 lb (approx. = 1.4 kg), and 5 to 15 W depending on the application]. The estimated computing power of the two PowerPC 440s in the Virtex-5 FPGA is 1100 DMIPS each. The SpaceCube 1.5 includes two Gigabit Ethernet (1 Gbps) interfaces as well as two SATA-I/II interfaces (1.5 to 3.0 Gbps) for recording to data drives. The SpaceCube 1.5 also features DDR2 SDRAM (double data rate synchronous dynamic random access memory); 4- Gbit Flash for storing application code for the CPU, FPGA, and DSP processing elements; and a Xilinx Platform Flash XL to store FPGA configuration files or application code. The system also incorporates a 12 bit analog to digital converter with the ability to read 32 discrete analog sensor inputs. The SpaceCube 1.5 design also has a built-in accelerometer. In addition, the system has 12 receive and transmit RS- 422 interfaces for legacy support. The SpaceCube 1.5 processor card represents the first NASA Goddard design in a compact form factor featuring the Xilinx Virtex- 5. The SpaceCube 1.5 incorporates backward compatibility with the Space- Cube 1.0 form factor and stackable architecture. It also makes use of low-cost commercial parts, but is designed for operation in harsh environments.
Characterization of robotics parallel algorithms and mapping onto a reconfigurable SIMD machine

NASA Technical Reports Server (NTRS)

Lee, C. S. G.; Lin, C. T.

1989-01-01

The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems in the control of robot manipulators. Efficient parallel algorithms for these computations are discussed and analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is shown that most of the algorithms for robotic computations possess highly regular properties and some common structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single-instruction-stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic procedure internal direct feedback is introduced. A systematic procedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing sequence of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are illustrated and discussed.
Adaptive reconfigurable V-BLAST type equalizer for cognitive MIMO-OFDM radios

NASA Astrophysics Data System (ADS)

Ozden, Mehmet Tahir

2015-12-01

An adaptive channel shortening equalizer design for multiple input multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) radio receivers is considered in this presentation. The proposed receiver has desirable features for cognitive and software defined radio implementations. It consists of two sections: MIMO decision feedback equalizer (MIMO-DFE) and adaptive multiple Viterbi detection. In MIMO-DFE section, a complete modified Gram-Schmidt orthogonalization of multichannel input data is accomplished using sequential processing multichannel Givens lattice stages, so that a Vertical Bell Laboratories Layered Space Time (V-BLAST) type MIMO-DFE is realized at the front-end section of the channel shortening equalizer. Matrix operations, a major bottleneck for receiver operations, are accordingly avoided, and only scalar operations are used. A highly modular and regular radio receiver architecture that has a suitable structure for digital signal processing (DSP) chip and field programable gate array (FPGA) implementations, which are important for software defined radio realizations, is achieved. The MIMO-DFE section of the proposed receiver can also be reconfigured for spectrum sensing and positioning functions, which are important tasks for cognitive radio applications. In connection with adaptive multiple Viterbi detection section, a systolic array implementation for each channel is performed so that a receiver architecture with high computational concurrency is attained. The total computational complexity is given in terms of equalizer and desired response filter lengths, alphabet size, and number of antennas. The performance of the proposed receiver is presented for two-channel case by means of mean squared error (MSE) and probability of error evaluations, which are conducted for time-invariant and time-variant channel conditions, orthogonal and nonorthogonal transmissions, and two different modulation schemes.
A Course on Reconfigurable Processors

ERIC Educational Resources Information Center

Shoufan, Abdulhadi; Huss, Sorin A.

2010-01-01

Reconfigurable computing is an established field in computer science. Teaching this field to computer science students demands special attention due to limited student experience in electronics and digital system design. This article presents a compact course on reconfigurable processors, which was offered at the Technische Universitat Darmstadt,…
FPGA Based High Speed Data Acquisition System for Electrical Impedance Tomography

PubMed Central

Khan, S; Borsic, A; Manwaring, Preston; Hartov, Alexander; Halter, Ryan

2014-01-01

Electrical Impedance Tomography (EIT) systems are used to image tissue bio-impedance. EIT provides a number of features making it attractive for use as a medical imaging device including the ability to image fast physiological processes (>60 Hz), to meet a range of clinical imaging needs through varying electrode geometries and configurations, to impart only non-ionizing radiation to a patient, and to map the significant electrical property contrasts present between numerous benign and pathological tissues. To leverage these potential advantages for medical imaging, we developed a modular 32 channel data acquisition (DAQ) system using National Instruments’ PXI chassis, along with FPGA, ADC, Signal Generator and Timing and Synchronization modules. To achieve high frame rates, signal demodulation and spectral characteristics of higher order harmonics were computed using dedicated FFT-hardware built into the FPGA module. By offloading the computing onto FPGA, we were able to achieve a reduction in throughput required between the FPGA and PC by a factor of 32:1. A custom designed analog front end (AFE) was used to interface electrodes with our system. Our system is wideband, and capable of acquiring data for input signal frequencies ranging from 100 Hz to 12 MHz. The modular design of both the hardware and software will allow this system to be flexibly configured for the particular clinical application. PMID:24729790
DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, C.; Yu, G.; Wang, K.

The physical designs of the new concept reactors which have complex structure, various materials and neutronic energy spectrum, have greatly improved the requirements to the calculation methods and the corresponding computing hardware. Along with the widely used parallel algorithm, heterogeneous platforms architecture has been introduced into numerical computations in reactor physics. Because of the natural parallel characteristics, the CPU-FPGA architecture is often used to accelerate numerical computation. This paper studies the application and features of this kind of heterogeneous platforms used in numerical calculation of reactor physics through practical examples. After the designed neutron diffusion module based on CPU-FPGA architecturemore » achieves a 11.2 speed up factor, it is proved to be feasible to apply this kind of heterogeneous platform into reactor physics. (authors)« less

A novel pipeline based FPGA implementation of a genetic algorithm

NASA Astrophysics Data System (ADS)

Thirer, Nonel

2014-05-01

To solve problems when an analytical solution is not available, more and more bio-inspired computation techniques have been applied in the last years. Thus, an efficient algorithm is the Genetic Algorithm (GA), which imitates the biological evolution process, finding the solution by the mechanism of "natural selection", where the strong has higher chances to survive. A genetic algorithm is an iterative procedure which operates on a population of individuals called "chromosomes" or "possible solutions" (usually represented by a binary code). GA performs several processes with the population individuals to produce a new population, like in the biological evolution. To provide a high speed solution, pipelined based FPGA hardware implementations are used, with a nstages pipeline for a n-phases genetic algorithm. The FPGA pipeline implementations are constraints by the different execution time of each stage and by the FPGA chip resources. To minimize these difficulties, we propose a bio-inspired technique to modify the crossover step by using non identical twins. Thus two of the chosen chromosomes (parents) will build up two new chromosomes (children) not only one as in classical GA. We analyze the contribution of this method to reduce the execution time in the asynchronous and synchronous pipelines and also the possibility to a cheaper FPGA implementation, by using smaller populations. The full hardware architecture for a FPGA implementation to our target ALTERA development card is presented and analyzed.
Particle Identification on an FPGA Accelerated Compute Platform for the LHCb Upgrade

NASA Astrophysics Data System (ADS)

Fäerber, Christian; Schwemmer, Rainer; Machen, Jonathan; Neufeld, Niko

2017-07-01

The current LHCb readout system will be upgraded in 2018 to a “triggerless” readout of the entire detector at the Large Hadron Collider collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 Gb/s up to 40 Tb/s. The event filter farm will preanalyze the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, and the reason why different new technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building farm or in the event filter farm (trigger), an experimental field programmable gate array (FPGA) accelerated computing platform is considered and, therefore, tested. FPGA compute accelerators are used more and more in standard servers such as for Microsoft Bing search or Baidu search. The platform we use hosts a general Intel CPU and a high-performance FPGA linked via the high-speed Intel QuickPath Interconnect. An accelerator is implemented on the FPGA. It is very likely that these platforms, which are built, in general, for high-performance computing, are also very interesting for the high-energy physics community. First, the performance results of smaller test cases performed at the beginning are presented. Afterward, a part of the existing LHCb RICH particle identification is tested and is ported to the experimental FPGA accelerated platform. We have compared the performance of the LHCb RICH particle identification running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform.
Optimized smith waterman processor design for breast cancer early diagnosis

NASA Astrophysics Data System (ADS)

Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.

2017-09-01

This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
Implementation of data acquisition interface using on-board field-programmable gate array (FPGA) universal serial bus (USB) link

NASA Astrophysics Data System (ADS)

Yussup, N.; Ibrahim, M. M.; Lombigit, L.; Rahman, N. A. A.; Zin, M. R. M.

2014-02-01

Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of data acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.
Implementation of data acquisition interface using on-board field-programmable gate array (FPGA) universal serial bus (USB) link

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yussup, N.; Ibrahim, M. M.; Lombigit, L.

Typically a system consists of hardware as the controller and software which is installed in the personal computer (PC). In the effective nuclear detection, the hardware involves the detection setup and the electronics used, with the software consisting of analysis tools and graphical display on PC. A data acquisition interface is necessary to enable the communication between the controller hardware and PC. Nowadays, Universal Serial Bus (USB) has become a standard connection method for computer peripherals and has replaced many varieties of serial and parallel ports. However the implementation of USB is complex. This paper describes the implementation of datamore » acquisition interface between a field-programmable gate array (FPGA) board and a PC by exploiting the USB link of the FPGA board. The USB link is based on an FTDI chip which allows direct access of input and output to the Joint Test Action Group (JTAG) signals from a USB host and a complex programmable logic device (CPLD) with a 24 MHz clock input to the USB link. The implementation and results of using the USB link of FPGA board as the data interfacing are discussed.« less
FPGA architecture and implementation of sparse matrix vector multiplication for the finite element method

NASA Astrophysics Data System (ADS)

Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.

2008-04-01

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside the FPGA internal memory.
Integration of digital signal processing technologies with pulsed electron paramagnetic resonance imaging

PubMed Central

Pursley, Randall H.; Salem, Ghadi; Devasahayam, Nallathamby; Subramanian, Sankaran; Koscielniak, Janusz; Krishna, Murali C.; Pohida, Thomas J.

2006-01-01

The integration of modern data acquisition and digital signal processing (DSP) technologies with Fourier transform electron paramagnetic resonance (FT-EPR) imaging at radiofrequencies (RF) is described. The FT-EPR system operates at a Larmor frequency (Lf) of 300 MHz to facilitate in vivo studies. This relatively low frequency Lf, in conjunction with our ~10 MHz signal bandwidth, enables the use of direct free induction decay time-locked subsampling (TLSS). This particular technique provides advantages by eliminating the traditional analog intermediate frequency downconversion stage along with the corresponding noise sources. TLSS also results in manageable sample rates that facilitate the design of DSP-based data acquisition and image processing platforms. More specifically, we utilize a high-speed field programmable gate array (FPGA) and a DSP processor to perform advanced real-time signal and image processing. The migration to a DSP-based configuration offers the benefits of improved EPR system performance, as well as increased adaptability to various EPR system configurations (i.e., software configurable systems instead of hardware reconfigurations). The required modifications to the FT-EPR system design are described, with focus on the addition of DSP technologies including the application-specific hardware, software, and firmware developed for the FPGA and DSP processor. The first results of using real-time DSP technologies in conjunction with direct detection bandpass sampling to implement EPR imaging at RF frequencies are presented. PMID:16243552
Remotely Powered Reconfigurable Receiver for Extreme Sensing Platforms

NASA Technical Reports Server (NTRS)

Sheldon, Douglas J. (Inventor)

2017-01-01

Unmanned space programs are currently used to enable scientists to explore and research the furthest reaches of outer space. Systems and methods for low power communication devices in accordance with embodiments of the invention are disclosed, describing a wide variety of low power communication devices capable of remotely collecting, processing, and transmitting data from outer space in order to further mankind's goal of exploring the cosmos. Many embodiments of the invention include a Flash-based FPGA, an energy-harvesting power supply module, a sensor module, and a radio module. By utilizing technologies that withstand the harsh environment of outer space, more reliable low power communication devices can be deployed, enhancing the quality and longevity of the low power communication devices, enabling more data to be gathered and aiding in the exploration of outer space.
Re-Form: FPGA-Powered True Codesign Flow for High-Performance Computing In The Post-Moore Era

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cappello, Franck; Yoshii, Kazutomo; Finkel, Hal

Multicore scaling will end soon because of practical power limits. Dark silicon is becoming a major issue even more than the end of Moore’s law. In the post-Moore era, the energy efficiency of computing will be a major concern. FPGAs could be a key to maximizing the energy efficiency. In this paper we address severe challenges in the adoption of FPGA in HPC and describe “Re-form,” an FPGA-powered codesign flow.
Accelerating object detection via a visual-feature-directed search cascade: algorithm and field programmable gate array implementation

NASA Astrophysics Data System (ADS)

Kyrkou, Christos; Theocharides, Theocharis

2016-07-01

Object detection is a major step in several computer vision applications and a requirement for most smart camera systems. Recent advances in hardware acceleration for real-time object detection feature extensive use of reconfigurable hardware [field programmable gate arrays (FPGAs)], and relevant research has produced quite fascinating results, in both the accuracy of the detection algorithms as well as the performance in terms of frames per second (fps) for use in embedded smart camera systems. Detecting objects in images, however, is a daunting task and often involves hardware-inefficient steps, both in terms of the datapath design and in terms of input/output and memory access patterns. We present how a visual-feature-directed search cascade composed of motion detection, depth computation, and edge detection, can have a significant impact in reducing the data that needs to be examined by the classification engine for the presence of an object of interest. Experimental results on a Spartan 6 FPGA platform for face detection indicate data search reduction of up to 95%, which results in the system being able to process up to 50 1024×768 pixels images per second with a significantly reduced number of false positives.
Fused smart sensor network for multi-axis forward kinematics estimation in industrial robots.

PubMed

Rodriguez-Donate, Carlos; Osornio-Rios, Roque Alfredo; Rivera-Guillen, Jesus Rooney; Romero-Troncoso, Rene de Jesus

2011-01-01

Flexible manipulator robots have a wide industrial application. Robot performance requires sensing its position and orientation adequately, known as forward kinematics. Commercially available, motion controllers use high-resolution optical encoders to sense the position of each joint which cannot detect some mechanical deformations that decrease the accuracy of the robot position and orientation. To overcome those problems, several sensor fusion methods have been proposed but at expenses of high-computational load, which avoids the online measurement of the joint's angular position and the online forward kinematics estimation. The contribution of this work is to propose a fused smart sensor network to estimate the forward kinematics of an industrial robot. The developed smart processor uses Kalman filters to filter and to fuse the information of the sensor network. Two primary sensors are used: an optical encoder, and a 3-axis accelerometer. In order to obtain the position and orientation of each joint online a field-programmable gate array (FPGA) is used in the hardware implementation taking advantage of the parallel computation capabilities and reconfigurability of this device. With the aim of evaluating the smart sensor network performance, three real-operation-oriented paths are executed and monitored in a 6-degree of freedom robot.
Circularly split-ring-resonator-based frequency-reconfigurable antenna

NASA Astrophysics Data System (ADS)

Rahman, M. A.; Faruque, M. R. I.; Islam, M. T.

2017-01-01

In this paper, an antenna with frequency configurability in light of a circularly split-ring resonator (CSRR) is introduced. The proposed reconfigurable monopole antenna consists of a microstrip-fed hook-shaped structure and a CSRR having single reconfigurable split only. A new band of radiation unlike the band radiated from monopole only is observed due to magnetic coupling between the CSRR and the monopole antenna. The resonance frequency of the CSRR can be arbitrarily chosen by varying the dimension and relative position of its gap with the monopole, which leads the antenna to become reconfigurable one. By using a single switch with perfect electric conductor at the gap of CSRR cell, the effect of CSRR can be deactivated and, hence, it is possible to suppress the corresponding resonance, resulting in a frequency-reconfigurable antenna. Commercially available Computer Simulation Technology microwave studio based on finite integration technique was adopted throughout the study.
Reconfigurable Fault Tolerance for FPGAs

NASA Technical Reports Server (NTRS)

Shuler, Robert, Jr.

2010-01-01

The invention allows a field-programmable gate array (FPGA) or similar device to be efficiently reconfigured in whole or in part to provide higher capacity, non-redundant operation. The redundant device consists of functional units such as adders or multipliers, configuration memory for the functional units, a programmable routing method, configuration memory for the routing method, and various other features such as block RAM, I/O (random access memory, input/output) capability, dedicated carry logic, etc. The redundant device has three identical sets of functional units and routing resources and majority voters that correct errors. The configuration memory may or may not be redundant, depending on need. For example, SRAM-based FPGAs will need some type of radiation-tolerant configuration memory, or they will need triple-redundant configuration memory. Flash or anti-fuse devices will generally not need redundant configuration memory. Some means of loading and verifying the configuration memory is also required. These are all components of the pre-existing redundant FPGA. This innovation modifies the voter to accept a MODE input, which specifies whether ordinary voting is to occur, or if redundancy is to be split. Generally, additional routing resources will also be required to pass data between sections of the device created by splitting the redundancy. In redundancy mode, the voters produce an output corresponding to the two inputs that agree, in the usual fashion. In the split mode, the voters select just one input and convey this to the output, ignoring the other inputs. In a dual-redundant system (as opposed to triple-redundant), instead of a voter, there is some means to latch or gate a state update only when both inputs agree. In this case, the invention would require modification of the latch or gate so that it would operate normally in redundant mode, and would separately latch or gate the inputs in non-redundant mode.
FPGA design for constrained energy minimization

NASA Astrophysics Data System (ADS)

Wang, Jianwei; Chang, Chein-I.; Cao, Mang

2004-02-01

The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.
Hardware accelerated high performance neutron transport computation based on AGENT methodology

NASA Astrophysics Data System (ADS)

Xiao, Shanjie

The spatial heterogeneity of the next generation Gen-IV nuclear reactor core designs brings challenges to the neutron transport analysis. The Arbitrary Geometry Neutron Transport (AGENT) AGENT code is a three-dimensional neutron transport analysis code being developed at the Laboratory for Neutronics and Geometry Computation (NEGE) at Purdue University. It can accurately describe the spatial heterogeneity in a hierarchical structure through the R-function solid modeler. The previous version of AGENT coupled the 2D transport MOC solver and the 1D diffusion NEM solver to solve the three dimensional Boltzmann transport equation. In this research, the 2D/1D coupling methodology was expanded to couple two transport solvers, the radial 2D MOC solver and the axial 1D MOC solver, for better accuracy. The expansion was benchmarked with the widely applied C5G7 benchmark models and two fast breeder reactor models, and showed good agreement with the reference Monte Carlo results. In practice, the accurate neutron transport analysis for a full reactor core is still time-consuming and thus limits its application. Therefore, another content of my research is focused on designing a specific hardware based on the reconfigurable computing technique in order to accelerate AGENT computations. It is the first time that the application of this type is used to the reactor physics and neutron transport for reactor design. The most time consuming part of the AGENT algorithm was identified. Moreover, the architecture of the AGENT acceleration system was designed based on the analysis. Through the parallel computation on the specially designed, highly efficient architecture, the acceleration design on FPGA acquires high performance at the much lower working frequency than CPUs. The whole design simulations show that the acceleration design would be able to speedup large scale AGENT computations about 20 times. The high performance AGENT acceleration system will drastically shortening the computation time for 3D full-core neutron transport analysis, making the AGENT methodology unique and advantageous, and thus supplies the possibility to extend the application range of neutron transport analysis in either industry engineering or academic research.
A Real-Time Data Acquisition and Processing Framework Based on FlexRIO FPGA and ITER Fast Plant System Controller

NASA Astrophysics Data System (ADS)

Yang, C.; Zheng, W.; Zhang, M.; Yuan, T.; Zhuang, G.; Pan, Y.

2016-06-01

Measurement and control of the plasma in real-time are critical for advanced Tokamak operation. It requires high speed real-time data acquisition and processing. ITER has designed the Fast Plant System Controllers (FPSC) for these purposes. At J-TEXT Tokamak, a real-time data acquisition and processing framework has been designed and implemented using standard ITER FPSC technologies. The main hardware components of this framework are an Industrial Personal Computer (IPC) with a real-time system and FlexRIO devices based on FPGA. With FlexRIO devices, data can be processed by FPGA in real-time before they are passed to the CPU. The software elements are based on a real-time framework which runs under Red Hat Enterprise Linux MRG-R and uses Experimental Physics and Industrial Control System (EPICS) for monitoring and configuring. That makes the framework accord with ITER FPSC standard technology. With this framework, any kind of data acquisition and processing FlexRIO FPGA program can be configured with a FPSC. An application using the framework has been implemented for the polarimeter-interferometer diagnostic system on J-TEXT. The application is able to extract phase-shift information from the intermediate frequency signal produced by the polarimeter-interferometer diagnostic system and calculate plasma density profile in real-time. Different algorithms implementations on the FlexRIO FPGA are compared in the paper.
A high-speed, reconfigurable, channel- and time-tagged photon arrival recording system for intensity-interferometry and quantum optics experiments

NASA Astrophysics Data System (ADS)

Girish, B. S.; Pandey, Deepak; Ramachandran, Hema

2017-08-01

We present a compact, inexpensive multichannel module, APODAS (Avalanche Photodiode Output Data Acquisition System), capable of detecting 0.8 billion photons per second and providing real-time recording on a computer hard-disk, of channel- and time-tagged information of the arrival of upto 0.4 billion photons per second. Built around a Virtex-5 Field Programmable Gate Array (FPGA) unit, APODAS offers a temporal resolution of 5 nanoseconds with zero deadtime in data acquisition, utilising an efficient scheme for time and channel tagging and employing Gigabit ethernet for the transfer of data. Analysis tools have been developed on a Linux platform for multi-fold coincidence studies and time-delayed intensity interferometry. As illustrative examples, the second-order intensity correlation function ( g 2) of light from two commonly used sources in quantum optics —a coherent laser source and a dilute atomic vapour emitting spontaneously, constituting a thermal source— are presented. With easy reconfigurability and with no restriction on the total record length, APODAS can be readily used for studies over various time scales. This is demonstrated by using APODAS to reveal Rabi oscillations on nanosecond time scales in the emission of ultracold atoms, on the one hand, and, on the other hand, to measure the second-order correlation function on the millisecond time scales from tailored light sources. The efficient and versatile performance of APODAS promises its utility in diverse fields, like quantum optics, quantum communication, nuclear physics, astrophysics and biology.
Parametric dense stereovision implementation on a system-on chip (SoC).

PubMed

Gardel, Alfredo; Montejo, Pablo; García, Jorge; Bravo, Ignacio; Lázaro, José L

2012-01-01

This paper proposes a novel hardware implementation of a dense recovery of stereovision 3D measurements. Traditionally 3D stereo systems have imposed the maximum number of stereo correspondences, introducing a large restriction on artificial vision algorithms. The proposed system-on-chip (SoC) provides great performance and efficiency, with a scalable architecture available for many different situations, addressing real time processing of stereo image flow. Using double buffering techniques properly combined with pipelined processing, the use of reconfigurable hardware achieves a parametrisable SoC which gives the designer the opportunity to decide its right dimension and features. The proposed architecture does not need any external memory because the processing is done as image flow arrives. Our SoC provides 3D data directly without the storage of whole stereo images. Our goal is to obtain high processing speed while maintaining the accuracy of 3D data using minimum resources. Configurable parameters may be controlled by later/parallel stages of the vision algorithm executed on an embedded processor. Considering hardware FPGA clock of 100 MHz, image flows up to 50 frames per second (fps) of dense stereo maps of more than 30,000 depth points could be obtained considering 2 Mpix images, with a minimum initial latency. The implementation of computer vision algorithms on reconfigurable hardware, explicitly low level processing, opens up the prospect of its use in autonomous systems, and they can act as a coprocessor to reconstruct 3D images with high density information in real time.
A Comparison of FPGA and GPGPU Designs for Bayesian Occupancy Filters

PubMed Central

Medina, Luis; Diez-Ochoa, Miguel; Correal, Raul; Cuenca-Asensi, Sergio; Godoy, Jorge; Martínez-Álvarez, Antonio

2017-01-01

Grid-based perception techniques in the automotive sector based on fusing information from different sensors and their robust perceptions of the environment are proliferating in the industry. However, one of the main drawbacks of these techniques is the traditionally prohibitive, high computing performance that is required for embedded automotive systems. In this work, the capabilities of new computing architectures that embed these algorithms are assessed in a real car. The paper compares two ad hoc optimized designs of the Bayesian Occupancy Filter; one for General Purpose Graphics Processing Unit (GPGPU) and the other for Field-Programmable Gate Array (FPGA). The resulting implementations are compared in terms of development effort, accuracy and performance, using datasets from a realistic simulator and from a real automated vehicle. PMID:29137137
FPGA-based coprocessor for matrix algorithms implementation

NASA Astrophysics Data System (ADS)

Amira, Abbes; Bensaali, Faycal

2003-03-01

Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.

FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

PubMed Central

Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao

2012-01-01

This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
A real-time multi-scale 2D Gaussian filter based on FPGA

NASA Astrophysics Data System (ADS)

Luo, Haibo; Gai, Xingqin; Chang, Zheng; Hui, Bin

2014-11-01

Multi-scale 2-D Gaussian filter has been widely used in feature extraction (e.g. SIFT, edge etc.), image segmentation, image enhancement, image noise removing, multi-scale shape description etc. However, their computational complexity remains an issue for real-time image processing systems. Aimed at this problem, we propose a framework of multi-scale 2-D Gaussian filter based on FPGA in this paper. Firstly, a full-hardware architecture based on parallel pipeline was designed to achieve high throughput rate. Secondly, in order to save some multiplier, the 2-D convolution is separated into two 1-D convolutions. Thirdly, a dedicate first in first out memory named as CAFIFO (Column Addressing FIFO) was designed to avoid the error propagating induced by spark on clock. Finally, a shared memory framework was designed to reduce memory costs. As a demonstration, we realized a 3 scales 2-D Gaussian filter on a single ALTERA Cyclone III FPGA chip. Experimental results show that, the proposed framework can computing a Multi-scales 2-D Gaussian filtering within one pixel clock period, is further suitable for real-time image processing. Moreover, the main principle can be popularized to the other operators based on convolution, such as Gabor filter, Sobel operator and so on.
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform

NASA Astrophysics Data System (ADS)

Bartoldus, R.; Claus, R.; Garelli, N.; Herbst, R. T.; Huffer, M.; Iakovidis, G.; Iordanidou, K.; Kwan, K.; Kocian, M.; Lankford, A. J.; Moschovakos, P.; Nelson, A.; Ntekas, K.; Ruckman, L.; Russell, J.; Schernau, M.; Schlenker, S.; Su, D.; Valderanis, C.; Wittgen, M.; Yildiz, S. C.

2016-01-01

The ATLAS muon Cathode Strip Chamber (CSC) backend readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run-2 luminosity. The readout design is based on the Reconfigurable Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the Advanced Telecommunication Computing Architecture (ATCA) platform. The RCE design is based on the new System on Chip XILINX ZYNQ series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources. Together with auxiliary memories, all these components form a versatile DAQ building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the ZYNQ for high speed input and output fiberoptic links and TTC allowed the full system of 320 input links from the 32 chambers to be processed by 6 COBs in one ATCA shelf. The full system was installed in September 2014. We will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning for LHC Run 2.
A new ATLAS muon CSC readout system with system on chip technology on ATCA platform

DOE PAGES

Bartoldus, R.; Claus, R.; Garelli, N.; ...

2016-01-25

The ATLAS muon Cathode Strip Chamber (CSC) backend readout system has been upgraded during the LHC 2013-2015 shutdown to be able to handle the higher Level-1 trigger rate of 100 kHz and the higher occupancy at Run-2 luminosity. The readout design is based on the Reconfigurable Cluster Element (RCE) concept for high bandwidth generic DAQ implemented on the Advanced Telecommunication Computing Architecture (ATCA) platform. The RCE design is based on the new System on Chip XILINX ZYNQ series with a processor-centric architecture with ARM processor embedded in FPGA fabric and high speed I/O resources. Together with auxiliary memories, all ofmore » these components form a versatile DAQ building block that can host applications tapping into both software and firmware resources. The Cluster on Board (COB) ATCA carrier hosts RCE mezzanines and an embedded Fulcrum network switch to form an online DAQ processing cluster. More compact firmware solutions on the ZYNQ for high speed input and output fiberoptic links and TTC allowed the full system of 320 input links from the 32 chambers to be processed by 6 COBs in one ATCA shelf. The full system was installed in September 2014. In conclusion, we will present the RCE/COB design concept, the firmware and software processing architecture, and the experience from the intense commissioning for LHC Run 2.« less
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System

NASA Astrophysics Data System (ADS)

Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.

2016-03-01

We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio interferometers.
The Application Design of Solar Radio Spectrometer Based on FPGA

NASA Astrophysics Data System (ADS)

Du, Q. F.; Chen, R. J.; Zhao, Y. C.; Feng, S. W.; Chen, Y.; Song, Y.

2017-10-01

The Solar radio spectrometer is the key instrument to observe solar radio. By programing the computer software, we control the AD signal acquisition card which is based on FPGA to get a mass of data. The data are transferred by using PCI-E port. This program has realized the function of timing data collection, finding data in specific time and controlling acquisition meter in real time. It can also map the solar radio power intensity graph. By doing the experiment, we verify the reliability of solar radio spectrum instrument, in the meanwhile, the instrument simplifies the operation in observing the sun.
YARR - A PCIe based Readout Concept for Current and Future ATLAS Pixel Modules

NASA Astrophysics Data System (ADS)

Heim, Timon

2017-10-01

The Yet Another Rapid Readout (YARR) system is a DAQ system designed for the readout of current generation ATLAS Pixel FE-I4 and next generation chips. It utilises a commercial-off-the-shelf PCIe FPGA card as a reconfigurable I/O interface, which acts as a simple gateway to pipe all data from the Pixel modules via the high speed PCIe connection into the host system’s memory. Relying on modern CPU architectures, which enables the usage of parallelised processing in threads and commercial high speed interfaces in everyday computers, it is possible to perform all processing on a software level in the host CPU. Although FPGAs are very powerful at parallel signal processing their firmware is hard to maintain and constrained by their connected hardware. Software, on the other hand, is very portable and upgraded frequently with new features coming at no cost. A DAQ concept which does not rely on the underlying hardware for acceleration also eases the transition from prototyping in the laboratory to the full scale implementation in the experiment. The overall concept and data flow will be outlined, as well as the challenges and possible bottlenecks which can be encountered when moving the processing from hardware to software.
An adaptive software defined radio design based on a standard space telecommunication radio system API

NASA Astrophysics Data System (ADS)

Xiong, Wenhao; Tian, Xin; Chen, Genshe; Pham, Khanh; Blasch, Erik

2017-05-01

Software defined radio (SDR) has become a popular tool for the implementation and testing for communications performance. The advantage of the SDR approach includes: a re-configurable design, adaptive response to changing conditions, efficient development, and highly versatile implementation. In order to understand the benefits of SDR, the space telecommunication radio system (STRS) was proposed by NASA Glenn research center (GRC) along with the standard application program interface (API) structure. Each component of the system uses a well-defined API to communicate with other components. The benefit of standard API is to relax the platform limitation of each component for addition options. For example, the waveform generating process can support a field programmable gate array (FPGA), personal computer (PC), or an embedded system. As long as the API defines the requirements, the generated waveform selection will work with the complete system. In this paper, we demonstrate the design and development of adaptive SDR following the STRS and standard API protocol. We introduce step by step the SDR testbed system including the controlling graphic user interface (GUI), database, GNU radio hardware control, and universal software radio peripheral (USRP) tranceiving front end. In addition, a performance evaluation in shown on the effectiveness of the SDR approach for space telecommunication.
Pulse-coupled neural network implementation in FPGA

NASA Astrophysics Data System (ADS)

Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael

1998-03-01

Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
A Lithography-Free and Field-Programmable Photonic Metacanvas.

PubMed

Dong, Kaichen; Hong, Sukjoon; Deng, Yang; Ma, He; Li, Jiachen; Wang, Xi; Yeo, Junyeob; Wang, Letian; Lou, Shuai; Tom, Kyle B; Liu, Kai; You, Zheng; Wei, Yang; Grigoropoulos, Costas P; Yao, Jie; Wu, Junqiao

2018-02-01

The unique correspondence between mathematical operators and photonic elements in wave optics enables quantitative analysis of light manipulation with individual optical devices. Phase-transition materials are able to provide real-time reconfigurability of these devices, which would create new optical functionalities via (re)compilation of photonic operators, as those achieved in other fields such as field-programmable gate arrays (FPGA). Here, by exploiting the hysteretic phase transition of vanadium dioxide, an all-solid, rewritable metacanvas on which nearly arbitrary photonic devices can be rapidly and repeatedly written and erased is presented. The writing is performed with a low-power laser and the entire process stays below 90 °C. Using the metacanvas, dynamic manipulation of optical waves is demonstrated for light propagation, polarization, and reconstruction. The metacanvas supports physical (re)compilation of photonic operators akin to that of FPGA, opening up possibilities where photonic elements can be field programmed to deliver complex, system-level functionalities. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Novel Reconfigurable Logic Unit Based on the DNA-Templated Potassium-Concentration-Dependent Supramolecular Assembly.

PubMed

Yang, Chunrong; Zou, Dan; Chen, Jianchi; Zhang, Linyan; Miao, Jiarong; Huang, Dan; Du, Yuanyuan; Yang, Shu; Yang, Qianfan; Tang, Yalin

2018-03-15

Plenty of molecular circuits with specific functions have been developed; however, logic units with reconfigurability, which could simplify the circuits and speed up the information process, are rarely reported. In this work, we designed a novel reconfigurable logic unit based on a DNA-templated, potassium-concentration-dependent, supramolecular assembly, which could respond to the input stimuli of H + and K + . By inputting different concentrations of K + , the logic unit could implement three significant functions, including a half adder, a half subtractor, and a 2-to-4 decoder. Considering its reconfigurable ability and good performance, the novel prototypes developed here may serve as a promising proof of principle in molecular computers. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Providing Self-Healing Ability for Wireless Sensor Node by Using Reconfigurable Hardware

PubMed Central

Yuan, Shenfang; Qiu, Lei; Gao, Shang; Tong, Yao; Yang, Weiwei

2012-01-01

Wireless sensor networks (WSNs) have received tremendous attention over the past ten years. In engineering applications of WSNs, a number of sensor nodes are usually spread across some specific geographical area. Some of these nodes have to work in harsh environments. Dependability of the Wireless Sensor Network (WSN) is very important for its successful applications in the engineering area. In ordinary research, when a node has a failure, it is usually discarded and the network is reorganized to ensure the normal operation of the WSN. Using appropriate WSN re-organization methods, though the sensor networks can be reorganized, this causes additional maintenance costs and sometimes still decreases the function of the networks. In those situations where the sensor networks cannot be reorganized, the performance of the whole WSN will surely be degraded. In order to ensure the reliable and low cost operation of WSNs, a method to develop a wireless sensor node with self-healing ability based on reconfigurable hardware is proposed in this paper. Two self-healing WSN node realization paradigms based on reconfigurable hardware are presented, including a redundancy-based self-healing paradigm and a whole FPAA/FPGA based self-healing paradigm. The nodes designed with the self-healing ability can dynamically change their node configurations to repair the nodes' hardware failures. To demonstrate these two paradigms, a strain sensor node is adopted as an illustration to show the concepts. Two strain WSN sensor nodes with self-healing ability are developed respectively according to the proposed self-healing paradigms. Evaluation experiments on self-healing ability and power consumption are performed. Experimental results show that the developed nodes can self-diagnose the failures and recover to a normal state automatically. The research presented can improve the robustness of WSNs and reduce the maintenance cost of WSNs in engineering applications. PMID:23202176
Methods and systems for providing reconfigurable and recoverable computing resources

NASA Technical Reports Server (NTRS)

Stange, Kent (Inventor); Hess, Richard (Inventor); Kelley, Gerald B (Inventor); Rogers, Randy (Inventor)

2010-01-01

A method for optimizing the use of digital computing resources to achieve reliability and availability of the computing resources is disclosed. The method comprises providing one or more processors with a recovery mechanism, the one or more processors executing one or more applications. A determination is made whether the one or more processors needs to be reconfigured. A rapid recovery is employed to reconfigure the one or more processors when needed. A computing system that provides reconfigurable and recoverable computing resources is also disclosed. The system comprises one or more processors with a recovery mechanism, with the one or more processors configured to execute a first application, and an additional processor configured to execute a second application different than the first application. The additional processor is reconfigurable with rapid recovery such that the additional processor can execute the first application when one of the one more processors fails.
Self-Organizing Map Neural Network-Based Nearest Neighbor Position Estimation Scheme for Continuous Crystal PET Detectors

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Li, Deng; Lu, Xiaoming; Cheng, Xinyi; Wang, Liwei

2014-10-01

Continuous crystal-based positron emission tomography (PET) detectors could be an ideal alternative for current high-resolution pixelated PET detectors if the issues of high performance γ interaction position estimation and its real-time implementation are solved. Unfortunately, existing position estimators are not very feasible for implementation on field-programmable gate array (FPGA). In this paper, we propose a new self-organizing map neural network-based nearest neighbor (SOM-NN) positioning scheme aiming not only at providing high performance, but also at being realistic for FPGA implementation. Benefitting from the SOM feature mapping mechanism, the large set of input reference events at each calibration position is approximated by a small set of prototypes, and the computation of the nearest neighbor searching for unknown events is largely reduced. Using our experimental data, the scheme was evaluated, optimized and compared with the smoothed k-NN method. The spatial resolutions of full-width-at-half-maximum (FWHM) of both methods averaged over the center axis of the detector were obtained as 1.87 ±0.17 mm and 1.92 ±0.09 mm, respectively. The test results show that the SOM-NN scheme has an equivalent positioning performance with the smoothed k-NN method, but the amount of computation is only about one-tenth of the smoothed k-NN method. In addition, the algorithm structure of the SOM-NN scheme is more feasible for implementation on FPGA. It has the potential to realize real-time position estimation on an FPGA with a high-event processing throughput.
FPGA-based fused smart sensor for dynamic and vibration parameter extraction in industrial robot links.

PubMed

Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene

2010-01-01

Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA).
FPGA-Based Fused Smart Sensor for Dynamic and Vibration Parameter Extraction in Industrial Robot Links

PubMed Central

Rodriguez-Donate, Carlos; Morales-Velazquez, Luis; Osornio-Rios, Roque Alfredo; Herrera-Ruiz, Gilberto; de Jesus Romero-Troncoso, Rene

2010-01-01

Intelligent robotics demands the integration of smart sensors that allow the controller to efficiently measure physical quantities. Industrial manipulator robots require a constant monitoring of several parameters such as motion dynamics, inclination, and vibration. This work presents a novel smart sensor to estimate motion dynamics, inclination, and vibration parameters on industrial manipulator robot links based on two primary sensors: an encoder and a triaxial accelerometer. The proposed smart sensor implements a new methodology based on an oversampling technique, averaging decimation filters, FIR filters, finite differences and linear interpolation to estimate the interest parameters, which are computed online utilizing digital hardware signal processing based on field programmable gate arrays (FPGA). PMID:22319345
A modular design for rapid-response telecoms and navigation missions

NASA Astrophysics Data System (ADS)

Davies, P.; Liddle, D.; Buckley, John; Sweeting, M.; Roussel-Dupre, Diane; Caffrey, Michael

2004-11-01

Surrey Satellite Technology Ltd and Los Alamos National Laboratory are together building the Cibola Flight Experiment (CFESat), a mission with the aim of flight-proving a reconfigurable processor payload intended for a Low Earth Orbit system. The mission will survey portions of the VHF and UHF radio spectra. The satellite will be launched by the Space Test Program in September 2006 on the USAF Evolved Expendable Launch Vehicle (EELV) using the EELV's Secondary Payload Adapter (ESPA) that allows up to six small satellites to be launched as "piggyback" passengers with larger spacecraft. The payload is based on networks of reprogrammable, Field Programmable Gate Arrays (FPGAs) to process the received signals for ionospheric and lightning studies. The objective is to validate the on-orbit use of commercial, reconfigurable FPGA technology utilizing several different single-event upset mitigation schemes. It will also detect and measure impulsive events that occur in a complex background. SSTL's satellite platform is based on a new, ESPA- compatible, structure housing subsystems and equipments with proven flight heritage from SSTL's disaster monitoring constellation (DMC) and the Topsat mission satellite due for launch in 2005. The structure is mechanically quite complex for a microsatellite having both deployed solar panels and a pair of long booms as part of the payload. The satellite design is highly constrained by the mass and volume requirements of the EELV/EPSA.
FPGA-based protein sequence alignment : A review

NASA Astrophysics Data System (ADS)

Isa, Mohd. Nazrin Md.; Muhsen, Ku Noor Dhaniah Ku; Saiful Nurdin, Dayana; Ahmad, Muhammad Imran; Anuar Zainol Murad, Sohiful; Nizam Mohyar, Shaiful; Harun, Azizi; Hussin, Razaidi

2017-11-01

Sequence alignment have been optimized using several techniques in order to accelerate the computation time to obtain the optimal score by implementing DP-based algorithm into hardware such as FPGA-based platform. During hardware implementation, there will be performance challenges such as the frequent memory access and highly data dependent in computation process. Therefore, investigation in processing element (PE) configuration where involves more on memory access in load or access the data (substitution matrix, query sequence character) and the PE configuration time will be the main focus in this paper. There are various approaches to enhance the PE configuration performance that have been done in previous works such as by using serial configuration chain and parallel configuration chain i.e. the configuration data will be loaded into each PEs sequentially and simultaneously respectively. Some researchers have proven that the performance using parallel configuration chain has optimized both the configuration time and area.
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques

DTIC Science & Technology

2013-03-01

MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated
A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System.

PubMed

Zhang, Zhen; Ma, Cheng; Zhu, Rong

2017-08-23

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.

A FPGA-Based, Granularity-Variable Neuromorphic Processor and Its Application in a MIMO Real-Time Control System

PubMed Central

Zhang, Zhen; Zhu, Rong

2017-01-01

Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
Synchronized Pair Configuration in Virtualization-Based Lab for Learning Computer Networks

ERIC Educational Resources Information Center

Kongcharoen, Chaknarin; Hwang, Wu-Yuin; Ghinea, Gheorghita

2017-01-01

More studies are concentrating on using virtualization-based labs to facilitate computer or network learning concepts. Some benefits are lower hardware costs and greater flexibility in reconfiguring computer and network environments. However, few studies have investigated effective mechanisms for using virtualization fully for collaboration.…
Reconfigurable Hardware for Compressing Hyperspectral Image Data

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh; Namkung, Jeffrey; Villapando, Carlos; Kiely, Aaron; Klimesh, Matthew; Xie, Hua

2010-01-01

High-speed, low-power, reconfigurable electronic hardware has been developed to implement ICER-3D, an algorithm for compressing hyperspectral-image data. The algorithm and parts thereof have been the topics of several NASA Tech Briefs articles, including Context Modeler for Wavelet Compression of Hyperspectral Images (NPO-43239) and ICER-3D Hyperspectral Image Compression Software (NPO-43238), which appear elsewhere in this issue of NASA Tech Briefs. As described in more detail in those articles, the algorithm includes three main subalgorithms: one for computing wavelet transforms, one for context modeling, and one for entropy encoding. For the purpose of designing the hardware, these subalgorithms are treated as modules to be implemented efficiently in field-programmable gate arrays (FPGAs). The design takes advantage of industry- standard, commercially available FPGAs. The implementation targets the Xilinx Virtex II pro architecture, which has embedded PowerPC processor cores with flexible on-chip bus architecture. It incorporates an efficient parallel and pipelined architecture to compress the three-dimensional image data. The design provides for internal buffering to minimize intensive input/output operations while making efficient use of offchip memory. The design is scalable in that the subalgorithms are implemented as independent hardware modules that can be combined in parallel to increase throughput. The on-chip processor manages the overall operation of the compression system, including execution of the top-level control functions as well as scheduling, initiating, and monitoring processes. The design prototype has been demonstrated to be capable of compressing hyperspectral data at a rate of 4.5 megasamples per second at a conservative clock frequency of 50 MHz, with a potential for substantially greater throughput at a higher clock frequency. The power consumption of the prototype is less than 6.5 W. The reconfigurability (by means of reprogramming) of the FPGAs makes it possible to effectively alter the design to some extent to satisfy different requirements without adding hardware. The implementation could be easily propagated to future FPGA generations and/or to custom application-specific integrated circuits.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor

PubMed Central

Tayara, Hilal; Ham, Woonchul; Chong, Kil To

2016-01-01

This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
Diversification of Processors Based on Redundancy in Instruction Set

NASA Astrophysics Data System (ADS)

Ichikawa, Shuichi; Sawada, Takashi; Hata, Hisashi

By diversifying processor architecture, computer software is expected to be more resistant to plagiarism, analysis, and attacks. This study presents a new method to diversify instruction set architecture (ISA) by utilizing the redundancy in the instruction set. Our method is particularly suited for embedded systems implemented with FPGA technology, and realizes a genuine instruction set randomization, which has not been provided by the preceding studies. The evaluation results on four typical ISAs indicate that our scheme can provide a far larger degree of freedom than the preceding studies. Diversified processors based on MIPS architecture were actually implemented and evaluated with Xilinx Spartan-3 FPGA. The increase of logic scale was modest: 5.1% in Specialized design and 3.6% in RAM-mapped design. The performance overhead was also modest: 3.4% in Specialized design and 11.6% in RAM-mapped design. From these results, our scheme is regarded as a practical and promising way to secure FPGA-based embedded systems.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.

PubMed

Tayara, Hilal; Ham, Woonchul; Chong, Kil To

2016-12-15

This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
Efficient Graph Based Assembly of Short-Read Sequences on Hybrid Core Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sczyrba, Alex; Pratap, Abhishek; Canon, Shane

2011-03-22

Advanced architectures can deliver dramatically increased throughput for genomics and proteomics applications, reducing time-to-completion in some cases from days to minutes. One such architecture, hybrid-core computing, marries a traditional x86 environment with a reconfigurable coprocessor, based on field programmable gate array (FPGA) technology. In addition to higher throughput, increased performance can fundamentally improve research quality by allowing more accurate, previously impractical approaches. We will discuss the approach used by Convey?s de Bruijn graph constructor for short-read, de-novo assembly. Bioinformatics applications that have random access patterns to large memory spaces, such as graph-based algorithms, experience memory performance limitations on cache-based x86more » servers. Convey?s highly parallel memory subsystem allows application-specific logic to simultaneously access 8192 individual words in memory, significantly increasing effective memory bandwidth over cache-based memory systems. Many algorithms, such as Velvet and other de Bruijn graph based, short-read, de-novo assemblers, can greatly benefit from this type of memory architecture. Furthermore, small data type operations (four nucleotides can be represented in two bits) make more efficient use of logic gates than the data types dictated by conventional programming models.JGI is comparing the performance of Convey?s graph constructor and Velvet on both synthetic and real data. We will present preliminary results on memory usage and run time metrics for various data sets with different sizes, from small microbial and fungal genomes to very large cow rumen metagenome. For genomes with references we will also present assembly quality comparisons between the two assemblers.« less
C to VHDL compiler

NASA Astrophysics Data System (ADS)

Berdychowski, Piotr P.; Zabolotny, Wojciech M.

2010-09-01

The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.
Resource and Performance Evaluations of Fixed Point QRD-RLS Systolic Array through FPGA Implementation

NASA Astrophysics Data System (ADS)

Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki

At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
FPGA cluster for high-performance AO real-time control system

NASA Astrophysics Data System (ADS)

Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.

2006-06-01

Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
Windowing technique in FM radar realized by FPGA for better target resolution

NASA Astrophysics Data System (ADS)

Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique; Kravchenko, Victor F.

2006-09-01

Remote sensing systems, such as SAR usually apply FM signals to resolve nearly placed targets (objects) and improve SNR. Main drawbacks in the pulse compression of FM radar signal that it can add the range side-lobes in reflectivity measurements. Using weighting window processing in time domain it is possible to decrease significantly the side-lobe level (SLL) of output radar signal that permits to resolve small or low power targets those are masked by powerful ones. There are usually used classical windows such as Hamming, Hanning, Blackman-Harris, Kaiser-Bessel, Dolph-Chebyshev, Gauss, etc. in window processing. Additionally to classical ones in here we also use a novel class of windows based on atomic functions (AF) theory. For comparison of simulation and experimental results we applied the standard parameters, such as coefficient of amplification, maximum level of side-lobe, width of main lobe, etc. In this paper we also proposed to implement the compression-windowing model on a hardware level employing Field Programmable Gate Array (FPGA) that offers some benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. It has been investigated the pulse compression design on FPGA applying classical and novel window technique to reduce the SLL in absence and presence of noise. The paper presents simulated and experimental examples of detection of small or nearly placed targets in the imaging radar. Paper also presents the experimental hardware results of windowing in FM radar demonstrating resolution of the several targets for classical rectangular, Hamming, Kaiser-Bessel, and some novel ones: Up(x), fup 4(x)•D 3(x), fup 6(x)•G 3(x), etc. It is possible to conclude that windows created on base of the AFs offer better decreasing of the SLL in cases of presence or absence of noise and when we move away of the main lobe in comparison with classical windows.
Evaluation of FPGA to PC feedback loop

NASA Astrophysics Data System (ADS)

Linczuk, Pawel; Zabolotny, Wojciech M.; Wojenski, Andrzej; Krawczyk, Rafal D.; Pozniak, Krzysztof T.; Chernyshova, Maryna; Czarski, Tomasz; Gaska, Michal; Kasprowicz, Grzegorz; Kowalska-Strzeciwilk, Ewa; Malinowski, Karol

2017-08-01

The paper presents the evaluation study of the performance of the data transmission subsystem which can be used in High Energy Physics (HEP) and other High-Performance Computing (HPC) systems. The test environment consisted of Xilinx Artix-7 FPGA and server-grade PC connected via the PCIe 4xGen2 bus. The DMA engine was based on the Xilinx DMA for PCI Express Subsystem1 controlled by the modified Xilinx XDMA kernel driver.2 The research is focused on the influence of the system configuration on achievable throughput and latency of data transfer.
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators

NASA Astrophysics Data System (ADS)

Gonzalez, Juan; Núñez, Rafael C.

2009-07-01

We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.
CRionScan: A stand-alone real time controller designed to perform ion beam imaging, dose controlled irradiation and proton beam writing

NASA Astrophysics Data System (ADS)

Daudin, L.; Barberet, Ph.; Serani, L.; Moretto, Ph.

2013-07-01

High resolution ion microbeams, usually used to perform elemental mapping, low dose targeted irradiation or ion beam lithography needs a very flexible beam control system. For this purpose, we have developed a dedicated system (called “CRionScan”), on the AIFIRA facility (Applications Interdisciplinaires des Faisceaux d'Ions en Région Aquitaine). It consists of a stand-alone real-time scanning and imaging instrument based on a Compact Reconfigurable Input/Output (Compact RIO) device from National Instruments™. It is based on a real-time controller, a Field Programmable Gate Array (FPGA), input/output modules and Ethernet connectivity. We have implemented a fast and deterministic beam scanning system interfaced with our commercial data acquisition system without any hardware development. CRionScan is built under LabVIEW™ and has been used on AIFIRA's nanobeam line since 2009 (Barberet et al., 2009, 2011) [1,2]. A Graphical User Interface (GUI) embedded in the Compact RIO as a web page is used to control the scanning parameters. In addition, a fast electrostatic beam blanking trigger has been included in the FPGA and high speed counters (15 MHz) have been implemented to perform dose controlled irradiation and on-line images on the GUI. Analog to Digital converters are used for the beam current measurement and in the near future for secondary electrons imaging. Other functionalities have been integrated in this controller like LED lighting using Pulse Width Modulation and a “NIM Wilkinson ADC” data acquisition.
Radiation Hardened Electronics for Extreme Environments

NASA Technical Reports Server (NTRS)

Keys, Andrew S.; Watson, Michael D.

2007-01-01

The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches.
Design of a real-time system of moving ship tracking on-board based on FPGA in remote sensing images

NASA Astrophysics Data System (ADS)

Yang, Tie-jun; Zhang, Shen; Zhou, Guo-qing; Jiang, Chuan-xian

2015-12-01

With the broad attention of countries in the areas of sea transportation and trade safety, the requirements of efficiency and accuracy of moving ship tracking are becoming higher. Therefore, a systematic design of moving ship tracking onboard based on FPGA is proposed, which uses the Adaptive Inter Frame Difference (AIFD) method to track a ship with different speed. For the Frame Difference method (FD) is simple but the amount of computation is very large, it is suitable for the use of FPGA to implement in parallel. But Frame Intervals (FIs) of the traditional FD method are fixed, and in remote sensing images, a ship looks very small (depicted by only dozens of pixels) and moves slowly. By applying invariant FIs, the accuracy of FD for moving ship tracking is not satisfactory and the calculation is highly redundant. So we use the adaptation of FD based on adaptive extraction of key frames for moving ship tracking. A FPGA development board of Xilinx Kintex-7 series is used for simulation. The experiments show that compared with the traditional FD method, the proposed one can achieve higher accuracy of moving ship tracking, and can meet the requirement of real-time tracking in high image resolution.
A design approach for small vision-based autonomous vehicles

NASA Astrophysics Data System (ADS)

Edwards, Barrett B.; Fife, Wade S.; Archibald, James K.; Lee, Dah-Jye; Wilde, Doran K.

2006-10-01

This paper describes the design of a small autonomous vehicle based on the Helios computing platform, a custom FPGA-based board capable of supporting on-board vision. Target applications for the Helios computing platform are those that require lightweight equipment and low power consumption. To demonstrate the capabilities of FPGAs in real-time control of autonomous vehicles, a 16 inch long R/C monster truck was outfitted with a Helios board. The platform provided by such a small vehicle is ideal for testing and development. The proof of concept application for this autonomous vehicle was a timed race through an environment with obstacles. Given the size restrictions of the vehicle and its operating environment, the only feasible on-board sensor is a small CMOS camera. The single video feed is therefore the only source of information from the surrounding environment. The image is then segmented and processed by custom logic in the FPGA that also controls direction and speed of the vehicle based on visual input.
Development of A Low-Cost FPGA-Based Measurement System for Real-Time Processing of Acoustic Emission Data: Proof of Concept Using Control of Pulsed Laser Ablation in Liquids.

PubMed

Wirtz, Sebastian F; Cunha, Adauto P A; Labusch, Marc; Marzun, Galina; Barcikowski, Stephan; Söffker, Dirk

2018-06-01

Today, the demand for continuous monitoring of valuable or safety critical equipment is increasing in many industrial applications due to safety and economical requirements. Therefore, reliable in-situ measurement techniques are required for instance in Structural Health Monitoring (SHM) as well as process monitoring and control. Here, current challenges are related to the processing of sensor data with a high data rate and low latency. In particular, measurement and analyses of Acoustic Emission (AE) are widely used for passive, in-situ inspection. Advantages of AE are related to its sensitivity to different micro-mechanical mechanisms on the material level. However, online processing of AE waveforms is computationally demanding. The related equipment is typically bulky, expensive, and not well suited for permanent installation. The contribution of this paper is the development of a Field Programmable Gate Array (FPGA)-based measurement system using ZedBoard devlopment kit with Zynq-7000 system on chip for embedded implementation of suitable online processing algorithms. This platform comprises a dual-core Advanced Reduced Instruction Set Computer Machine (ARM) architecture running a Linux operating system and FPGA fabric. A FPGA-based hardware implementation of the discrete wavelet transform is realized to accelerate processing the AE measurements. Key features of the system are low cost, small form factor, and low energy consumption, which makes it suitable to serve as field-deployed measurement and control device. For verification of the functionality, a novel automatically realized adjustment of the working distance during pulsed laser ablation in liquids is established as an example. A sample rate of 5 MHz is achieved at 16 bit resolution.
An FPGA-Based High-Speed Error Resilient Data Aggregation and Control for High Energy Physics Experiment

NASA Astrophysics Data System (ADS)

Mandal, Swagata; Saini, Jogender; Zabołotny, Wojciech M.; Sau, Suman; Chakrabarti, Amlan; Chattopadhyay, Subhasis

2017-03-01

Due to the dramatic increase of data volume in modern high energy physics (HEP) experiments, a robust high-speed data acquisition (DAQ) system is very much needed to gather the data generated during different nuclear interactions. As the DAQ works under harsh radiation environment, there is a fair chance of data corruption due to various energetic particles like alpha, beta, or neutron. Hence, a major challenge in the development of DAQ in the HEP experiment is to establish an error resilient communication system between front-end sensors or detectors and back-end data processing computing nodes. Here, we have implemented the DAQ using field-programmable gate array (FPGA) due to some of its inherent advantages over the application-specific integrated circuit. A novel orthogonal concatenated code and cyclic redundancy check (CRC) have been used to mitigate the effects of data corruption in the user data. Scrubbing with a 32-b CRC has been used against error in the configuration memory of FPGA. Data from front-end sensors will reach to the back-end processing nodes through multiple stages that may add an uncertain amount of delay to the different data packets. We have also proposed a novel memory management algorithm that helps to process the data at the back-end computing nodes removing the added path delays. To the best of our knowledge, the proposed FPGA-based DAQ utilizing optical link with channel coding and efficient memory management modules can be considered as first of its kind. Performance estimation of the implemented DAQ system is done based on resource utilization, bit error rate, efficiency, and robustness to radiation.
High-speed free-space based reconfigurable card-to-card optical interconnects with broadcast capability.

PubMed

Wang, Ke; Nirmalathas, Ampalavanapillai; Lim, Christina; Skafidas, Efstratios; Alameh, Kamal

2013-07-01

In this paper, we propose and experimentally demonstrate a free-space based high-speed reconfigurable card-to-card optical interconnect architecture with broadcast capability, which is required for control functionalities and efficient parallel computing applications. Experimental results show that 10 Gb/s data can be broadcast to all receiving channels for up to 30 cm with a worst-case receiver sensitivity better than -12.20 dBm. In addition, arbitrary multicasting with the same architecture is also investigated. 10 Gb/s reconfigurable point-to-point link and multicast channels are simultaneously demonstrated with a measured receiver sensitivity power penalty of ~1.3 dB due to crosstalk.

Systems-on-chip approach for real-time simulation of wheel-rail contact laws

NASA Astrophysics Data System (ADS)

Mei, T. X.; Zhou, Y. J.

2013-04-01

This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
FPGA Boot Loader and Scrubber

NASA Technical Reports Server (NTRS)

Wade, Randall S.; Jones, Bailey

2009-01-01

A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), reads back and verifies that code, reloads the code if an error is detected, and monitors the performance of the FPGA for errors in the presence of radiation. The program consists mainly of a set of VHDL files (wherein "VHDL" signifies "VHSIC Hardware Description Language" and "VHSIC" signifies "very-high-speed integrated circuit").
Fused Smart Sensor Network for Multi-Axis Forward Kinematics Estimation in Industrial Robots

PubMed Central

Rodriguez-Donate, Carlos; Osornio-Rios, Roque Alfredo; Rivera-Guillen, Jesus Rooney; de Jesus Romero-Troncoso, Rene

2011-01-01

Flexible manipulator robots have a wide industrial application. Robot performance requires sensing its position and orientation adequately, known as forward kinematics. Commercially available, motion controllers use high-resolution optical encoders to sense the position of each joint which cannot detect some mechanical deformations that decrease the accuracy of the robot position and orientation. To overcome those problems, several sensor fusion methods have been proposed but at expenses of high-computational load, which avoids the online measurement of the joint’s angular position and the online forward kinematics estimation. The contribution of this work is to propose a fused smart sensor network to estimate the forward kinematics of an industrial robot. The developed smart processor uses Kalman filters to filter and to fuse the information of the sensor network. Two primary sensors are used: an optical encoder, and a 3-axis accelerometer. In order to obtain the position and orientation of each joint online a field-programmable gate array (FPGA) is used in the hardware implementation taking advantage of the parallel computation capabilities and reconfigurability of this device. With the aim of evaluating the smart sensor network performance, three real-operation-oriented paths are executed and monitored in a 6-degree of freedom robot. PMID:22163850
Pipelined CPU Design with FPGA in Teaching Computer Architecture

ERIC Educational Resources Information Center

Lee, Jong Hyuk; Lee, Seung Eun; Yu, Heon Chang; Suh, Taeweon

2012-01-01

This paper presents a pipelined CPU design project with a field programmable gate array (FPGA) system in a computer architecture course. The class project is a five-stage pipelined 32-bit MIPS design with experiments on the Altera DE2 board. For proper scheduling, milestones were set every one or two weeks to help students complete the project on…
The Unified Floating Point Vector Coprocessor for Reconfigurable Hardware

NASA Astrophysics Data System (ADS)

Kathiara, Jainik

There has been an increased interest recently in using embedded cores on FPGAs. Many of the applications that make use of these cores have floating point operations. Due to the complexity and expense of floating point hardware, these algorithms are usually converted to fixed point operations or implemented using floating-point emulation in software. As the technology advances, more and more homogeneous computational resources and fixed function embedded blocks are added to FPGAs and hence implementation of floating point hardware becomes a feasible option. In this research we have implemented a high performance, autonomous floating point vector Coprocessor (FPVC) that works independently within an embedded processor system. We have presented a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The Hybrid vector/SIMD computational model of FPVC results in greater overall performance for most applications along with improved peak performance compared to other approaches. By parameterizing vector length and the number of vector lanes, we can design an application specific FPVC and take optimal advantage of the FPGA fabric. For this research we have also initiated designing a software library for various computational kernels, each of which adapts FPVC's configuration and provide maximal performance. The kernels implemented are from the area of linear algebra and include matrix multiplication and QR and Cholesky decomposition. We have demonstrated the operation of FPVC on a Xilinx Virtex 5 using the embedded PowerPC.
Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA

PubMed Central

Xia, Fei; Dou, Yong; Zhou, Xingming; Yang, Xuejun; Xu, Jiaqing; Zhang, Yang

2009-01-01

Background In the field of RNA secondary structure prediction, the RNAalifold algorithm is one of the most popular methods using free energy minimization. However, general-purpose computers including parallel computers or multi-core computers exhibit parallel efficiency of no more than 50%. Field Programmable Gate-Array (FPGA) chips provide a new approach to accelerate RNAalifold by exploiting fine-grained custom design. Results RNAalifold shows complicated data dependences, in which the dependence distance is variable, and the dependence direction is also across two dimensions. We propose a systolic array structure including one master Processing Element (PE) and multiple slave PEs for fine grain hardware implementation on FPGA. We exploit data reuse schemes to reduce the need to load energy matrices from external memory. We also propose several methods to reduce energy table parameter size by 80%. Conclusion To our knowledge, our implementation with 16 PEs is the only FPGA accelerator implementing the complete RNAalifold algorithm. The experimental results show a factor of 12.2 speedup over the RNAalifold (ViennaPackage – 1.6.5) software for a group of aligned RNA sequences with 2981-residue running on a Personal Computer (PC) platform with Pentium 4 2.6 GHz CPU. PMID:19208138
Pre-Hardware Optimization of Spacecraft Image Processing Software Algorithms and Hardware Implementation

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Petrick, David J.; Day, John H. (Technical Monitor)

2001-01-01

Spacecraft telemetry rates have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image processing application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms and re-configurable computing hardware technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processing (DSP). It has been shown in [1] and [2] that this configuration can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft. However, since this technology is still maturing, intensive pre-hardware steps are necessary to achieve the benefits of hardware implementation. This paper describes these steps for the GOES-8 application, a software project developed using Interactive Data Language (IDL) (Trademark of Research Systems, Inc.) on a Workstation/UNIX platform. The solution involves converting the application to a PC/Windows/RC platform, selected mainly by the availability of low cost, adaptable high-speed RC hardware. In order for the hybrid system to run, the IDL software was modified to account for platform differences. It was interesting to examine the gains and losses in performance on the new platform, as well as unexpected observations before implementing hardware. After substantial pre-hardware optimization steps, the necessity of hardware implementation for bottleneck code in the PC environment became evident and solvable beginning with the methodology described in [1], [2], and implementing a novel methodology for this specific application [6]. The PC-RC interface bandwidth problem for the class of applications with moderate input-output data rates but large intermediate multi-thread data streams has been addressed and mitigated. This opens a new class of satellite image processing applications for bottleneck problems solution using RC technologies. The issue of a science algorithm level of abstraction necessary for RC hardware implementation is also described. Selected Matlab functions already implemented in hardware were investigated for their direct applicability to the GOES-8 application with the intent to create a library of Matlab and IDL RC functions for ongoing work. A complete class of spacecraft image processing applications using embedded re-configurable computing technology to meet real-time requirements, including performance results and comparison with the existing system, is described in this paper.
A space-efficient quantum computer simulator suitable for high-speed FPGA implementation

NASA Astrophysics Data System (ADS)

Frank, Michael P.; Oniciuc, Liviu; Meyer-Baese, Uwe H.; Chiorescu, Irinel

2009-05-01

Conventional vector-based simulators for quantum computers are quite limited in the size of the quantum circuits they can handle, due to the worst-case exponential growth of even sparse representations of the full quantum state vector as a function of the number of quantum operations applied. However, this exponential-space requirement can be avoided by using general space-time tradeoffs long known to complexity theorists, which can be appropriately optimized for this particular problem in a way that also illustrates some interesting reformulations of quantum mechanics. In this paper, we describe the design and empirical space/time complexity measurements of a working software prototype of a quantum computer simulator that avoids excessive space requirements. Due to its space-efficiency, this design is well-suited to embedding in single-chip environments, permitting especially fast execution that avoids access latencies to main memory. We plan to prototype our design on a standard FPGA development board.
Real-time machine vision system using FPGA and soft-core processor

NASA Astrophysics Data System (ADS)

Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad

2012-06-01

This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
Sensor Systems Based on FPGAs and Their Applications: A Survey

PubMed Central

de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah

2012-01-01

In this manuscript, we present a survey of designs and implementations of research sensor nodes that rely on FPGAs, either based upon standalone platforms or as a combination of microcontroller and FPGA. Several current challenges in sensor networks are distinguished and linked to the features of modern FPGAs. As it turns out, low-power optimized FPGAs are able to enhance the computation of several types of algorithms in terms of speed and power consumption in comparison to microcontrollers of commercial sensor nodes. We show that architectures based on the combination of microcontrollers and FPGA can play a key role in the future of sensor networks, in fields where processing capabilities such as strong cryptography, self-testing and data compression, among others, are paramount.
Materials challenges for repeatable RF wireless device reconfiguration with microfluidic channels

NASA Astrophysics Data System (ADS)

Griffin, Anthony S.; Sottos, Nancy R.; White, Scott R.

2018-03-01

Recently, adaptive wireless devices have utilized displacement of EGaIn within microchannels as an electrical switching mechanism to enable reconfigurable electronics. Device reconfiguration using EGaIn in microchannels overcomes many challenges encountered by more traditional reconfiguration mechanisms such as diodes and microelectromechanical systems (MEMS). Reconfiguration using EGaIn is severely limited by undesired permanent shorting due to retention of the liquid in microchannels caused by wetting and rapid oxide skin formation. Here, we investigate the conditions which prevent repeatable electrical switching using EGaIn in microchannels. Initial contact angle tests of EGaIn on epoxy surfaces demonstrate the wettability of EGaIn on flat surfaces. SEM cross-sections of microchannels reveal adhesion of EGaIn residue to channel walls. Micro-computed tomography (microCT) scans of provide volumetric measurements of EGaIn remaining inside channels after flow cycling. Non-wetting coatings are proposed as materials based strategy to overcome these issues in future work.
Hybrid Architectures for Evolutionary Computing Algorithms

DTIC Science & Technology

2008-01-01

other EC algorithms to FPGA Core Burns P1026/MAPLD 200532 Genetic Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based...on Parallel and Distributed Processing (IPPS/SPDP 󈨦), pp. 316-320, Proceedings. IEEE Computer Society 1998. [12] Scott, S. D. , Samal , A., and...Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based Genetic Algorithm”, Proceedings of the 1995 ACM Third
FPGA-Based Laboratory Assignments for NoC-Based Manycore Systems

ERIC Educational Resources Information Center

Ttofis, C.; Theocharides, T.; Michael, M. K.

2012-01-01

Manycore systems have emerged as being one of the dominant architectural trends in next-generation computer systems. These highly parallel systems are expected to be interconnected via packet-based networks-on-chip (NoC). The complexity of such systems poses novel and exciting challenges in academia, as teaching their design requires the students…
Computer image generation: Reconfigurability as a strategy in high fidelity space applications

NASA Technical Reports Server (NTRS)

Bartholomew, Michael J.

1989-01-01

The demand for realistic, high fidelity, computer image generation systems to support space simulation is well established. However, as the number and diversity of space applications increase, the complexity and cost of computer image generation systems also increase. One strategy used to harmonize cost with varied requirements is establishment of a reconfigurable image generation system that can be adapted rapidly and easily to meet new and changing requirements. The reconfigurability strategy through the life cycle of system conception, specification, design, implementation, operation, and support for high fidelity computer image generation systems are discussed. The discussion is limited to those issues directly associated with reconfigurability and adaptability of a specialized scene generation system in a multi-faceted space applications environment. Examples and insights gained through the recent development and installation of the Improved Multi-function Scene Generation System at Johnson Space Center, Systems Engineering Simulator are reviewed and compared with current simulator industry practices. The results are clear; the strategy of reconfigurability applied to space simulation requirements provides a viable path to supporting diverse applications with an adaptable computer image generation system.
Operateurs et engins de calcul en virgule flottante et leur application a la simulation en temps reel sur FPGA

NASA Astrophysics Data System (ADS)

Ould Bachir, Tarek

The real-time simulation of electrical networks gained a vivid industrial interest during recent years, motivated by the substantial development cost reduction that such a prototyping approach can offer. Real-time simulation allows the progressive inclusion of real hardware during its development, allowing its testing under realistic conditions. However, CPU-based simulations suffer from certain limitations such as the difficulty to reach time-steps of a few microsecond, an important challenge brought by modern power converters. Hence, industrial practitioners adopted the FPGA as a platform of choice for the implementation of calculation engines dedicated to the rapid real-time simulation of electrical networks. The reconfigurable technology broke the 5 kHz switching frequency barrier that is characteristic of CPU-based simulations. Moreover, FPGA-based real-time simulation offers many advantages, including the reduced latency of the simulation loop that is obtained thanks to a direct access to sensors and actuators. The fixed-point format is paradigmatic to FPGA-based digital signal processing. However, the format imposes a time penalty in the development process since the designer has to asses the required precision for all model variables. This fact brought an import research effort on the use of the floating-point format for the simulation of electrical networks. One of the main challenges in the use of the floating-point format are the long latencies required by the elementary arithmetic operators, particularly when an adder is used as an accumulator, an important building bloc for the implementation of integration rules such as the trapezoidal method. Hence, single-cycle floating-point accumulation forms the core of this research work. Our results help building such operators as accumulators, multiply-accumulators (MACs), and dot-product (DP) operators. These operators play a key role in the implementation of the proposed calculation engines. Therefore, this thesis contributes to the realm of FPGA-based real-time simulation in many ways. The research work proposes a new summation algorithm, which is a generalization of the so-called self-alignment technique. The new formulation is broader, simpler in its expression and hardware implementation. Our research helps formulating criteria to guarantee good accuracy, the criteria being established on a theoretical, as well as empirical basis. Moreover, the thesis offers a comprehensive analysis on the use of the redundant high radix carry-save (HRCS) format. The HRCS format is used to perform rapid additions of large mantissas. Two new HRCS operators are also proposed, namely an endomorphic adder and a HRCS to conventional converter. Once the mean to single-cycle accumulation is defined as a combination of the self-alignment technique and the HRCS format, the research focuses on the FPGA implementation of SIMD calculation engines using parallel floating-point MACs or DPs. The proposed operators are characterized by low latencies, allowing the engines to reach very low time-steps. The document finally discusses power electronic circuits modelling, and concludes with the presentation of a versatile calculation engine capable of simulating power converter with arbitrary topologies and up to 24 switches, while achieving time steps below 1 mus and allowing switching frequencies in the range of tens kilohertz. The latter realization has led to commercialization of a product by our industrial partner.
A Compact Synchronous Cellular Model of Nonlinear Calcium Dynamics: Simulation and FPGA Synthesis Results.

PubMed

Soleimani, Hamid; Drakakis, Emmanuel M

2017-06-01

Recent studies have demonstrated that calcium is a widespread intracellular ion that controls a wide range of temporal dynamics in the mammalian body. The simulation and validation of such studies using experimental data would benefit from a fast large scale simulation and modelling tool. This paper presents a compact and fully reconfigurable cellular calcium model capable of mimicking Hopf bifurcation phenomenon and various nonlinear responses of the biological calcium dynamics. The proposed cellular model is synthesized on a digital platform for a single unit and a network model. Hardware synthesis, physical implementation on FPGA, and theoretical analysis confirm that the proposed cellular model can mimic the biological calcium behaviors with considerably low hardware overhead. The approach has the potential to speed up large-scale simulations of slow intracellular dynamics by sharing more cellular units in real-time. To this end, various networks constructed by pipelining 10 k to 40 k cellular calcium units are compared with an equivalent simulation run on a standard PC workstation. Results show that the cellular hardware model is, on average, 83 times faster than the CPU version.
FPGA Control System for the Automated Test of Microshutters

NASA Technical Reports Server (NTRS)

Lyness, Eric; Rapchun, David A.; Moseley, S. Harvey

2008-01-01

The James Webb Space Telescope, scheduled to replace the Hubble in 2013, must simultaneously observe hundreds of faint galaxies. This requirement has led to the development of a programmable transmission mask which can be adapted to admit light with arbitrary pattern of galaxies into its spectrograph. This programmable mask will contain a large array of micro-electromechanical (MEMs) devices called MicroShutters. These microscopic shutters physically open and close like the shutter on a camera, except each shutter is microscopic in size and an array 365 by 171 is used to select the objects under spectroscopic observation at a given time, and to block the unwanted background light from other areas. NASA developed and is currently refining the exceptionally difficult process of manufacturing these shutters. This paper describes how the authors used LabVIEW FPGA and a reconfigurable I/O board to control the shutters in a test chamber and how the flexibility of the system allows us to continue to modify the control algorithms as NASA optimizes the performance of the MicroShutter arrays.
FPGA Control System for the Automated Test of MicroShutters

NASA Technical Reports Server (NTRS)

Lyness, Eric; Rapchun, David A.; Moseley, S. Harvey

2008-01-01

The James Webb Space Telescope, scheduled to replace the Hubble in 2013, must simultaneously observe hundreds of faint galaxies. This requirement has led to the development of a programmable transmission mask which can be adapted to admit light from an arbitrary pattern of galaxies into its spectrograph. This programmable mask will contain a large array of micro-electromechanical (MEMs) devices called MicroShutters. These microscopic shutters physically open and close like the shutter on a camera, except each shutter is microscopic in size and an array 365 by 171 is used to select the objects under spectroscopic observation at a given time, and to block the unwanted background light from other areas. NASA developed and is currently refining the exceptionally difficult process of manufacturing these shutters. This paper describes how the authors used LabVIEW FPGA and a reconfigurable I/O board to control the shutters in a test chamber and how the flexibility of the system allows us to continue to modify the control algorithms as NASA optimizes the performance of the MicroShutter arrays.
An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

PubMed Central

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
An intelligent architecture based on Field Programmable Gate Arrays designed to detect moving objects by using Principal Component Analysis.

PubMed

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.

An FPGA-Based People Detection System

NASA Astrophysics Data System (ADS)

Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.

2005-12-01

This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.
Multipurpose Controller with EPICS integration and data logging: BPM application for ESS Bilbao

NASA Astrophysics Data System (ADS)

Arredondo, I.; del Campo, M.; Echevarria, P.; Jugo, J.; Etxebarria, V.

2013-10-01

This work presents a multipurpose configurable control system which can be integrated in an EPICS control network, this functionality being configured through a XML configuration file. The core of the system is the so-called Hardware Controller which is in charge of the control hardware management, the set up and communication with the EPICS network and the data storage. The reconfigurable nature of the controller is based on a single XML file, allowing any final user to easily modify and adjust the control system to any specific requirement. The selected Java development environment ensures a multiplatform operation and large versatility, even regarding the control hardware to be controlled. Specifically, this paper, focused on fast control based on a high performance FPGA, describes also an application approach for the ESS Bilbao's Beam Position Monitoring system. The implementation of the XML configuration file and the satisfactory performance outcome achieved are presented, as well as a general description of the Multipurpose Controller itself.
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale

PubMed Central

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2017-01-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.

PubMed

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2016-10-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting

PubMed Central

Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel

2008-01-01

Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553
IDSAC-IUCAA digital sampler array controller

NASA Astrophysics Data System (ADS)

Chattopadhyay, Sabyasachi; Chordia, Pravin; Ramaprakash, A. N.; Burse, Mahesh P.; Joshi, Bhushan; Chillal, Kalpesh

2016-07-01

In order to run the large format detector arrays and mosaics that are required by most astronomical instruments, readout electronic controllers are required which can process multiple CCD outputs simultaneously at high speeds and low noise levels. These CCD controllers need to be modular and configurable, should be able to run multiple detector types to cater to a wide variety of requirements. IUCAA Digital Sampler Array Controller (IDSAC), is a generic CCD Controller based on a fully scalable architecture which is adequately flexible and powerful enough to control a wide variety of detectors used in ground based astronomy. The controller has a modular backplane architecture that consists of Single Board Controller Cards (SBCs) and can control up to 5 CCDs (mosaic or independent). Each Single Board Controller (SBC) has all the resources to a run Single large format CCD having up to four outputs. All SBCs are identical and are easily interchangeable without needing any reconfiguration. A four channel video processor on each SBC can process up to four output CCDs with or without dummy outputs at 0.5 Megapixels/Sec/Channel with 16 bit resolution. Each SBC has a USB 2.0 interface which can be connected to a host computer via optional USB to Fibre converters. The SBC uses a reconfigurable hardware (FPGA) as a Master Controller. IDSAC offers Digital Correlated Double Sampling (DCDS) to eliminate thermal kTC noise. CDS performed in Digital domain (DCDS) has several advantages over its analog counterpart, such as - less electronics, faster readout and easier post processing. It is also flexible with sampling rate and pixel throughput while maintaining the core circuit topology intact. Noise characterization of the IDSAC CDS signal chain has been performed by analytical modelling and practical measurements. Various types of noise such as white, pink, power supply, bias etc. has been considered while creating an analytical noise model tool to predict noise of a controller system like IDSAC. Several tests are performed to measure the actual noise of IDSAC. The theoretical calculation matches very well with practical measurements within 10% accuracy.
High-speed polarization sensitive optical coherence tomography for retinal diagnostics

NASA Astrophysics Data System (ADS)

Yin, Biwei; Wang, Bingqing; Vemishetty, Kalyanramu; Nagle, Jim; Liu, Shuang; Wang, Tianyi; Rylander, Henry G., III; Milner, Thomas E.

2012-01-01

We report design and construction of an FPGA-based high-speed swept-source polarization-sensitive optical coherence tomography (SS-PS-OCT) system for clinical retinal imaging. Clinical application of the SS-PS-OCT system is accurate measurement and display of thickness, phase retardation and birefringence maps of the retinal nerve fiber layer (RNFL) in human subjects for early detection of glaucoma. The FPGA-based SS-PS-OCT system provides three incident polarization states on the eye and uses a bulk-optic polarization sensitive balanced detection module to record two orthogonal interference fringe signals. Interference fringe signals and relative phase retardation between two orthogonal polarization states are used to obtain Stokes vectors of light returning from each RNFL depth. We implement a Levenberg-Marquardt algorithm on a Field Programmable Gate Array (FPGA) to compute accurate phase retardation and birefringence maps. For each retinal scan, a three-state Levenberg-Marquardt nonlinear algorithm is applied to 360 clusters each consisting of 100 A-scans to determine accurate maps of phase retardation and birefringence in less than 1 second after patient measurement allowing real-time clinical imaging-a speedup of more than 300 times over previous implementations. We report application of the FPGA-based SS-PS-OCT system for real-time clinical imaging of patients enrolled in a clinical study at the Eye Institute of Austin and Duke Eye Center.
Digital hardware implementation of a stochastic two-dimensional neuron model.

PubMed

Grassia, F; Kohno, T; Levi, T

2016-11-01

This study explores the feasibility of stochastic neuron simulation in digital systems (FPGA), which realizes an implementation of a two-dimensional neuron model. The stochasticity is added by a source of current noise in the silicon neuron using an Ornstein-Uhlenbeck process. This approach uses digital computation to emulate individual neuron behavior using fixed point arithmetic operation. The neuron model's computations are performed in arithmetic pipelines. It was designed in VHDL language and simulated prior to mapping in the FPGA. The experimental results confirmed the validity of the developed stochastic FPGA implementation, which makes the implementation of the silicon neuron more biologically plausible for future hybrid experiments. Copyright © 2017 Elsevier Ltd. All rights reserved.
Instrumentation and control of harmonic oscillators via a single-board microprocessor-FPGA device.

PubMed

Picone, Rico A R; Davis, Solomon; Devine, Cameron; Garbini, Joseph L; Sidles, John A

2017-04-01

We report the development of an instrumentation and control system instantiated on a microprocessor-field programmable gate array (FPGA) device for a harmonic oscillator comprising a portion of a magnetic resonance force microscope. The specific advantages of the system are that it minimizes computation, increases maintainability, and reduces the technical barrier required to enter the experimental field of magnetic resonance force microscopy. Heterodyne digital control and measurement yields computational advantages. A single microprocessor-FPGA device improves system maintainability by using a single programming language. The system presented requires significantly less technical expertise to instantiate than the instrumentation of previous systems, yet integrity of performance is retained and demonstrated with experimental data.
Instrumentation and control of harmonic oscillators via a single-board microprocessor-FPGA device

NASA Astrophysics Data System (ADS)

Picone, Rico A. R.; Davis, Solomon; Devine, Cameron; Garbini, Joseph L.; Sidles, John A.

2017-04-01

We report the development of an instrumentation and control system instantiated on a microprocessor-field programmable gate array (FPGA) device for a harmonic oscillator comprising a portion of a magnetic resonance force microscope. The specific advantages of the system are that it minimizes computation, increases maintainability, and reduces the technical barrier required to enter the experimental field of magnetic resonance force microscopy. Heterodyne digital control and measurement yields computational advantages. A single microprocessor-FPGA device improves system maintainability by using a single programming language. The system presented requires significantly less technical expertise to instantiate than the instrumentation of previous systems, yet integrity of performance is retained and demonstrated with experimental data.
Pre-Hardware Optimization and Implementation Of Fast Optics Closed Control Loop Algorithms

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Lyon, Richard G.; Herman, Jay R.; Abuhassan, Nader

2004-01-01

One of the main heritage tools used in scientific and engineering data spectrum analysis is the Fourier Integral Transform and its high performance digital equivalent - the Fast Fourier Transform (FFT). The FFT is particularly useful in two-dimensional (2-D) image processing (FFT2) within optical systems control. However, timing constraints of a fast optics closed control loop would require a supercomputer to run the software implementation of the FFT2 and its inverse, as well as other image processing representative algorithm, such as numerical image folding and fringe feature extraction. A laboratory supercomputer is not always available even for ground operations and is not feasible for a night project. However, the computationally intensive algorithms still warrant alternative implementation using reconfigurable computing technologies (RC) such as Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGA), which provide low cost compact super-computing capabilities. We present a new RC hardware implementation and utilization architecture that significantly reduces the computational complexity of a few basic image-processing algorithm, such as FFT2, image folding and phase diversity for the NASA Solar Viewing Interferometer Prototype (SVIP) using a cluster of DSPs and FPGAs. The DSP cluster utilization architecture also assures avoidance of a single point of failure, while using commercially available hardware. This, combined with the control algorithms pre-hardware optimization, or the first time allows construction of image-based 800 Hertz (Hz) optics closed control loops on-board a spacecraft, based on the SVIP ground instrument. That spacecraft is the proposed Earth Atmosphere Solar Occultation Imager (EASI) to study greenhouse gases CO2, C2H, H2O, O3, O2, N2O from Lagrange-2 point in space. This paper provides an advanced insight into a new type of science capabilities for future space exploration missions based on on-board image processing for control and for robotics missions using vision sensors. It presents a top-level description of technologies required for the design and construction of SVIP and EASI and to advance the spatial-spectral imaging and large-scale space interferometry science and engineering.
Implementing Audio Digital Feedback Loop Using the National Instruments RIO System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, G.; Byrd, J. M.

2006-11-20

Development of system for high precision RF distribution and laser synchronization at Berkeley Lab has been ongoing for several years. Successful operation of these systems requires multiple audio bandwidth feedback loops running at relatively high gains. Stable operation of the feedback loops requires careful design of the feedback transfer function. To allow for flexible and compact implementation, we have developed digital feedback loops on the National Instruments Reconfigurable Input/Output (RIO) platform. This platform uses an FPGA and multiple I/Os that can provide eight parallel channels running different filters. We present the design and preliminary experimental results of this system.
Adaptive Tunable Laser Spectrometer for Space Applications

NASA Technical Reports Server (NTRS)

Flesch, Gregory; Keymeulen, Didier

2010-01-01

An architecture and process for the rapid prototyping and subsequent development of an adaptive tunable laser absorption spectrometer (TLS) are described. Our digital hardware/firmware/software platform is both reconfigurable at design time as well as autonomously adaptive in real-time for both post-integration and post-launch situations. The design expands the range of viable target environments and enhances tunable laser spectrometer performance in extreme and even unpredictable environments. Through rapid prototyping with a commercial RTOS/FPGA platform, we have implemented a fully operational tunable laser spectrometer (using a highly sensitive second harmonic technique). With this prototype, we have demonstrated autonomous real-time adaptivity in the lab with simulated extreme environments.
On Representative Spaceflight Instrument and Associated Instrument Sensor Web Framework

NASA Technical Reports Server (NTRS)

Kizhner, Semion; Patel, Umeshkumar; Vootukuru, Meg

2007-01-01

Sensor Web-based adaptation and sharing of space flight mission resources, including those of the Space-Ground and Control-User communication segment, could greatly benefit from utilization of heritage Internet Protocols and devices applied for Spaceflight (SpaceIP). This had been successfully demonstrated by a few recent spaceflight experiments. However, while terrestrial applications of Internet protocols are well developed and understood (mostly due to billions of dollars in investments by the military and industry), the spaceflight application of Internet protocols is still in its infancy. Progress in the developments of SpaceIP-enabled instrument components will largely determine the SpaceIP utilization of those investments and acceptance in years to come. Likewise SpaceIP, the development of commercial real-time and instrument colocated computational resources, data compression and storage, can be enabled on-board a spacecraft and, in turn, support a powerful application to Sensor Web-based design of a spaceflight instrument. Sensor Web-enabled reconfiguration and adaptation of structures for hardware resources and information systems will commence application of Field Programmable Arrays (FPGA) and other aerospace programmable logic devices for what this technology was intended. These are a few obvious potential benefits of Sensor Web technologies for spaceflight applications. However, they are still waiting to be explored. This is because there is a need for a new approach to spaceflight instrumentation in order to make these mature sensor web technologies applicable for spaceflight. In this paper we present an approach in developing related and enabling spaceflight instrument-level technologies based on the new concept of a representative spaceflight Instrument Sensor Web (ISW).
Programmable hardware for reconfigurable computing systems

NASA Astrophysics Data System (ADS)

Smith, Stephen

1996-10-01

In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
Cycle accurate and cycle reproducible memory for an FPGA based hardware accelerator

DOEpatents

Asaad, Sameh W.; Kapur, Mohit

2016-03-15

A method, system and computer program product are disclosed for using a Field Programmable Gate Array (FPGA) to simulate operations of a device under test (DUT). The DUT includes a device memory having a number of input ports, and the FPGA is associated with a target memory having a second number of input ports, the second number being less than the first number. In one embodiment, a given set of inputs is applied to the device memory at a frequency Fd and in a defined cycle of time, and the given set of inputs is applied to the target memory at a frequency Ft. Ft is greater than Fd and cycle accuracy is maintained between the device memory and the target memory. In an embodiment, a cycle accurate model of the DUT memory is created by separating the DUT memory interface protocol from the target memory storage array.
Fast semivariogram computation using FPGA architectures

NASA Astrophysics Data System (ADS)

Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang

2015-02-01

The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices
Random number generators for large-scale parallel Monte Carlo simulations on FPGA

NASA Astrophysics Data System (ADS)

Lin, Y.; Wang, F.; Liu, B.

2018-05-01

Through parallelization, field programmable gate array (FPGA) can achieve unprecedented speeds in large-scale parallel Monte Carlo (LPMC) simulations. FPGA presents both new constraints and new opportunities for the implementations of random number generators (RNGs), which are key elements of any Monte Carlo (MC) simulation system. Using empirical and application based tests, this study evaluates all of the four RNGs used in previous FPGA based MC studies and newly proposed FPGA implementations for two well-known high-quality RNGs that are suitable for LPMC studies on FPGA. One of the newly proposed FPGA implementations: a parallel version of additive lagged Fibonacci generator (Parallel ALFG) is found to be the best among the evaluated RNGs in fulfilling the needs of LPMC simulations on FPGA.
Reconfigurable Mobile System - Ground, sea and air applications

NASA Astrophysics Data System (ADS)

Lamonica, Gary L.; Sturges, James W.

1990-11-01

The Reconfigurable Mobile System (RMS) is a highly mobile data-processing unit for military users requiring real-time access to data gathered by airborne (and other) reconnaissance data. RMS combines high-performance computation and image processing workstations with resources for command/control/communications in a single, lightweight shelter. RMS is composed of off-the-shelf components, and is easily reconfigurable to land-vehicle or shipboard versions. Mission planning, which involves an airborne sensor platform's sensor coverage, considered aircraft/sensor capabilities in conjunction with weather, terrain, and threat scenarios. RMS's man-machine interface concept facilitates user familiarization and features iron-based function selection and windowing.
STAR: FPGA-based software defined satellite transponder

NASA Astrophysics Data System (ADS)

Davalle, Daniele; Cassettari, Riccardo; Saponara, Sergio; Fanucci, Luca; Cucchi, Luca; Bigongiari, Franco; Errico, Walter

2013-05-01

This paper presents STAR, a flexible Telemetry, Tracking & Command (TT&C) transponder for Earth Observation (EO) small satellites, developed in collaboration with INTECS and SITAEL companies. With respect to state-of-the-art EO transponders, STAR includes the possibility of scientific data transfer thanks to the 40 Mbps downlink data-rate. This feature represents an important optimization in terms of hardware mass, which is important for EO small satellites. Furthermore, in-flight re-configurability of communication parameters via telecommand is important for in-orbit link optimization, which is especially useful for low orbit satellites where visibility can be as short as few hundreds of seconds. STAR exploits the principles of digital radio to minimize the analog section of the transceiver. 70MHz intermediate frequency (IF) is the interface with an external S/X band radio-frequency front-end. The system is composed of a dedicated configurable high-speed digital signal processing part, the Signal Processor (SP), described in technology-independent VHDL working with a clock frequency of 184.32MHz and a low speed control part, the Control Processor (CP), based on the 32-bit Gaisler LEON3 processor clocked at 32 MHz, with SpaceWire and CAN interfaces. The quantization parameters were fine-tailored to reach a trade-off between hardware complexity and implementation loss which is less than 0.5 dB at BER = 10-5 for the RX chain. The IF ports require 8-bit precision. The system prototype is fitted on the Xilinx Virtex 6 VLX75T-FF484 FPGA of which a space-qualified version has been announced. The total device occupation is 82 %.

Real-time object tracking based on scale-invariant features employing bio-inspired hardware.

PubMed

Yasukawa, Shinsuke; Okuno, Hirotsugu; Ishii, Kazuo; Yagi, Tetsuya

2016-09-01

We developed a vision sensor system that performs a scale-invariant feature transform (SIFT) in real time. To apply the SIFT algorithm efficiently, we focus on a two-fold process performed by the visual system: whole-image parallel filtering and frequency-band parallel processing. The vision sensor system comprises an active pixel sensor, a metal-oxide semiconductor (MOS)-based resistive network, a field-programmable gate array (FPGA), and a digital computer. We employed the MOS-based resistive network for instantaneous spatial filtering and a configurable filter size. The FPGA is used to pipeline process the frequency-band signals. The proposed system was evaluated by tracking the feature points detected on an object in a video. Copyright © 2016 Elsevier Ltd. All rights reserved.
Exploring Accelerating Science Applications with FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Storaasli, Olaf O; Strenski, Dave

2007-01-01

FPGA hardware and tools (VHDL, Viva, MitrionC and CHiMPS) are described. FPGA performance is evaluated on two Cray XD1 systems (Virtex-II Pro 50 and Virtex-4 LX160) for human genome (DNA and protein) sequence comparisons for a computational biology code (FASTA). Scalable FPGA speedups of 50X (Virtex-II) and 100X (Virtex-4) over a 2.2 GHz Opteron were achieved. Coding and IO issues faced for human genome data are described.
Invited Article: Digital beam-forming imaging riometer systems

NASA Astrophysics Data System (ADS)

Honary, Farideh; Marple, Steve R.; Barratt, Keith; Chapman, Peter; Grill, Martin; Nielsen, Erling

2011-03-01

The design and operation of a new generation of digital imaging riometer systems developed by Lancaster University are presented. In the heart of the digital imaging riometer is a field-programmable gate array (FPGA), which is used for the digital signal processing and digital beam forming, completely replacing the analog Butler matrices which have been used in previous designs. The reconfigurable nature of the FPGA has been exploited to produce tools for remote system testing and diagnosis which have proven extremely useful for operation in remote locations such as the Arctic and Antarctic. Different FPGA programs enable different instrument configurations, including a 4 × 4 antenna filled array (producing 4 × 4 beams), an 8 × 8 antenna filled array (producing 7 × 7 beams), and a Mills cross system utilizing 63 antennas producing 556 usable beams. The concept of using a Mills cross antenna array for riometry has been successfully demonstrated for the first time. The digital beam forming has been validated by comparing the received signal power from cosmic radio sources with results predicted from the theoretical beam radiation pattern. The performances of four digital imaging riometer systems are compared against each other and a traditional imaging riometer utilizing analog Butler matrices. The comparison shows that digital imaging riometer systems, with independent receivers for each antenna, can obtain much better measurement precision for filled arrays or much higher spatial resolution for the Mills cross configuration when compared to existing imaging riometer systems.
Design of the SLAC RCE Platform: A General Purpose ATCA Based Data Acquisition System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herbst, R.; Claus, R.; Freytag, M.

2015-01-23

The SLAC RCE platform is a general purpose clustered data acquisition system implemented on a custom ATCA compliant blade, called the Cluster On Board (COB). The core of the system is the Reconfigurable Cluster Element (RCE), which is a system-on-chip design based upon the Xilinx Zynq family of FPGAs, mounted on custom COB daughter-boards. The Zynq architecture couples a dual core ARM Cortex A9 based processor with a high performance 28nm FPGA. The RCE has 12 external general purpose bi-directional high speed links, each supporting serial rates of up to 12Gbps. 8 RCE nodes are included on a COB, eachmore » with a 10Gbps connection to an on-board 24-port Ethernet switch integrated circuit. The COB is designed to be used with a standard full-mesh ATCA backplane allowing multiple RCE nodes to be tightly interconnected with minimal interconnect latency. Multiple shelves can be clustered using the front panel 10-gbps connections. The COB also supports local and inter-blade timing and trigger distribution. An experiment specific Rear Transition Module adapts the 96 high speed serial links to specific experiments and allows an experiment-specific timing and busy feedback connection. This coupling of processors with a high performance FPGA fabric in a low latency, multiple node cluster allows high speed data processing that can be easily adapted to any physics experiment. RTEMS and Linux are both ported to the module. The RCE has been used or is the baseline for several current and proposed experiments (LCLS, HPS, LSST, ATLAS-CSC, LBNE, DarkSide, ILC-SiD, etc).« less
Reconfigurable, Intelligently-Adaptive, Communication System, an SDR Platform

NASA Technical Reports Server (NTRS)

Roche, Rigoberto

2016-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework to abstract the application software from the radio platform hardware. STRS aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. The Glenn Research Center (GRC) team made a software-defined radio (SDR) platform STRS compliant by adding an STRS operating environment and a field programmable gate array (FPGA) wrapper, capable of implementing each of the platforms interfaces, as well as a test waveform to exercise those interfaces. This effort serves to provide a framework toward waveform development on an STRS compliant platform to support future space communication systems for advanced exploration missions. Validated STRS compliant applications provided tested code with extensive documentation to potentially reduce risk, cost and efforts in development of space-deployable SDRs. This paper discusses the advantages of STRS, the integration of STRS onto a Reconfigurable, Intelligently-Adaptive, Communication System (RIACS) SDR platform, the sample waveform, and wrapper development efforts. The paper emphasizes the infusion of the STRS Architecture onto the RIACS platform for potential use in next generation SDRs for advance exploration missions.
Reconfigurable Resonant Regulating Rectifier With Primary Equalization for Extended Coupling- and Loading-Range in Bio-Implant Wireless Power Transfer.

PubMed

Li, Xing; Meng, Xiaodong; Tsui, Chi-Ying; Ki, Wing-Hung

2015-12-01

Wireless power transfer using reconfigurable resonant regulating (R(3)) rectification suffers from limited range in accommodating varying coupling and loading conditions. A primary-assisted regulation principle is proposed to mitigate these limitations, of which the amplitude of the rectifier input voltage on the secondary side is regulated by accordingly adjusting the voltage amplitude Veq on the primary side. A novel current-sensing method and calibration scheme track Veq on the primary side. A ramp generator simultaneously provides three clock signals for different modules. Both the primary equalizer and the R(3) rectifier are implemented as custom integrated circuits fabricated in a 0.35 μm CMOS process, with the global control implemented in FPGA. Measurements show that with the primary equalizer, the workable coupling and loading ranges are extended by 250% at 120 mW load and 300% at 1.2 cm coil distance compared to the same system without the primary equalizer. A maximum rectifier efficiency of 92.5% and a total system efficiency of 62.4% are demonstrated.
FPGA-accelerated adaptive optics wavefront control

NASA Astrophysics Data System (ADS)

Mauch, S.; Reger, J.; Reinlein, C.; Appelfelder, M.; Goy, M.; Beckert, E.; Tünnermann, A.

2014-03-01

The speed of real-time adaptive optical systems is primarily restricted by the data processing hardware and computational aspects. Furthermore, the application of mirror layouts with increasing numbers of actuators reduces the bandwidth (speed) of the system and, thus, the number of applicable control algorithms. This burden turns out a key-impediment for deformable mirrors with continuous mirror surface and highly coupled actuator influence functions. In this regard, specialized hardware is necessary for high performance real-time control applications. Our approach to overcome this challenge is an adaptive optics system based on a Shack-Hartmann wavefront sensor (SHWFS) with a CameraLink interface. The data processing is based on a high performance Intel Core i7 Quadcore hard real-time Linux system. Employing a Xilinx Kintex-7 FPGA, an own developed PCie card is outlined in order to accelerate the analysis of a Shack-Hartmann Wavefront Sensor. A recently developed real-time capable spot detection algorithm evaluates the wavefront. The main features of the presented system are the reduction of latency and the acceleration of computation For example, matrix multiplications which in general are of complexity O(n3 are accelerated by using the DSP48 slices of the field-programmable gate array (FPGA) as well as a novel hardware implementation of the SHWFS algorithm. Further benefits are the Streaming SIMD Extensions (SSE) which intensively use the parallelization capability of the processor for further reducing the latency and increasing the bandwidth of the closed-loop. Due to this approach, up to 64 actuators of a deformable mirror can be handled and controlled without noticeable restriction from computational burdens.
Reconfigurable Model Execution in the OpenMDAO Framework

NASA Technical Reports Server (NTRS)

Hwang, John T.

2017-01-01

NASA's OpenMDAO framework facilitates constructing complex models and computing their derivatives for multidisciplinary design optimization. Decomposing a model into components that follow a prescribed interface enables OpenMDAO to assemble multidisciplinary derivatives from the component derivatives using what amounts to the adjoint method, direct method, chain rule, global sensitivity equations, or any combination thereof, using the MAUD architecture. OpenMDAO also handles the distribution of processors among the disciplines by hierarchically grouping the components, and it automates the data transfer between components that are on different processors. These features have made OpenMDAO useful for applications in aircraft design, satellite design, wind turbine design, and aircraft engine design, among others. This paper presents new algorithms for OpenMDAO that enable reconfigurable model execution. This concept refers to dynamically changing, during execution, one or more of: the variable sizes, solution algorithm, parallel load balancing, or set of variables-i.e., adding and removing components, perhaps to switch to a higher-fidelity sub-model. Any component can reconfigure at any point, even when running in parallel with other components, and the reconfiguration algorithm presented here performs the synchronized updates to all other components that are affected. A reconfigurable software framework for multidisciplinary design optimization enables new adaptive solvers, adaptive parallelization, and new applications such as gradient-based optimization with overset flow solvers and adaptive mesh refinement. Benchmarking results demonstrate the time savings for reconfiguration compared to setting up the model again from scratch, which can be significant in large-scale problems. Additionally, the new reconfigurability feature is applied to a mission profile optimization problem for commercial aircraft where both the parametrization of the mission profile and the time discretization are adaptively refined, resulting in computational savings of roughly 10% and the elimination of oscillations in the optimized altitude profile.
Reconfigurable optical interconnections via dynamic computer-generated holograms

NASA Technical Reports Server (NTRS)

Liu, Hua-Kuang (Inventor); Zhou, Shaomin (Inventor)

1994-01-01

A system is proposed for optically providing one-to-many irregular interconnections, and strength-adjustable many-to-many irregular interconnections which may be provided with strengths (weights) w(sub ij) using multiple laser beams which address multiple holograms and means for combining the beams modified by the holograms to form multiple interconnections, such as a cross-bar switching network. The optical means for interconnection is based on entering a series of complex computer-generated holograms on an electrically addressed spatial light modulator for real-time reconfigurations, thus providing flexibility for interconnection networks for largescale practical use. By employing multiple sources and holograms, the number of interconnection patterns achieved is increased greatly.
Driver face tracking using semantics-based feature of eyes on single FPGA

NASA Astrophysics Data System (ADS)

Yu, Ying-Hao; Chen, Ji-An; Ting, Yi-Siang; Kwok, Ngaiming

2017-06-01

Tracking driver's face is one of the essentialities for driving safety control. This kind of system is usually designed with complicated algorithms to recognize driver's face by means of powerful computers. The design problem is not only about detecting rate but also from parts damages under rigorous environments by vibration, heat, and humidity. A feasible strategy to counteract these damages is to integrate entire system into a single chip in order to achieve minimum installation dimension, weight, power consumption, and exposure to air. Meanwhile, an extraordinary methodology is also indispensable to overcome the dilemma of low-computing capability and real-time performance on a low-end chip. In this paper, a novel driver face tracking system is proposed by employing semantics-based vague image representation (SVIR) for minimum hardware resource usages on a FPGA, and the real-time performance is also guaranteed at the same time. Our experimental results have indicated that the proposed face tracking system is viable and promising for the smart car design in the future.
Real-time dual-polarization transmission based on hybrid optical wireless communications

NASA Astrophysics Data System (ADS)

Sousa, Artur N.; Alimi, Isiaka A.; Ferreira, Ricardo M.; Shahpari, Ali; Lima, Mário; Monteiro, Paulo P.; Teixeira, António L.

2018-01-01

We present experimental work on a gigabit-capable and long-reach hybrid coherent UWDM-PON plus FSO system for supporting different applications over the same fiber infrastructure in the mobile backhaul (MBH) networks. Also, for the first time, we demonstrate a reconfigurable real-time DSP transmission/reception of DP-QPSK signals over standard single-mode fiber (SSMF) and FSO links. The receiver presented is based on a commercial field-programmable gate array (FPGA). The considered communication links are based on 20 UDWDM channels with 625 Mbaud and 2.5 GHz channel spacing. We are able to demonstrate the lowest sampling rate required for digital coherent PON by employing four 1.25 Gsa/s ADCs using an electrical front-end receiver that offers only 1 GHz analog bandwidth. We achieved this by implementing a phase and polarization diversity coherent receiver combined with the DP-QPSK modulation formats. The system performance is estimated in terms of receiver sensitivity. The results show the viability of coherent PON and flexible dual-polarization supported by software-defined transceivers for the MBH.
Design Considerations for a Computationally-Lightweight Authentication Mechanism for Passive RFID Tags

DTIC Science & Technology

2009-09-01

suffer the power and complexity requirements of a public key system. 28 In [18], a simulation of the SHA –1 algorithm is performed on a Xilinx FPGA ... 256 bits. Thus, the construction of a hash table would need 2512 independent comparisons. It is known that hash collisions of the SHA –1 algorithm... SHA –1 algorithm for small-core FPGA design. Small-core FPGA design is the process by which a circuit is adapted to use the minimal amount of logic
Acceleration of Image Segmentation Algorithm for (Breast) Mammogram Images Using High-Performance Reconfigurable Dataflow Computers

PubMed Central

Filipovic, Nenad D.

2017-01-01

Image segmentation is one of the most common procedures in medical imaging applications. It is also a very important task in breast cancer detection. Breast cancer detection procedure based on mammography can be divided into several stages. The first stage is the extraction of the region of interest from a breast image, followed by the identification of suspicious mass regions, their classification, and comparison with the existing image database. It is often the case that already existing image databases have large sets of data whose processing requires a lot of time, and thus the acceleration of each of the processing stages in breast cancer detection is a very important issue. In this paper, the implementation of the already existing algorithm for region-of-interest based image segmentation for mammogram images on High-Performance Reconfigurable Dataflow Computers (HPRDCs) is proposed. As a dataflow engine (DFE) of such HPRDC, Maxeler's acceleration card is used. The experiments for examining the acceleration of that algorithm on the Reconfigurable Dataflow Computers (RDCs) are performed with two types of mammogram images with different resolutions. There were, also, several DFE configurations and each of them gave a different acceleration value of algorithm execution. Those acceleration values are presented and experimental results showed good acceleration. PMID:28611851
Acceleration of Image Segmentation Algorithm for (Breast) Mammogram Images Using High-Performance Reconfigurable Dataflow Computers.

PubMed

Milankovic, Ivan L; Mijailovic, Nikola V; Filipovic, Nenad D; Peulic, Aleksandar S

2017-01-01

Image segmentation is one of the most common procedures in medical imaging applications. It is also a very important task in breast cancer detection. Breast cancer detection procedure based on mammography can be divided into several stages. The first stage is the extraction of the region of interest from a breast image, followed by the identification of suspicious mass regions, their classification, and comparison with the existing image database. It is often the case that already existing image databases have large sets of data whose processing requires a lot of time, and thus the acceleration of each of the processing stages in breast cancer detection is a very important issue. In this paper, the implementation of the already existing algorithm for region-of-interest based image segmentation for mammogram images on High-Performance Reconfigurable Dataflow Computers (HPRDCs) is proposed. As a dataflow engine (DFE) of such HPRDC, Maxeler's acceleration card is used. The experiments for examining the acceleration of that algorithm on the Reconfigurable Dataflow Computers (RDCs) are performed with two types of mammogram images with different resolutions. There were, also, several DFE configurations and each of them gave a different acceleration value of algorithm execution. Those acceleration values are presented and experimental results showed good acceleration.
The Application of Virtex-II Pro FPGA in High-Speed Image Processing Technology of Robot Vision Sensor

NASA Astrophysics Data System (ADS)

Ren, Y. J.; Zhu, J. G.; Yang, X. Y.; Ye, S. H.

2006-10-01

The Virtex-II Pro FPGA is applied to the vision sensor tracking system of IRB2400 robot. The hardware platform, which undertakes the task of improving SNR and compressing data, is constructed by using the high-speed image processing of FPGA. The lower level image-processing algorithm is realized by combining the FPGA frame and the embedded CPU. The velocity of image processing is accelerated due to the introduction of FPGA and CPU. The usage of the embedded CPU makes it easily to realize the logic design of interface. Some key techniques are presented in the text, such as read-write process, template matching, convolution, and some modules are simulated too. In the end, the compare among the modules using this design, using the PC computer and using the DSP, is carried out. Because the high-speed image processing system core is a chip of FPGA, the function of which can renew conveniently, therefore, to a degree, the measure system is intelligent.
Dedicated hardware processor and corresponding system-on-chip design for real-time laser speckle imaging.

PubMed

Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming

2011-11-01

Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
A hybrid short read mapping accelerator

PubMed Central

2013-01-01

Background The rapid growth of short read datasets poses a new challenge to the short read mapping problem in terms of sensitivity and execution speed. Existing methods often use a restrictive error model for computing the alignments to improve speed, whereas more flexible error models are generally too slow for large-scale applications. A number of short read mapping software tools have been proposed. However, designs based on hardware are relatively rare. Field programmable gate arrays (FPGAs) have been successfully used in a number of specific application areas, such as the DSP and communications domains due to their outstanding parallel data processing capabilities, making them a competitive platform to solve problems that are “inherently parallel”. Results We present a hybrid system for short read mapping utilizing both FPGA-based hardware and CPU-based software. The computation intensive alignment and the seed generation operations are mapped onto an FPGA. We present a computationally efficient, parallel block-wise alignment structure (Align Core) to approximate the conventional dynamic programming algorithm. The performance is compared to the multi-threaded CPU-based GASSST and BWA software implementations. For single-end alignment, our hybrid system achieves faster processing speed than GASSST (with a similar sensitivity) and BWA (with a higher sensitivity); for pair-end alignment, our design achieves a slightly worse sensitivity than that of BWA but has a higher processing speed. Conclusions This paper shows that our hybrid system can effectively accelerate the mapping of short reads to a reference genome based on the seed-and-extend approach. The performance comparison to the GASSST and BWA software implementations under different conditions shows that our hybrid design achieves a high degree of sensitivity and requires less overall execution time with only modest FPGA resource utilization. Our hybrid system design also shows that the performance bottleneck for the short read mapping problem can be changed from the alignment stage to the seed generation stage, which provides an additional requirement for the future development of short read aligners. PMID:23441908
Comparison between Frame-Constrained Fix-Pixel-Value and Frame-Free Spiking-Dynamic-Pixel ConvNets for Visual Processing

PubMed Central

Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; LeCun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe

2012-01-01

Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons. PMID:22518097
Comparison between Frame-Constrained Fix-Pixel-Value and Frame-Free Spiking-Dynamic-Pixel ConvNets for Visual Processing.

PubMed

Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; Lecun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe

2012-01-01

Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons.
Design of a system based on DSP and FPGA for video recording and replaying

NASA Astrophysics Data System (ADS)

Kang, Yan; Wang, Heng

2013-08-01

This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA for video recording and replaying has a considerable perspective in analysis after the event, simulated exercitation and so forth.

Framework for architecture-independent run-time reconfigurable applications

NASA Astrophysics Data System (ADS)

Lehn, David I.; Hudson, Rhett D.; Athanas, Peter M.

2000-10-01

Configurable Computing Machines (CCMs) have emerged as a technology with the computational benefits of custom ASICs as well as the flexibility and reconfigurability of general-purpose microprocessors. Significant effort from the research community has focused on techniques to move this reconfigurability from a rapid application development tool to a run-time tool. This requires the ability to change the hardware design while the application is executing and is known as Run-Time Reconfiguration (RTR). Widespread acceptance of run-time reconfigurable custom computing depends upon the existence of high-level automated design tools. Such tools must reduce the designers effort to port applications between different platforms as the architecture, hardware, and software evolves. A Java implementation of a high-level application framework, called Janus, is presented here. In this environment, developers create Java classes that describe the structural behavior of an application. The framework allows hardware and software modules to be freely mixed and interchanged. A compilation phase of the development process analyzes the structure of the application and adapts it to the target platform. Janus is capable of structuring the run-time behavior of an application to take advantage of the memory and computational resources available.
Evolution of a designless nanoparticle network into reconfigurable Boolean logic

NASA Astrophysics Data System (ADS)

Bose, S. K.; Lawrence, C. P.; Liu, Z.; Makarenko, K. S.; van Damme, R. M. J.; Broersma, H. J.; van der Wiel, W. G.

2015-12-01

Natural computers exploit the emergent properties and massive parallelism of interconnected networks of locally active components. Evolution has resulted in systems that compute quickly and that use energy efficiently, utilizing whatever physical properties are exploitable. Man-made computers, on the other hand, are based on circuits of functional units that follow given design rules. Hence, potentially exploitable physical processes, such as capacitive crosstalk, to solve a problem are left out. Until now, designless nanoscale networks of inanimate matter that exhibit robust computational functionality had not been realized. Here we artificially evolve the electrical properties of a disordered nanomaterials system (by optimizing the values of control voltages using a genetic algorithm) to perform computational tasks reconfigurably. We exploit the rich behaviour that emerges from interconnected metal nanoparticles, which act as strongly nonlinear single-electron transistors, and find that this nanoscale architecture can be configured in situ into any Boolean logic gate. This universal, reconfigurable gate would require about ten transistors in a conventional circuit. Our system meets the criteria for the physical realization of (cellular) neural networks: universality (arbitrary Boolean functions), compactness, robustness and evolvability, which implies scalability to perform more advanced tasks. Our evolutionary approach works around device-to-device variations and the accompanying uncertainties in performance. Moreover, it bears a great potential for more energy-efficient computation, and for solving problems that are very hard to tackle in conventional architectures.
High speed CMOS acquisition system based on FPGA embedded image processing for electro-optical measurements

NASA Astrophysics Data System (ADS)

Rosu-Hamzescu, Mihnea; Polonschii, Cristina; Oprea, Sergiu; Popescu, Dragos; David, Sorin; Bratu, Dumitru; Gheorghiu, Eugen

2018-06-01

Electro-optical measurements, i.e., optical waveguides and plasmonic based electrochemical impedance spectroscopy (P-EIS), are based on the sensitive dependence of refractive index of electro-optical sensors on surface charge density, modulated by an AC electrical field applied to the sensor surface. Recently, P-EIS has emerged as a new analytical tool that can resolve local impedance with high, optical spatial resolution, without using microelectrodes. This study describes a high speed image acquisition and processing system for electro-optical measurements, based on a high speed complementary metal-oxide semiconductor (CMOS) sensor and a field-programmable gate array (FPGA) board. The FPGA is used to configure CMOS parameters, as well as to receive and locally process the acquired images by performing Fourier analysis for each pixel, deriving the real and imaginary parts of the Fourier coefficients for the AC field frequencies. An AC field generator, for single or multi-sine signals, is synchronized with the high speed acquisition system for phase measurements. The system was successfully used for real-time angle-resolved electro-plasmonic measurements from 30 Hz up to 10 kHz, providing results consistent to ones obtained by a conventional electrical impedance approach. The system was able to detect amplitude variations with a relative variation of ±1%, even for rather low sampling rates per period (i.e., 8 samples per period). The PC (personal computer) acquisition and control software allows synchronized acquisition for multiple FPGA boards, making it also suitable for simultaneous angle-resolved P-EIS imaging.
SWARM: A 32 GHz Correlator and VLBI Beamformer for the Submillimeter Array

NASA Astrophysics Data System (ADS)

Primiani, Rurik A.; Young, Kenneth H.; Young, André; Patel, Nimesh; Wilson, Robert W.; Vertatschitsch, Laura; Chitwood, Billie B.; Srinivasan, Ranjani; MacMahon, David; Weintroub, Jonathan

2016-03-01

A 32GHz bandwidth VLBI capable correlator and phased array has been designed and deployeda at the Smithsonian Astrophysical Observatory’s Submillimeter Array (SMA). The SMA Wideband Astronomical ROACH2 Machine (SWARM) integrates two instruments: a correlator with 140kHz spectral resolution across its full 32GHz band, used for connected interferometric observations, and a phased array summer used when the SMA participates as a station in the Event Horizon Telescope (EHT) very long baseline interferometry (VLBI) array. For each SWARM quadrant, Reconfigurable Open Architecture Computing Hardware (ROACH2) units shared under open-source from the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER) are equipped with a pair of ultra-fast analog-to-digital converters (ADCs), a field programmable gate array (FPGA) processor, and eight 10 Gigabit Ethernet (GbE) ports. A VLBI data recorder interface designated the SWARM digital back end, or SDBE, is implemented with a ninth ROACH2 per quadrant, feeding four Mark6 VLBI recorders with an aggregate recording rate of 64 Gbps. This paper describes the design and implementation of SWARM, as well as its deployment at SMA with reference to verification and science data.
Advancing interconnect density for spiking neural network hardware implementations using traffic-aware adaptive network-on-chip routers.

PubMed

Carrillo, Snaider; Harkin, Jim; McDaid, Liam; Pande, Sandeep; Cawley, Seamus; McGinley, Brian; Morgan, Fearghal

2012-09-01

The brain is highly efficient in how it processes information and tolerates faults. Arguably, the basic processing units are neurons and synapses that are interconnected in a complex pattern. Computer scientists and engineers aim to harness this efficiency and build artificial neural systems that can emulate the key information processing principles of the brain. However, existing approaches cannot provide the dense interconnect for the billions of neurons and synapses that are required. Recently a reconfigurable and biologically inspired paradigm based on network-on-chip (NoC) and spiking neural networks (SNNs) has been proposed as a new method of realising an efficient, robust computing platform. However, the use of the NoC as an interconnection fabric for large-scale SNNs demands a good trade-off between scalability, throughput, neuron/synapse ratio and power consumption. This paper presents a novel traffic-aware, adaptive NoC router, which forms part of a proposed embedded mixed-signal SNN architecture called EMBRACE (EMulating Biologically-inspiRed ArChitectures in hardwarE). The proposed adaptive NoC router provides the inter-neuron connectivity for EMBRACE, maintaining router communication and avoiding dropped router packets by adapting to router traffic congestion. Results are presented on throughput, power and area performance analysis of the adaptive router using a 90 nm CMOS technology which outperforms existing NoCs in this domain. The adaptive behaviour of the router is also verified on a Stratix II FPGA implementation of a 4 × 2 router array with real-time traffic congestion. The presented results demonstrate the feasibility of using the proposed adaptive NoC router within the EMBRACE architecture to realise large-scale SNNs on embedded hardware. Copyright © 2012 Elsevier Ltd. All rights reserved.
FPGA Implementation of Stereo Disparity with High Throughput for Mobility Applications

NASA Technical Reports Server (NTRS)

Villalpando, Carlos Y.; Morfopolous, Arin; Matthies, Larry; Goldberg, Steven

2011-01-01

High speed stereo vision can allow unmanned robotic systems to navigate safely in unstructured terrain, but the computational cost can exceed the capacity of typical embedded CPUs. In this paper, we describe an end-to-end stereo computation co-processing system optimized for fast throughput that has been implemented on a single Virtex 4 LX160 FPGA. This system is capable of operating on images from a 1024 x 768 3CCD (true RGB) camera pair at 15 Hz. Data enters the FPGA directly from the cameras via Camera Link and is rectified, pre-filtered and converted into a disparity image all within the FPGA, incurring no CPU load. Once complete, a rectified image and the final disparity image are read out over the PCI bus, for a bandwidth cost of 68 MB/sec. Within the FPGA there are 4 distinct algorithms: Camera Link capture, Bilinear rectification, Bilateral subtraction pre-filtering and the Sum of Absolute Difference (SAD) disparity. Each module will be described in brief along with the data flow and control logic for the system. The system has been successfully fielded upon the Carnegie Mellon University's National Robotics Engineering Center (NREC) Crusher system during extensive field trials in 2007 and 2008 and is being implemented for other surface mobility systems at JPL.
Survey of reconfigurable architectures for multimedia applications

NASA Astrophysics Data System (ADS)

Cervero, T.; López, S.; Callicó, G. M.; Tobajas, F.; de Armas, V.; López, J.; Sarmiento, R.

2009-05-01

In a short period of time, the multimedia sector has quickly progressed trying to overcome the exigencies of the customers in terms of transfer speeds, storage memory, image quality, and functionalities. In order to cope with this stringent situation, different hardware devices have been developed as possible choices. Despite of the fact that not every device is apt for implementing the high computational demands associated to multimedia applications; reconfigurable architectures appear as ideal candidates to achieve these necessities. As a direct consequence, worldwide universities and industries have incremented their research activity into this area, generating an important know-how base. In order to sort all the information generated about this issue, this paper reviews the most recent reconfigurable architectures for multimedia applications. As a result, this paper establishes the benefits and drawbacks of the different dynamically reconfigurable architectures for multimedia applications according to their system-level design.
Technology Developments in Radiation-Hardened Electronics for Space Environments

NASA Technical Reports Server (NTRS)

Keys, Andrew S.; Howell, Joe T.

2008-01-01

The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS, Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches. System level applications for the RHESE technology products are discussed.
A phase-based stereo vision system-on-a-chip.

PubMed

Díaz, Javier; Ros, Eduardo; Sabatini, Silvio P; Solari, Fabio; Mota, Sonia

2007-02-01

A simple and fast technique for depth estimation based on phase measurement has been adopted for the implementation of a real-time stereo system with sub-pixel resolution on an FPGA device. The technique avoids the attendant problem of phase warping. The designed system takes full advantage of the inherent processing parallelism and segmentation capabilities of FPGA devices to achieve a computation speed of 65megapixels/s, which can be arranged with a customized frame-grabber module to process 211frames/s at a size of 640x480 pixels. The processing speed achieved is higher than conventional camera frame rates, thus allowing the system to extract multiple estimations and be used as a platform to evaluate integration schemes of a population of neurons without increasing hardware resource demands.
Parallel algorithm for computation of second-order sequential best rotations

NASA Astrophysics Data System (ADS)

Redif, Soydan; Kasap, Server

2013-12-01

Algorithms for computing an approximate polynomial matrix eigenvalue decomposition of para-Hermitian systems have emerged as a powerful, generic signal processing tool. A technique that has shown much success in this regard is the sequential best rotation (SBR2) algorithm. Proposed is a scheme for parallelising SBR2 with a view to exploiting the modern architectural features and inherent parallelism of field-programmable gate array (FPGA) technology. Experiments show that the proposed scheme can achieve low execution times while requiring minimal FPGA resources.
Reconfigurable Optical Interconnections Via Dynamic Computer-Generated Holograms

NASA Technical Reports Server (NTRS)

Liu, Hua-Kuang (Inventor); Zhou, Shao-Min (Inventor)

1996-01-01

A system is presented for optically providing one-to-many irregular interconnections, and strength-adjustable many-to-many irregular interconnections which may be provided with strengths (weights) w(sub ij) using multiple laser beams which address multiple holograms and means for combining the beams modified by the holograms to form multiple interconnections, such as a cross-bar switching network. The optical means for interconnection is based on entering a series of complex computer-generated holograms on an electrically addressed spatial light modulator for real-time reconfigurations, thus providing flexibility for interconnection networks for large-scale practical use. By employing multiple sources and holograms, the number of interconnection patterns achieved is increased greatly.
A CMOS high speed imaging system design based on FPGA

NASA Astrophysics Data System (ADS)

Tang, Hong; Wang, Huawei; Cao, Jianzhong; Qiao, Mingrui

2015-10-01

CMOS sensors have more advantages than traditional CCD sensors. The imaging system based on CMOS has become a hot spot in research and development. In order to achieve the real-time data acquisition and high-speed transmission, we design a high-speed CMOS imaging system on account of FPGA. The core control chip of this system is XC6SL75T and we take advantages of CameraLink interface and AM41V4 CMOS image sensors to transmit and acquire image data. AM41V4 is a 4 Megapixel High speed 500 frames per second CMOS image sensor with global shutter and 4/3" optical format. The sensor uses column parallel A/D converters to digitize the images. The CameraLink interface adopts DS90CR287 and it can convert 28 bits of LVCMOS/LVTTL data into four LVDS data stream. The reflected light of objects is photographed by the CMOS detectors. CMOS sensors convert the light to electronic signals and then send them to FPGA. FPGA processes data it received and transmits them to upper computer which has acquisition cards through CameraLink interface configured as full models. Then PC will store, visualize and process images later. The structure and principle of the system are both explained in this paper and this paper introduces the hardware and software design of the system. FPGA introduces the driven clock of CMOS. The data in CMOS is converted to LVDS signals and then transmitted to the data acquisition cards. After simulation, the paper presents a row transfer timing sequence of CMOS. The system realized real-time image acquisition and external controls.
An Efficient Pipeline Wavefront Phase Recovery for the CAFADIS Camera for Extremely Large Telescopes

PubMed Central

Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations. PMID:22315523
An efficient pipeline wavefront phase recovery for the CAFADIS camera for extremely large telescopes.

PubMed

Magdaleno, Eduardo; Rodríguez, Manuel; Rodríguez-Ramos, José Manuel

2010-01-01

In this paper we show a fast, specialized hardware implementation of the wavefront phase recovery algorithm using the CAFADIS camera. The CAFADIS camera is a new plenoptic sensor patented by the Universidad de La Laguna (Canary Islands, Spain): international patent PCT/ES2007/000046 (WIPO publication number WO/2007/082975). It can simultaneously measure the wavefront phase and the distance to the light source in a real-time process. The pipeline algorithm is implemented using Field Programmable Gate Arrays (FPGA). These devices present architecture capable of handling the sensor output stream using a massively parallel approach and they are efficient enough to resolve several Adaptive Optics (AO) problems in Extremely Large Telescopes (ELTs) in terms of processing time requirements. The FPGA implementation of the wavefront phase recovery algorithm using the CAFADIS camera is based on the very fast computation of two dimensional fast Fourier Transforms (FFTs). Thus we have carried out a comparison between our very novel FPGA 2D-FFTa and other implementations.
Implementing Scientific Simulation Codes Highly Tailored for Vector Architectures Using Custom Configurable Computing Machines

NASA Technical Reports Server (NTRS)

Rutishauser, David

2006-01-01

The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
Net-aware bitstreams that upgrade FPGA hardware remotely over the Internet: creating intelligent bitstreams that know where to go, what to do when they get there, and can report back when they're done

NASA Astrophysics Data System (ADS)

Casselman, Steve; Schewel, John

2002-07-01

Success in the marketplace may well depend upon the ability to upgrade and test hardware designs instantly around the world. An upgrade management strategy requires more than just the bitstream file, email or a JTAG cable. A well-managed methodology, capable of transmitting bitstreams directly into targeted FPGAs over the network or internet is an essential element for a successful FPGA based product strategy. Virtual Computer Corporation"s HOTMan, Bitstream Management Environment combines a feature rich cross-platform API with an Object Oriented Bitstream technique for Remote Upgrading of Hardware over the Internet.
Design of an FPGA-Based Algorithm for Real-Time Solutions of Statistics-Based Positioning

PubMed Central

DeWitt, Don; Johnson-Williams, Nathan G.; Miyaoka, Robert S.; Li, Xiaoli; Lockhart, Cate; Lewellen, Tom K.; Hauck, Scott

2010-01-01

We report on the implementation of an algorithm and hardware platform to allow real-time processing of the statistics-based positioning (SBP) method for continuous miniature crystal element (cMiCE) detectors. The SBP method allows an intrinsic spatial resolution of ~1.6 mm FWHM to be achieved using our cMiCE design. Previous SBP solutions have required a postprocessing procedure due to the computation and memory intensive nature of SBP. This new implementation takes advantage of a combination of algebraic simplifications, conversion to fixed-point math, and a hierarchal search technique to greatly accelerate the algorithm. For the presented seven stage, 127 × 127 bin LUT implementation, these algorithm improvements result in a reduction from >7 × 106 floating-point operations per event for an exhaustive search to < 5 × 103 integer operations per event. Simulations show nearly identical FWHM positioning resolution for this accelerated SBP solution, and positioning differences of <0.1 mm from the exhaustive search solution. A pipelined field programmable gate array (FPGA) implementation of this optimized algorithm is able to process events in excess of 250 K events per second, which is greater than the maximum expected coincidence rate for an individual detector. In contrast with all detectors being processed at a centralized host, as in the current system, a separate FPGA is available at each detector, thus dividing the computational load. These methods allow SBP results to be calculated in real-time and to be presented to the image generation components in real-time. A hardware implementation has been developed using a commercially available prototype board. PMID:21197135
Implementing a Digital Phasemeter in an FPGA

NASA Technical Reports Server (NTRS)

Rao, Shanti R.

2008-01-01

Firmware for implementing a digital phasemeter within a field-programmable gate array (FPGA) has been devised. In the original application of this firmware, the phase that one seeks to measure is the difference between the phases of two nominally-equal-frequency heterodyne signals generated by two interferometers. In that application, zero-crossing detectors convert the heterodyne signals to trains of rectangular pulses, the two pulse trains are fed to a fringe counter (the major part of the phasemeter) controlled by a clock signal having a frequency greater than the heterodyne frequency, and the fringe counter computes a time-averaged estimate of the difference between the phases of the two pulse trains. The firmware also does the following: Causes the FPGA to compute the frequencies of the input signals; Causes the FPGA to implement an Ethernet (or equivalent) transmitter for readout of phase and frequency values; and Provides data for use in diagnosis of communication failures. The readout rate can be set, by programming, to a value between 250 Hz and 1 kHz. Network addresses can be programmed by the user.
A natural-color mapping for single-band night-time image based on FPGA

NASA Astrophysics Data System (ADS)

Wang, Yilun; Qian, Yunsheng

2018-01-01

A natural-color mapping for single-band night-time image method based on FPGA can transmit the color of the reference image to single-band night-time image, which is consistent with human visual habits and can help observers identify the target. This paper introduces the processing of the natural-color mapping algorithm based on FPGA. Firstly, the image can be transformed based on histogram equalization, and the intensity features and standard deviation features of reference image are stored in SRAM. Then, the real-time digital images' intensity features and standard deviation features are calculated by FPGA. At last, FPGA completes the color mapping through matching pixels between images using the features in luminance channel.
Airborne Advanced Reconfigurable Computer System (ARCS)

NASA Technical Reports Server (NTRS)

Bjurman, B. E.; Jenkins, G. M.; Masreliez, C. J.; Mcclellan, K. L.; Templeman, J. E.

1976-01-01

A digital computer subsystem fault-tolerant concept was defined, and the potential benefits and costs of such a subsystem were assessed when used as the central element of a new transport's flight control system. The derived advanced reconfigurable computer system (ARCS) is a triple-redundant computer subsystem that automatically reconfigures, under multiple fault conditions, from triplex to duplex to simplex operation, with redundancy recovery if the fault condition is transient. The study included criteria development covering factors at the aircraft's operation level that would influence the design of a fault-tolerant system for commercial airline use. A new reliability analysis tool was developed for evaluating redundant, fault-tolerant system availability and survivability; and a stringent digital system software design methodology was used to achieve design/implementation visibility.

Redundancy management for efficient fault recovery in NASA's distributed computing system

NASA Technical Reports Server (NTRS)

Malek, Miroslaw; Pandya, Mihir; Yau, Kitty

1991-01-01

The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Field Programmable Gate Array Based Parallel Strapdown Algorithm Design for Strapdown Inertial Navigation Systems

PubMed Central

Li, Zong-Tao; Wu, Tie-Jun; Lin, Can-Long; Ma, Long-Hua

2011-01-01

A new generalized optimum strapdown algorithm with coning and sculling compensation is presented, in which the position, velocity and attitude updating operations are carried out based on the single-speed structure in which all computations are executed at a single updating rate that is sufficiently high to accurately account for high frequency angular rate and acceleration rectification effects. Different from existing algorithms, the updating rates of the coning and sculling compensations are unrelated with the number of the gyro incremental angle samples and the number of the accelerometer incremental velocity samples. When the output sampling rate of inertial sensors remains constant, this algorithm allows increasing the updating rate of the coning and sculling compensation, yet with more numbers of gyro incremental angle and accelerometer incremental velocity in order to improve the accuracy of system. Then, in order to implement the new strapdown algorithm in a single FPGA chip, the parallelization of the algorithm is designed and its computational complexity is analyzed. The performance of the proposed parallel strapdown algorithm is tested on the Xilinx ISE 12.3 software platform and the FPGA device XC6VLX550T hardware platform on the basis of some fighter data. It is shown that this parallel strapdown algorithm on the FPGA platform can greatly decrease the execution time of algorithm to meet the real-time and high precision requirements of system on the high dynamic environment, relative to the existing implemented on the DSP platform. PMID:22164058
Single Event Effects Test Results for Advanced Field Programmable Gate Arrays

NASA Technical Reports Server (NTRS)

Allen, Gregory R.; Swift, Gary M.

2006-01-01

Reconfigurable Field Programmable Gate Arrays (FPGAs) from Altera and Actel and an FPGA-based quick-turnApplication Specific Integrated Circuit (ASIC) from Altera were subjected to single-event testing using heavy ions. Both Altera devices (Stratix II and HardCopy II) exhibited a low latchup threshold (below an LET of 3 MeV-cm2/mg) and thus are not recommended for applications in the space radiation environment. The flash-based Actel ProASIC Plus device did not exhibit latchup to an effective LET of 75 MeV-cm2/mg at room temperature. In addition, these tests did not show flash cell charge loss (upset) or retention damage. Upset characterization of the design-level flip-flops yielded an LET threshold below 10 MeV-cm2/mg and a high LET cross section of about lxlO-6 cm2/bit for storing ones and about lxl0-7 cm2/bit for storing zeros . Thus, the ProASIC device may be suitable for critical flight applications with appropriate triple modular redundancy mitigation techniques.
Multinode reconfigurable pipeline computer

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M. (Inventor); Littman, Michael G. (Inventor)

1989-01-01

A multinode parallel-processing computer is made up of a plurality of innerconnected, large capacity nodes each including a reconfigurable pipeline of functional units such as Integer Arithmetic Logic Processors, Floating Point Arithmetic Processors, Special Purpose Processors, etc. The reconfigurable pipeline of each node is connected to a multiplane memory by a Memory-ALU switch NETwork (MASNET). The reconfigurable pipeline includes three (3) basic substructures formed from functional units which have been found to be sufficient to perform the bulk of all calculations. The MASNET controls the flow of signals from the memory planes to the reconfigurable pipeline and vice versa. the nodes are connectable together by an internode data router (hyperspace router) so as to form a hypercube configuration. The capability of the nodes to conditionally configure the pipeline at each tick of the clock, without requiring a pipeline flush, permits many powerful algorithms to be implemented directly.
Field-programmable beam reconfiguring based on digitally-controlled coding metasurface

NASA Astrophysics Data System (ADS)

Wan, Xiang; Qi, Mei Qing; Chen, Tian Yi; Cui, Tie Jun

2016-02-01

Digital phase shifters have been applied in traditional phased array antennas to realize beam steering. However, the phase shifter deals with the phase of the induced current; hence, it has to be in the path of each element of the antenna array, making the phased array antennas very expensive. Metamaterials and/or metasurfaces enable the direct modulation of electromagnetic waves by designing subwavelength structures, which opens a new way to control the beam scanning. Here, we present a direct digital mechanism to control the scattered electromagnetic waves using coding metasurface, in which each unit cell loads a pin diode to produce binary coding states of “1” and “0”. Through data lines, the instant communications are established between the coding metasurface and the internal memory of field-programmable gate arrays (FPGA). Thus, we realize the digital modulation of electromagnetic waves, from which we present the field-programmable reflective antenna with good measurement performance. The proposed mechanism and functional device have great application potential in new-concept radar and communication systems.
A reconfigurable, wearable, wireless ECG system.

PubMed

Borromeo, S; Rodriguez-Sanchez, C; Machado, F; Hernandez-Tamames, J A; de la Prieta, R

2007-01-01

New emerging concepts as "wireless hospital", "mobile healthcare" or "wearable telemonitoring" require the development of bio-signal acquisition devices to be easily integrated into the clinical routine. In this work, we present a new system for Electrocardiogram (ECG) acquisition and its processing, with wireless transmission on demand (either the complete ECG or only one alarm message, just in case a pathological heart rate detected). Size and power consumption are optimized in order to provide mobility and comfort to the patient. We have designed a modular hardware system and an autonomous platform based on a Field-Programmable Gate Array (FPGA) for developing and debugging. The modular approach allows to redesign the system in an easy way. Its adaptation to a new biomedical signal would only need small changes on it. The hardware system is composed of three layers that can be plugged/unplugged: communication layer, processing layer and sensor layer. In addition, we also present a general purpose end-user application developed for mobile phones or Personal Digital Assistant devices (PDAs).
A modularized pulse programmer for NMR spectroscopy

NASA Astrophysics Data System (ADS)

Mao, Wenping; Bao, Qingjia; Yang, Liang; Chen, Yiqun; Liu, Chaoyang; Qiu, Jianqing; Ye, Chaohui

2011-02-01

A modularized pulse programmer for a NMR spectrometer is described. It consists of a networked PCI-104 single-board computer and a field programmable gate array (FPGA). The PCI-104 is dedicated to translate the pulse sequence elements from the host computer into 48-bit binary words and download these words to the FPGA, while the FPGA functions as a sequencer to execute these binary words. High-resolution NMR spectra obtained on a home-built spectrometer with four pulse programmers working concurrently demonstrate the effectiveness of the pulse programmer. Advantages of the module include (1) once designed it can be duplicated and used to construct a scalable NMR/MRI system with multiple transmitter and receiver channels, (2) it is a totally programmable system in which all specific applications are determined by software, and (3) it provides enough reserve for possible new pulse sequences.
Definition and trade-off study of reconfigurable airborne digital computer system organizations

NASA Technical Reports Server (NTRS)

Conn, R. B.

1974-01-01

A highly-reliable, fault-tolerant reconfigurable computer system for aircraft applications was developed. The development and application reliability and fault-tolerance assessment techniques are described. Particular emphasis is placed on the needs of an all-digital, fly-by-wire control system appropriate for a passenger-carrying airplane.
MicroTCA-based Global Trigger Upgrade project for the CMS experiment at LHC

NASA Astrophysics Data System (ADS)

Rahbaran, B.; Arnold, B.; Bergauer, H.; Eichberger, M.; Rabady, D.

2011-12-01

The electronics of the first Level Global Trigger (GT) of CMS is the last stage of the Level-1 trigger system [1]. At LHC up to 40 million collisions of proton bunches occur every second, resulting in about 800 million proton collisions. The CMS Level-1 Global Trigger [1], a custom designed electronics system based on FPGA technology and the VMEbus system, performs a quick on-line analysis of each collision every 25 ns and decides whether to reject or to accept it for further analysis. The CMS trigger group of the Institute of High Energy Physics in Vienna (HEPHY) is involved in the Level-1 trigger of the CMS experiment at CERN. As part of the Trigger Upgrade, the Level-1 Global Trigger will be redesigned and implemented in MicroTCA based technology, which allows engineers to detect all possible faults on plug-in boards, in the power supply and in the cooling system. The upgraded Global Trigger will be designed to have the same basic categories of functions as the present GT, but will have more algorithms and more possibilities for combining trigger candidates. Additionally, reconfigurability and testability will be supported based on the next system generation.
A digitalized silicon microgyroscope based on embedded FPGA.

PubMed

Xia, Dunzhu; Yu, Cheng; Wang, Yuliang

2012-09-27

This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system.
A Digitalized Silicon Microgyroscope Based on Embedded FPGA

PubMed Central

Xia, Dunzhu; Yu, Cheng; Wang, Yuliang

2012-01-01

This paper presents a novel digital miniaturization method for a prototype silicon micro-gyroscope (SMG) with the symmetrical and decoupled structure. The schematic blocks of the overall system consist of high precision analog front-end interface, high-speed 18-bit analog to digital convertor, a high-performance core Field Programmable Gate Array (FPGA) chip and other peripherals such as high-speed serial ports for transmitting data. In drive mode, the closed-loop drive circuit are implemented by automatic gain control (AGC) loop and software phase-locked loop (SPLL) based on the Coordinated Rotation Digital Computer (CORDIC) algorithm. Meanwhile, the sense demodulation module based on varying step least mean square demodulation (LMSD) are addressed in detail. All kinds of algorithms are simulated by Simulink and DSPbuilder tools, which is in good agreement with the theoretical design. The experimental results have fully demonstrated the stability and flexibility of the system. PMID:23201990
Low-SWaP coincidence processing for Geiger-mode LIDAR video

NASA Astrophysics Data System (ADS)

Schultz, Steven E.; Cervino, Noel P.; Kurtz, Zachary D.; Brown, Myron Z.

2015-05-01

Photon-counting Geiger-mode lidar detector arrays provide a promising approach for producing three-dimensional (3D) video at full motion video (FMV) data rates, resolution, and image size from long ranges. However, coincidence processing required to filter raw photon counts is computationally expensive, generally requiring significant size, weight, and power (SWaP) and also time. In this paper, we describe a laboratory test-bed developed to assess the feasibility of low-SWaP, real-time processing for 3D FMV based on Geiger-mode lidar. First, we examine a design based on field programmable gate arrays (FPGA) and demonstrate proof-of-concept results. Then we examine a design based on a first-of-its-kind embedded graphical processing unit (GPU) and compare performance with the FPGA. Results indicate feasibility of real-time Geiger-mode lidar processing for 3D FMV and also suggest utility for real-time onboard processing for mapping lidar systems.
FPGA based data processing in the ALICE High Level Trigger in LHC Run 2

NASA Astrophysics Data System (ADS)

Engel, Heiko; Alt, Torsten; Kebschull, Udo; ALICE Collaboration

2017-10-01

The ALICE High Level Trigger (HLT) is a computing cluster dedicated to the online compression, reconstruction and calibration of experimental data. The HLT receives detector data via serial optical links into FPGA based readout boards that process the data on a per-link level already inside the FPGA and provide it to the host machines connected with a data transport framework. FPGA based data pre-processing is enabled for the biggest detector of ALICE, the Time Projection Chamber (TPC), with a hardware cluster finding algorithm. This algorithm was ported to the Common Read-Out Receiver Card (C-RORC) as used in the HLT for RUN 2. It was improved to handle double the input bandwidth and adjusted to the upgraded TPC Readout Control Unit (RCU2). A flexible firmware implementation in the HLT handles both the old and the new TPC data format and link rates transparently. Extended protocol and data error detection, error handling and the enhanced RCU2 data ordering scheme provide an improved physics performance of the cluster finder. The performance of the cluster finder was verified against large sets of reference data both in terms of throughput and algorithmic correctness. Comparisons with a software reference implementation confirm significant savings on CPU processing power using the hardware implementation. The C-RORC hardware with the cluster finder for RCU1 data is in use in the HLT since the start of RUN 2. The extended hardware cluster finder implementation for the RCU2 with doubled throughput is active since the upgrade of the TPC readout electronics in early 2016.
Acceleration of fluoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: a simulation study

NASA Astrophysics Data System (ADS)

Xue, Xinwei; Cheryauka, Arvi; Tubbs, David

2006-03-01

CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.
Implementation of High Speed Distributed Data Acquisition System

NASA Astrophysics Data System (ADS)

Raju, Anju P.; Sekhar, Ambika

2012-09-01

This paper introduces a high speed distributed data acquisition system based on a field programmable gate array (FPGA). The aim is to develop a "distributed" data acquisition interface. The development of instruments such as personal computers and engineering workstations based on "standard" platforms is the motivation behind this effort. Using standard platforms as the controlling unit allows independence in hardware from a particular vendor and hardware platform. The distributed approach also has advantages from a functional point of view: acquisition resources become available to multiple instruments; the acquisition front-end can be physically remote from the rest of the instrument. High speed data acquisition system transmits data faster to a remote computer system through Ethernet interface. The data is acquired through 16 analog input channels. The input data commands are multiplexed and digitized and then the data is stored in 1K buffer for each input channel. The main control unit in this design is the 16 bit processor implemented in the FPGA. This 16 bit processor is used to set up and initialize the data source and the Ethernet controller, as well as control the flow of data from the memory element to the NIC. Using this processor we can initialize and control the different configuration registers in the Ethernet controller in a easy manner. Then these data packets are sending to the remote PC through the Ethernet interface. The main advantages of the using FPGA as standard platform are its flexibility, low power consumption, short design duration, fast time to market, programmability and high density. The main advantages of using Ethernet controller AX88796 over others are its non PCI interface, the presence of embedded SRAM where transmit and reception buffers are located and high-performance SRAM-like interface. The paper introduces the implementation of the distributed data acquisition using FPGA by VHDL. The main advantages of this system are high accuracy, high speed, real time monitoring.
Upper and lower bounds for semi-Markov reliability models of reconfigurable systems

NASA Technical Reports Server (NTRS)

White, A. L.

1984-01-01

This paper determines the information required about system recovery to compute the reliability of a class of reconfigurable systems. Upper and lower bounds are derived for these systems. The class consists of those systems that satisfy five assumptions: the components fail independently at a low constant rate, fault occurrence and system reconfiguration are independent processes, the reliability model is semi-Markov, the recovery functions which describe system configuration have small means and variances, and the system is well designed. The bounds are easy to compute, and examples are included.
DSP+FPGA-based real-time histogram equalization system of infrared image

NASA Astrophysics Data System (ADS)

Gu, Dongsheng; Yang, Nansheng; Pi, Defu; Hua, Min; Shen, Xiaoyan; Zhang, Ruolan

2001-10-01

Histogram Modification is a simple but effective method to enhance an infrared image. There are several methods to equalize an infrared image's histogram due to the different characteristics of the different infrared images, such as the traditional HE (Histogram Equalization) method, and the improved HP (Histogram Projection) and PE (Plateau Equalization) method and so on. If to realize these methods in a single system, the system must have a mass of memory and extremely fast speed. In our system, we introduce a DSP + FPGA based real-time procession technology to do these things together. FPGA is used to realize the common part of these methods while DSP is to do the different part. The choice of methods and the parameter can be input by a keyboard or a computer. By this means, the function of the system is powerful while it is easy to operate and maintain. In this article, we give out the diagram of the system and the soft flow chart of the methods. And at the end of it, we give out the infrared image and its histogram before and after the process of HE method.
FPGA Implementation of Metastability-Based True Random Number Generator

NASA Astrophysics Data System (ADS)

Hata, Hisashi; Ichikawa, Shuichi

True random number generators (TRNGs) are important as a basis for computer security. Though there are some TRNGs composed of analog circuit, the use of digital circuits is desired for the application of TRNGs to logic LSIs. Some of the digital TRNGs utilize jitter in free-running ring oscillators as a source of entropy, which consume large power. Another type of TRNG exploits the metastability of a latch to generate entropy. Although this kind of TRNG has been mostly implemented with full-custom LSI technology, this study presents an implementation based on common FPGA technology. Our TRNG is comprised of logic gates only, and can be integrated in any kind of logic LSI. The RS latch in our TRNG is implemented as a hard-macro to guarantee the quality of randomness by minimizing the signal skew and load imbalance of internal nodes. To improve the quality and throughput, the output of 64-256 latches are XOR'ed. The derived design was verified on a Xilinx Virtex-4 FPGA (XC4VFX20), and passed NIST statistical test suite without post-processing. Our TRNG with 256 latches occupies 580 slices, while achieving 12.5Mbps throughput.
FPGA based demodulation of laser induced fluorescence in plasmas

NASA Astrophysics Data System (ADS)

Mattingly, Sean W.; Skiff, Fred

2018-04-01

We present a field programmable gate array (FPGA)-based system that counts photons from laser-induced fluorescence (LIF) on a laboratory plasma. This is accomplished with FPGA-based up/down counters that demodulate the data, giving a background-subtracted LIF signal stream that is updated with a new point as each laser amplitude modulation cycle completes. We demonstrate using the FPGA to modulate a laser at 1 MHz and demodulate the resulting LIF data stream. This data stream is used to calculate an LIF-based measurement sampled at 1 MHz of a plasma ion fluctuation spectrum.
Implementation in an FPGA circuit of Edge detection algorithm based on the Discrete Wavelet Transforms

NASA Astrophysics Data System (ADS)

Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia

2017-07-01

The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.

A minimal SATA III Host Controller based on FPGA

NASA Astrophysics Data System (ADS)

Liu, Hailiang

2018-03-01

SATA (Serial Advanced Technology Attachment) is an advanced serial bus which has a outstanding performance in transmitting high speed real-time data applied in Personal Computers, Financial Industry, astronautics and aeronautics, etc. In this express, a minimal SATA III Host Controller based on Xilinx Kintex 7 serial FPGA is designed and implemented. Compared to the state-of-art, registers utilization are reduced 25.3% and LUTs utilization are reduced 65.9%. According to the experimental results, the controller works precisely and steady with the reading bandwidth of up to 536 MB per second and the writing bandwidth of up to 512 MB per second, both of which are close to the maximum bandwidth of the SSD(Solid State Disk) device. The host controller is very suitable for high speed data transmission and mass data storage.
A high-speed DAQ framework for future high-level trigger and event building clusters

NASA Astrophysics Data System (ADS)

Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.

2017-03-01

Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Radiation Hardened 10BASE-T Ethernet Physical Layer (PHY)

NASA Technical Reports Server (NTRS)

Lin, Michael R. (Inventor); Petrick, David J. (Inventor); Ballou, Kevin M. (Inventor); Espinosa, Daniel C. (Inventor); James, Edward F. (Inventor); Kliesner, Matthew A. (Inventor)

2017-01-01

Embodiments may provide a radiation hardened 10BASE-T Ethernet interface circuit suitable for space flight and in compliance with the IEEE 802.3 standard for Ethernet. The various embodiments may provide a 10BASE-T Ethernet interface circuit, comprising a field programmable gate array (FPGA), a transmitter circuit connected to the FPGA, a receiver circuit connected to the FPGA, and a transformer connected to the transmitter circuit and the receiver circuit. In the various embodiments, the FPGA, transmitter circuit, receiver circuit, and transformer may be radiation hardened.
Multichannel FPGA-Based Data-Acquisition-System for Time-Resolved Synchrotron Radiation Experiments

NASA Astrophysics Data System (ADS)

Choe, Hyeokmin; Gorfman, Semen; Heidbrink, Stefan; Pietsch, Ullrich; Vogt, Marco; Winter, Jens; Ziolkowski, Michael

2017-06-01

The aim of this contribution is to describe our recent development of a novel compact field-programmable gatearray (FPGA)-based data acquisition (DAQ) system for use with multichannel X-ray detectors at synchrotron radiation facilities. The system is designed for time resolved counting of single photons arriving from several-currently 12-independent detector channels simultaneously. Detector signals of at least 2.8 ns duration are latched by asynchronous logic and then synchronized with the system clock of 100 MHz. The incoming signals are subsequently sorted out into 10 000 time-bins where they are counted. This occurs according to the arrival time of photons with respect to the trigger signal. Repeatable mode of triggered operation is used to achieve high statistic of accumulated counts. The time-bin width is adjustable from 10 ns to 1 ms. In addition, a special mode of operation with 2 ns time resolution is provided for two detector channels. The system is implemented in a pocketsize FPGA-based hardware of 10 cm × 10 cm × 3 cm and thus can easily be transported between synchrotron radiation facilities. For setup of operation and data read-out, the hardware is connected via USB interface to a portable control computer. DAQ applications are provided in both LabVIEW and MATLAB environments.
Design of polarization imaging system based on CIS and FPGA

NASA Astrophysics Data System (ADS)

Zeng, Yan-an; Liu, Li-gang; Yang, Kun-tao; Chang, Da-ding

2008-02-01

As polarization is an important characteristic of light, polarization image detecting is a new image detecting technology of combining polarimetric and image processing technology. Contrasting traditional image detecting in ray radiation, polarization image detecting could acquire a lot of very important information which traditional image detecting couldn't. Polarization image detecting will be widely used in civilian field and military field. As polarization image detecting could resolve some problem which couldn't be resolved by traditional image detecting, it has been researched widely around the world. The paper introduces polarization image detecting in physical theory at first, then especially introduces image collecting and polarization image process based on CIS (CMOS image sensor) and FPGA. There are two parts including hardware and software for polarization imaging system. The part of hardware include drive module of CMOS image sensor, VGA display module, SRAM access module and the real-time image data collecting system based on FPGA. The circuit diagram and PCB was designed. Stokes vector and polarization angle computing method are analyzed in the part of software. The float multiply of Stokes vector is optimized into just shift and addition operation. The result of the experiment shows that real time image collecting system could collect and display image data from CMOS image sensor in real-time.
Rapid algorithm prototyping and implementation for power quality measurement

NASA Astrophysics Data System (ADS)

Kołek, Krzysztof; Piątek, Krzysztof

2015-12-01

This article presents a Model-Based Design (MBD) approach to rapidly implement power quality (PQ) metering algorithms. Power supply quality is a very important aspect of modern power systems and will become even more important in future smart grids. In this case, maintaining the PQ parameters at the desired level will require efficient implementation methods of the metering algorithms. Currently, the development of new, advanced PQ metering algorithms requires new hardware with adequate computational capability and time intensive, cost-ineffective manual implementations. An alternative, considered here, is an MBD approach. The MBD approach focuses on the modelling and validation of the model by simulation, which is well-supported by a Computer-Aided Engineering (CAE) packages. This paper presents two algorithms utilized in modern PQ meters: a phase-locked loop based on an Enhanced Phase Locked Loop (EPLL), and the flicker measurement according to the IEC 61000-4-15 standard. The algorithms were chosen because of their complexity and non-trivial development. They were first modelled in the MATLAB/Simulink package, then tested and validated in a simulation environment. The models, in the form of Simulink diagrams, were next used to automatically generate C code. The code was compiled and executed in real-time on the Zynq Xilinx platform that combines a reconfigurable Field Programmable Gate Array (FPGA) with a dual-core processor. The MBD development of PQ algorithms, automatic code generation, and compilation form a rapid algorithm prototyping and implementation path for PQ measurements. The main advantage of this approach is the ability to focus on the design, validation, and testing stages while skipping over implementation issues. The code generation process renders production-ready code that can be easily used on the target hardware. This is especially important when standards for PQ measurement are in constant development, and the PQ issues in emerging smart grids will require tools for rapid development and implementation of such algorithms.
Adaptive packet switch with an optical core (demonstrator)

NASA Astrophysics Data System (ADS)

Abdo, Ahmad; Bishtein, Vadim; Clark, Stewart A.; Dicorato, Pino; Lu, David T.; Paredes, Sofia A.; Taebi, Sareh; Hall, Trevor J.

2004-11-01

A three-stage opto-electronic packet switch architecture is described consisting of a reconfigurable optical centre stage surrounded by two electronic buffering stages partitioned into sectors to ease memory contention. A Flexible Bandwidth Provision (FBP) algorithm, implemented on a soft-core processor, is used to change the configuration of the input sectors and optical centre stage to set up internal paths that will provide variable bandwidth to serve the traffic. The switch is modeled by a bipartite graph built from a service matrix, which is a function of the arriving traffic. The bipartite graph is decomposed by solving an edge-colouring problem and the resulting permutations are used to configure the switch. Simulation results show that this architecture exhibits a dramatic reduction of complexity and increased potential for scalability, at the price of only a modest spatial speed-up k, 1
Reconfigurable, Intelligently-Adaptive, Communication System, an SDR Platform

NASA Technical Reports Server (NTRS)

Roche, Rigoberto J.; Shalkhauser, Mary Jo; Hickey, Joseph P.; Briones, Janette C.

2016-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework to abstract the application software from the radio platform hardware. STRS aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. The NASA Glenn Research Center (GRC) team made a software defined radio (SDR) platform STRS compliant by adding an STRS operating environment and a field programmable gate array (FPGA) wrapper, capable of implementing each of the platforms interfaces, as well as a test waveform to exercise those interfaces. This effort serves to provide a framework toward waveform development onto an STRS compliant platform to support future space communication systems for advanced exploration missions. The use of validated STRS compliant applications provides tested code with extensive documentation to potentially reduce risk, cost and e ort in development of space-deployable SDRs. This paper discusses the advantages of STRS, the integration of STRS onto a Reconfigurable, Intelligently-Adaptive, Communication System (RIACS) SDR platform, and the test waveform and wrapper development e orts. The paper emphasizes the infusion of the STRS Architecture onto the RIACS platform for potential use in next generation flight system SDRs for advanced exploration missions.
Solving global shallow water equations on heterogeneous supercomputers

PubMed Central

Fu, Haohuan; Gan, Lin; Yang, Chao; Xue, Wei; Wang, Lanning; Wang, Xinliang; Huang, Xiaomeng; Yang, Guangwen

2017-01-01

The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration of heterogeneous accelerators, how to scale the computing problems onto more nodes and various kinds of accelerators has become a challenge for the model development. This paper describes our efforts on developing a highly scalable framework for performing global atmospheric modeling on heterogeneous supercomputers equipped with various accelerators, such as GPU (Graphic Processing Unit), MIC (Many Integrated Core), and FPGA (Field Programmable Gate Arrays) cards. We propose a generalized partition scheme of the problem domain, so as to keep a balanced utilization of both CPU resources and accelerator resources. With optimizations on both computing and memory access patterns, we manage to achieve around 8 to 20 times speedup when comparing one hybrid GPU or MIC node with one CPU node with 12 cores. Using a customized FPGA-based data-flow engines, we see the potential to gain another 5 to 8 times improvement on performance. On heterogeneous supercomputers, such as Tianhe-1A and Tianhe-2, our framework is capable of achieving ideally linear scaling efficiency, and sustained double-precision performances of 581 Tflops on Tianhe-1A (using 3750 nodes) and 3.74 Pflops on Tianhe-2 (using 8644 nodes). Our study also provides an evaluation on the programming paradigm of various accelerator architectures (GPU, MIC, FPGA) for performing global atmospheric simulation, to form a picture about both the potential performance benefits and the programming efforts involved. PMID:28282428
Extending the BEAGLE library to a multi-FPGA platform.

PubMed

Jin, Zheming; Bakos, Jason D

2013-01-19

Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.
Active Reconfigurable Metamaterial Unit Cell Based on Non-Foster Elements

DTIC Science & Technology

2013-10-01

Krois Ivan Bonic Aleksandar Kiricenko Damir Muha University of Zagreb Faculty of Electrical Engineering and Computing Unksa 3 Zagreb ...PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) University of Zagreb Faculty of Electrical Engineering and Computing Unksa 3 Zagreb , HR-10000 CROATIA 8...Electrical Engineering and Computing University of Zagreb Unska 3 Zagreb , HR-10000, Croatia 14 October 2013 Distribution A: Approved for
An efficient HW and SW design of H.264 video compression, storage and playback on FPGA devices for handheld thermal imaging systems

NASA Astrophysics Data System (ADS)

Gunay, Omer; Ozsarac, Ismail; Kamisli, Fatih

2017-05-01

Video recording is an essential property of new generation military imaging systems. Playback of the stored video on the same device is also desirable as it provides several operational benefits to end users. Two very important constraints for many military imaging systems, especially for hand-held devices and thermal weapon sights, are power consumption and size. To meet these constraints, it is essential to perform most of the processing applied to the video signal, such as preprocessing, compression, storing, decoding, playback and other system functions on a single programmable chip, such as FPGA, DSP, GPU or ASIC. In this work, H.264/AVC (Advanced Video Coding) compatible video compression, storage, decoding and playback blocks are efficiently designed and implemented on FPGA platforms using FPGA fabric and Altera NIOS II soft processor. Many subblocks that are used in video encoding are also used during video decoding in order to save FPGA resources and power. Computationally complex blocks are designed using FPGA fabric, while blocks such as SD card write/read, H.264 syntax decoding and CAVLC decoding are done using NIOS processor to benefit from software flexibility. In addition, to keep power consumption low, the system was designed to require limited external memory access. The design was tested using 640x480 25 fps thermal camera on CYCLONE V FPGA, which is the ALTERA's lowest power FPGA family, and consumes lower than 40% of CYCLONE V 5CEFA7 FPGA resources on average.
FPGA implementation of self organizing map with digital phase locked loops.

PubMed

Hikawa, Hiroomi

2005-01-01

The self-organizing map (SOM) has found applicability in a wide range of application areas. Recently new SOM hardware with phase modulated pulse signal and digital phase-locked loops (DPLLs) has been proposed (Hikawa, 2005). The system uses the DPLL as a computing element since the operation of the DPLL is very similar to that of SOM's computation. The system also uses square waveform phase to hold the value of the each input vector element. This paper discuss the hardware implementation of the DPLL SOM architecture. For effective hardware implementation, some components are redesigned to reduce the circuit size. The proposed SOM architecture is described in VHDL and implemented on field programmable gate array (FPGA). Its feasibility is verified by experiments. Results show that the proposed SOM implemented on the FPGA has a good quantization capability, and its circuit size very small.
Optimized FPGA Implementation of the Thyroid Hormone Secretion Mechanism Using CAD Tools.

PubMed

Alghazo, Jaafar M

2017-02-01

The goal of this paper is to implement the secretion mechanism of the Thyroid Hormone (TH) based on bio-mathematical differential eqs. (DE) on an FPGA chip. Hardware Descriptive Language (HDL) is used to develop a behavioral model of the mechanism derived from the DE. The Thyroid Hormone secretion mechanism is simulated with the interaction of the related stimulating and inhibiting hormones. Synthesis of the simulation is done with the aid of CAD tools and downloaded on a Field Programmable Gate Arrays (FPGAs) Chip. The chip output shows identical behavior to that of the designed algorithm through simulation. It is concluded that the chip mimics the Thyroid Hormone secretion mechanism. The chip, operating in real-time, is computer-independent stand-alone system.
Shaping Crystal-Crystal Phase Transitions

NASA Astrophysics Data System (ADS)

Du, Xiyu; van Anders, Greg; Dshemuchadse, Julia; Glotzer, Sharon

Previous computational and experimental studies have shown self-assembled structure depends strongly on building block shape. New synthesis techniques have led to building blocks with reconfigurable shape and it has been demonstrated that building block reconfiguration can induce bulk structural reconfiguration. However, we do not understand systematically how this transition happens as a function of building block shape. Using a recently developed ``digital alchemy'' framework, we study the thermodynamics of shape-driven crystal-crystal transitions. We find examples of shape-driven bulk reconfiguration that are accompanied by first-order phase transitions, and bulk reconfiguration that occurs without any thermodynamic phase transition. Our results suggest that for well-chosen shapes and structures, there exist facile means of bulk reconfiguration, and that shape-driven bulk reconfiguration provides a viable mechanism for developing functional materials.
A Front-End Electronics Prototype Based on Gigabit Ethernet for the ATLAS Small-Strip Thin Gap Chamber

NASA Astrophysics Data System (ADS)

Hu, Kun; Lu, Houbing; Wang, Xu; Li, Feng; Wang, Xinxin; Geng, Tianru; Yang, Hang; Liu, Shengquan; Han, Liang; Jin, Ge

2017-06-01

A front-end electronics prototype for the ATLAS small-strip Thin Gap Chamber (sTGC) based on gigabit Ethernet has been developed. The prototype is designed to read out signals of pads, wires, and strips of the sTGC detector. The prototype includes two VMM2 chips developed to read out the signals of the sTGC, a Xilinx Kintex-7 field-programmable gate array (FPGA) used for the VMM2 configuration and the events storage, and a gigabit Ethernet transceiver PHY chip for interfacing with a computer. The VMM2 chip is designed for the readout of the Micromegas detector and sTGC detector, which is composed of 64 linear front-end channels. Each channel integrates a charge-sensitive amplifier, a shaper, several analog-to-digital converters, and other digital functions. For a bunch-crossing interval of 25 ns, events are continuously read out by the FPGA and forwarded to the computer. The interface between the computer and the prototype has been measured to reach an error-free rate of 900 Mb/s, therefore making a very effective use of the available bandwidth. Additionally, the computer can control several prototypes of this kind simultaneously via the Ethernet interface. At present, the prototype will be used for the sTGC performance test. The features of the prototype are described in detail.
20-GFLOPS QR processor on a Xilinx Virtex-E FPGA

NASA Astrophysics Data System (ADS)

Walke, Richard L.; Smith, Robert W. M.; Lightbody, Gaye

2000-11-01

Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.
Central FPGA-based destination and load control in the LHCb MHz event readout

NASA Astrophysics Data System (ADS)

Jacobsson, R.

2012-10-01

The readout strategy of the LHCb experiment is based on complete event readout at 1 MHz. A set of 320 sub-detector readout boards transmit event fragments at total rate of 24.6 MHz at a bandwidth usage of up to 70 GB/s over a commercial switching network based on Gigabit Ethernet to a distributed event building and high-level trigger processing farm with 1470 individual multi-core computer nodes. In the original specifications, the readout was based on a pure push protocol. This paper describes the proposal, implementation, and experience of a non-conventional mixture of a push and a pull protocol, akin to credit-based flow control. An FPGA-based central master module, partly operating at the LHC bunch clock frequency of 40.08 MHz and partly at a double clock speed, is in charge of the entire trigger and readout control from the front-end electronics up to the high-level trigger farm. One FPGA is dedicated to controlling the event fragment packing in the readout boards, the assignment of the farm node destination for each event, and controls the farm load based on an asynchronous pull mechanism from each farm node. This dynamic readout scheme relies on generic event requests and the concept of node credit allowing load control and trigger rate regulation as a function of the global farm load. It also allows the vital task of fast central monitoring and automatic recovery in-flight of failing nodes while maintaining dead-time and event loss at a minimum. This paper demonstrates the strength and suitability of implementing this real-time task for a very large distributed system in an FPGA where no random delays are introduced, and where extreme reliability and accurate event accounting are fundamental requirements. It was in use during the entire commissioning phase of LHCb and has been in faultless operation during the first two years of physics luminosity data taking.
Cloud-Based Virtual Laboratory for Network Security Education

ERIC Educational Resources Information Center

Xu, Le; Huang, Dijiang; Tsai, Wei-Tek

2014-01-01

Hands-on experiments are essential for computer network security education. Existing laboratory solutions usually require significant effort to build, configure, and maintain and often do not support reconfigurability, flexibility, and scalability. This paper presents a cloud-based virtual laboratory education platform called V-Lab that provides a…
FPGA-Based X-Ray Detection and Measurement for an X-Ray Polarimeter

NASA Technical Reports Server (NTRS)

Gregory, Kyle; Hill, Joanne; Black, Kevin; Baumgartner, Wayne

2013-01-01

This technology enables detection and measurement of x-rays in an x-ray polarimeter using a field-programmable gate array (FPGA). The technology was developed for the Gravitational and Extreme Magnetism Small Explorer (GEMS) mission. It performs precision energy and timing measurements, as well as rejection of non-x-ray events. It enables the GEMS polarimeter to detect precisely when an event has taken place so that additional measurements can be made. The technology also enables this function to be performed in an FPGA using limited resources so that mass and power can be minimized while reliability for a space application is maximized and precise real-time operation is achieved. This design requires a low-noise, charge-sensitive preamplifier; a highspeed analog to digital converter (ADC); and an x-ray detector with a cathode terminal. It functions by computing a sum of differences for time-samples whose difference exceeds a programmable threshold. A state machine advances through states as a programmable number of consecutive samples exceeds or fails to exceed this threshold. The pulse height is recorded as the accumulated sum. The track length is also measured based on the time from the start to the end of accumulation. For track lengths longer than a certain length, the algorithm estimates the barycenter of charge deposit by comparing the accumulator value at the midpoint to the final accumulator value. The design also employs a number of techniques for rejecting background events. This innovation enables the function to be performed in space where it can operate autonomously with a rapid response time. This implementation combines advantages of computing system-based approaches with those of pure analog approaches. The result is an implementation that is highly reliable, performs in real-time, rejects background events, and consumes minimal power.

Reconfigurable Hardware Adapts to Changing Mission Demands

NASA Technical Reports Server (NTRS)

2003-01-01

A new class of computing architectures and processing systems, which use reconfigurable hardware, is creating a revolutionary approach to implementing future spacecraft systems. With the increasing complexity of electronic components, engineers must design next-generation spacecraft systems with new technologies in both hardware and software. Derivation Systems, Inc., of Carlsbad, California, has been working through NASA s Small Business Innovation Research (SBIR) program to develop key technologies in reconfigurable computing and Intellectual Property (IP) soft cores. Founded in 1993, Derivation Systems has received several SBIR contracts from NASA s Langley Research Center and the U.S. Department of Defense Air Force Research Laboratories in support of its mission to develop hardware and software for high-assurance systems. Through these contracts, Derivation Systems began developing leading-edge technology in formal verification, embedded Java, and reconfigurable computing for its PF3100, Derivational Reasoning System (DRS ), FormalCORE IP, FormalCORE PCI/32, FormalCORE DES, and LavaCORE Configurable Java Processor, which are designed for greater flexibility and security on all space missions.
Parallel ICA and its hardware implementation in hyperspectral image analysis

NASA Astrophysics Data System (ADS)

Du, Hongtao; Qi, Hairong; Peterson, Gregory D.

2004-04-01

Advances in hyperspectral images have dramatically boosted remote sensing applications by providing abundant information using hundreds of contiguous spectral bands. However, the high volume of information also results in excessive computation burden. Since most materials have specific characteristics only at certain bands, a lot of these information is redundant. This property of hyperspectral images has motivated many researchers to study various dimensionality reduction algorithms, including Projection Pursuit (PP), Principal Component Analysis (PCA), wavelet transform, and Independent Component Analysis (ICA), where ICA is one of the most popular techniques. It searches for a linear or nonlinear transformation which minimizes the statistical dependence between spectral bands. Through this process, ICA can eliminate superfluous but retain practical information given only the observations of hyperspectral images. One hurdle of applying ICA in hyperspectral image (HSI) analysis, however, is its long computation time, especially for high volume hyperspectral data sets. Even the most efficient method, FastICA, is a very time-consuming process. In this paper, we present a parallel ICA (pICA) algorithm derived from FastICA. During the unmixing process, pICA divides the estimation of weight matrix into sub-processes which can be conducted in parallel on multiple processors. The decorrelation process is decomposed into the internal decorrelation and the external decorrelation, which perform weight vector decorrelations within individual processors and between cooperative processors, respectively. In order to further improve the performance of pICA, we seek hardware solutions in the implementation of pICA. Until now, there are very few hardware designs for ICA-related processes due to the complicated and iterant computation. This paper discusses capacity limitation of FPGA implementations for pICA in HSI analysis. A synthesis of Application-Specific Integrated Circuit (ASIC) is designed for pICA-based dimensionality reduction in HSI analysis. The pICA design is implemented using standard-height cells and aimed at TSMC 0.18 micron process. During the synthesis procedure, three ICA-related reconfigurable components are developed for the reuse and retargeting purpose. Preliminary results show that the standard-height cell based ASIC synthesis provide an effective solution for pICA and ICA-related processes in HSI analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Underwood, Keith D; Ulmer, Craig D.; Thompson, David

Field programmable gate arrays (FPGAs) have been used as alternative computational de-vices for over a decade; however, they have not been used for traditional scientific com-puting due to their perceived lack of floating-point performance. In recent years, there hasbeen a surge of interest in alternatives to traditional microprocessors for high performancecomputing. Sandia National Labs began two projects to determine whether FPGAs wouldbe a suitable alternative to microprocessors for high performance scientific computing and,if so, how they should be integrated into the system. We present results that indicate thatFPGAs could have a significant impact on future systems. FPGAs have thepotentialtohave ordermore » of magnitude levels of performance wins on several key algorithms; however,there are serious questions as to whether the system integration challenge can be met. Fur-thermore, there remain challenges in FPGA programming and system level reliability whenusing FPGA devices.4 AcknowledgmentArun Rodrigues provided valuable support and assistance in the use of the Structural Sim-ulation Toolkit within an FPGA context. Curtis Janssen and Steve Plimpton provided valu-able insights into the workings of two Sandia applications (MPQC and LAMMPS, respec-tively).5« less
FPGA-based real-time swept-source OCT systems for B-scan live-streaming or volumetric imaging

NASA Astrophysics Data System (ADS)

Bandi, Vinzenz; Goette, Josef; Jacomet, Marcel; von Niederhäusern, Tim; Bachmann, Adrian H.; Duelk, Marcus

2013-03-01

We have developed a Swept-Source Optical Coherence Tomography (Ss-OCT) system with high-speed, real-time signal processing on a commercially available Data-Acquisition (DAQ) board with a Field-Programmable Gate Array (FPGA). The Ss-OCT system simultaneously acquires OCT and k-clock reference signals at 500MS/s. From the k-clock signal of each A-scan we extract a remap vector for the k-space linearization of the OCT signal. The linear but oversampled interpolation is followed by a 2048-point FFT, additional auxiliary computations, and a data transfer to a host computer for real-time, live-streaming of B-scan or volumetric C-scan OCT visualization. We achieve a 100 kHz A-scan rate by parallelization of our hardware algorithms, which run on standard and affordable, commercially available DAQ boards. Our main development tool for signal analysis as well as for hardware synthesis is MATLAB® with add-on toolboxes and 3rd-party tools.
RPython high-level synthesis

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Linczuk, Maciej

2016-09-01

The development of FPGA technology and the increasing complexity of applications in recent decades have forced compilers to move to higher abstraction levels. Compilers interprets an algorithmic description of a desired behavior written in High-Level Languages (HLLs) and translate it to Hardware Description Languages (HDLs). This paper presents a RPython based High-Level synthesis (HLS) compiler. The compiler get the configuration parameters and map RPython program to VHDL. Then, VHDL code can be used to program FPGA chips. In comparison of other technologies usage, FPGAs have the potential to achieve far greater performance than software as a result of omitting the fetch-decode-execute operations of General Purpose Processors (GPUs), and introduce more parallel computation. This can be exploited by utilizing many resources at the same time. Creating parallel algorithms computed with FPGAs in pure HDL is difficult and time consuming. Implementation time can be greatly reduced with High-Level Synthesis compiler. This article describes design methodologies and tools, implementation and first results of created VHDL backend for RPython compiler.
A fast adaptive convex hull algorithm on two-dimensional processor arrays with a reconfigurable BUS system

NASA Technical Reports Server (NTRS)

Olariu, S.; Schwing, J.; Zhang, J.

1991-01-01

A bus system that can change dynamically to suit computational needs is referred to as reconfigurable. We present a fast adaptive convex hull algorithm on a two-dimensional processor array with a reconfigurable bus system (2-D PARBS, for short). Specifically, we show that computing the convex hull of a planar set of n points taken O(log n/log m) time on a 2-D PARBS of size mn x n with 3 less than or equal to m less than or equal to n. Our result implies that the convex hull of n points in the plane can be computed in O(1) time in a 2-D PARBS of size n(exp 1.5) x n.
FPGA-based prototype storage system with phase change memory

NASA Astrophysics Data System (ADS)

Li, Gezi; Chen, Xiaogang; Chen, Bomy; Li, Shunfen; Zhou, Mi; Han, Wenbing; Song, Zhitang

2016-10-01

With the ever-increasing amount of data being stored via social media, mobile telephony base stations, and network devices etc. the database systems face severe bandwidth bottlenecks when moving vast amounts of data from storage to the processing nodes. At the same time, Storage Class Memory (SCM) technologies such as Phase Change Memory (PCM) with unique features like fast read access, high density, non-volatility, byte-addressability, positive response to increasing temperature, superior scalability, and zero standby leakage have changed the landscape of modern computing and storage systems. In such a scenario, we present a storage system called FLEET which can off-load partial or whole SQL queries to the storage engine from CPU. FLEET uses an FPGA rather than conventional CPUs to implement the off-load engine due to its highly parallel nature. We have implemented an initial prototype of FLEET with PCM-based storage. The results demonstrate that significant performance and CPU utilization gains can be achieved by pushing selected query processing components inside in PCM-based storage.
A real-time KLT implementation for radio-SETI applications

NASA Astrophysics Data System (ADS)

Melis, Andrea; Concu, Raimondo; Pari, Pierpaolo; Maccone, Claudio; Montebugnoli, Stelio; Possenti, Andrea; Valente, Giuseppe; Antonietti, Nicoló; Perrodin, Delphine; Migoni, Carlo; Murgia, Matteo; Trois, Alessio; Barbaro, Massimo; Bocchinu, Alessandro; Casu, Silvia; Lunesu, Maria Ilaria; Monari, Jader; Navarrini, Alessandro; Pisanu, Tonino; Schilliró, Francesco; Vacca, Valentina

2016-07-01

SETI, the Search for ExtraTerrestrial Intelligence, is the search for radio signals emitted by alien civilizations living in the Galaxy. Narrow-band FFT-based approaches have been preferred in SETI, since their computation time only grows like N*lnN, where N is the number of time samples. On the contrary, a wide-band approach based on the Kahrunen-Lo`eve Transform (KLT) algorithm would be preferable, but it would scale like N*N. In this paper, we describe a hardware-software infrastructure based on FPGA boards and GPU-based PCs that circumvents this computation-time problem allowing for a real-time KLT.
Programmable Logic Device (PLD) Design Description for the Integrated Power, Avionics, and Software (iPAS) Space Telecommunications Radio System (STRS) Radio

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary Jo W.

2017-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework for software defined radios (SDRs) to abstract the application software from the radio platform hardware. The STRS standard aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. To promote the use of the STRS architecture for future NASA advanced exploration missions, NASA Glenn Research Center (GRC) developed an STRS compliant SDR on a radio platform used by the Advance Exploration System program at the Johnson Space Center (JSC) in their Integrated Power, Avionics, and Software (iPAS) laboratory. At the conclusion of the development, the software and hardware description language (HDL) code was delivered to JSC for their use in their iPAS test bed to get hands-on experience with the STRS standard, and for development of their own STRS Waveforms on the now STRS compliant platform.The iPAS STRS Radio was implemented on the Reconfigurable, Intelligently-Adaptive Communication System (RIACS) platform, currently being used for radio development at JSC. The platform consists of a Xilinx ML605 Virtex-6 FPGA board, an Analog Devices FMCOMMS1-EBZ RF transceiver board, and an Embedded PC (Axiomtek eBox 620-110-FL) running the Ubuntu 12.4 operating system. Figure 1 shows the RIACS platform hardware. The result of this development is a very low cost STRS compliant platform that can be used for waveform developments for multiple applications.The purpose of this document is to describe the design of the HDL code for the FPGA portion of the iPAS STRS Radio particularly the design of the FPGA wrapper and the test waveform.
A novel optogenetically tunable frequency modulating oscillator

PubMed Central

2018-01-01

Synthetic biology has enabled the creation of biological reconfigurable circuits, which perform multiple functions monopolizing a single biological machine; Such a system can switch between different behaviours in response to environmental cues. Previous work has demonstrated switchable dynamical behaviour employing reconfigurable logic gate genetic networks. Here we describe a computational framework for reconfigurable circuits in E.coli using combinations of logic gates, and also propose the biological implementation. The proposed system is an oscillator that can exhibit tunability of frequency and amplitude of oscillations. Further, the frequency of operation can be changed optogenetically. Insilico analysis revealed that two-component light systems, in response to light within a frequency range, can be used for modulating the frequency of the oscillator or stopping the oscillations altogether. Computational modelling reveals that mixing two colonies of E.coli oscillating at different frequencies generates spatial beat patterns. Further, we show that these oscillations more robustly respond to input perturbations compared to the base oscillator, to which the proposed oscillator is a modification. Compared to the base oscillator, the proposed system shows faster synchronization in a colony of cells for a larger region of the parameter space. Additionally, the proposed oscillator also exhibits lesser synchronization error in the transient period after input perturbations. This provides a strong basis for the construction of synthetic reconfigurable circuits in bacteria and other organisms, which can be scaled up to perform functions in the field of time dependent drug delivery with tunable dosages, and sets the stage for further development of circuits with synchronized population level behaviour. PMID:29389936
A novel optogenetically tunable frequency modulating oscillator.

PubMed

Mahajan, Tarun; Rai, Kshitij

2018-01-01

Synthetic biology has enabled the creation of biological reconfigurable circuits, which perform multiple functions monopolizing a single biological machine; Such a system can switch between different behaviours in response to environmental cues. Previous work has demonstrated switchable dynamical behaviour employing reconfigurable logic gate genetic networks. Here we describe a computational framework for reconfigurable circuits in E.coli using combinations of logic gates, and also propose the biological implementation. The proposed system is an oscillator that can exhibit tunability of frequency and amplitude of oscillations. Further, the frequency of operation can be changed optogenetically. Insilico analysis revealed that two-component light systems, in response to light within a frequency range, can be used for modulating the frequency of the oscillator or stopping the oscillations altogether. Computational modelling reveals that mixing two colonies of E.coli oscillating at different frequencies generates spatial beat patterns. Further, we show that these oscillations more robustly respond to input perturbations compared to the base oscillator, to which the proposed oscillator is a modification. Compared to the base oscillator, the proposed system shows faster synchronization in a colony of cells for a larger region of the parameter space. Additionally, the proposed oscillator also exhibits lesser synchronization error in the transient period after input perturbations. This provides a strong basis for the construction of synthetic reconfigurable circuits in bacteria and other organisms, which can be scaled up to perform functions in the field of time dependent drug delivery with tunable dosages, and sets the stage for further development of circuits with synchronized population level behaviour.
FPGA implementation for real-time background subtraction based on Horprasert model.

PubMed

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W.
Small Microprocessor for ASIC or FPGA Implementation

NASA Technical Reports Server (NTRS)

Kleyner, Igor; Katz, Richard; Blair-Smith, Hugh

2011-01-01

A small microprocessor, suitable for use in applications in which high reliability is required, was designed to be implemented in either an application-specific integrated circuit (ASIC) or a field-programmable gate array (FPGA). The design is based on commercial microprocessor architecture, making it possible to use available software development tools and thereby to implement the microprocessor at relatively low cost. The design features enhancements, including trapping during execution of illegal instructions. The internal structure of the design yields relatively high performance, with a significant decrease, relative to other microprocessors that perform the same functions, in the number of microcycles needed to execute macroinstructions. The problem meant to be solved in designing this microprocessor was to provide a modest level of computational capability in a general-purpose processor while adding as little as possible to the power demand, size, and weight of a system into which the microprocessor would be incorporated. As designed, this microprocessor consumes very little power and occupies only a small portion of a typical modern ASIC or FPGA. The microprocessor operates at a rate of about 4 million instructions per second with clock frequency of 20 MHz.
FPGA Implementation for Real-Time Background Subtraction Based on Horprasert Model

PubMed Central

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J.; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W. PMID:22368487
Space Debris Detection on the HPDP, a Coarse-Grained Reconfigurable Array Architecture for Space

NASA Astrophysics Data System (ADS)

Suarez, Diego Andres; Bretz, Daniel; Helfers, Tim; Weidendorfer, Josef; Utzmann, Jens

2016-08-01

Stream processing, widely used in communications and digital signal processing applications, requires high- throughput data processing that is achieved in most cases using Application-Specific Integrated Circuit (ASIC) designs. Lack of programmability is an issue especially in space applications, which use on-board components with long life-cycles requiring applications updates. To this end, the High Performance Data Processor (HPDP) architecture integrates an array of coarse-grained reconfigurable elements to provide both flexible and efficient computational power suitable for stream-based data processing applications in space. In this work the capabilities of the HPDP architecture are demonstrated with the implementation of a real-time image processing algorithm for space debris detection in a space-based space surveillance system. The implementation challenges and alternatives are described making trade-offs to improve performance at the expense of negligible degradation of detection accuracy. The proposed implementation uses over 99% of the available computational resources. Performance estimations based on simulations show that the HPDP can amply match the application requirements.
A Hybrid FPGA-Based System for EEG- and EMG-Based Online Movement Prediction.

PubMed

Wöhrle, Hendrik; Tabie, Marc; Kim, Su Kyoung; Kirchner, Frank; Kirchner, Elsa Andrea

2017-07-03

A current trend in the development of assistive devices for rehabilitation, for example exoskeletons or active orthoses, is to utilize physiological data to enhance their functionality and usability, for example by predicting the patient's upcoming movements using electroencephalography (EEG) or electromyography (EMG). However, these modalities have different temporal properties and classification accuracies, which results in specific advantages and disadvantages. To use physiological data analysis in rehabilitation devices, the processing should be performed in real-time, guarantee close to natural movement onset support, provide high mobility, and should be performed by miniaturized systems that can be embedded into the rehabilitation device. We present a novel Field Programmable Gate Array (FPGA) -based system for real-time movement prediction using physiological data. Its parallel processing capabilities allows the combination of movement predictions based on EEG and EMG and additionally a P300 detection, which is likely evoked by instructions of the therapist. The system is evaluated in an offline and an online study with twelve healthy subjects in total. We show that it provides a high computational performance and significantly lower power consumption in comparison to a standard PC. Furthermore, despite the usage of fixed-point computations, the proposed system achieves a classification accuracy similar to systems with double precision floating-point precision.
Efficient digital implementation of a conductance-based globus pallidus neuron and the dynamics analysis

NASA Astrophysics Data System (ADS)

Yang, Shuangming; Wei, Xile; Deng, Bin; Liu, Chen; Li, Huiyan; Wang, Jiang

2018-03-01

Balance between biological plausibility of dynamical activities and computational efficiency is one of challenging problems in computational neuroscience and neural system engineering. This paper proposes a set of efficient methods for the hardware realization of the conductance-based neuron model with relevant dynamics, targeting reproducing the biological behaviors with low-cost implementation on digital programmable platform, which can be applied in wide range of conductance-based neuron models. Modified GP neuron models for efficient hardware implementation are presented to reproduce reliable pallidal dynamics, which decode the information of basal ganglia and regulate the movement disorder related voluntary activities. Implementation results on a field-programmable gate array (FPGA) demonstrate that the proposed techniques and models can reduce the resource cost significantly and reproduce the biological dynamics accurately. Besides, the biological behaviors with weak network coupling are explored on the proposed platform, and theoretical analysis is also made for the investigation of biological characteristics of the structured pallidal oscillator and network. The implementation techniques provide an essential step towards the large-scale neural network to explore the dynamical mechanisms in real time. Furthermore, the proposed methodology enables the FPGA-based system a powerful platform for the investigation on neurodegenerative diseases and real-time control of bio-inspired neuro-robotics.
A Hybrid FPGA-Based System for EEG- and EMG-Based Online Movement Prediction

PubMed Central

Wöhrle, Hendrik; Tabie, Marc; Kim, Su Kyoung; Kirchner, Frank; Kirchner, Elsa Andrea

2017-01-01

A current trend in the development of assistive devices for rehabilitation, for example exoskeletons or active orthoses, is to utilize physiological data to enhance their functionality and usability, for example by predicting the patient’s upcoming movements using electroencephalography (EEG) or electromyography (EMG). However, these modalities have different temporal properties and classification accuracies, which results in specific advantages and disadvantages. To use physiological data analysis in rehabilitation devices, the processing should be performed in real-time, guarantee close to natural movement onset support, provide high mobility, and should be performed by miniaturized systems that can be embedded into the rehabilitation device. We present a novel Field Programmable Gate Array (FPGA) -based system for real-time movement prediction using physiological data. Its parallel processing capabilities allows the combination of movement predictions based on EEG and EMG and additionally a P300 detection, which is likely evoked by instructions of the therapist. The system is evaluated in an offline and an online study with twelve healthy subjects in total. We show that it provides a high computational performance and significantly lower power consumption in comparison to a standard PC. Furthermore, despite the usage of fixed-point computations, the proposed system achieves a classification accuracy similar to systems with double precision floating-point precision. PMID:28671632
Real-time blind image deconvolution based on coordinated framework of FPGA and DSP

NASA Astrophysics Data System (ADS)

Wang, Ze; Li, Hang; Zhou, Hua; Liu, Hongjun

2015-10-01

Image restoration takes a crucial place in several important application domains. With the increasing of computation requirement as the algorithms become much more complexity, there has been a significant rise in the need for accelerating implementation. In this paper, we focus on an efficient real-time image processing system for blind iterative deconvolution method by means of the Richardson-Lucy (R-L) algorithm. We study the characteristics of algorithm, and an image restoration processing system based on the coordinated framework of FPGA and DSP (CoFD) is presented. Single precision floating-point processing units with small-scale cascade and special FFT/IFFT processing modules are adopted to guarantee the accuracy of the processing. Finally, Comparing experiments are done. The system could process a blurred image of 128×128 pixels within 32 milliseconds, and is up to three or four times faster than the traditional multi-DSPs systems.
FPGA acceleration of rigid-molecule docking codes

PubMed Central

Sukhwani, B.; Herbordt, M.C.

2011-01-01

Modelling the interactions of biological molecules, or docking, is critical both to understanding basic life processes and to designing new drugs. The field programmable gate array (FPGA) based acceleration of a recently developed, complex, production docking code is described. The authors found that it is necessary to extend their previous three-dimensional (3D) correlation structure in several ways, most significantly to support simultaneous computation of several correlation functions. The result for small-molecule docking is a 100-fold speed-up of a section of the code that represents over 95% of the original run-time. An additional 2% is accelerated through a previously described method, yielding a total acceleration of 36× over a single core and 10× over a quad-core. This approach is found to be an ideal complement to graphics processing unit (GPU) based docking, which excels in the protein–protein domain. PMID:21857870

Multicasting mesh AER: a scalable assembly approach for reconfigurable neuromorphic structured AER systems. Application to ConvNets.

PubMed

Zamarreno-Ramos, C; Linares-Barranco, A; Serrano-Gotarredona, T; Linares-Barranco, B

2013-02-01

This paper presents a modular, scalable approach to assembling hierarchically structured neuromorphic Address Event Representation (AER) systems. The method consists of arranging modules in a 2D mesh, each communicating bidirectionally with all four neighbors. Address events include a module label. Each module includes an AER router which decides how to route address events. Two routing approaches have been proposed, analyzed and tested, using either destination or source module labels. Our analyses reveal that depending on traffic conditions and network topologies either one or the other approach may result in better performance. Experimental results are given after testing the approach using high-end Virtex-6 FPGAs. The approach is proposed for both single and multiple FPGAs, in which case a special bidirectional parallel-serial AER link with flow control is exploited, using the FPGA Rocket-I/O interfaces. Extensive test results are provided exploiting convolution modules of 64 × 64 pixels with kernels with sizes up to 11 × 11, which process real sensory data from a Dynamic Vision Sensor (DVS) retina. One single Virtex-6 FPGA can hold up to 64 of these convolution modules, which is equivalent to a neural network with 262 × 10(3) neurons and almost 32 million synapses.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.

PubMed

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2017-03-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures

PubMed Central

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2016-01-01

Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
Field programmable gate array-assigned complex-valued computation and its limits

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bernard-Schwarz, Maria, E-mail: maria.bernardschwarz@ni.com; Institute of Applied Physics, TU Wien, Wiedner Hauptstrasse 8, 1040 Wien; Zwick, Wolfgang

We discuss how leveraging Field Programmable Gate Array (FPGA) technology as part of a high performance computing platform reduces latency to meet the demanding real time constraints of a quantum optics simulation. Implementations of complex-valued operations using fixed point numeric on a Virtex-5 FPGA compare favorably to more conventional solutions on a central processing unit. Our investigation explores the performance of multiple fixed point options along with a traditional 64 bits floating point version. With this information, the lowest execution times can be estimated. Relative error is examined to ensure simulation accuracy is maintained.
Economical Implementation of a Filter Engine in an FPGA

NASA Technical Reports Server (NTRS)

Kowalski, James E.

2009-01-01

A logic design has been conceived for a field-programmable gate array (FPGA) that would implement a complex system of multiple digital state-space filters. The main innovative aspect of this design lies in providing for reuse of parts of the FPGA hardware to perform different parts of the filter computations at different times, in such a manner as to enable the timely performance of all required computations in the face of limitations on available FPGA hardware resources. The implementation of the digital state-space filter involves matrix vector multiplications, which, in the absence of the present innovation, would ordinarily necessitate some multiplexing of vector elements and/or routing of data flows along multiple paths. The design concept calls for implementing vector registers as shift registers to simplify operand access to multipliers and accumulators, obviating both multiplexing and routing of data along multiple paths. Each vector register would be reused for different parts of a calculation. Outputs would always be drawn from the same register, and inputs would always be loaded into the same register. A simple state machine would control each filter. The output of a given filter would be passed to the next filter, accompanied by a "valid" signal, which would start the state machine of the next filter. Multiple filter modules would share a multiplication/accumulation arithmetic unit. The filter computations would be timed by use of a clock having a frequency high enough, relative to the input and output data rate, to provide enough cycles for matrix and vector arithmetic operations. This design concept could prove beneficial in numerous applications in which digital filters are used and/or vectors are multiplied by coefficient matrices. Examples of such applications include general signal processing, filtering of signals in control systems, processing of geophysical measurements, and medical imaging. For these and other applications, it could be advantageous to combine compact FPGA digital filter implementations with other application-specific logic implementations on single integrated-circuit chips. An FPGA could readily be tailored to implement a variety of filters because the filter coefficients would be loaded into memory at startup.
Note: Design of FPGA based system identification module with application to atomic force microscopy

NASA Astrophysics Data System (ADS)

Ghosal, Sayan; Pradhan, Sourav; Salapaka, Murti

2018-05-01

The science of system identification is widely utilized in modeling input-output relationships of diverse systems. In this article, we report field programmable gate array (FPGA) based implementation of a real-time system identification algorithm which employs forgetting factors and bias compensation techniques. The FPGA module is employed to estimate the mechanical properties of surfaces of materials at the nano-scale with an atomic force microscope (AFM). The FPGA module is user friendly which can be interfaced with commercially available AFMs. Extensive simulation and experimental results validate the design.
Application of a distributed systems architecture for increased speed in image processing on an autonomous ground vehicle

NASA Astrophysics Data System (ADS)

Wright, Adam A.; Momin, Orko; Shin, Young Ho; Shakya, Rahul; Nepal, Kumud; Ahlgren, David J.

2010-01-01

This paper presents the application of a distributed systems architecture to an autonomous ground vehicle, Q, that participates in both the autonomous and navigation challenges of the Intelligent Ground Vehicle Competition. In the autonomous challenge the vehicle is required to follow a course, while avoiding obstacles and staying within the course boundaries, which are marked by white lines. For the navigation challenge, the vehicle is required to reach a set of target destinations, known as way points, with given GPS coordinates and avoid obstacles that it encounters in the process. Previously the vehicle utilized a single laptop to execute all processing activities including image processing, sensor interfacing and data processing, path planning and navigation algorithms and motor control. National Instruments' (NI) LabVIEW served as the programming language for software implementation. As an upgrade to last year's design, a NI compact Reconfigurable Input/Output system (cRIO) was incorporated to the system architecture. The cRIO is NI's solution for rapid prototyping that is equipped with a real time processor, an FPGA and modular input/output. Under the current system, the real time processor handles the path planning and navigation algorithms, the FPGA gathers and processes sensor data. This setup leaves the laptop to focus on running the image processing algorithm. Image processing as previously presented by Nepal et. al. is a multi-step line extraction algorithm and constitutes the largest processor load. This distributed approach results in a faster image processing algorithm which was previously Q's bottleneck. Additionally, the path planning and navigation algorithms are executed more reliably on the real time processor due to the deterministic nature of operation. The implementation of this architecture required exploration of various inter-system communication techniques. Data transfer between the laptop and the real time processor using UDP packets was established as the most reliable protocol after testing various options. Improvement can be made to the system by migrating more algorithms to the hardware based FPGA to further speed up the operations of the vehicle.
Region-Oriented Placement Algorithm for Coarse-Grained Power-Gating FPGA Architecture

NASA Astrophysics Data System (ADS)

Li, Ce; Dong, Yiping; Watanabe, Takahiro

An FPGA plays an essential role in industrial products due to its fast, stable and flexible features. But the power consumption of FPGAs used in portable devices is one of critical issues. Top-down hierarchical design method is commonly used in both ASIC and FPGA design. But, in the case where plural modules are integrated in an FPGA and some of them might be in sleep-mode, current FPGA architecture cannot be fully effective. In this paper, coarse-grained power gating FPGA architecture is proposed where a whole area of an FPGA is partitioned into several regions and power supply is controlled for each region, so that modules in sleep mode can be effectively power-off. We also propose a region oriented FPGA placement algorithm fitted to this user's hierarchical design based on VPR[1]. Simulation results show that this proposed method could reduce power consumption of FPGA by 38% on average by setting unused modules or regions in sleep mode.
Hardware and Software Design of FPGA-based PCIe Gen3 interface for APEnet+ network interconnect system

NASA Astrophysics Data System (ADS)

Ammendola, R.; Biagioni, A.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Rossetti, D.; Simula, F.; Tosoratto, L.; Vicini, P.

2015-12-01

In the attempt to develop an interconnection architecture optimized for hybrid HPC systems dedicated to scientific computing, we designed APEnet+, a point-to-point, low-latency and high-performance network controller supporting 6 fully bidirectional off-board links over a 3D torus topology. The first release of APEnet+ (named V4) was a board based on a 40 nm Altera FPGA, integrating 6 channels at 34 Gbps of raw bandwidth per direction and a PCIe Gen2 x8 host interface. It has been the first-of-its-kind device to implement an RDMA protocol to directly read/write data from/to Fermi and Kepler NVIDIA GPUs using NVIDIA peer-to-peer and GPUDirect RDMA protocols, obtaining real zero-copy GPU-to-GPU transfers over the network. The latest generation of APEnet+ systems (now named V5) implements a PCIe Gen3 x8 host interface on a 28 nm Altera Stratix V FPGA, with multi-standard fast transceivers (up to 14.4 Gbps) and an increased amount of configurable internal resources and hardware IP cores to support main interconnection standard protocols. Herein we present the APEnet+ V5 architecture, the status of its hardware and its system software design. Both its Linux Device Driver and the low-level libraries have been redeveloped to support the PCIe Gen3 protocol, introducing optimizations and solutions based on hardware/software co-design.
Electrically reconfigurable logic array

NASA Technical Reports Server (NTRS)

Agarwal, R. K.

1982-01-01

To compose the complicated systems using algorithmically specialized logic circuits or processors, one solution is to perform relational computations such as union, division and intersection directly on hardware. These relations can be pipelined efficiently on a network of processors having an array configuration. These processors can be designed and implemented with a few simple cells. In order to determine the state-of-the-art in Electrically Reconfigurable Logic Array (ERLA), a survey of the available programmable logic array (PLA) and the logic circuit elements used in such arrays was conducted. Based on this survey some recommendations are made for ERLA devices.
Personal pervasive environments: practice and experience.

PubMed

Ballesteros, Francisco J; Guardiola, Gorka; Soriano, Enrique

2012-01-01

In this paper we present our experience designing and developing two different systems to enable personal pervasive computing environments, Plan B and the Octopus. These systems were fully implemented and have been used on a daily basis for years. Both are based on synthetic (virtual) file system interfaces and provide mechanisms to adapt to changes in the context and reconfigure the system to support pervasive applications. We also present the main differences between them, focusing on architectural and reconfiguration aspects. Finally, we analyze the pitfalls and successes of both systems and review the lessons we learned while designing, developing, and using them.
Personal Pervasive Environments: Practice and Experience

PubMed Central

Ballesteros, Francisco J.; Guardiola, Gorka; Soriano, Enrique

2012-01-01

In this paper we present our experience designing and developing two different systems to enable personal pervasive computing environments, Plan B and the Octopus. These systems were fully implemented and have been used on a daily basis for years. Both are based on synthetic (virtual) file system interfaces and provide mechanisms to adapt to changes in the context and reconfigure the system to support pervasive applications. We also present the main differences between them, focusing on architectural and reconfiguration aspects. Finally, we analyze the pitfalls and successes of both systems and review the lessons we learned while designing, developing, and using them. PMID:22969340
Real-Time Processing System for the JET Hard X-Ray and Gamma-Ray Profile Monitor Enhancement

NASA Astrophysics Data System (ADS)

Fernandes, Ana M.; Pereira, Rita C.; Neto, André; Valcárcel, Daniel F.; Alves, Diogo; Sousa, Jorge; Carvalho, Bernardo B.; Kiptily, Vasily; Syme, Brian; Blanchard, Patrick; Murari, Andrea; Correia, Carlos M. B. A.; Varandas, Carlos A. F.; Gonçalves, Bruno

2014-06-01

The Joint European Torus (JET) is currently undertaking an enhancement program which includes tests of relevant diagnostics with real-time processing capabilities for the International Thermonuclear Experimental Reactor (ITER). Accordingly, a new real-time processing system was developed and installed at JET for the gamma-ray and hard X-ray profile monitor diagnostic. The new system is connected to 19 CsI(Tl) photodiodes in order to obtain the line-integrated profiles of the gamma-ray and hard X-ray emissions. Moreover, it was designed to overcome the former data acquisition (DAQ) limitations while exploiting the required real-time features. The new DAQ hardware, based on the Advanced Telecommunication Computer Architecture (ATCA) standard, includes reconfigurable digitizer modules with embedded field-programmable gate array (FPGA) devices capable of acquiring and simultaneously processing data in real-time from the 19 detectors. A suitable algorithm was developed and implemented in the FPGAs, which are able to deliver the corresponding energy of the acquired pulses. The processed data is sent periodically, during the discharge, through the JET real-time network and stored in the JET scientific databases at the end of the pulse. The interface between the ATCA digitizers, the JET control and data acquisition system (CODAS), and the JET real-time network is provided by the Multithreaded Application Real-Time executor (MARTe). The work developed allowed attaining two of the major milestones required by next fusion devices: the ability to process and simultaneously supply high volume data rates in real-time.
FPGA-Based, Self-Checking, Fault-Tolerant Computers

NASA Technical Reports Server (NTRS)

Some, Raphael; Rennels, David

2004-01-01

A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing identical programs in lock step, with comparison of their outputs to detect errors. It would also contain various cache local memory circuits, communication circuits, and configurable special-purpose processors that would use self-checking checkers. (The basic principle of the self-checking checker method is to utilize logic circuitry that generates error signals whenever there is an error in either the checker or the circuit being checked.) The memory system would comprise a main memory and a hardware-controlled check-pointing system (CPS) based on a buffer memory denoted the recovery cache. The main memory would contain random-access memory (RAM) chips and FPGAs that would, in addition to everything else, implement double-error-detecting and single-error-correcting memory functions to enable recovery from single-bit errors.
Hardware realization of an SVM algorithm implemented in FPGAs

NASA Astrophysics Data System (ADS)

Wiśniewski, Remigiusz; Bazydło, Grzegorz; Szcześniak, Paweł

2017-08-01

The paper proposes a technique of hardware realization of a space vector modulation (SVM) of state function switching in matrix converter (MC), oriented on the implementation in a single field programmable gate array (FPGA). In MC the SVM method is based on the instantaneous space-vector representation of input currents and output voltages. The traditional computation algorithms usually involve digital signal processors (DSPs) which consumes the large number of power transistors (18 transistors and 18 independent PWM outputs) and "non-standard positions of control pulses" during the switching sequence. Recently, hardware implementations become popular since computed operations may be executed much faster and efficient due to nature of the digital devices (especially concurrency). In the paper, we propose a hardware algorithm of SVM computation. In opposite to the existing techniques, the presented solution applies COordinate Rotation DIgital Computer (CORDIC) method to solve the trigonometric operations. Furthermore, adequate arithmetic modules (that is, sub-devices) used for intermediate calculations, such as code converters or proper sectors selectors (for output voltages and input current) are presented in detail. The proposed technique has been implemented as a design described with the use of Verilog hardware description language. The preliminary results of logic implementation oriented on the Xilinx FPGA (particularly, low-cost device from Artix-7 family from Xilinx was used) are also presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Choong, W. -S.; Abu-Nimeh, F.; Moses, W. W.

Here, we present a 16-channel front-end readout board for the OpenPET electronics system. A major task in developing a nuclear medical imaging system, such as a positron emission computed tomograph (PET) or a single-photon emission computed tomograph (SPECT), is the electronics system. While there are a wide variety of detector and camera design concepts, the relatively simple nature of the acquired data allows for a common set of electronics requirements that can be met by a flexible, scalable, and high-performance OpenPET electronics system. The analog signals from the different types of detectors used in medical imaging share similar characteristics, whichmore » allows for a common analog signal processing. The OpenPET electronics processes the analog signals with Detector Boards. Here we report on the development of a 16-channel Detector Board. Each signal is digitized by a continuously sampled analog-to-digital converter (ADC), which is processed by a field programmable gate array (FPGA) to extract pulse height information. A leading edge discriminator creates a timing edge that is "time stamped" by a time-to-digital converter (TDC) implemented inside the FPGA. In conclusion, this digital information from each channel is sent to an FPGA that services 16 analog channels, and then information from multiple channels is processed by this FPGA to perform logic for crystal lookup, DOI calculation, calibration, etc.« less
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
FPGA-based trigger system for the Fermilab SeaQuest experimentz

NASA Astrophysics Data System (ADS)

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; Chang, Ting-Hua; Chang, Wen-Chen; Chen, Yen-Chu; Gilman, Ron; Nakano, Kenichi; Peng, Jen-Chieh; Wang, Su-Yin

2015-12-01

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ+ and μ- produced in 120 GeV/c proton-nucleon interactions in a high rate environment. The trigger system consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is then examined against pre-determined trigger matrices to identify candidate muon tracks. Information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.
FPGA-based Trigger System for the Fermilab SeaQuest Experimentz

DOE PAGES

Shiu, Shiuan-Hal; Wu, Jinyuan; McClellan, Randall Evan; ...

2015-09-10

The SeaQuest experiment (Fermilab E906) detects pairs of energetic μ + and μ -produced in 120 GeV/c proton–nucleon interactions in a high rate environment. The trigger system we used consists of several arrays of scintillator hodoscopes and a set of field-programmable gate array (FPGA) based VMEbus modules. Signals from up to 96 channels of hodoscope are digitized by each FPGA with a 1-ns resolution using the time-to-digital convertor (TDC) firmware. The delay of the TDC output can be adjusted channel-by-channel in 1-ns step and then re-aligned with the beam RF clock. The hit pattern on the hodoscope planes is thenmore » examined against pre-determined trigger matrices to identify candidate muon tracks. Finally, information on the candidate tracks is sent to the 2nd-level FPGA-based track correlator to find candidate di-muon events. The design and implementation of the FPGA-based trigger system for SeaQuest experiment are presented.« less
Compute Element and Interface Box for the Hazard Detection System

NASA Technical Reports Server (NTRS)

Villalpando, Carlos Y.; Khanoyan, Garen; Stern, Ryan A.; Some, Raphael R.; Bailey, Erik S.; Carson, John M.; Vaughan, Geoffrey M.; Werner, Robert A.; Salomon, Phil M.; Martin, Keith E.;

2013-01-01

The Autonomous Landing and Hazard Avoidance Technology (ALHAT) program is building a sensor that enables a spacecraft to evaluate autonomously a potential landing area to generate a list of hazardous and safe landing sites. It will also provide navigation inputs relative to those safe sites. The Hazard Detection System Compute Element (HDS-CE) box combines a field-programmable gate array (FPGA) board for sensor integration and timing, with a multicore computer board for processing. The FPGA does system-level timing and data aggregation, and acts as a go-between, removing the real-time requirements from the processor and labeling events with a high resolution time. The processor manages the behavior of the system, controls the instruments connected to the HDS-CE, and services the "heavy lifting" computational requirements for analyzing the potential landing spots.

Neuromorphic VLSI vision system for real-time texture segregation.

PubMed

Shimonomura, Kazuhiro; Yagi, Tetsuya

2008-10-01

The visual system of the brain can perceive an external scene in real-time with extremely low power dissipation, although the response speed of an individual neuron is considerably lower than that of semiconductor devices. The neurons in the visual pathway generate their receptive fields using a parallel and hierarchical architecture. This architecture of the visual cortex is interesting and important for designing a novel perception system from an engineering perspective. The aim of this study is to develop a vision system hardware, which is designed inspired by a hierarchical visual processing in V1, for real time texture segregation. The system consists of a silicon retina, orientation chip, and field programmable gate array (FPGA) circuit. The silicon retina emulates the neural circuits of the vertebrate retina and exhibits a Laplacian-Gaussian-like receptive field. The orientation chip selectively aggregates multiple pixels of the silicon retina in order to produce Gabor-like receptive fields that are tuned to various orientations by mimicking the feed-forward model proposed by Hubel and Wiesel. The FPGA circuit receives the output of the orientation chip and computes the responses of the complex cells. Using this system, the neural images of simple cells were computed in real-time for various orientations and spatial frequencies. Using the orientation-selective outputs obtained from the multi-chip system, a real-time texture segregation was conducted based on a computational model inspired by psychophysics and neurophysiology. The texture image was filtered by the two orthogonally oriented receptive fields of the multi-chip system and the filtered images were combined to segregate the area of different texture orientation with the aid of FPGA. The present system is also useful for the investigation of the functions of the higher-order cells that can be obtained by combining the simple and complex cells.
Application-specific coarse-grained reconfigurable array: architecture and design methodology

NASA Astrophysics Data System (ADS)

Zhou, Li; Liu, Dongpei; Zhang, Jianfeng; Liu, Hengzhu

2015-06-01

Coarse-grained reconfigurable arrays (CGRAs) have shown potential for application in embedded systems in recent years. Numerous reconfigurable processing elements (PEs) in CGRAs provide flexibility while maintaining high performance by exploring different levels of parallelism. However, a difference remains between the CGRA and the application-specific integrated circuit (ASIC). Some application domains, such as software-defined radios (SDRs), require flexibility with performance demand increases. More effective CGRA architectures are expected to be developed. Customisation of a CGRA according to its application can improve performance and efficiency. This study proposes an application-specific CGRA architecture template composed of generic PEs (GPEs) and special PEs (SPEs). The hardware of the SPE can be customised to accelerate specific computational patterns. An automatic design methodology that includes pattern identification and application-specific function unit generation is also presented. A mapping algorithm based on ant colony optimisation is provided. Experimental results on the SDR target domain show that compared with other ordinary and application-specific reconfigurable architectures, the CGRA generated by the proposed method performs more efficiently for given applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Cohen, J; Dossa, D; Gokhale, M

Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe:more » (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows: SuperMicro X7DBE Xeon Dual Socket Blackford Server Motherboard; 2 Intel Xeon Dual-Core 2.66 GHz processors; 1 GB DDR2 PC2-5300 RAM (2 x 512); 80GB Hard Drive (Seagate SATA II Barracuda). The Fusion board is presently capable of 4X in a PCIe slot. The image resampling benchmark was run on a dual Xeon workstation with NVIDIA graphics card (see Chapter 5 for full specification). An XtremeData Opteron+FPGA was used for the language classification application. We observed that these benchmarks are not uniformly I/O intensive. The only benchmark that showed greater that 50% of the time in I/O was the graph algorithm when it accessed data files over NFS. When local disk was used, the graph benchmark spent at most 40% of its time in I/O. The other benchmarks were CPU dominated. The image resampling benchmark and language classification showed order of magnitude speedup over software by using co-processor technology to offload the CPU-intensive kernels. Our experiments to date suggest that emerging hardware technologies offer significant benefit to boosting the performance of data-intensive algorithms. Using GPU and FPGA co-processors, we were able to improve performance by more than an order of magnitude on the benchmark algorithms, eliminating the processor bottleneck of CPU-bound tasks. Experiments with a prototype solid state nonvolative memory available today show 10X better throughput on random reads than disk, with a 2X speedup on a graph processing benchmark when compared to the use of local SATA disk.« less
Extending the BEAGLE library to a multi-FPGA platform

PubMed Central

2013-01-01

Background Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein’s pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein’s pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. Results The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform’s peak memory bandwidth and the implementation’s memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE’s CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE’s GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. Conclusions The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor. PMID:23331707
Rapid prototyping of SoC-based real-time vision system: application to image preprocessing and face detection

NASA Astrophysics Data System (ADS)

Jridi, Maher; Alfalou, Ayman

2017-05-01

By this paper, the major goal is to investigate the Multi-CPU/FPGA SoC (System on Chip) design flow and to transfer a know-how and skills to rapidly design embedded real-time vision system. Our aim is to show how the use of these devices can be benefit for system level integration since they make possible simultaneous hardware and software development. We take the facial detection and pretreatments as case study since they have a great potential to be used in several applications such as video surveillance, building access control and criminal identification. The designed system use the Xilinx Zedboard platform. The last is the central element of the developed vision system. The video acquisition is performed using either standard webcam connected to the Zedboard via USB interface or several camera IP devices. The visualization of video content and intermediate results are possible with HDMI interface connected to HD display. The treatments embedded in the system are as follow: (i) pre-processing such as edge detection implemented in the ARM and in the reconfigurable logic, (ii) software implementation of motion detection and face detection using either ViolaJones or LBP (Local Binary Pattern), and (iii) application layer to select processing application and to display results in a web page. One uniquely interesting feature of the proposed system is that two functions have been developed to transmit data from and to the VDMA port. With the proposed optimization, the hardware implementation of the Sobel filter takes 27 ms and 76 ms for 640x480, and 720p resolutions, respectively. Hence, with the FPGA implementation, an acceleration of 5 times is obtained which allow the processing of 37 fps and 13 fps for 640x480, and 720p resolutions, respectively.
Software-based high-level synthesis design of FPGA beamformers for synthetic aperture imaging.

PubMed

Amaro, Joao; Yiu, Billy Y S; Falcao, Gabriel; Gomes, Marco A C; Yu, Alfred C H

2015-05-01

Field-programmable gate arrays (FPGAs) can potentially be configured as beamforming platforms for ultrasound imaging, but a long design time and skilled expertise in hardware programming are typically required. In this article, we present a novel approach to the efficient design of FPGA beamformers for synthetic aperture (SA) imaging via the use of software-based high-level synthesis techniques. Software kernels (coded in OpenCL) were first developed to stage-wise handle SA beamforming operations, and their corresponding FPGA logic circuitry was emulated through a high-level synthesis framework. After design space analysis, the fine-tuned OpenCL kernels were compiled into register transfer level descriptions to configure an FPGA as a beamformer module. The processing performance of this beamformer was assessed through a series of offline emulation experiments that sought to derive beamformed images from SA channel-domain raw data (40-MHz sampling rate, 12 bit resolution). With 128 channels, our FPGA-based SA beamformer can achieve 41 frames per second (fps) processing throughput (3.44 × 10(8) pixels per second for frame size of 256 × 256 pixels) at 31.5 W power consumption (1.30 fps/W power efficiency). It utilized 86.9% of the FPGA fabric and operated at a 196.5 MHz clock frequency (after optimization). Based on these findings, we anticipate that FPGA and high-level synthesis can together foster rapid prototyping of real-time ultrasound processor modules at low power consumption budgets.
A low delay transmission method of multi-channel video based on FPGA

NASA Astrophysics Data System (ADS)

Fu, Weijian; Wei, Baozhi; Li, Xiaobin; Wang, Quan; Hu, Xiaofei

2018-03-01

In order to guarantee the fluency of multi-channel video transmission in video monitoring scenarios, we designed a kind of video format conversion method based on FPGA and its DMA scheduling for video data, reduces the overall video transmission delay.In order to sace the time in the conversion process, the parallel ability of FPGA is used to video format conversion. In order to improve the direct memory access (DMA) writing transmission rate of PCIe bus, a DMA scheduling method based on asynchronous command buffer is proposed. The experimental results show that this paper designs a low delay transmission method based on FPGA, which increases the DMA writing transmission rate by 34% compared with the existing method, and then the video overall delay is reduced to 23.6ms.
High-performance electronic image stabilisation for shift and rotation correction

NASA Astrophysics Data System (ADS)

Parker, Steve C. J.; Hickman, D. L.; Wu, F.

2014-06-01

A novel low size, weight and power (SWaP) video stabiliser called HALO™ is presented that uses a SoC to combine the high processing bandwidth of an FPGA, with the signal processing flexibility of a CPU. An image based architecture is presented that can adapt the tiling of frames to cope with changing scene dynamics. A real-time implementation is then discussed that can generate several hundred optical flow vectors per video frame, to accurately calculate the unwanted rigid body translation and rotation of camera shake. The performance of the HALO™ stabiliser is comprehensively benchmarked against the respected Deshaker 3.0 off-line stabiliser plugin to VirtualDub. Eight different videos are used for benchmarking, simulating: battlefield, surveillance, security and low-level flight applications in both visible and IR wavebands. The results show that HALO™ rivals the performance of Deshaker within its operating envelope. Furthermore, HALO™ may be easily reconfigured to adapt to changing operating conditions or requirements; and can be used to host other video processing functionality like image distortion correction, fusion and contrast enhancement.
From Motion to Photons in 80 Microseconds: Towards Minimal Latency for Virtual and Augmented Reality.

PubMed

Lincoln, Peter; Blate, Alex; Singh, Montek; Whitted, Turner; State, Andrei; Lastra, Anselmo; Fuchs, Henry

2016-04-01

We describe an augmented reality, optical see-through display based on a DMD chip with an extremely fast (16 kHz) binary update rate. We combine the techniques of post-rendering 2-D offsets and just-in-time tracking updates with a novel modulation technique for turning binary pixels into perceived gray scale. These processing elements, implemented in an FPGA, are physically mounted along with the optical display elements in a head tracked rig through which users view synthetic imagery superimposed on their real environment. The combination of mechanical tracking at near-zero latency with reconfigurable display processing has given us a measured average of 80 µs of end-to-end latency (from head motion to change in photons from the display) and also a versatile test platform for extremely-low-latency display systems. We have used it to examine the trade-offs between image quality and cost (i.e. power and logical complexity) and have found that quality can be maintained with a fairly simple display modulation scheme.
Flight Model of the `Flying Laptop' OBC and Reconfiguration Unit

NASA Astrophysics Data System (ADS)

Eickhoff, Jens; Stratton, Sam; Butz, Pius; Cook, Barry; Walker, Paul; Uryu, Alexander; Lengowski, Michael; Roser, Hans-Peter

2012-08-01

As already published in papers at the DASIA conferences 2010 in Budapest [1] and 2011 in Malta [2], the University of Stuttgart, Germany, is developing an advanced 3-axis stabilized small satellite applying industry standards for command/control techniques, onboard software design and onboard computer components. The satellite has a launch mass of approx. 120kg. One of the main challenges was the development of an ultra compact and performing onboard computer (OBC), which was intended to support an RTEMS operating system, a PUS standard based onboard software (OBSW) and CCSDS standard based ground/space communication. The developed architecture is based on 4 main elements (see [1, 2] and Figure 3) which are developed in cooperation with industrial partners:• the OBC core board based on the LEON3 FT architecture,• an I/O Board for all OBC digital interfaces to S/C equipment,• a CCSDS TC/TM decoder/encoder board,• reconfiguration unit being embedded in the satellite power control and distribution unit PCDU.In the meantime the EM / Breadboard units of the computer have been tested intensively including first HW/SW integration tests in a Satellite Testbench (see Figure 2). The FM HW elements from the co-authoring suppliers are under assembly in Stuttgart.
Nanopatterned reconfigurable spin-textures for magnonics

NASA Astrophysics Data System (ADS)

Albisetti, E.; Petti, D.; Pancaldi, M.; Madami, M.; Tacchi, S.; Curtis, J.; King, W. P.; Papp, A.; Csaba, G.; Porod, W.; Vavassori, P.; Riedo, E.; Bertacco, R.

The control of spin-waves holds the promise to enable energy-efficient information transport and wave-based computing. Conventionally, the engineering of spin-waves is achieved via physically patterning magnetic structures such as magnonic crystals and micro-nanowires. We demonstrate a new concept for creating reconfigurable magnonic nanostructures, by crafting at the nanoscale the magnetic anisotropy landscape of a ferromagnet exchange-coupled to an antiferromagnet. By performing a highly localized field cooling with the hot tip of a scanning probe microscope, magnetic structures, with arbitrarily oriented magnetization and tunable unidirectional anisotropy, are patterned without modifying the film chemistry and topography. We demonstrate that, in such structures, the spin-wave excitation and propagation can be spatially controlled at remanence, and can be tuned by external magnetic fields. This opens the way to the use of nanopatterned spin-textures, such as domains and domain walls, for exciting and manipulating magnons in reconfigurable nanocircuits. Partially funded by the EC through project SWING (no. 705326).
Reconfigurable logic in nanosecond Cu/GeTe/TiN filamentary memristors for energy-efficient in-memory computing.

PubMed

Jin, Miaomiao; Cheng, Long; Li, Yi; Hu, Siyu; Lu, Ke; Chen, Jia; Duan, Nian; Wang, Zhuorui; Zhou, Yaxiong; Chang, Ting-Chang; Miao, Xiangshui

2018-06-27

Owing to the capability of integrating the information storage and computing in the same physical location, in-memory computing with memristors has become a research hotspot as a promising route for non von Neumann architecture. However, it is still a challenge to develop high performance devices as well as optimized logic methodologies to realize energy-efficient computing. Herein, filamentary Cu/GeTe/TiN memristor is reported to show satisfactory properties with nanosecond switching speed (< 60 ns), low voltage operation (< 2 V), high endurance (>104 cycles) and good retention (>104 s @85℃). It is revealed that the charge carrier conduction mechanisms in high resistance and low resistance states are Schottky emission and hopping transport between the adjacent Cu clusters, respectively, based on the analysis of current-voltage behaviors and resistance-temperature characteristics. An intuitive picture is given to describe the dynamic processes of resistive switching. Moreover, based on the basic material implication (IMP) logic circuit, we proposed a reconfigurable logic method and experimentally implemented IMP, NOT, OR, and COPY logic functions. Design of a one-bit full adder with reduction in computational sequences and its validation in simulation further demonstrate the potential practical application. The results provide important progress towards understanding of resistive switching mechanism and realization of energy-efficient in-memory computing architecture. © 2018 IOP Publishing Ltd.
Operational Dynamic Configuration Analysis

NASA Technical Reports Server (NTRS)

Lai, Chok Fung; Zelinski, Shannon

2010-01-01

Sectors may combine or split within areas of specialization in response to changing traffic patterns. This method of managing capacity and controller workload could be made more flexible by dynamically modifying sector boundaries. Much work has been done on methods for dynamically creating new sector boundaries [1-5]. Many assessments of dynamic configuration methods assume the current day baseline configuration remains fixed [6-7]. A challenging question is how to select a dynamic configuration baseline to assess potential benefits of proposed dynamic configuration concepts. Bloem used operational sector reconfigurations as a baseline [8]. The main difficulty is that operational reconfiguration data is noisy. Reconfigurations often occur frequently to accommodate staff training or breaks, or to complete a more complicated reconfiguration through a rapid sequence of simpler reconfigurations. Gupta quantified a few aspects of airspace boundary changes from this data [9]. Most of these metrics are unique to sector combining operations and not applicable to more flexible dynamic configuration concepts. To better understand what sort of reconfigurations are acceptable or beneficial, more configuration change metrics should be developed and their distribution in current practice should be computed. This paper proposes a method to select a simple sequence of configurations among operational configurations to serve as a dynamic configuration baseline for future dynamic configuration concept assessments. New configuration change metrics are applied to the operational data to establish current day thresholds for these metrics. These thresholds are then corroborated, refined, or dismissed based on airspace practitioner feedback. The dynamic configuration baseline selection method uses a k-means clustering algorithm to select the sequence of configurations and trigger times from a given day of operational sector combination data. The clustering algorithm selects a simplified schedule containing k configurations based on stability score of the sector combinations among the raw operational configurations. In addition, the number of the selected configurations is determined based on balance between accuracy and assessment complexity.
Design and implementation of digital controllers for smart structures using field-programmable gate arrays

NASA Astrophysics Data System (ADS)

Kelly, Jamie S.; Bowman, Hiroshi C.; Rao, Vittal S.; Pottinger, Hardy J.

1997-06-01

Implementation issues represent an unfamiliar challenge to most control engineers, and many techniques for controller design ignore these issues outright. Consequently, the design of controllers for smart structural systems usually proceeds without regard for their eventual implementation, thus resulting either in serious performance degradation or in hardware requirements that squander power, complicate integration, and drive up cost. The level of integration assumed by the Smart Patch further exacerbates these difficulties, and any design inefficiency may render the realization of a single-package sensor-controller-actuator system infeasible. The goal of this research is to automate the controller implementation process and to relieve the design engineer of implementation concerns like quantization, computational efficiency, and device selection. We specifically target Field Programmable Gate Arrays (FPGAs) as our hardware platform because these devices are highly flexible, power efficient, and reprogrammable. The current study develops an automated implementation sequence that minimizes hardware requirements while maintaining controller performance. Beginning with a state space representation of the controller, the sequence automatically generates a configuration bitstream for a suitable FPGA implementation. MATLAB functions optimize and simulate the control algorithm before translating it into the VHSIC hardware description language. These functions improve power efficiency and simplify integration in the final implementation by performing a linear transformation that renders the controller computationally friendly. The transformation favors sparse matrices in order to reduce multiply operations and the hardware necessary to support them; simultaneously, the remaining matrix elements take on values that minimize limit cycles and parameter sensitivity. The proposed controller design methodology is implemented on a simple cantilever beam test structure using FPGA hardware. The experimental closed loop response is compared with that of an automated FPGA controller implementation. Finally, we explore the integration of FPGA based controllers into a multi-chip module, which we believe represents the next step towards the realization of the Smart Patch.
A single FPGA-based portable ultrasound imaging system for point-of-care applications.

PubMed

Kim, Gi-Duck; Yoon, Changhan; Kye, Sang-Bum; Lee, Youngbae; Kang, Jeeun; Yoo, Yangmo; Song, Tai-kyong

2012-07-01

We present a cost-effective portable ultrasound system based on a single field-programmable gate array (FPGA) for point-of-care applications. In the portable ultrasound system developed, all the ultrasound signal and image processing modules, including an effective 32-channel receive beamformer with pseudo-dynamic focusing, are embedded in an FPGA chip. For overall system control, a mobile processor running Linux at 667 MHz is used. The scan-converted ultrasound image data from the FPGA are directly transferred to the system controller via external direct memory access without a video processing unit. The potable ultrasound system developed can provide real-time B-mode imaging with a maximum frame rate of 30, and it has a battery life of approximately 1.5 h. These results indicate that the single FPGA-based portable ultrasound system developed is able to meet the processing requirements in medical ultrasound imaging while providing improved flexibility for adapting to emerging POC applications.
On-chip visual perception of motion: a bio-inspired connectionist model on FPGA.

PubMed

Torres-Huitzil, César; Girau, Bernard; Castellanos-Sánchez, Claudio

2005-01-01

Visual motion provides useful information to understand the dynamics of a scene to allow intelligent systems interact with their environment. Motion computation is usually restricted by real time requirements that need the design and implementation of specific hardware architectures. In this paper, the design of hardware architecture for a bio-inspired neural model for motion estimation is presented. The motion estimation is based on a strongly localized bio-inspired connectionist model with a particular adaptation of spatio-temporal Gabor-like filtering. The architecture is constituted by three main modules that perform spatial, temporal, and excitatory-inhibitory connectionist processing. The biomimetic architecture is modeled, simulated and validated in VHDL. The synthesis results on a Field Programmable Gate Array (FPGA) device show the potential achievement of real-time performance at an affordable silicon area.
Design of barrier bucket kicker control system

NASA Astrophysics Data System (ADS)

Ni, Fa-Fu; Wang, Yan-Yu; Yin, Jun; Zhou, De-Tai; Shen, Guo-Dong; Zheng, Yang-De.; Zhang, Jian-Chuan; Yin, Jia; Bai, Xiao; Ma, Xiao-Li

2018-05-01

The Heavy-Ion Research Facility in Lanzhou (HIRFL) contains two synchrotrons: the main cooler storage ring (CSRm) and the experimental cooler storage ring (CSRe). Beams are extracted from CSRm, and injected into CSRe. To apply the Barrier Bucket (BB) method on the CSRe beam accumulation, a new BB technology based kicker control system was designed and implemented. The controller of the system is implemented using an Advanced Reduced Instruction Set Computer (RISC) Machine (ARM) chip and a field-programmable gate array (FPGA) chip. Within the architecture, ARM is responsible for data presetting and floating number arithmetic processing. The FPGA computes the RF phase point of the two rings and offers more accurate control of the time delay. An online preliminary experiment on HIRFL was also designed to verify the functionalities of the control system. The result shows that the reference trigger point of two different sinusoidal RF signals for an arbitrary phase point was acquired with a matched phase error below 1° (approximately 2.1 ns), and the step delay time better than 2 ns were realized.
FPGA-accelerated algorithm for the regular expression matching system

NASA Astrophysics Data System (ADS)

Russek, P.; Wiatr, K.

2015-01-01

This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
Development of embedded real-time and high-speed vision platform

NASA Astrophysics Data System (ADS)

Ouyang, Zhenxing; Dong, Yimin; Yang, Hua

2015-12-01

Currently, high-speed vision platforms are widely used in many applications, such as robotics and automation industry. However, a personal computer (PC) whose over-large size is not suitable and applicable in compact systems is an indispensable component for human-computer interaction in traditional high-speed vision platforms. Therefore, this paper develops an embedded real-time and high-speed vision platform, ER-HVP Vision which is able to work completely out of PC. In this new platform, an embedded CPU-based board is designed as substitution for PC and a DSP and FPGA board is developed for implementing image parallel algorithms in FPGA and image sequential algorithms in DSP. Hence, the capability of ER-HVP Vision with size of 320mm x 250mm x 87mm can be presented in more compact condition. Experimental results are also given to indicate that the real-time detection and counting of the moving target at a frame rate of 200 fps at 512 x 512 pixels under the operation of this newly developed vision platform are feasible.
Radiation-Hardened Solid-State Drive

NASA Technical Reports Server (NTRS)

Sheldon, Douglas J.

2010-01-01

A method is provided for a radiationhardened (rad-hard) solid-state drive for space mission memory applications by combining rad-hard and commercial off-the-shelf (COTS) non-volatile memories (NVMs) into a hybrid architecture. The architecture is controlled by a rad-hard ASIC (application specific integrated circuit) or a FPGA (field programmable gate array). Specific error handling and data management protocols are developed for use in a rad-hard environment. The rad-hard memories are smaller in overall memory density, but are used to control and manage radiation-induced errors in the main, and much larger density, non-rad-hard COTS memory devices. Small amounts of rad-hard memory are used as error buffers and temporary caches for radiation-induced errors in the large COTS memories. The rad-hard ASIC/FPGA implements a variety of error-handling protocols to manage these radiation-induced errors. The large COTS memory is triplicated for protection, and CRC-based counters are calculated for sub-areas in each COTS NVM array. These counters are stored in the rad-hard non-volatile memory. Through monitoring, rewriting, regeneration, triplication, and long-term storage, radiation-induced errors in the large NV memory are managed. The rad-hard ASIC/FPGA also interfaces with the external computer buses.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.