xilinx virtex fpgas: Topics by Science.gov

Sample records for xilinx virtex fpgas

Mitigating Upsets in SRAM-Based FPGAs from the Xilinx Virtex 2 Family

NASA Technical Reports Server (NTRS)

Swift, G. M.; Yui, C. C.; Carmichael, C.; Koga, R.; George, J. S.

2003-01-01

Static random access memory (SRAM) upset rates in field programmable gate arrays (FPGAs) from the Xilinx Virtex 2 family have been tested for radiation effects on configuration memory, block RAM and the power-on-reset (POR) and SelectMAP single event functional interrupts (SEFIs). Dynamic testing has shown the effectiveness and value of Triple Module Redundancy (TMR) and partial reconfiguration when used in conjunction. Continuing dynamic testing for more complex designs and other Virtex 2 capabilities (i.e., I/O standards, digital clock managers (DCM), etc.) is scheduled.
Initial Single Event Effects Testing of the Xilinx Virtex-4 Field Programmable Gate Array

NASA Technical Reports Server (NTRS)

Allen, Gregory R.; Swift, Gary M.; Carmichael, C.; Tseng, C.

2007-01-01

We present initial results for the thin epitaxial Xilinx Virtex-4 Fie ld Programmable Gate Array (FPGA), and compare to previous results ob tained for the Virtex-II and Virtex-II Pro. The data presented was a cquired through a consortium based effort with the common goal of pr oviding the space community with data and mitigation methods for the use of Xilinx FPGAs in space.
20-GFLOPS QR processor on a Xilinx Virtex-E FPGA

NASA Astrophysics Data System (ADS)

Walke, Richard L.; Smith, Robert W. M.; Lightbody, Gaye

2000-11-01

Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.
A Genetic Representation for Evolutionary Fault Recovery in Virtex FPGAs

NASA Technical Reports Server (NTRS)

Lohn, Jason; Larchev, Greg; DeMara, Ronald; Korsmeyer, David (Technical Monitor)

2003-01-01

Most evolutionary approaches to fault recovery in FPGAs focus on evolving alternative logic configurations as opposed to evolving the intra-cell routing. Since the majority of transistors in a typical FPGA are dedicated to interconnect, nearly 80% according to one estimate, evolutionary fault-recovery systems should benefit hy accommodating routing. In this paper, we propose an evolutionary fault-recovery system employing a genetic representation that takes into account both logic and routing configurations. Experiments were run using a software model of the Xilinx Virtex FPGA. We report that using four Virtex combinational logic blocks, we were able to evolve a 100% accurate quadrature decoder finite state machine in the presence of a stuck-at-zero fault.
Xilinx Virtex-5QV (V5QV) Independent SEU Data

NASA Technical Reports Server (NTRS)

Berg, Melanie D.; LaBel, Kenneth A.; Pellish, Jonathan

2014-01-01

This is an independent study to determine the single event destructive and transient susceptibility of the Xilinx Virtex-5QV (SIRF) device. A framework for evaluating complex digital systems targeted for harsh radiation environments such as space is presented.
Single event upset susceptibility testing of the Xilinx Virtex II FPGA

NASA Technical Reports Server (NTRS)

Yui, C.; Swift, G.; Carmichael, C.

2002-01-01

Heavy ion testing of the Xilinx Virtex IZ was conducted on the configuration, block RAM and user flip flop cells to determine their single event upset susceptibility using LETs of 1.2 to 60 MeVcm^2/mg. A software program specifically designed to count errors in the FPGA is used to reveal L1/e values and single-event-functional interrupt failures.
Single event upset suspectibility testing of the Xilinx Virtex II FPGA

NASA Technical Reports Server (NTRS)

Carmichael, C.; Swift, C.; Yui, G.

2002-01-01

Heavy ion testing of the Xilinx Virtex II was conducted on the configuration, block RAM and user flip flop cells to determine their static single-event upset susceptibility using LETs of 1.2 to 60 MeVcm^2/mg. A software program specifically designed to count errors in the FPGA was used to reveal L1/e, values (the LET at which the cross section is l/e times the saturation cross-section) and single-event functional-interrupt failures.
Tradeoffs in Flight Design Upset Mitigation in State of the Art FPGAs: Hardened by Design vs. Design Level Hardening

NASA Technical Reports Server (NTRS)

Swift, Gary M.; Roosta, Ramin

2004-01-01

This presentation compares and contrasts the effectiveness and the system/designer impacts of the two main approaches to upset hardening: the Actel approach (RTSX-S and RTAX-S) of low-level (inside each flip-flop) triplication and the Xilinx approach (Virtex and Virtex2) of design-level triplication of both functional blocks and voters. The effectiveness of these approaches is compared using measurements made in conjunction with each of the FPGAs' manufacturer: for Actel, published data [1] and for Xilinx, recent results from the Xilinx SEE Test Consortium (note that the author is an active and founding member). The impacts involve Actel advantages in the areas of transistor-utilization efficiency and minimizing designer involvement in the triplication while the Xilinx advantages relate to the ability to custom tailor upset hardness and the flexibility of re-configurability. Additionally, there are currently clear Xilinx advantages in available features such as the number of I/O's, logic cells, and RAM blocks as well as speed. However, the advantage of the Actel anti-fuses for configuration over the Xilinx SRAM cells is that the latter need additional functionality and external circuitry (PROMs and, at least a watchdog timer) for configuration and configuration scrubbing. Further, although effectively mitigated if done correctly, the proton upset-ability of the Xilinx FPGAs is a concern in severe proton-rich environments. Ultimately, both manufacturers' upset hardening is limited by SEFI (single-event functional interrupt) rates where it appears the Actel results are better although the Xilinx Virtex2-family result of about one SEFI in 65 device-years in solar-min GCR (the more intense part of the galactic cosmic-ray background) should be acceptable to most missions
Single Event Analysis and Fault Injection Techniques Targeting Complex Designs Implemented in Xilinx-Virtex Family Field Programmable Gate Array (FPGA) Devices

NASA Technical Reports Server (NTRS)

Berg, Melanie D.; LaBel, Kenneth; Kim, Hak

2014-01-01

An informative session regarding SRAM FPGA basics. Presenting a framework for fault injection techniques applied to Xilinx Field Programmable Gate Arrays (FPGAs). Introduce an overlooked time component that illustrates fault injection is impractical for most real designs as a stand-alone characterization tool. Demonstrate procedures that benefit from fault injection error analysis.
A Frequency Agile, Self-Adaptive Serial Link on Xilinx FPGAs

NASA Astrophysics Data System (ADS)

Aloisio, A.; Giordano, R.; Izzo, V.; Perrella, S.

2015-06-01

In this paper, we focused on the GTX transceiver modules of Xilinx Kintex 7 field-programmable gate arrays (FPGAs), which provide high bandwidth, low jitter on the recovered clock, and an equalization system on the transmitter and the receiver. We present a frequency agile, auto-adaptive serial link. The link is able to take care of the reconfiguration of the GTX parameters in order to fully benefit from the available link bandwidth, by setting the highest line rate. It is designed around an FPGA-embedded microprocessor, which drives the programmable ports of the GTX in order to control the quality of the received data and to easily calculate the bit-error rate in each sampling point of the eye diagram. We present the self-adaptive link project, the description of the test system, and the main results.
Physics of Failure Analysis of Xilinx Flip Chip CCGA Packages: Effects of Mission Environments on Properties of LP2 Underfill and ATI Lid Adhesive Materials

NASA Technical Reports Server (NTRS)

Suh, Jong-ook

2013-01-01

The Xilinx Virtex 4QV and 5QV (V4 and V5) are next-generation field-programmable gate arrays (FPGAs) for space applications. However, there have been concerns within the space community regarding the non-hermeticity of V4/V5 packages; polymeric materials such as the underfill and lid adhesive will be directly exposed to the space environment. In this study, reliability concerns associated with the non-hermeticity of V4/V5 packages were investigated by studying properties and behavior of the underfill and the lid adhesvie materials used in V4/V5 packages.
Radiation Hardening by Software Techniques on FPGAs: Flight Experiment Evaluation and Results

NASA Technical Reports Server (NTRS)

Schmidt, Andrew G.; Flatley, Thomas

2017-01-01

We present our work on implementing Radiation Hardening by Software (RHBSW) techniques on the Xilinx Virtex5 FPGAs PowerPC 440 processors on the SpaceCube 2.0 platform. The techniques have been matured and tested through simulation modeling, fault emulation, laser fault injection and now in a flight experiment, as part of the Space Test Program- Houston 4-ISS SpaceCube Experiment 2.0 (STP-H4-ISE 2.0). This work leverages concepts such as heartbeat monitoring, control flow assertions, and checkpointing, commonly used in the High Performance Computing industry, and adapts them for use in remote sensing embedded systems. These techniques are extremely low overhead (typically <1.3%), enabling a 3.3x gain in processing performance as compared to the equivalent traditionally radiation hardened processor. The recently concluded STP-H4 flight experiment was an opportunity to upgrade the RHBSW techniques for the Virtex5 FPGA and demonstrate them on-board the ISS to achieve TRL 7. This work details the implementation of the RHBSW techniques, that were previously developed for the Virtex4-based SpaceCube 1.0 platform, on the Virtex5-based SpaceCube 2.0 flight platform. The evaluation spans the development and integration with flight software, remotely uploading the new experiment to the ISS SpaceCube 2.0 platform, and conducting the experiment continuously for 16 days before the platform was decommissioned. The experiment was conducted on two PowerPCs embedded within the Virtex5 FPGA devices and the experiment collected 19,400 checkpoints, processed 253,482 status messages, and incurred 0 faults. These results are highly encouraging and future work is looking into longer duration testing as part of the STP-H5 flight experiment.
Thermal Interface Materials Selection and Application Guidelines: In Perspective of Xilinx Virtex-5QV Thermal Management

NASA Technical Reports Server (NTRS)

Suh, Jong-ook; Dillon, R. Peter; Tseng, Stephen

2015-01-01

The heat from high-power microdevices for space, such as Xilinx Virtex 4 and 5 (V4 and V5), has to be removed mainly through conduction in the space vacuum environment. The class-Y type packages are designed to remove the heat from the top of the package, and the most effective method to remove heat from the class-Y type packages is to attach a heat transfer device on the lid of the package and to transfer the heat to frame or chassis. When a heat transfer device is attached to the package lid, the surfaces roughness of the package lid and the heat transfer device reduces the effective contact area between the two. The reduced contact area results in increased thermal contact resistance, and a thermal interface material is required to reduce the thermal contact resistance by filling in the gap between the surfaces of the package lid and the heat transfer device. The current report describes JPL's FY14 NEPP task study on property requirements of TIM and impact of TIM properties on the packaging reliability. The current task also developed appratuses to investigate the performances of TIMs in the actual mission environment.
Analyzing the effectiveness of a frame-level redundancy scrubbing technique for SRAM-based FPGAs

DOE PAGES

Tonfat, Jorge; Lima Kastensmidt, Fernanda; Rech, Paolo; ...

2015-12-17

Radiation effects such as soft errors are the major threat to the reliability of SRAM-based FPGAs. This work analyzes the effectiveness in correcting soft errors of a novel scrubbing technique using internal frame redundancy called Frame-level Redundancy Scrubbing (FLR-scrubbing). This correction technique can be implemented in a coarse grain TMR design. The FLR-scrubbing technique was implemented on a mid-size Xilinx Virtex-5 FPGA device used as a case study. The FLR-scrubbing technique was tested under neutron radiation and fault injection. Implementation results demonstrated minimum area and energy consumption overhead when compared to other techniques. The time to repair the fault ismore » also improved by using the Internal Configuration Access Port (ICAP). Lastly, neutron radiation test results demonstrated that the proposed technique is suitable for correcting accumulated SEUs and MBUs.« less
Analyzing the effectiveness of a frame-level redundancy scrubbing technique for SRAM-based FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tonfat, Jorge; Lima Kastensmidt, Fernanda; Rech, Paolo

Radiation effects such as soft errors are the major threat to the reliability of SRAM-based FPGAs. This work analyzes the effectiveness in correcting soft errors of a novel scrubbing technique using internal frame redundancy called Frame-level Redundancy Scrubbing (FLR-scrubbing). This correction technique can be implemented in a coarse grain TMR design. The FLR-scrubbing technique was implemented on a mid-size Xilinx Virtex-5 FPGA device used as a case study. The FLR-scrubbing technique was tested under neutron radiation and fault injection. Implementation results demonstrated minimum area and energy consumption overhead when compared to other techniques. The time to repair the fault ismore » also improved by using the Internal Configuration Access Port (ICAP). Lastly, neutron radiation test results demonstrated that the proposed technique is suitable for correcting accumulated SEUs and MBUs.« less
Minimizing energy dissipation of matrix multiplication kernel on Virtex-II

NASA Astrophysics Data System (ADS)

Choi, Seonil; Prasanna, Viktor K.; Jang, Ju-wook

2002-07-01

In this paper, we develop energy-efficient designs for matrix multiplication on FPGAs. To analyze the energy dissipation, we develop a high-level model using domain-specific modeling techniques. In this model, we identify architecture parameters that significantly affect the total energy (system-wide energy) dissipation. Then, we explore design trade-offs by varying these parameters to minimize the system-wide energy. For matrix multiplication, we consider a uniprocessor architecture and a linear array architecture to develop energy-efficient designs. For the uniprocessor architecture, the cache size is a parameter that affects the I/O complexity and the system-wide energy. For the linear array architecture, the amount of storage per processing element is a parameter affecting the system-wide energy. By using maximum amount of storage per processing element and minimum number of multipliers, we obtain a design that minimizes the system-wide energy. We develop several energy-efficient designs for matrix multiplication. For example, for 6×6 matrix multiplication, energy savings of upto 52% for the uniprocessor architecture and 36% for the linear arrary architecture is achieved over an optimized library for Virtex-II FPGA from Xilinx.
Design space exploration of high throughput finite field multipliers for channel coding on Xilinx FPGAs

NASA Astrophysics Data System (ADS)

de Schryver, C.; Weithoffer, S.; Wasenmüller, U.; Wehn, N.

2012-09-01

Channel coding is a standard technique in all wireless communication systems. In addition to the typically employed methods like convolutional coding, turbo coding or low density parity check (LDPC) coding, algebraic codes are used in many cases. For example, outer BCH coding is applied in the DVB-S2 standard for satellite TV broadcasting. A key operation for BCH and the related Reed-Solomon codes are multiplications in finite fields (Galois Fields), where extension fields of prime fields are used. A lot of architectures for multiplications in finite fields have been published over the last decades. This paper examines four different multiplier architectures in detail that offer the potential for very high throughputs. We investigate the implementation performance of these multipliers on FPGA technology in the context of channel coding. We study the efficiency of the multipliers with respect to area, frequency and throughput, as well as configurability and scalability. The implementation data of the fully verified circuits are provided for a Xilinx Virtex-4 device after place and route.
Virtex-5QV Self Scrubber

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wojahn, Christopher K.

2015-10-20

This HDL code (hereafter referred to as "software") implements circuitry in Xilinx Virtex-5QV Field Programmable Gate Array (FPGA) hardware. This software allows the device to self-check the consistency of its own configuration memory for radiation-induced errors. The software then provides the capability to correct any single-bit errors detected in the memory using the device's inherent circuitry, or reload corrupted memory frames when larger errors occur that cannot be corrected with the device's built-in error correction and detection scheme.
Virtex-II Pro PowerPC SEE Characterization Test Methods and Results

NASA Technical Reports Server (NTRS)

Petrick, David; Powell, Wesley; LaBel, Ken; Howard, James

2005-01-01

The Xilinx Vix-11 Pro is a platform FPGA that embeds multiple microprocessors within the fabric of an SRAM-based reprogrammable FPGA. The variety and quantity of resources provided by this family of devices make them very attractive for spaceflight applications. However,these devices will be susceptible to single event effects (SEE), which must be mitigated. Observations from prior testing of the Xilinx Virtex-II Pro suggest that the PowerPC core has significant vulnerability to SEES. However, these initial tests were not designed to exclusively target the functionality of the PowerPC, therefore making it difficult to distinguish processor upsets from fabric upsets. The main focus of this paper involves detailed SEE testing of the embedded PowerPC core. Due to the complexity of the PowerPC, various custom test applications, both static and dynamic, will be designed to isolate each Unit of the processor. Collective analysis of the test results will provide insight into the exact upset mechanism of the PowerPC. With this information, mitigations schemes can be developed and tested that address the specific susceptibilities of these devices. The test bed will be the Xilinx SEE Consortium Virtex-II Pro test board, which allows for configuration scrubbing, design triplication, and ease of data collection. Testing will be performed at the Indiana University Cyclotron Facility using protons of varying energy levels and fluencies. This paper will present the detailed test approach along with the results.
Mitigating Upsets in SRAM Based FPGAs from the Xilinix Virtex 2 Family

NASA Technical Reports Server (NTRS)

Swift, Gary M.; Yui, Candice C.; Carmichael, Carl; Koga, Rocky; George, Jeffrey S.

2003-01-01

This slide presentation reviews the single event upset static testing of the Virtex II field programmable gate arrays (FPGA) that were tested in protons and heavy-ions. The test designs and static and dynamic test results are reviewed.

A Re-programmable Platform for Dynamic Burn-in Test of Xilinx Virtexll 3000 FPGA for Military and Aerospace Applications

NASA Technical Reports Server (NTRS)

Roosta, Ramin; Wang, Xinchen; Sadigursky, Michael; Tracton, Phil

2004-01-01

Field Programmable Gate Arrays (FPGA) have played increasingly important roles in military and aerospace applications. Xilinx SRAM-based FPGAs have been extensively used in commercial applications. They have been used less frequently in space flight applications due to their susceptibility to single-event upsets. Reliability of these devices in space applications is a concern that has not been addressed. The objective of this project is to design a fully programmable hardware/software platform that allows (but is not limited to) comprehensive static/dynamic burn-in test of Virtex-II 3000 FPGAs, at speed test and SEU test. Conventional methods test very few discrete AC parameters (primarily switching) of a given integrated circuit. This approach will test any possible configuration of the FPGA and any associated performance parameters. It allows complete or partial re-programming of the FPGA and verification of the program by using read back followed by dynamic test. Designers have full control over which functional elements of the FPGA to stress. They can completely simulate all possible types of configurations/functions. Another benefit of this platform is that it allows collecting information on elevation of the junction temperature as a function of gate utilization, operating frequency and functionality. A software tool has been implemented to demonstrate the various features of the system. The software consists of three major parts: the parallel interface driver, main system procedure and a graphical user interface (GUI).
Scalable System Design for Covert MIMO Communications

DTIC Science & Technology

2014-06-01

Sample based resolution of the QRD and equalization processes in the MIMO receiver, for NQR = 11...55 5.1 NQR calculation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.2 Resources available on Xilinx Virtex-7 FPGAs...carried out for Na ∈ [2 3 4]. Extrapolation is used to determine trends as a function of the number of QRD blocks instantiated NQR and Na. This section
Evaluation of the FIR Example using Xilinx Vivado High-Level Synthesis Compiler

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Finkel, Hal; Yoshii, Kazutomo

Compared to central processing units (CPUs) and graphics processing units (GPUs), field programmable gate arrays (FPGAs) have major advantages in reconfigurability and performance achieved per watt. This development flow has been augmented with high-level synthesis (HLS) flow that can convert programs written in a high-level programming language to Hardware Description Language (HDL). Using high-level programming languages such as C, C++, and OpenCL for FPGA-based development could allow software developers, who have little FPGA knowledge, to take advantage of the FPGA-based application acceleration. This improves developer productivity and makes the FPGA-based acceleration accessible to hardware and software developers. Xilinx Vivado HLSmore » compiler is a high-level synthesis tool that enables C, C++ and System C specification to be directly targeted into Xilinx FPGAs without the need to create RTL manually. The white paper [1] published recently by Xilinx uses a finite impulse response (FIR) example to demonstrate the variable-precision features in the Vivado HLS compiler and the resource and power benefits of converting floating point to fixed point for a design. To get a better understanding of variable-precision features in terms of resource usage and performance, this report presents the experimental results of evaluating the FIR example using Vivado HLS 2017.1 and a Kintex Ultrascale FPGA. In addition, we evaluated the half-precision floating-point data type against the double-precision and single-precision data type and present the detailed results.« less
Virtex-5 CN Package Daisy Chain Evaluation Test Report

NASA Technical Reports Server (NTRS)

Suh, Jong-ook

2016-01-01

The board-level temperature cycling reliability of Xilinx Virtex-5 (V5) CN package was investigated. V5s were temperature cycled under two conditions, 0 to +100 C (0/100) and -55 to +100 C (-55/100). During the 0/100 test, no part out of 8 parts failed up to 6586 cycles. During the -55/100 test, one part out of 8 parts failed at 1236 cycle, and there were no additional failures up to 1705 cycles. The failure mode of the part that failed at 1236 cycles indicated that most likely the failure was not a solder fatigue failure, and therefore no obvious solder fatigue failure was observed throughout the tests.
MicroShell Minimalist Shell for Xilinx Microprocessors

NASA Technical Reports Server (NTRS)

Werne, Thomas A.

2011-01-01

MicroShell is a lightweight shell environment for engineers and software developers working with embedded microprocessors in Xilinx FPGAs. (MicroShell has also been successfully ported to run on ARM Cortex-M1 microprocessors in Actel ProASIC3 FPGAs, but without project-integration support.) Micro Shell decreases the time spent performing initial tests of field-programmable gate array (FPGA) designs, simplifies running customizable one-time-only experiments, and provides a familiar-feeling command-line interface. The program comes with a collection of useful functions and enables the designer to add an unlimited number of custom commands, which are callable from the command-line. The commands are parameterizable (using the C-based command-line parameter idiom), so the designer can use one function to exercise hardware with different values. Also, since many hardware peripherals instantiated in FPGAs have reasonably simple register-mapped I/O interfaces, the engineer can edit and view hardware parameter settings at any time without stopping the processor. MicroShell comes with a set of support scripts that interface seamlessly with Xilinx's EDK tool. Adding an instance of MicroShell to a project is as simple as marking a check box in a library configuration dialog box and specifying a software project directory. The support scripts then examine the hardware design, build design-specific functions, conditionally include processor-specific functions, and complete the compilation process. For code-size constrained designs, most of the stock functionality can be excluded from the compiled library. When all of the configurable options are removed from the binary, MicroShell has an unoptimized memory footprint of about 4.8 kB and a size-optimized footprint of about 2.3 kB. Since MicroShell allows unfettered access to all processor-accessible memory locations, it is possible to perform live patching on a running system. This can be useful, for instance, if a bug is
DOE Office of Scientific and Technical Information (OSTI.GOV)

Learn, Mark Walter

Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
High-Precision Pulse Generator

NASA Technical Reports Server (NTRS)

Katz, Richard; Kleyner, Igor

2011-01-01

A document discusses a pulse generator with subnanosecond resolution implemented with a low-cost field-programmable gate array (FPGA) at low power levels. The method used exploits the fast carry chains of certain FPGAs. Prototypes have been built and tested in both Actel AX and Xilinx Virtex 4 technologies. In-flight calibration or control can be performed by using a similar and related technique as a time interval measurement circuit by measuring a period of the stable oscillator, as the delays through the fast carry chains will vary as a result of manufacturing variances as well as the result of environmental conditions (voltage, aging, temperature, and radiation).
Applying a Genetic Algorithm to Reconfigurable Hardware

NASA Technical Reports Server (NTRS)

Wells, B. Earl; Weir, John; Trevino, Luis; Patrick, Clint; Steincamp, Jim

2004-01-01

This paper investigates the feasibility of applying genetic algorithms to solve optimization problems that are implemented entirely in reconfgurable hardware. The paper highlights the pe$ormance/design space trade-offs that must be understood to effectively implement a standard genetic algorithm within a modem Field Programmable Gate Array, FPGA, reconfgurable hardware environment and presents a case-study where this stochastic search technique is applied to standard test-case problems taken from the technical literature. In this research, the targeted FPGA-based platform and high-level design environment was the Starbridge Hypercomputing platform, which incorporates multiple Xilinx Virtex II FPGAs, and the Viva TM graphical hardware description language.
NASA Accelerates SpaceCube Technology into Orbit

NASA Technical Reports Server (NTRS)

Petrick, David

2010-01-01

On May 11, 2009, STS-125 Space Shuttle Atlantis blasted off from Kennedy Space Center on a historic mission to service the Hubble Space Telescope (HST). In addition to sending up the hardware and tools required to repair the observatory, the servicing team at NASA's Goddard Space Flight Center also sent along a complex experimental payload called Relative Navigation Sensors (RNS). The main objective of the RNS payload was to provide real-time image tracking of HST during rendezvous and docking operations. RNS was a complete success, and was brought to life by four Xilinx FPGAs (Field Programmable Gate Arrays) tightly packed into one integrated computer called SpaceCube. SpaceCube is a compact, reconfigurable, multiprocessor computing platform for space applications demanding extreme processing capabilities based on Xilinx Virtex 4 FX60 FPGAs. In a matter of months, the concept quickly went from the white board to a fully funded flight project. The 4-inch by 4-inch SpaceCube processor card was prototyped by a group of Goddard engineers using internal research funding. Once engineers were able to demonstrate the processing power of SpaceCube to NASA, HST management stood behind the product and invested in a flight qualified version, inserting it into the heart of the RNS system. With the determination of putting Xilinx into space, the team strengthened to a small army and delivered a fully functional, space qualified system to the mission.
Design and implementation of projects with Xilinx Zynq FPGA: a practical case

NASA Astrophysics Data System (ADS)

Travaglini, R.; D'Antone, I.; Meneghini, S.; Rignanese, L.; Zuffa, M.

The main advantage when using FPGAs with embedded processors is the availability of additional several high-performance resources in the same physical device. Moreover, the FPGA programmability allows for connect custom peripherals. Xilinx have designed a programmable device named Zynq-7000 (simply called Zynq in the following), which integrates programmable logic (identical to the other Xilinx "serie 7" devices) with a System on Chip (SOC) based on two embedded ARM processors. Since both parts are deeply connected, the designers benefit from performance of hardware SOC and flexibility of programmability as well. In this paper a design developed by the Electronic Design Department at the Bologna Division of INFN will be presented as a practical case of project based on Zynq device. It is developed by using a commercial board called ZedBoard hosting a FMC mezzanine with a 12-bit 500 MS/s ADC. The Zynq FPGA on the ZedBoard receives digital outputs from the ADC and send them to the acquisition PC, after proper formatting, through a Gigabit Ethernet link. The major focus of the paper will be about the methodology to develop a Zynq-based design with the Xilinx Vivado software, enlightening how to configure the SOC and connect it with the programmable logic. Firmware design techniques will be presented: in particular both VHDL and IP core based strategies will be discussed. Further, the procedure to develop software for the embedded processor will be presented. Finally, some debugging tools, like the embedded Logic Analyzer, will be shown. Advantages and disadvantages with respect to adopting FPGA without embedded processors will be discussed.
A 128-channel Time-to-Digital Converter (TDC) inside a Virtex-5 FPGA on the GANDALF module

NASA Astrophysics Data System (ADS)

Büchele, M.; Fischer, H.; Gorzellik, M.; Herrmann, F.; Königsmann, K.; Schill, C.; Schopferer, S.

2012-03-01

The GANDALF 6U-VME64x/VXS module has been developed for the digitization and real time analysis of detector signals. To perform different applications such as analog-to-digital or time-to-digital conversions, coincidence matrix formation, fast pattern recognition and trigger generation, this module comes with exchangeable analog and digital mezzanine cards. Based on this platform, we present a 128-channel TDC which is implemented in a single Xilinx Virtex-5 FPGA using a shifted clock sampling method. In contrast to common TDC concepts, the input signal is sampled by 16 equidistant phase-shifted clocks. A particular challenge of the design is the minimum skew routing of the input signals to the sampling flip-flops. We present measurement results for the differential nonlinearity and the time resolution of the TDC readout system.
A preliminary study of molecular dynamics on reconfigurable computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolinski, C.; Trouw, F. R.; Gokhale, M.

2003-01-01

In this paper we investigate the performance of platform FPGAs on a compute-intensive, floating-point-intensive supercomputing application, Molecular Dynamics (MD). MD is a popular simulation technique to track interacting particles through time by integrating their equations of motion. One part of the MD algorithm was implemented using the Fabric Generator (FG)[l I ] and mapped onto several reconfigurable logic arrays. FG is a Java-based toolset that greatly accelerates construction of the fabrics from an abstract technology independent representation. Our experiments used technology-independent IEEE 32-bit floating point operators so that the design could be easily re-targeted. Experiments were performed using both non-pipelinedmore » and pipelined floating point modules. We present results for the Altera Excalibur ARM System on a Programmable Chip (SoPC), the Altera Strath EPlS80, and the Xilinx Virtex-N Pro 2VP.50. The best results obtained were 5.69 GFlops at 8OMHz(Altera Strath EPlS80), and 4.47 GFlops at 82 MHz (Xilinx Virtex-II Pro 2VF50). Assuming a lOWpower budget, these results compare very favorably to a 4Gjlop/40Wprocessing/power rate for a modern Pentium, suggesting that reconfigurable logic can achieve high performance at low power on jloating-point-intensivea pplications.« less
CoNNeCT Baseband Processor Module

NASA Technical Reports Server (NTRS)

Yamamoto, Clifford K; Jedrey, Thomas C.; Gutrich, Daniel G.; Goodpasture, Richard L.

2011-01-01

A document describes the CoNNeCT Baseband Processor Module (BPM) based on an updated processor, memory technology, and field-programmable gate arrays (FPGAs). The BPM was developed from a requirement to provide sufficient computing power and memory storage to conduct experiments for a Software Defined Radio (SDR) to be implemented. The flight SDR uses the AT697 SPARC processor with on-chip data and instruction cache. The non-volatile memory has been increased from a 20-Mbit EEPROM (electrically erasable programmable read only memory) to a 4-Gbit Flash, managed by the RTAX2000 Housekeeper, allowing more programs and FPGA bit-files to be stored. The volatile memory has been increased from a 20-Mbit SRAM (static random access memory) to a 1.25-Gbit SDRAM (synchronous dynamic random access memory), providing additional memory space for more complex operating systems and programs to be executed on the SPARC. All memory is EDAC (error detection and correction) protected, while the SPARC processor implements fault protection via TMR (triple modular redundancy) architecture. Further capability over prior BPM designs includes the addition of a second FPGA to implement features beyond the resources of a single FPGA. Both FPGAs are implemented with Xilinx Virtex-II and are interconnected by a 96-bit bus to facilitate data exchange. Dedicated 1.25- Gbit SDRAMs are wired to each Xilinx FPGA to accommodate high rate data buffering for SDR applications as well as independent SpaceWire interfaces. The RTAX2000 manages scrub and configuration of each Xilinx.
Exploring Accelerating Science Applications with FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Storaasli, Olaf O; Strenski, Dave

2007-01-01

FPGA hardware and tools (VHDL, Viva, MitrionC and CHiMPS) are described. FPGA performance is evaluated on two Cray XD1 systems (Virtex-II Pro 50 and Virtex-4 LX160) for human genome (DNA and protein) sequence comparisons for a computational biology code (FASTA). Scalable FPGA speedups of 50X (Virtex-II) and 100X (Virtex-4) over a 2.2 GHz Opteron were achieved. Coding and IO issues faced for human genome data are described.
Design for Security Workshop

DTIC Science & Technology

2014-09-30

fingerprint sensor etc.  Secure application execution  Trust established outwards  With normal world apps  With internet/cloud apps...Xilinx Zynq Security Components and Capabilities © Copyright 2014 Xilinx . Security Features Inherited from FPGAs Zynq Secure Boot TrustZone...2014 Xilinx . Security Features Inherited from FPGAs Zynq Secure Boot TrustZone Integration 4 Agenda © Copyright 2014 Xilinx . Device DNA and User
Virtex-II Pro SEE Test Methods and Results

NASA Technical Reports Server (NTRS)

Petrick, David; Powell, Wesley; Howard, James W., Jr.; LaBel, Kenneth A.

2004-01-01

The objective of this coarse Single Event Effect (SEE) test is to determine the suitability of the commercial Virtex-II Pro family for use in spaceflight applications. To this end, this test is primarily intended to determine any Singe Event Latchup (SEL) susceptibilities for these devices. Secondly, this test is intended to measure the level of Single Event Upset (SEU) susceptibilities and in a general sense where they occur. The coarse SEE test was performed on a commercial XC2VP7 device, a relatively small single processor version of the Virtex-II Pro. As the XC2VP7 shares the same functional block design and fabrication process with the larger Virtex-II Pro devices, the results of this test should also be applicable to the larger devices. The XC2VP7 device was tested on a commercial Virtex-II Pro development board. The testing was performed at the Cyclotron laboratories at Texas A&M and Michigan State Universities using ions of varying energy levels and fluences.
Implementation of a Multichannel Serial Data Streaming Algorithm using the Xilinx Serial RapidIO Solution

NASA Technical Reports Server (NTRS)

Doxley, Charles A.

2016-01-01

In the current world of applications that use reconfigurable technology implemented on field programmable gate arrays (FPGAs), there is a need for flexible architectures that can grow as the systems evolve. A project has limited resources and a fixed set of requirements that development efforts are tasked to meet. Designers must develop robust solutions that practically meet the current customer demands and also have the ability to grow for future performance. This paper describes the development of a high speed serial data streaming algorithm that allows for transmission of multiple data channels over a single serial link. The technique has the ability to change to meet new applications developed for future design considerations. This approach uses the Xilinx Serial RapidIO LOGICORE Solution to implement a flexible infrastructure to meet the current project requirements with the ability to adapt future system designs.
Hardware accelerator design for change detection in smart camera

NASA Astrophysics Data System (ADS)

Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Chaudhury, Santanu; Vohra, Anil

2011-10-01

Smart Cameras are important components in Human Computer Interaction. In any remote surveillance scenario, smart cameras have to take intelligent decisions to select frames of significant changes to minimize communication and processing overhead. Among many of the algorithms for change detection, one based on clustering based scheme was proposed for smart camera systems. However, such an algorithm could achieve low frame rate far from real-time requirements on a general purpose processors (like PowerPC) available on FPGAs. This paper proposes the hardware accelerator capable of detecting real time changes in a scene, which uses clustering based change detection scheme. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA board. Resulted frame rate is 30 frames per second for QVGA resolution in gray scale.
Radiation effects in reconfigurable FPGAs

NASA Astrophysics Data System (ADS)

Quinn, Heather

2017-04-01

Field-programmable gate arrays (FPGAs) are co-processing hardware used in image and signal processing. FPGA are programmed with custom implementations of an algorithm. These algorithms are highly parallel hardware designs that are faster than software implementations. This flexibility and speed has made FPGAs attractive for many space programs that need in situ, high-speed signal processing for data categorization and data compression. Most commercial FPGAs are affected by the space radiation environment, though. Problems with TID has restricted the use of flash-based FPGAs. Static random access memory based FPGAs must be mitigated to suppress errors from single-event upsets. This paper provides a review of radiation effects issues in reconfigurable FPGAs and discusses methods for mitigating these problems. With careful design it is possible to use these components effectively and resiliently.
Design and evaluation of online arithmetic for signal processing applications on FPGAs

NASA Astrophysics Data System (ADS)

Galli, Reto; Tenca, Alexandre F.

2001-11-01

This paper shows the design and the evaluation of on-line arithmetic modules for the most common operators used in DSP applications, using FPGAs as the target technology. The designs are highly optimized for the target technology and the common range of precision in DSP. The results are based on experimental data collected using CAD tools. All designs are synthesized for the same type of devices (Xilinx XC4000) for comparison, avoiding rough estimates of the system performance, and generating a more reliable and detailed comparison of on-line signal processing solutions with other state of the art approaches, such as distributed arithmetic. We show that on-line designs have a hard stand for basic DSP applications that use only addition and multiplication. However, we also show that on-line designs are able to overtake other approaches as the applications become more sophisticated, e.g. when data dependencies exist, or when non constant multiplicands restrict the use of other approaches.

Optimizing latency in Xilinx FPGA implementations of the GBT

NASA Astrophysics Data System (ADS)

Muschter, S.; Baron, S.; Bohm, C.; Cachemiche, J.-P.; Soos, C.

2010-12-01

The GigaBit Transceiver (GBT) [1] system has been developed to replace the Timing, Trigger and Control (TTC) system [2], currently used by LHC, as well as to provide data transmission between on-detector and off-detector components in future sLHC detectors. A VHDL version of the GBT-SERDES, designed for FPGAs, was released in March 2010 as a GBT-FPGA Starter Kit for future GBT users and for off-detector GBT implementation [3]. This code was optimized for resource utilization [4], as the GBT protocol is very demanding. It was not, however, optimized for latency — which will be a critical parameter when used in the trigger path. The GBT-FPGA Starter Kit firmware was first analyzed in terms of latency by looking at the separate components of the VHDL version. Once the parts which contribute most to the latency were identified and modified, two possible optimizations were chosen, resulting in a latency reduced by a factor of three. The modifications were also analyzed in terms of logic utilization. The latency optimization results were compared with measurement results from a Virtex 6 ML605 development board [5] equipped with a XC6VLX240T with speedgrade-1 and the package FF1156. Bit error rate tests were also performed to ensure an error free operation. The two final optimizations were analyzed for utilization and compared with the original code, distributed in the Starter Kit.
VIRTEX-5 Fpga Implementation of Advanced Encryption Standard Algorithm

NASA Astrophysics Data System (ADS)

Rais, Muhammad H.; Qasim, Syed M.

2010-06-01

In this paper, we present an implementation of Advanced Encryption Standard (AES) cryptographic algorithm using state-of-the-art Virtex-5 Field Programmable Gate Array (FPGA). The design is coded in Very High Speed Integrated Circuit Hardware Description Language (VHDL). Timing simulation is performed to verify the functionality of the designed circuit. Performance evaluation is also done in terms of throughput and area. The design implemented on Virtex-5 (XC5VLX50FFG676-3) FPGA achieves a maximum throughput of 4.34 Gbps utilizing a total of 399 slices.
Multiple Embedded Processors for Fault-Tolerant Computing

NASA Technical Reports Server (NTRS)

Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

2005-01-01

A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.

PubMed

Ly, Daniel Le; Chow, Paul

2010-11-01

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
A Nonlinearity Minimization-Oriented Resource-Saving Time-to-Digital Converter Implemented in a 28 nm Xilinx FPGA

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Liu, Chong

2015-10-01

Because large nonlinearity errors exist in the current tapped-delay line (TDL) style field programmable gate array (FPGA)-based time-to-digital converters (TDC), bin-by-bin calibration techniques have to be resorted for gaining a high measurement resolution. If the TDL in selected FPGAs is significantly affected by changes in ambient temperature, the bin-by-bin calibration table has to be updated as frequently as possible. The on-line calibration and calibration table updating increase the TDC design complexity and limit the system performance to some extent. This paper proposes a method to minimize the nonlinearity errors of TDC bins, so that the bin-by-bin calibration may not be needed while maintaining a reasonably high time resolution. The method is a two pass approach: By a bin realignment, the large number of wasted zero-width bins in the original TDL is reused and the granularity of the bins is improved; by a bin decimation, the bin size and its uniformity is traded-off, and the time interpolation by the delay line turns more precise so that the bin-by-bin calibration is not necessary. Using Xilinx 28 nm FPGAs, in which the TDL property is not very sensitive to ambient temperature, the proposed TDC achieves approximately 15 ps root-mean-square (RMS) time resolution by dual-channel measurements of time-intervals over the range of operating temperature. Because of removing the calibration and less logic resources required for the data post-processing, the method has bigger multi-channel capability.
Mercury BLASTP: Accelerating Protein Sequence Alignment

PubMed Central

Jacob, Arpith; Lancaster, Joseph; Buhler, Jeremy; Harris, Brandon; Chamberlain, Roger D.

2008-01-01

Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this paper, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11-15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results. PMID:19492068
FPGA Implementation of the Coupled Filtering Method and the Affine Warping Method.

PubMed

Zhang, Chen; Liang, Tianzhu; Mok, Philip K T; Yu, Weichuan

2017-07-01

In ultrasound image analysis, the speckle tracking methods are widely applied to study the elasticity of body tissue. However, "feature-motion decorrelation" still remains as a challenge for the speckle tracking methods. Recently, a coupled filtering method and an affine warping method were proposed to accurately estimate strain values, when the tissue deformation is large. The major drawback of these methods is the high computational complexity. Even the graphics processing unit (GPU)-based program requires a long time to finish the analysis. In this paper, we propose field-programmable gate array (FPGA)-based implementations of both methods for further acceleration. The capability of FPGAs on handling different image processing components in these methods is discussed. A fast and memory-saving image warping approach is proposed. The algorithms are reformulated to build a highly efficient pipeline on FPGA. The final implementations on a Xilinx Virtex-7 FPGA are at least 13 times faster than the GPU implementation on the NVIDIA graphic card (GeForce GTX 580).
Radiation effects and mitigation strategies for modern FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stettler, M. W.; Caffrey, M. P.; Graham, P. S.

2004-01-01

Field Programmable Gate Array devices have become the technology of choice in small volume modern instrumentation and control systems. These devices have always offered significant advantages in flexibility, and recent advances in fabrication have greatly increased logic capacity, substantially increasing the number of applications for this technology. Unfortunately, the increased density (and corresponding shrinkage of process geometry), has made these devices more susceptible to failure due to external radiation. This has been an issue for space based systems for some time, but is now becoming an issue for terrestrial systems in elevated radiation environments and commercial avionics as well. Characterizingmore » the failure modes of Xilinx FPGAs, and developing mitigation strategies is the subject of ongoing research by a consortium of academic, industrial, and governmental laboratories. This paper presents background information of radiation effects and failure modes, as well as current and future mitigation techniques. In particular, the availability of very large FPGA devices, complete with generous amounts of RAM and embedded processor(s), has led to the implementation of complete digital systems on a single device, bringing issues of system reliability and redundancy management to the chip level. Radiation effects on a single FPGA are increasingly likely to have system level consequences, and will need to be addressed in current and future designs.« less
Single Event Effects in FPGA Devices 2015-2016

NASA Technical Reports Server (NTRS)

Berg, Melanie; LaBel, Kenneth; Pellish, Jonathan

2016-01-01

This presentation provides an overview of single event effects in FPGA devices 2015-2016 including commercial Xilinx V5 heavy ion accelerated testing, Xilinx Kintex-7 heavy ion accelerated testing. Mitigation study, and investigation of various types of triple modular redundancy (TMR) for commercial SRAM based FPGAs.
Single Event Effects in FPGA Devices 2014-2015

NASA Technical Reports Server (NTRS)

Berg, Melanie D.; LaBel, Kenneth A.; Pellish, Jonathan

2015-01-01

This presentation provides an overview of single event effects in FPGA devices 2014-2015 including commercial Xilinx V5 heavy ion accelerated testing, Xilinx Kintex-7 heavy ion accelerated testing. Mitigation study, and investigation of various types of triple modular redundancy (TMR) for commercial SRAM based FPGAs.
Single Event Effects in FPGA Devices 2015-2016

NASA Technical Reports Server (NTRS)

Berg, Melanie; LaBel, Kenneth; Pellish, Jonathan

2016-01-01

This presentation provides an overview of single event effects in FPGA devices 2015-2016 including commercial Xilinx V5 heavy ion accelerated testing, Xilinx Kintex-7 heavy ion accelerated testing, mitigation study, and investigation of various types of triple modular redundancy (TMR) for commercial SRAM based FPGAs.
Novel intelligent real-time position tracking system using FPGA and fuzzy logic.

PubMed

Soares dos Santos, Marco P; Ferreira, J A F

2014-03-01

The main aim of this paper is to test if FPGAs are able to achieve better position tracking performance than software-based soft real-time platforms. For comparison purposes, the same controller design was implemented in these architectures. A Multi-state Fuzzy Logic controller (FLC) was implemented both in a Xilinx(®) Virtex-II FPGA (XC2v1000) and in a soft real-time platform NI CompactRIO(®)-9002. The same sampling time was used. The comparative tests were conducted using a servo-pneumatic actuation system. Steady-state errors lower than 4 μm were reached for an arbitrary vertical positioning of a 6.2 kg mass when the controller was embedded into the FPGA platform. Performance gains up to 16 times in the steady-state error, up to 27 times in the overshoot and up to 19.5 times in the settling time were achieved by using the FPGA-based controller over the software-based FLC controller. © 2013 ISA. Published by Elsevier Ltd. All rights reserved.
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors

NASA Technical Reports Server (NTRS)

Flatley, Thomas P.

2015-01-01

SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
An FPGA- Based General-Purpose Data Acquisition Controller

NASA Astrophysics Data System (ADS)

Robson, C. C. W.; Bousselham, A.; Bohm

2006-08-01

System development in advanced FPGAs allows considerable flexibility, both during development and in production use. A mixed firmware/software solution allows the developer to choose what shall be done in firmware or software, and to make that decision late in the process. However, this flexibility comes at the cost of increased complexity. We have designed a modular development framework to help to overcome these issues of increased complexity. This framework comprises a generic controller that can be adapted for different systems by simply changing the software or firmware parts. The controller can use both soft and hard processors, with or without an RTOS, based on the demands of the system to be developed. The resulting system uses the Internet for both control and data acquisition. In our studies we developed the embedded system in a Xilinx Virtex-II Pro FPGA, where we used both PowerPC and MicroBlaze cores, http, Java, and LabView for control and communication, together with the MicroC/OS-II and OSE operating systems
Integration of the Reconfigurable Self-Healing eDNA Architecture in an Embedded System

NASA Technical Reports Server (NTRS)

Boesen, Michael Reibel; Keymeulen, Didier; Madsen, Jan; Lu, Thomas; Chao, Tien-Hsin

2011-01-01

In this work we describe the first real world case study for the self-healing eDNA (electronic DNA) architecture by implementing the control and data processing of a Fourier Transform Spectrometer (FTS) on an eDNA prototype. For this purpose the eDNA prototype has been ported from a Xilinx Virtex 5 FPGA to an embedded system consisting of a PowerPC and a Xilinx Virtex 5 FPGA. The FTS instrument features a novel liquid crystal waveguide, which consequently eliminates all moving parts from the instrument. The addition of the eDNA architecture to do the control and data processing has resulted in a highly fault-tolerant FTS instrument. The case study has shown that the early stage prototype of the autonomous self-healing eDNA architecture is expensive in terms of execution time.
L1 track trigger for the CMS HL-LHC upgrade using AM chips and FPGAs

NASA Astrophysics Data System (ADS)

Fedi, Giacomo

2017-08-01

The increase of luminosity at the HL-LHC will require the introduction of tracker information in CMS's Level-1 trigger system to maintain an acceptable trigger rate when selecting interesting events, despite the order of magnitude increase in minimum bias interactions. To meet the latency requirements, dedicated hardware has to be used. This paper presents the results of tests of a prototype system (pattern recognition ezzanine) as core of pattern recognition and track fitting for the CMS experiment, combining the power of both associative memory custom ASICs and modern Field Programmable Gate Array (FPGA) devices. The mezzanine uses the latest available associative memory devices (AM06) and the most modern Xilinx Ultrascale FPGAs. The results of the test for a complete tower comprising about 0.5 million patterns is presented, using as simulated input events traversing the upgraded CMS detector. The paper shows the performance of the pattern matching, track finding and track fitting, along with the latency and processing time needed. The pT resolution over pT of the muons measured using the reconstruction algorithm is at the order of 1% in the range 3-100 GeV/c.
The Unified Floating Point Vector Coprocessor for Reconfigurable Hardware

NASA Astrophysics Data System (ADS)

Kathiara, Jainik

There has been an increased interest recently in using embedded cores on FPGAs. Many of the applications that make use of these cores have floating point operations. Due to the complexity and expense of floating point hardware, these algorithms are usually converted to fixed point operations or implemented using floating-point emulation in software. As the technology advances, more and more homogeneous computational resources and fixed function embedded blocks are added to FPGAs and hence implementation of floating point hardware becomes a feasible option. In this research we have implemented a high performance, autonomous floating point vector Coprocessor (FPVC) that works independently within an embedded processor system. We have presented a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The Hybrid vector/SIMD computational model of FPVC results in greater overall performance for most applications along with improved peak performance compared to other approaches. By parameterizing vector length and the number of vector lanes, we can design an application specific FPVC and take optimal advantage of the FPGA fabric. For this research we have also initiated designing a software library for various computational kernels, each of which adapts FPVC's configuration and provide maximal performance. The kernels implemented are from the area of linear algebra and include matrix multiplication and QR and Cholesky decomposition. We have demonstrated the operation of FPVC on a Xilinx Virtex 5 using the embedded PowerPC.
Flexible Architecture for FPGAs in Embedded Systems

NASA Technical Reports Server (NTRS)

Clark, Duane I.; Lim, Chester N.

2012-01-01

Commonly, field-programmable gate arrays (FPGAs) being developed in cPCI embedded systems include the bus interface in the FPGA. This complicates the development because the interface is complicated and requires a lot of development time and FPGA resources. In addition, flight qualification requires a substantial amount of time be devoted to just this interface. Another complication of putting the cPCI interface into the FPGA being developed is that configuration information loaded into the device by the cPCI microprocessor is lost when a new bit file is loaded, requiring cumbersome operations to return the system to an operational state. Finally, SRAM-based FPGAs are typically programmed via specialized cables and software, with programming files being loaded either directly into the FPGA, or into PROM devices. This can be cumbersome when doing FPGA development in an embedded environment, and does not have an easy path to flight. Currently, FPGAs used in space applications are usually programmed via multiple space-qualified PROM devices that are physically large and require extra circuitry (typically including a separate one-time programmable FPGA) to enable them to be used for this application. This technology adds a cPCI interface device with a simple, flexible, high-performance backend interface supporting multiple backend FPGAs. It includes a mechanism for programming the FPGAs directly via the microprocessor in the embedded system, eliminating specialized hardware, software, and PROM devices and their associated circuitry. It has a direct path to flight, and no extra hardware and minimal software are required to support reprogramming in flight. The device added is currently a small FPGA, but an advantage of this technology is that the design of the device does not change, regardless of the application in which it is being used. This means that it needs to be qualified for flight only once, and is suitable for one-time programmable devices or an application
Experiences on developing digital down conversion algorithms using Xilinx system generator

NASA Astrophysics Data System (ADS)

Xu, Chengfa; Yuan, Yuan; Zhao, Lizhi

2013-07-01

The Digital Down Conversion (DDC) algorithm is a classical signal processing method which is widely used in radar and communication systems. In this paper, the DDC function is implemented by Xilinx System Generator tool on FPGA. System Generator is an FPGA design tool provided by Xilinx Inc and MathWorks Inc. It is very convenient for programmers to manipulate the design and debug the function, especially for the complex algorithm. Through the developing process of DDC function based on System Generator, the results show that System Generator is a very fast and efficient tool for FPGA design.
A Mathematical Approach for Compiling and Optimizing Hardware Implementations of DSP Transforms

DTIC Science & Technology

2010-08-01

FPGA throughput [billion samples per second] performance [ Gflop /s] 0 30 60 90 120 150 0 1 2 3 4 5 0 5,000 10,000 15,000 20,000 25,000...30,000 35,000 40,000 45,000 area [slices] DFT 64 (floating point) on Xilinx Virtex-6 FPGA throughput [billion samples per second] performance [ Gflop ...Virtex-6 FPGA throughput [billion samples per second] performance [ Gflop /s] 0 50 100 150 200 250 0 1 2 3 4 5 0 10,000 20,000 30,000 40,000

Sensor Systems Based on FPGAs and Their Applications: A Survey

PubMed Central

de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah

2012-01-01

In this manuscript, we present a survey of designs and implementations of research sensor nodes that rely on FPGAs, either based upon standalone platforms or as a combination of microcontroller and FPGA. Several current challenges in sensor networks are distinguished and linked to the features of modern FPGAs. As it turns out, low-power optimized FPGAs are able to enhance the computation of several types of algorithms in terms of speed and power consumption in comparison to microcontrollers of commercial sensor nodes. We show that architectures based on the combination of microcontrollers and FPGA can play a key role in the future of sensor networks, in fields where processing capabilities such as strong cryptography, self-testing and data compression, among others, are paramount.
Validation techniques for fault emulation of SRAM-based FPGAs

DOE PAGES

Quinn, Heather; Wirthlin, Michael

2015-08-07

A variety of fault emulation systems have been created to study the effect of single-event effects (SEEs) in static random access memory (SRAM) based field-programmable gate arrays (FPGAs). These systems are useful for augmenting radiation-hardness assurance (RHA) methodologies for verifying the effectiveness for mitigation techniques; understanding error signatures and failure modes in FPGAs; and failure rate estimation. For radiation effects researchers, it is important that these systems properly emulate how SEEs manifest in FPGAs. If the fault emulation systems does not mimic the radiation environment, the system will generate erroneous data and incorrect predictions of behavior of the FPGA inmore » a radiation environment. Validation determines whether the emulated faults are reasonable analogs to the radiation-induced faults. In this study we present methods for validating fault emulation systems and provide several examples of validated FPGA fault emulation systems.« less
Leveraging FPGAs for Accelerating Short Read Alignment.

PubMed

Arram, James; Kaplan, Thomas; Luk, Wayne; Jiang, Peiyong

2017-01-01

One of the key challenges facing genomics today is how to efficiently analyze the massive amounts of data produced by next-generation sequencing platforms. With general-purpose computing systems struggling to address this challenge, specialized processors such as the Field-Programmable Gate Array (FPGA) are receiving growing interest. The means by which to leverage this technology for accelerating genomic data analysis is however largely unexplored. In this paper, we present a runtime reconfigurable architecture for accelerating short read alignment using FPGAs. This architecture exploits the reconfigurability of FPGAs to allow the development of fast yet flexible alignment designs. We apply this architecture to develop an alignment design which supports exact and approximate alignment with up to two mismatches. Our design is based on the FM-index, with optimizations to improve the alignment performance. In particular, the n-step FM-index, index oversampling, a seed-and-compare stage, and bi-directional backtracking are included. Our design is implemented and evaluated on a 1U Maxeler MPC-X2000 dataflow node with eight Altera Stratix-V FPGAs. Measurements show that our design is 28 times faster than Bowtie2 running with 16 threads on dual Intel Xeon E5-2640 CPUs, and nine times faster than Soap3-dp running on an NVIDIA Tesla C2070 GPU.
Reconfigurable Hardware for Compressing Hyperspectral Image Data

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh; Namkung, Jeffrey; Villapando, Carlos; Kiely, Aaron; Klimesh, Matthew; Xie, Hua

2010-01-01

High-speed, low-power, reconfigurable electronic hardware has been developed to implement ICER-3D, an algorithm for compressing hyperspectral-image data. The algorithm and parts thereof have been the topics of several NASA Tech Briefs articles, including Context Modeler for Wavelet Compression of Hyperspectral Images (NPO-43239) and ICER-3D Hyperspectral Image Compression Software (NPO-43238), which appear elsewhere in this issue of NASA Tech Briefs. As described in more detail in those articles, the algorithm includes three main subalgorithms: one for computing wavelet transforms, one for context modeling, and one for entropy encoding. For the purpose of designing the hardware, these subalgorithms are treated as modules to be implemented efficiently in field-programmable gate arrays (FPGAs). The design takes advantage of industry- standard, commercially available FPGAs. The implementation targets the Xilinx Virtex II pro architecture, which has embedded PowerPC processor cores with flexible on-chip bus architecture. It incorporates an efficient parallel and pipelined architecture to compress the three-dimensional image data. The design provides for internal buffering to minimize intensive input/output operations while making efficient use of offchip memory. The design is scalable in that the subalgorithms are implemented as independent hardware modules that can be combined in parallel to increase throughput. The on-chip processor manages the overall operation of the compression system, including execution of the top-level control functions as well as scheduling, initiating, and monitoring processes. The design prototype has been demonstrated to be capable of compressing hyperspectral data at a rate of 4.5 megasamples per second at a conservative clock frequency of 50 MHz, with a potential for substantially greater throughput at a higher clock frequency. The power consumption of the prototype is less than 6.5 W. The reconfigurability (by means of reprogramming) of
An FPGA Platform for Real-Time Simulation of Spiking Neuronal Networks

PubMed Central

Pani, Danilo; Meloni, Paolo; Tuveri, Giuseppe; Palumbo, Francesca; Massobrio, Paolo; Raffo, Luigi

2017-01-01

In the last years, the idea to dynamically interface biological neurons with artificial ones has become more and more urgent. The reason is essentially due to the design of innovative neuroprostheses where biological cell assemblies of the brain can be substituted by artificial ones. For closed-loop experiments with biological neuronal networks interfaced with in silico modeled networks, several technological challenges need to be faced, from the low-level interfacing between the living tissue and the computational model to the implementation of the latter in a suitable form for real-time processing. Field programmable gate arrays (FPGAs) can improve flexibility when simple neuronal models are required, obtaining good accuracy, real-time performance, and the possibility to create a hybrid system without any custom hardware, just programming the hardware to achieve the required functionality. In this paper, this possibility is explored presenting a modular and efficient FPGA design of an in silico spiking neural network exploiting the Izhikevich model. The proposed system, prototypically implemented on a Xilinx Virtex 6 device, is able to simulate a fully connected network counting up to 1,440 neurons, in real-time, at a sampling rate of 10 kHz, which is reasonable for small to medium scale extra-cellular closed-loop experiments. PMID:28293163
Optimized smith waterman processor design for breast cancer early diagnosis

NASA Astrophysics Data System (ADS)

Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.

2017-09-01

This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
A Primer for Telemetry Interfacing in Accordance with NASA Standards Using Low Cost FPGAs

NASA Astrophysics Data System (ADS)

McCoy, Jake; Schultz, Ted; Tutt, James; Rogers, Thomas; Miles, Drew; McEntaffer, Randall

2016-03-01

Photon counting detector systems on sounding rocket payloads often require interfacing asynchronous outputs with a synchronously clocked telemetry (TM) stream. Though this can be handled with an on-board computer, there are several low cost alternatives including custom hardware, microcontrollers and field-programmable gate arrays (FPGAs). This paper outlines how a TM interface (TMIF) for detectors on a sounding rocket with asynchronous parallel digital output can be implemented using low cost FPGAs and minimal custom hardware. Low power consumption and high speed FPGAs are available as commercial off-the-shelf (COTS) products and can be used to develop the main component of the TMIF. Then, only a small amount of additional hardware is required for signal buffering and level translating. This paper also discusses how this system can be tested with a simulated TM chain in the small laboratory setting using FPGAs and COTS specialized data acquisition products.
Fast Inference of Deep Neural Networks in FPGAs for Particle Physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duarte, Javier; Han, Song; Harris, Philip

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which wouldmore » enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.« less
Radiation testing campaign results for understanding the suitability of FPGAs in detector electronics

DOE PAGES

Citterio, M.; Camplani, A.; Cannon, M.; ...

2015-11-19

SRAM based Field Programmable Gate Arrays (FPGAs) have been rarely used in High Energy Physics (HEP) due to their sensitivity to radiation. The last generation of commercial FPGAs based on 28 nm feature size and on Silicon On Insulator (SOI) technologies are more tolerant to radiation to the level that their use in front-end electronics is now feasible. FPGAs provide re-programmability, high-speed computation and fast data transmission through the embedded serial transceivers. They could replace custom application specific integrated circuits in front end electronics in locations with moderate radiation field. Finally, the use of a FPGA in HEP experiments ismore » only limited by our ability to mitigate single event effects induced by the high energy hadrons present in the radiation field.« less
Applied Digital Logic Exercises Using FPGAs

NASA Astrophysics Data System (ADS)

Wick, Kurt

2017-09-01

Applied Digital Logic Exercises Using FPGAs is appropriate for anyone interested in digital logic who needs to learn how to implement it through detailed exercises with state-of-the-art digital design tools and components. The book exposes readers to combinational and sequential digital logic concepts and implements them with hands-on exercises using the Verilog Hardware Description Language (HDL) and a Field Programmable Gate Arrays (FGPA) teaching board.
Implementation of a Loosely-Coupled Lockstep Approach in the Xilinx Zynq-7000 All Programmable SoC for High Consequence Applications

DTIC Science & Technology

2017-03-01

Implementation of a Loosely-Coupled Lockstep Approach in the Xilinx Zynq-7000 All Programmable SoC™ for High Consequence Applications Ryan D...sandia.gov Abstract: For high consequence applications requiring information assurance, the architecture of the Xilinx Zynq- 7000 All Programmable ...transaction checker residing in the Programmable Logic portion of the Zynq device will be discussed along with implementation results and latency
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems

NASA Technical Reports Server (NTRS)

Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.

2012-01-01

Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.
A New Partial Reconfiguration-Based Fault-Injection System to Evaluate SEU Effects in SRAM-Based FPGAs

NASA Astrophysics Data System (ADS)

Sterpone, L.; Violante, M.

2007-08-01

Modern SRAM-based field programmable gate array (FPGA) devices offer high capability in implementing complex system. Unfortunately, SRAM-based FPGAs are extremely sensitive to single event upsets (SEUs) induced by radiation particles. In order to successfully deploy safety- or mission-critical applications, designer need to validate the correctness of the obtained designs. In this paper we describe a system based on partial-reconfiguration for running fault-injection experiments within the configuration memory of SRAM-based FPGAs. The proposed fault-injection system uses the internal configuration capabilities that modern FPGAs offer in order to inject SEU within the configuration memory. Detailed experimental results show that the technique is orders of magnitude faster than previously proposed ones.
75 FR 7031 - Xilinx, Inc., Albuquerque, NM; Notice of Affirmative Determination Regarding Application for...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-02-16

... DEPARTMENT OF LABOR Employment and Training Administration [TA-W-71,608] Xilinx, Inc., Albuquerque, NM; Notice of Affirmative Determination Regarding Application for Reconsideration By application... After careful review of the application, I conclude that the claim is of sufficient weight to justify...
Start Up Application Concerns with Field Programmable Gate Arrays (FPGAs)

NASA Technical Reports Server (NTRS)

Katz, Richard B.

1999-01-01

This note is being published to improve the visibility of this subject, as we continue to see problems surface in designs, as well as to add additional information to the previously published note for design engineers. The original application note focused on designing systems with no single point failures using Actel Field Programmable Gate Arrays (FPGAs) for critical applications. Included in that note were the basic principles of operation of the Actel FPGA and a discussion of potential single-point failures. The note also discussed the issue of startup transients for that class of device. It is unfortunate that we continue to see some design problems using these devices. This note will focus on the startup properties of certain electronic components, in general, and current Actel FPGAs, in particular. Devices that are "power-on friendly" are currently being developed by Actel, as a variant of the new SX series of FPGAs. In the ideal world, electronic components would behave much differently than they do in the real world, The chain, of course, starts with the power supply. Ideally, the voltage will immediately rise to a stable V(sub cc) level, of course, it does not. Aside from practical design considerations, inrush current limits of certain capacitors must be observed and the power supply's output may be intentionally slew rate limited to prevent a large current spike on the system power bus. In any event, power supply rise time may range from less than I msec to 100 msec or more.
SpaceCube Version 1.5

NASA Technical Reports Server (NTRS)

Geist, Alessandro; Lin, Michael; Flatley, Tom; Petrick, David

2013-01-01

SpaceCube 1.5 is a high-performance and low-power system in a compact form factor. It is a hybrid processing system consisting of CPU (central processing unit), FPGA (field-programmable gate array), and DSP (digital signal processor) processing elements. The primary processing engine is the Virtex- 5 FX100T FPGA, which has two embedded processors. The SpaceCube 1.5 System was a bridge to the SpaceCube 2.0 and SpaceCube 2.0 Mini processing systems. The SpaceCube 1.5 system was the primary avionics in the successful SMART (Small Rocket/Spacecraft Technology) Sounding Rocket mission that was launched in the summer of 2011. For SMART and similar missions, an avionics processor is required that is reconfigurable, has high processing capability, has multi-gigabit interfaces, is low power, and comes in a rugged/compact form factor. The original SpaceCube 1.0 met a number of the criteria, but did not possess the multi-gigabit interfaces that were required and is a higher-cost system. The SpaceCube 1.5 was designed with those mission requirements in mind. The SpaceCube 1.5 features one Xilinx Virtex-5 FX100T FPGA and has excellent size, weight, and power characteristics [4×4×3 in. (approx. = 10×10×8 cm), 3 lb (approx. = 1.4 kg), and 5 to 15 W depending on the application]. The estimated computing power of the two PowerPC 440s in the Virtex-5 FPGA is 1100 DMIPS each. The SpaceCube 1.5 includes two Gigabit Ethernet (1 Gbps) interfaces as well as two SATA-I/II interfaces (1.5 to 3.0 Gbps) for recording to data drives. The SpaceCube 1.5 also features DDR2 SDRAM (double data rate synchronous dynamic random access memory); 4- Gbit Flash for storing application code for the CPU, FPGA, and DSP processing elements; and a Xilinx Platform Flash XL to store FPGA configuration files or application code. The system also incorporates a 12 bit analog to digital converter with the ability to read 32 discrete analog sensor inputs. The SpaceCube 1.5 design also has a built
Multicasting mesh AER: a scalable assembly approach for reconfigurable neuromorphic structured AER systems. Application to ConvNets.

PubMed

Zamarreno-Ramos, C; Linares-Barranco, A; Serrano-Gotarredona, T; Linares-Barranco, B

2013-02-01

This paper presents a modular, scalable approach to assembling hierarchically structured neuromorphic Address Event Representation (AER) systems. The method consists of arranging modules in a 2D mesh, each communicating bidirectionally with all four neighbors. Address events include a module label. Each module includes an AER router which decides how to route address events. Two routing approaches have been proposed, analyzed and tested, using either destination or source module labels. Our analyses reveal that depending on traffic conditions and network topologies either one or the other approach may result in better performance. Experimental results are given after testing the approach using high-end Virtex-6 FPGAs. The approach is proposed for both single and multiple FPGAs, in which case a special bidirectional parallel-serial AER link with flow control is exploited, using the FPGA Rocket-I/O interfaces. Extensive test results are provided exploiting convolution modules of 64 × 64 pixels with kernels with sizes up to 11 × 11, which process real sensory data from a Dynamic Vision Sensor (DVS) retina. One single Virtex-6 FPGA can hold up to 64 of these convolution modules, which is equivalent to a neural network with 262 × 10(3) neurons and almost 32 million synapses.
FPGA implementation of sparse matrix algorithm for information retrieval

NASA Astrophysics Data System (ADS)

Bojanic, Slobodan; Jevtic, Ruzica; Nieto-Taladriz, Octavio

2005-06-01

Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Hardware Implementation of Lossless Adaptive and Scalable Hyperspectral Data Compression for Space

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh; Keymeulen, Didier; Bakhshi, Alireza; Klimesh, Matthew

2009-01-01

On-board lossless hyperspectral data compression reduces data volume in order to meet NASA and DoD limited downlink capabilities. The technique also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware. A modified form of the algorithm that is better suited for data from pushbroom instruments is generally appropriate for flight implementation. A scalable field programmable gate array (FPGA) hardware implementation was developed. The FPGA implementation achieves a throughput performance of 58 Msamples/sec, which can be increased to over 100 Msamples/sec in a parallel implementation that uses twice the hardware resources This paper describes the hardware implementation of the 'Modified Fast Lossless' compression algorithm on an FPGA. The FPGA implementation targets the current state-of-the-art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for space applications.
Electronics for CMS Endcap Muon Level-1 Trigger System Phase-1 and HL LHC upgrades

NASA Astrophysics Data System (ADS)

Madorsky, A.

2017-07-01

To accommodate high-luminosity LHC operation at a 13 TeV collision energy, the CMS Endcap Muon Level-1 Trigger system had to be significantly modified. To provide robust track reconstruction, the trigger system must now import all available trigger primitives generated by the Cathode Strip Chambers and by certain other subsystems, such as Resistive Plate Chambers (RPC). In addition to massive input bandwidth, this also required significant increase in logic and memory resources. To satisfy these requirements, a new Sector Processor unit has been designed. It consists of three modules. The Core Logic module houses the large FPGA that contains the track-finding logic and multi-gigabit serial links for data exchange. The Optical module contains optical receivers and transmitters; it communicates with the Core Logic module via a custom backplane section. The Pt Lookup table (PTLUT) module contains 1 GB of low-latency memory that is used to assign the final Pt to reconstructed muon tracks. The μ TCA architecture (adopted by CMS) was used for this design. The talk presents the details of the hardware and firmware design of the production system based on Xilinx Virtex-7 FPGA family. The next round of LHC and CMS upgrades starts in 2019, followed by a major High-Luminosity (HL) LHC upgrade starting in 2024. In the course of these upgrades, new Gas Electron Multiplier (GEM) detectors and more RPC chambers will be added to the Endcap Muon system. In order to keep up with all these changes, a new Advanced Processor unit is being designed. This device will be based on Xilinx UltraScale+ FPGAs. It will be able to accommodate up to 100 serial links with bit rates of up to 25 Gb/s, and provide up to 2.5 times more logic resources than the device used currently. The amount of PTLUT memory will be significantly increased to provide more flexibility for the Pt assignment algorithm. The talk presents preliminary details of the hardware design program.

Effectiveness of Internal vs. External SEU Scrubbing Mitigation Strategies in a Xilinx FPGA: Design, Test, and Analysis

NASA Technical Reports Server (NTRS)

Berg, Melanie; Poivey C.; Petrick, D.; Espinosa, D.; Lesea, Austin; LaBel, K. A.; Friendlich, M; Kim, H; Phan, A.

2008-01-01

We compare two scrubbing mitigation schemes for Xilinx FPGA devices. The design of the scrubbers is briefly discussed along with an examination of mitigation limitations. Proton and Heavy Ion data are then presented and analyzed.
Hardware realization of an SVM algorithm implemented in FPGAs

NASA Astrophysics Data System (ADS)

Wiśniewski, Remigiusz; Bazydło, Grzegorz; Szcześniak, Paweł

2017-08-01

The paper proposes a technique of hardware realization of a space vector modulation (SVM) of state function switching in matrix converter (MC), oriented on the implementation in a single field programmable gate array (FPGA). In MC the SVM method is based on the instantaneous space-vector representation of input currents and output voltages. The traditional computation algorithms usually involve digital signal processors (DSPs) which consumes the large number of power transistors (18 transistors and 18 independent PWM outputs) and "non-standard positions of control pulses" during the switching sequence. Recently, hardware implementations become popular since computed operations may be executed much faster and efficient due to nature of the digital devices (especially concurrency). In the paper, we propose a hardware algorithm of SVM computation. In opposite to the existing techniques, the presented solution applies COordinate Rotation DIgital Computer (CORDIC) method to solve the trigonometric operations. Furthermore, adequate arithmetic modules (that is, sub-devices) used for intermediate calculations, such as code converters or proper sectors selectors (for output voltages and input current) are presented in detail. The proposed technique has been implemented as a design described with the use of Verilog hardware description language. The preliminary results of logic implementation oriented on the Xilinx FPGA (particularly, low-cost device from Artix-7 family from Xilinx was used) are also presented.
A flexible 32-channel time-to-digital converter implemented in a Xilinx Zynq-7000 field programmable gate array

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Kuang, Jie; Liu, Chong; Cao, Qiang; Li, Deng

2017-03-01

A high performance multi-channel time-to-digital converter (TDC) is implemented in a Xilinx Zynq-7000 field programmable gate array (FPGA). It can be flexibly configured as either 32 TDC channels with 9.9 ps time-interval RMS precision, 16 TDC channels with 6.9 ps RMS precision, or 8 TDC channels with 5.8 ps RMS precision. All TDCs have a 380 M Samples/second measurement throughput and a 2.63 ns measurement dead time. The performance consistency and temperature dependence of TDC channels are also evaluated. Because Zynq-7000 FPGA family integrates a feature-rich dual-core ARM based processing system and 28 nm Xilinx programmable logic in a single device, the realization of high performance TDCs on it will make the platform more widely used in time-measuring related applications.
Gaining Insight Into Femtosecond-scale CMOS Effects using FPGAs

DTIC Science & Technology

2015-03-24

paths or detecting gross path delay faults , but for characterizing subtle aging effects, there is a need to isolate very short paths and detect very...data using COTS FPGAs and novel self-test. Hardware experiments using a 28 nm FPGA demonstrate isolation of small sets of transistors, detection of...hold the static configuration data specifying the LUT function. A set of inverters drive the SRAM contents into a pass-gate multiplexor tree; we
Technology Readiness Level (TRL) Advancement of the MSPI On-Board Processing Platform for the ACE Decadal Survey Mission

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Werne, Thomas A.; Bekker, Dmitriy L.; Wilson, Thor O.

2011-01-01

The Xilinx Virtex-5QV is a new Single-event Immune Reconfigurable FPGA (SIRF) device that is targeted as the spaceborne processor for the NASA Decadal Survey Aerosol-Cloud-Ecosystem (ACE) mission's Multiangle SpectroPolarimetric Imager (MSPI) instrument, currently under development at JPL. A key technology needed for MSPI is on-board processing (OBP) to calculate polarimetry data as imaged by each of the 9 cameras forming the instrument. With funding from NASA's ESTO1 AIST2 Program, JPL is demonstrating how signal data at 95 Mbytes/sec over 16 channels for each of the 9 multi-angle cameras can be reduced to 0.45 Mbytes/sec, thereby substantially reducing the image data volume for spacecraft downlink without loss of science information. This is done via a least-squares fitting algorithm implemented on the Virtex-5 FPGA operating in real-time on the raw video data stream.
TOT measurement implemented in FPGA TDC

NASA Astrophysics Data System (ADS)

Fan, Huan-Huan; Cao, Ping; Liu, Shu-Bin; An, Qi

2015-11-01

Time measurement plays a crucial role for the purpose of particle identification in high energy physics experiments. With increasingly demanding physics goals and the development of electronics, modern time measurement systems need to meet the requirement of excellent resolution specification as well as high integrity. Based on Field Programmable Gate Arrays (FPGAs), FPGA time-to-digital converters (TDCs) have become one of the most mature and prominent time measurement methods in recent years. For correcting the time-walk effect caused by leading timing, a time-over-threshold (TOT) measurement should be added to the FPGA TDC. TOT can be obtained by measuring the interval between the signal leading and trailing edges. Unfortunately, a traditional TDC can recognize only one kind of signal edge, the leading or the trailing. Generally, to measure the interval, two TDC channels need to be used at the same time, one for leading, the other for trailing. However, this method unavoidably increases the amount of FPGA resources used and reduces the TDC's integrity. This paper presents one method of TOT measurement implemented in a Xilinx Virtex-5 FPGA. In this method, TOT measurement can be achieved using only one TDC input channel. The consumed resources and time resolution can both be guaranteed. Testing shows that this TDC can achieve resolution better than 15ps for leading edge measurement and 37 ps for TOT measurement. Furthermore, the TDC measurement dead time is about two clock cycles, which makes it good for applications with higher physics event rates. Supported by National Natural Science Foundation of China (11079003, 10979003)
The characterization and application of a low resource FPGA-based time to digital converter

NASA Astrophysics Data System (ADS)

Balla, Alessandro; Mario Beretta, Matteo; Ciambrone, Paolo; Gatta, Maurizio; Gonnella, Francesco; Iafolla, Lorenzo; Mascolo, Matteo; Messi, Roberto; Moricciani, Dario; Riondino, Domenico

2014-03-01

Time to Digital Converters (TDCs) are very common devices in particles physics experiments. A lot of "off-the-shelf" TDCs can be employed but the necessity of a custom DAta acQuisition (DAQ) system makes the TDCs implemented on the Field-Programmable Gate Arrays (FPGAs) desirable. Most of the architectures developed so far are based on the tapped delay lines with precision down to 10 ps, obtained with high FPGA resources usage and non-linearity issues to be managed. Often such precision is not necessary; in this case TDC architectures with low resources occupancy are preferable allowing the implementation of data processing systems and of other utilities on the same device. In order to reconstruct γγ physics events tagged with High Energy Tagger (HET) in the KLOE-2 (K LOng Experiment 2), we need to measure the Time Of Flight (TOF) of the electrons and positrons from the KLOE-2 Interaction Point (IP) to our tagging stations (11 m apart). The required resolution must be better than the bunch spacing (2.7 ns). We have developed and implemented on a Xilinx Virtex-5 FPGA a 32 channel TDC with a precision of 255 ps and low non-linearity effects along with an embedded data acquisition system and the interface to the online FARM of KLOE-2. The TDC is based on a low resources occupancy technique: the 4×Oversampling technique which, in this work, is pushed to its best resolution and its performances were exhaustively measured.
Fast and Adaptive Lossless On-Board Hyperspectral Data Compression System for Space Applications

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh; Bakhshi, Alireza; Keymeulen, Didier; Klimesh, Matthew

2009-01-01

Efficient on-board lossless hyperspectral data compression reduces the data volume necessary to meet NASA and DoD limited downlink capabilities. The techniques also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware, which makes it practical for flight implementations of pushbroom instruments. A prototype of the compressor (and decompressor) of the algorithm is available in software, but this implementation may not meet speed and real-time requirements of some space applications. Hardware acceleration provides performance improvements of 10x-100x vs. the software implementation (about 1M samples/sec on a Pentium IV machine). This paper describes a hardware implementation of the JPL-developed 'Fast Lossless' compression algorithm on a Field Programmable Gate Array (FPGA). The FPGA implementation targets the current state of the art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for Space applications.
A radiation tolerant Data link board for the ATLAS Tile Cal upgrade

NASA Astrophysics Data System (ADS)

Åkerstedt, H.; Bohm, C.; Muschter, S.; Silverstein, S.; Valdes, E.

2016-01-01

This paper describes the latest, full-functionality revision of the high-speed data link board developed for the Phase-2 upgrade of ATLAS hadronic Tile Calorimeter. The link board design is highly redundant, with digital functionality implemented in two Xilinx Kintex-7 FPGAs, and two Molex QSFP+ electro-optic modules with uplinks run at 10 Gbps. The FPGAs are remotely configured through two radiation-hard CERN GBTx deserialisers (GBTx), which also provide the LHC-synchronous system clock. The redundant design eliminates virtually all single-point error modes, and a combination of triple-mode redundancy (TMR), internal and external scrubbing will provide adequate protection against radiation-induced errors. The small portion of the FPGA design that cannot be protected by TMR will be the dominant source of radiation-induced errors, even if that area is small.
FPGA based hardware optimized implementation of signal processing system for LFM pulsed radar

NASA Astrophysics Data System (ADS)

Azim, Noor ul; Jun, Wang

2016-11-01

Signal processing is one of the main parts of any radar system. Different signal processing algorithms are used to extract information about different parameters like range, speed, direction etc, of a target in the field of radar communication. This paper presents LFM (Linear Frequency Modulation) pulsed radar signal processing algorithms which are used to improve target detection, range resolution and to estimate the speed of a target. Firstly, these algorithms are simulated in MATLAB to verify the concept and theory. After the conceptual verification in MATLAB, the simulation is converted into implementation on hardware using Xilinx FPGA. Chosen FPGA is Xilinx Virtex-6 (XC6LVX75T). For hardware implementation pipeline optimization is adopted and also other factors are considered for resources optimization in the process of implementation. Focusing algorithms in this work for improving target detection, range resolution and speed estimation are hardware optimized fast convolution processing based pulse compression and pulse Doppler processing.
Serial data acquisition for GEM-2D detector

NASA Astrophysics Data System (ADS)

Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech; Zienkiewicz, Pawel; Mazon, Didier; Malard, Philippe; Herrmann, Albrecht; Vezinet, Didier

2014-11-01

This article debates about data fast acquisition and histogramming method for the X-ray GEM detector. The whole process of histogramming is performed by FPGA chips (Spartan-6 series from Xilinx). The results of the histogramming process are stored in an internal FPGA memory and then sent to PC. In PC data is merged and processed by MATLAB. The structure of firmware functionality implemented in the FPGAs is described. Examples of test measurements and results are presented.
Real Time Coincidence Processing Algorithm for Geiger Mode LADAR using FPGAs

DTIC Science & Technology

2017-01-09

Defense for Research and Engineering. Real Time Coincidence Processing Algorithm for Geiger-Mode Ladar using FPGAs Rufo A. Antonio1, Alexandru N...the first ever Geiger-mode ladar processing al- gorithm that is suitable for implementation on an FPGA enabling real time pro- cessing and data...developed embedded FPGA real time processing algorithms that take noisy raw data, streaming at upwards of 1GB/sec, and filters the data to obtain a near- ly
Hardware-Abbildung eines videobasierten Verfahrens zur echtzeitfähigen Auswertung von Winkelhistogrammen auf eine modulare Coprozessor-Architektur

NASA Astrophysics Data System (ADS)

Flatt, H.; Tarnowsky, A.; Blume, H.; Pirsch, P.

2010-10-01

Dieser Beitrag behandelt die Abbildung eines videobasierten Verfahrens zur echtzeitfähigen Auswertung von Winkelhistogrammen auf eine modulare Coprozessor-Architektur. Die Architektur besteht aus mehreren dedizierten Recheneinheiten zur parallelen Verarbeitung rechenintensiver Bildverarbeitungsverfahren und ist mit einem RISC-Prozessor verbunden. Eine konfigurierbare Architekturerweiterung um eine Recheneinheit zur Auswertung von Winkelhistogrammen von Objekten ermöglicht in Verbindung mit dem RISC eine echtzeitfähige Klassifikation. Je nach Konfiguration sind für die Architekturerweiterung auf einem Xilinx Virtex-5-FPGA zwischen 3300 und 12 000 Lookup-Tables erforderlich. Bei einer Taktfrequenz von 100 MHz können unabhängig von der Bildauflösung pro Einzelbild in einem 25-Hz-Videodatenstrom bis zu 100 Objekte der Größe 256×256 Pixel analysiert werden. This paper presents the mapping of a video-based approach for real-time evaluation of angular histograms on a modular coprocessor architecture. The architecture comprises several dedicated processing elements for parallel processing of computation-intensive image processing tasks and is coupled with a RISC processor. A configurable architecture extension, especially a processing element for evaluating angular histograms of objects in conjunction with a RISC processor, provides a real-time classification. Depending on the configuration of the architecture extension, 3 300 to 12 000 look-up tables are required for a Xilinx Virtex-5 FPGA implementation. Running at a clock frequency of 100 MHz and independently of the image resolution per frame, 100 objects of size 256×256 pixels are analyzed in a 25 Hz video stream by the architecture.
Fast semivariogram computation using FPGA architectures

NASA Astrophysics Data System (ADS)

Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang

2015-02-01

The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments
Parallel Fixed Point Implementation of a Radial Basis Function Network in an FPGA

PubMed Central

de Souza, Alisson C. D.; Fernandes, Marcelo A. C.

2014-01-01

This paper proposes a parallel fixed point radial basis function (RBF) artificial neural network (ANN), implemented in a field programmable gate array (FPGA) trained online with a least mean square (LMS) algorithm. The processing time and occupied area were analyzed for various fixed point formats. The problems of precision of the ANN response for nonlinear classification using the XOR gate and interpolation using the sine function were also analyzed in a hardware implementation. The entire project was developed using the System Generator platform (Xilinx), with a Virtex-6 xc6vcx240t-1ff1156 as the target FPGA. PMID:25268918
Moving target detection for frequency agility radar by sparse reconstruction

NASA Astrophysics Data System (ADS)

Quan, Yinghui; Li, YaChao; Wu, Yaojun; Ran, Lei; Xing, Mengdao; Liu, Mengqi

2016-09-01

Frequency agility radar, with randomly varied carrier frequency from pulse to pulse, exhibits superior performance compared to the conventional fixed carrier frequency pulse-Doppler radar against the electromagnetic interference. A novel moving target detection (MTD) method is proposed for the estimation of the target's velocity of frequency agility radar based on pulses within a coherent processing interval by using sparse reconstruction. Hardware implementation of orthogonal matching pursuit algorithm is executed on Xilinx Virtex-7 Field Programmable Gata Array (FPGA) to perform sparse optimization. Finally, a series of experiments are performed to evaluate the performance of proposed MTD method for frequency agility radar systems.
A Scalable Architecture of a Structured LDPC Decoder

NASA Technical Reports Server (NTRS)

Lee, Jason Kwok-San; Lee, Benjamin; Thorpe, Jeremy; Andrews, Kenneth; Dolinar, Sam; Hamkins, Jon

2004-01-01

We present a scalable decoding architecture for a certain class of structured LDPC codes. The codes are designed using a small (n,r) protograph that is replicated Z times to produce a decoding graph for a (Z x n, Z x r) code. Using this architecture, we have implemented a decoder for a (4096,2048) LDPC code on a Xilinx Virtex-II 2000 FPGA, and achieved decoding speeds of 31 Mbps with 10 fixed iterations. The implemented message-passing algorithm uses an optimized 3-bit non-uniform quantizer that operates with 0.2dB implementation loss relative to a floating point decoder.
Moving target detection for frequency agility radar by sparse reconstruction.

PubMed

Quan, Yinghui; Li, YaChao; Wu, Yaojun; Ran, Lei; Xing, Mengdao; Liu, Mengqi

2016-09-01

Frequency agility radar, with randomly varied carrier frequency from pulse to pulse, exhibits superior performance compared to the conventional fixed carrier frequency pulse-Doppler radar against the electromagnetic interference. A novel moving target detection (MTD) method is proposed for the estimation of the target's velocity of frequency agility radar based on pulses within a coherent processing interval by using sparse reconstruction. Hardware implementation of orthogonal matching pursuit algorithm is executed on Xilinx Virtex-7 Field Programmable Gata Array (FPGA) to perform sparse optimization. Finally, a series of experiments are performed to evaluate the performance of proposed MTD method for frequency agility radar systems.
FPGA implemented testbed in 8-by-8 and 2-by-2 OFDM-MIMO channel estimation and design of baseband transceiver.

PubMed

Ramesh, S; Seshasayanan, R

2016-01-01

In this study, a baseband OFDM-MIMO framework with channel timing and estimation synchronization is composed and executed utilizing the FPGA innovation. The framework is prototyped in light of the IEEE 802.11a standard and the signals transmitted and received utilizing a data transmission of 20 MHz. With the assistance of the QPSK tweak, the framework can accomplish a throughput of 24 Mbps. Besides, the LS formula is executed and the estimation of a frequency-specific fading channel is illustrated. For the rough estimation of timing, MNC plan is examined and actualized. Above all else, the whole framework is demonstrated in MATLAB and a drifting point model is set up. At that point, the altered point model is made with the assistance of Simulink and Xilinx's System Generator for DSP. In this way, the framework is incorporated and actualized inside of Xilinx's ISE tools and focused to Xilinx Virtex 5 board. In addition, an equipment co-simulation is contrived to decrease the preparing time while figuring the BER of the fixed point model. The work concentrates on above all else venture for further examination of planning creative channel estimation strategies towards applications in the fourth era (4G) mobile correspondence frameworks.
SpaceCube Mini

NASA Technical Reports Server (NTRS)

Lin, Michael; Petrick, David; Geist, Alessandro; Flatley, Thomas

2012-01-01

This version of the SpaceCube will be a full-fledged, onboard space processing system capable of 2500+ MIPS, and featuring a number of plug-andplay gigabit and standard interfaces, all in a condensed 3x3x3 form factor [less than 10 watts and less than 3 lb (approximately equal to 1.4 kg)]. The main processing engine is the Xilinx SIRF radiation- hardened-by-design Virtex-5 FX-130T field-programmable gate array (FPGA). Even as the SpaceCube 2.0 version (currently under test) is being targeted as the platform of choice for a number of the upcoming Earth Science Decadal Survey missions, GSFC has been contacted by customers who wish to see a system that incorporates key features of the version 2.0 architecture in an even smaller form factor. In order to fulfill that need, the SpaceCube Mini is being designed, and will be a very compact and low-power system. A similar flight system with this combination of small size, low power, low cost, adaptability, and extremely high processing power does not otherwise exist, and the SpaceCube Mini will be of tremendous benefit to GSFC and its partners. The SpaceCube Mini will utilize space-grade components. The primary processing engine of the Mini is the Xilinx Virtex-5 SIRF FX-130T radiation-hardened-by-design FPGA for critical flight applications in high-radiation environments. The Mini can also be equipped with a commercial Xilinx Virtex-5 FPGA with integrated PowerPCs for a low-cost, high-power computing platform for use in the relatively radiation- benign LEOs (low-Earth orbits). In either case, this version of the Space-Cube will weigh less than 3 pounds (.1.4 kg), conform to the CubeSat form-factor (10x10x10 cm), and will be low power (less than 10 watts for typical applications). The SpaceCube Mini will have a radiation-hardened Aeroflex FPGA for configuring and scrubbing the Xilinx FPGA by utilizing the onboard FLASH memory to store the configuration files. The FLASH memory will also be used for storing algorithm and

Fast data transmission from serial data acquisition for the GEM detector system

NASA Astrophysics Data System (ADS)

Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Byszuk, Adrian; Chernyshova, Maryna; Kasprowicz, Grzegorz; Krawczyk, Rafal D.; Wojenski, Andrzej; Zabolotny, Wojciech

2015-09-01

This article proposes new method of storing data and transferring it to PC in the X-ray GEM detector system. The whole process is performed by FPGA chips (Spartan-6 series from Xilinx). Comparing to previous methods, new approach allows to store much more data in the system. New, improved implementation of the communication algorithm significantly increases transfer rate between system and PC. In PC data is merged and processed by MATLAB. The structure of firmware implemented in the FPGAs is described.
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting

PubMed Central

Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel

2008-01-01

Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553
Embedded system of image storage based on fiber channel

NASA Astrophysics Data System (ADS)

Chen, Xiaodong; Su, Wanxin; Xing, Zhongbao; Wang, Hualong

2008-03-01

In domains of aerospace, aviation, aiming, and optic measure etc., the embedded system of imaging, processing and recording is absolutely necessary, which has small volume, high processing speed and high resolution. But the embedded storage technology becomes system bottleneck because of developing slowly. It is used to use RAID to promote storage speed, but it is unsuitable for the embedded system because of its big volume. Fiber channel (FC) technology offers a new method to develop the high-speed, portable storage system. In order to make storage subsystem meet the needs of high storage rate, make use of powerful Virtex-4 FPGA and high speed fiber channel, advance a project of embedded system of digital image storage based on Xilinx Fiber Channel Arbitrated Loop LogiCORE. This project utilizes Virtex- 4 RocketIO MGT transceivers to transmit the data serially, and connects many Fiber Channel hard drivers by using of Arbitrated Loop optionally. It can achieve 400MBps storage rate, breaks through the bottleneck of PCI interface, and has excellences of high-speed, real-time, portable and massive capacity.
FPGA based charge fast histogramming for GEM detector

NASA Astrophysics Data System (ADS)

Poźniak, Krzysztof T.; Byszuk, A.; Chernyshova, M.; Cieszewski, R.; Czarski, T.; Dominik, W.; Jakubowska, K.; Kasprowicz, G.; Rzadkiewicz, J.; Scholz, M.; Zabolotny, W.

2013-10-01

This article presents a fast charge histogramming method for the position sensitive X-ray GEM detector. The energy resolved measurements are carried out simultaneously for 256 channels of the GEM detector. The whole process of histogramming is performed in 21 FPGA chips (Spartan-6 series from Xilinx) . The results of the histogramming process are stored in an external DDR3 memory. The structure of an electronic measuring equipment and a firmware functionality implemented in the FPGAs is described. Examples of test measurements are presented.
Qualification Strategies of Field Programmable Gate Arrays (FPGAs) for Space Application

NASA Technical Reports Server (NTRS)

Sheldon, Douglas; Schone, Harald

2005-01-01

This viewgraph document reviews the issue of using Field Programmable Gate Arrays (FPGAs) in Space Application, and the some of the strategies for qualifying the FPGA. Qualification and risk management of such complex systems requires new approaches. The paper presents a matrix approach to qualification has been presented that: - Complements historical specifications - Highlights the importance of device physics as a cornerstone to qualification. - Provides levels of risk management that expressly document trade offs. - Stresses the role of the FPGA vendor as team member in the development of modern spacecraft.
A FPGA-based Measurement System for Nonvolatile Semiconductor Memory Characterization

NASA Astrophysics Data System (ADS)

Bu, Jiankang; White, Marvin

2002-03-01

Low voltage, long retention, high density SONOS nonvolatile semiconductor memory (NVSM) devices are ideally suited for PCMCIA, FLASH and 'smart' cards. The SONOS memory transistor requires characterization with an accurate, rapid measurement system with minimum disturbance to the device. The FPGA-based measurement system includes three parts: 1) a pattern generator implemented with XILINX FPGAs and corresponding software, 2) a high-speed, constant-current, threshold voltage detection circuit, 3) and a data evaluation program, implemented with a LABVIEW program. Fig. 1 shows the general block diagram of the FPGA-based measurement system. The function generator is designed and simulated with XILINX Foundation Software. Under the control of the specific erase/write/read pulses, the analog detect circuit applies operational modes to the SONOS device under test (DUT) and determines the change of the memory-state of the SONOS nonvolatile memory transistor. The TEK460 digitizes the analog threshold voltage output and sends to the PC computer. The data is filtered and averaged with a LABVIEWTM program running on the PC computer and displayed on the monitor in real time. We have implemented the pattern generator with XILINX FPGAs. Fig. 2 shows the block diagram of the pattern generator. We realized the logic control by a method of state machine design. Fig. 3 shows a small part of the state machine. The flexibility of the FPGAs enhances the capabilities of this system and allows measurement variations without hardware changes. The characterization of the nonvolatile memory transistor device under test (DUT), as function of programming voltage and time, is achieved by a high-speed, constant-current threshold voltage detection circuit. The analog detection circuit incorporating fast analog switches controlled digitally with the FPGAs. The schematic circuit diagram is shown in Fig. 4. The various operational modes for the DUT are realized with control signals applied to the
Upset Characterization of the PowerPC405 Hard-core Processor Embedded in Virtex-II Pro Field Programmable Gate Arrays

NASA Technical Reports Server (NTRS)

Swift, Gary M.; Allen, Gregory S.; Farmanesh, Farhad; George, Jeffrey; Petrick, David J.; Chayab, Fayez

2006-01-01

Shown in this presentation are recent results for the upset susceptibility of the various types of memory elements in the embedded PowerPC405 in the Xilinx V2P40 FPGA. For critical flight designs where configuration upsets are mitigated effectively through appropriate design triplication and configuration scrubbing, these upsets of processor elements can dominate the system error rate. Data from irradiations with both protons and heavy ions are given and compared using available models.
Soft error evaluation and vulnerability analysis in Xilinx Zynq-7010 system-on chip

NASA Astrophysics Data System (ADS)

Du, Xuecheng; He, Chaohui; Liu, Shuhuan; Zhang, Yao; Li, Yonghong; Xiong, Ceng; Tan, Pengkang

2016-09-01

Radiation-induced soft errors are an increasingly important threat to the reliability of modern electronic systems. In order to evaluate system-on chip's reliability and soft error, the fault tree analysis method was used in this work. The system fault tree was constructed based on Xilinx Zynq-7010 All Programmable SoC. Moreover, the soft error rates of different components in Zynq-7010 SoC were tested by americium-241 alpha radiation source. Furthermore, some parameters that used to evaluate the system's reliability and safety were calculated using Isograph Reliability Workbench 11.0, such as failure rate, unavailability and mean time to failure (MTTF). According to fault tree analysis for system-on chip, the critical blocks and system reliability were evaluated through the qualitative and quantitative analysis.
PCI bus content-addressable-memory (CAM) implementation on FPGA for pattern recognition/image retrieval in a distributed environment

NASA Astrophysics Data System (ADS)

Megherbi, Dalila B.; Yan, Yin; Tanmay, Parikh; Khoury, Jed; Woods, C. L.

2004-11-01

Recently surveillance and Automatic Target Recognition (ATR) applications are increasing as the cost of computing power needed to process the massive amount of information continues to fall. This computing power has been made possible partly by the latest advances in FPGAs and SOPCs. In particular, to design and implement state-of-the-Art electro-optical imaging systems to provide advanced surveillance capabilities, there is a need to integrate several technologies (e.g. telescope, precise optics, cameras, image/compute vision algorithms, which can be geographically distributed or sharing distributed resources) into a programmable system and DSP systems. Additionally, pattern recognition techniques and fast information retrieval, are often important components of intelligent systems. The aim of this work is using embedded FPGA as a fast, configurable and synthesizable search engine in fast image pattern recognition/retrieval in a distributed hardware/software co-design environment. In particular, we propose and show a low cost Content Addressable Memory (CAM)-based distributed embedded FPGA hardware architecture solution with real time recognition capabilities and computing for pattern look-up, pattern recognition, and image retrieval. We show how the distributed CAM-based architecture offers a performance advantage of an order-of-magnitude over RAM-based architecture (Random Access Memory) search for implementing high speed pattern recognition for image retrieval. The methods of designing, implementing, and analyzing the proposed CAM based embedded architecture are described here. Other SOPC solutions/design issues are covered. Finally, experimental results, hardware verification, and performance evaluations using both the Xilinx Virtex-II and the Altera Apex20k are provided to show the potential and power of the proposed method for low cost reconfigurable fast image pattern recognition/retrieval at the hardware/software co-design level.
Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System

NASA Technical Reports Server (NTRS)

Aranki, Nazeeh I.; Keymeulen, Didier; Kimesh, Matthew A.

2012-01-01

Modern hyperspectral imaging systems are able to acquire far more data than can be downlinked from a spacecraft. Onboard data compression helps to alleviate this problem, but requires a system capable of power efficiency and high throughput. Software solutions have limited throughput performance and are power-hungry. Dedicated hardware solutions can provide both high throughput and power efficiency, while taking the load off of the main processor. Thus a hardware compression system was developed. The implementation uses a field-programmable gate array (FPGA). The implementation is based on the fast lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral-Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), page 26, which achieves excellent compression performance and has low complexity. This algorithm performs predictive compression using an adaptive filtering method, and uses adaptive Golomb coding. The implementation also packetizes the coded data. The FL algorithm is well suited for implementation in hardware. In the FPGA implementation, one sample is compressed every clock cycle, which makes for a fast and practical realtime solution for space applications. Benefits of this implementation are: 1) The underlying algorithm achieves a combination of low complexity and compression effectiveness that exceeds that of techniques currently in use. 2) The algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. 3) Hardware acceleration provides a throughput improvement of 10 to 100 times vs. the software implementation. A prototype of the compressor is available in software, but it runs at a speed that does not meet spacecraft requirements. The hardware implementation targets the Xilinx Virtex IV FPGAs, and makes the use of this compressor practical for Earth satellites as well as beyond-Earth missions with hyperspectral instruments.
Efficient FIR Filter Implementations for Multichannel BCIs Using Xilinx System Generator.

PubMed

Ghani, Usman; Wasim, Muhammad; Khan, Umar Shahbaz; Mubasher Saleem, Muhammad; Hassan, Ali; Rashid, Nasir; Islam Tiwana, Mohsin; Hamza, Amir; Kashif, Amir

2018-01-01

Background . Brain computer interface (BCI) is a combination of software and hardware communication protocols that allow brain to control external devices. Main purpose of BCI controlled external devices is to provide communication medium for disabled persons. Now these devices are considered as a new way to rehabilitate patients with impunities. There are certain potentials present in electroencephalogram (EEG) that correspond to specific event. Main issue is to detect such event related potentials online in such a low signal to noise ratio (SNR). In this paper we propose a method that will facilitate the concept of online processing by providing an efficient filtering implementation in a hardware friendly environment by switching to finite impulse response (FIR). Main focus of this research is to minimize latency and computational delay of preprocessing related to any BCI application. Four different finite impulse response (FIR) implementations along with large Laplacian filter are implemented in Xilinx System Generator. Efficiency of 25% is achieved in terms of reduced number of coefficients and multiplications which in turn reduce computational delays accordingly.
HyspIRI Intelligent Payload Module(IPM) and Benchmarking Algorithms for Upload

NASA Technical Reports Server (NTRS)

Mandl, Daniel

2010-01-01

Features: Hardware: a) Xilinx Virtex-5 (GSFC Space Cube 2); b) 2 x 400MHz PPC; c) 100MHz Bus; d) 2 x 512MB SDRAM; e) Dual Gigabit Ethernet. Support Linux kernel 2.6.31 (gcc version 4.2.2). Support software running in stand alone mode for better performance. Can stream raw data up to 800 Mbps. Ready for operations. Software Application Examples: Band-stripping Algiotrhmsl:cloud, sulfur, flood, thermal, SWIL, NDVI, NDWI, SIWI, oil spills, algae blooms, etc. Corrections: geometric, radiometric, atmospheric. Core Flight System/dynamic software bus. CCSDS File Delivery Protocol. Delay Tolerant Network. CASPER /onboard planning. Fault monitoring/recovery software. S/C command and telemetry software. Data compression. Sensor Web for Autonomous Mission Operations.
The SMS4 cryptographic system design based on dynamic partial self-reconfiguration technology

NASA Astrophysics Data System (ADS)

Wang, Jianxin; Gao, Xianwei; Li, Xiuying; Sui, Meili

2013-03-01

This paper describes SMS4 algorithm by using dynamic partial self-reconfiguration. The design is implemented on Xilinx VirtexII-Pro XC2VP30 FPGA devices. The partial self-reconfiguration encryption/decryption module data throughput is up to 50Mb/s, key expansion and encryption/decryption modules use 1606 and 1570 slices respectively, and the resource utilization ratio of the key expansion by using partial self-reconfiguration technology is less 32.03% and slices are less 757 than the non-reconfiguration technology. SMS4 implementation gets a good balance between high performance and low complexity in area. The theoretical and practical research of dynamic partial self-reconfiguration has a broad space for development and application prospect.
A framework for porting the NeuroBayes machine learning algorithm to FPGAs

NASA Astrophysics Data System (ADS)

Baehr, S.; Sander, O.; Heck, M.; Feindt, M.; Becker, J.

2016-01-01

The NeuroBayes machine learning algorithm is deployed for online data reduction at the pixel detector of Belle II. In order to test, characterize and easily adapt its implementation on FPGAs, a framework was developed. Within the framework an HDL model, written in python using MyHDL, is used for fast exploration of possible configurations. Under usage of input data from physics simulations figures of merit like throughput, accuracy and resource demand of the implementation are evaluated in a fast and flexible way. Functional validation is supported by usage of unit tests and HDL simulation for chosen configurations.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.

PubMed

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2017-03-01

implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures

PubMed Central

Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang

2016-01-01

is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
Hardware Implementation of Lossless Adaptive Compression of Data From a Hyperspectral Imager

NASA Technical Reports Server (NTRS)

Keymeulen, Didlier; Aranki, Nazeeh I.; Klimesh, Matthew A.; Bakhshi, Alireza

2012-01-01

Virtex IV LX25 device, and ported to a Xilinx prototype board. The current implementation has a critical path of 29.5 ns, which dictated a clock speed of 33 MHz. The critical path delay is end-to-end measurement between the uncompressed input data and the output compression data stream. The implementation compresses one sample every clock cycle, which results in a speed of 33 Msample/s. The implementation has a rather low device use of the Xilinx Virtex IV LX25, making the total power consumption of the implementation about 1.27 W.
Hardware accelerator design for tracking in smart camera

NASA Astrophysics Data System (ADS)

Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil

2011-10-01

Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.
An Efficient VLSI Architecture of the Enhanced Three Step Search Algorithm

NASA Astrophysics Data System (ADS)

Biswas, Baishik; Mukherjee, Rohan; Saha, Priyabrata; Chakrabarti, Indrajit

2016-09-01

The intense computational complexity of any video codec is largely due to the motion estimation unit. The Enhanced Three Step Search is a popular technique that can be adopted for fast motion estimation. This paper proposes a novel VLSI architecture for the implementation of the Enhanced Three Step Search Technique. A new addressing mechanism has been introduced which enhances the speed of operation and reduces the area requirements. The proposed architecture when implemented in Verilog HDL on Virtex-5 Technology and synthesized using Xilinx ISE Design Suite 14.1 achieves a critical path delay of 4.8 ns while the area comes out to be 2.9K gate equivalent. It can be incorporated in commercial devices like smart-phones, camcorders, video conferencing systems etc.

The Application of Virtex-II Pro FPGA in High-Speed Image Processing Technology of Robot Vision Sensor

NASA Astrophysics Data System (ADS)

Ren, Y. J.; Zhu, J. G.; Yang, X. Y.; Ye, S. H.

2006-10-01

The Virtex-II Pro FPGA is applied to the vision sensor tracking system of IRB2400 robot. The hardware platform, which undertakes the task of improving SNR and compressing data, is constructed by using the high-speed image processing of FPGA. The lower level image-processing algorithm is realized by combining the FPGA frame and the embedded CPU. The velocity of image processing is accelerated due to the introduction of FPGA and CPU. The usage of the embedded CPU makes it easily to realize the logic design of interface. Some key techniques are presented in the text, such as read-write process, template matching, convolution, and some modules are simulated too. In the end, the compare among the modules using this design, using the PC computer and using the DSP, is carried out. Because the high-speed image processing system core is a chip of FPGA, the function of which can renew conveniently, therefore, to a degree, the measure system is intelligent.
Implementation of a high precision multi-measurement time-to-digital convertor on a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Kuang, Jie; Wang, Yonggang; Cao, Qiang; Liu, Chong

2018-05-01

Time-to-digital convertors (TDCs) based on field programmable gate array (FPGA) are becoming more and more popular. Multi-measurement is an effective method to improve TDC precision beyond the cell delay limitation. However, the implementation of TDC with multi-measurement on FPGAs manufactured with 28 nm and more advanced process is facing new challenges. Benefiting from the ones-counter encoding scheme, which was developed in our previous work, we implement a ring oscillator multi-measurement TDC on a Xilinx Kintex-7 FPGA. Using the two TDC channels to measure time-intervals in the range (0 ns-30 ns), the average RMS precision can be improved to 5.76 ps, meanwhile the logic resource usage remains the same with the one-measurement TDC, and the TDC dead time is only 22 ns. The investigation demonstrates that the multi-measurement methods are still available for current main-stream FPGAs. Furthermore, the new implementation in this paper could make the trade-off among the time precision, resource usage and TDC dead time better than ever before.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Batista, Antonio J. N.; Santos, Bruno; Fernandes, Ana

The data acquisition and control instrumentation cubicles room of the ITER tokamak will be irradiated with neutrons during the fusion reactor operation. A Virtex-6 FPGA from Xilinx (XC6VLX365T-1FFG1156C) is used on the ATCA-IO-PROCESSOR board, included in the ITER Catalog of I and C products - Fast Controllers. The Virtex-6 is a re-programmable logic device where the configuration is stored in Static RAM (SRAM), functional data stored in dedicated Block RAM (BRAM) and functional state logic in Flip-Flops. Single Event Upsets (SEU) due to the ionizing radiation of neutrons causes soft errors, unintended changes (bit-flips) to the values stored in statemore » elements of the FPGA. The SEU monitoring and soft errors repairing, when possible, were explored in this work. An FPGA built-in Soft Error Mitigation (SEM) controller detects and corrects soft errors in the FPGA configuration memory. Novel SEU sensors with Error Correction Code (ECC) detect and repair the BRAM memories. Proper management of SEU can increase reliability and availability of control instrumentation hardware for nuclear applications. The results of the tests performed using the SEM controller and the BRAM SEU sensors are presented for a Virtex-6 FPGA (XC6VLX240T-1FFG1156C) when irradiated with neutrons from the Portuguese Research Reactor (RPI), a 1 MW nuclear fission reactor operated by IST in the neighborhood of Lisbon. Results show that the proposed SEU mitigation technique is able to repair the majority of the detected SEU errors in the configuration and BRAM memories. (authors)« less
An integrated framework for high level design of high performance signal processing circuits on FPGAs

NASA Astrophysics Data System (ADS)

Benkrid, K.; Belkacemi, S.; Sukhsawas, S.

2005-06-01

This paper proposes an integrated framework for the high level design of high performance signal processing algorithms' implementations on FPGAs. The framework emerged from a constant need to rapidly implement increasingly complicated algorithms on FPGAs while maintaining the high performance needed in many real time digital signal processing applications. This is particularly important for application developers who often rely on iterative and interactive development methodologies. The central idea behind the proposed framework is to dynamically integrate high performance structural hardware description languages with higher level hardware languages in other to help satisfy the dual requirement of high level design and high performance implementation. The paper illustrates this by integrating two environments: Celoxica's Handel-C language, and HIDE, a structural hardware environment developed at the Queen's University of Belfast. On the one hand, Handel-C has been proven to be very useful in the rapid design and prototyping of FPGA circuits, especially control intensive ones. On the other hand, HIDE, has been used extensively, and successfully, in the generation of highly optimised parameterisable FPGA cores. In this paper, this is illustrated in the construction of a scalable and fully parameterisable core for image algebra's five core neighbourhood operations, where fully floorplanned efficient FPGA configurations, in the form of EDIF netlists, are generated automatically for instances of the core. In the proposed combined framework, highly optimised data paths are invoked dynamically from within Handel-C, and are synthesized using HIDE. Although the idea might seem simple prima facie, it could have serious implications on the design of future generations of hardware description languages.
A multi-rate DPSK modem for free-space laser communications

NASA Astrophysics Data System (ADS)

Spellmeyer, N. W.; Browne, C. A.; Caplan, D. O.; Carney, J. J.; Chavez, M. L.; Fletcher, A. S.; Fitzgerald, J. J.; Kaminsky, R. D.; Lund, G.; Hamilton, S. A.; Magliocco, R. J.; Mikulina, O. V.; Murphy, R. J.; Rao, H. G.; Scheinbart, M. S.; Seaver, M. M.; Wang, J. P.

2014-03-01

The multi-rate DPSK format, which enables efficient free-space laser communications over a wide range of data rates, is finding applications in NASA's Laser Communications Relay Demonstration. We discuss the design and testing of an efficient and robust multi-rate DPSK modem, including aspects of the electrical, mechanical, thermal, and optical design. The modem includes an optically preamplified receiver, an 0.5-W average power transmitter, a LEON3 rad-hard microcontroller that provides the command and telemetry interface and supervisory control, and a Xilinx Virtex-5 radhard reprogrammable FPGA that both supports the high-speed data flow to and from the modem and controls the modem's analog and digital subsystems. For additional flexibility, the transmitter and receiver can be configured to support operation with multi-rate PPM waveforms.
Radiation Mitigation and Power Optimization Design Tools for Reconfigurable Hardware in Orbit

NASA Technical Reports Server (NTRS)

French, Matthew; Graham, Paul; Wirthlin, Michael; Wang, Li; Larchev, Gregory

2005-01-01

The Reconfigurable Hardware in Orbit (RHinO)project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. In the second year of the project, design tools that leverage an established FPGA design environment have been created to visualize and analyze an FPGA circuit for radiation weaknesses and power inefficiencies. For radiation, a single event Upset (SEU) emulator, persistence analysis tool, and a half-latch removal tool for Xilinx/Virtex-II devices have been created. Research is underway on a persistence mitigation tool and multiple bit upsets (MBU) studies. For power, synthesis level dynamic power visualization and analysis tools have been completed. Power optimization tools are under development and preliminary test results are positive.
Replication of Space-Shuttle Computers in FPGAs and ASICs

NASA Technical Reports Server (NTRS)

Ferguson, Roscoe C.

2008-01-01

A document discusses the replication of the functionality of the onboard space-shuttle general-purpose computers (GPCs) in field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). The purpose of the replication effort is to enable utilization of proven space-shuttle flight software and software-development facilities to the extent possible during development of software for flight computers for a new generation of launch vehicles derived from the space shuttles. The replication involves specifying the instruction set of the central processing unit and the input/output processor (IOP) of the space-shuttle GPC in a hardware description language (HDL). The HDL is synthesized to form a "core" processor in an FPGA or, less preferably, in an ASIC. The core processor can be used to create a flight-control card to be inserted into a new avionics computer. The IOP of the GPC as implemented in the core processor could be designed to support data-bus protocols other than that of a multiplexer interface adapter (MIA) used in the space shuttle. Hence, a computer containing the core processor could be tailored to communicate via the space-shuttle GPC bus and/or one or more other buses.
Reconfigurable Fault Tolerance for FPGAs

NASA Technical Reports Server (NTRS)

Shuler, Robert, Jr.

2010-01-01

The invention allows a field-programmable gate array (FPGA) or similar device to be efficiently reconfigured in whole or in part to provide higher capacity, non-redundant operation. The redundant device consists of functional units such as adders or multipliers, configuration memory for the functional units, a programmable routing method, configuration memory for the routing method, and various other features such as block RAM, I/O (random access memory, input/output) capability, dedicated carry logic, etc. The redundant device has three identical sets of functional units and routing resources and majority voters that correct errors. The configuration memory may or may not be redundant, depending on need. For example, SRAM-based FPGAs will need some type of radiation-tolerant configuration memory, or they will need triple-redundant configuration memory. Flash or anti-fuse devices will generally not need redundant configuration memory. Some means of loading and verifying the configuration memory is also required. These are all components of the pre-existing redundant FPGA. This innovation modifies the voter to accept a MODE input, which specifies whether ordinary voting is to occur, or if redundancy is to be split. Generally, additional routing resources will also be required to pass data between sections of the device created by splitting the redundancy. In redundancy mode, the voters produce an output corresponding to the two inputs that agree, in the usual fashion. In the split mode, the voters select just one input and convey this to the output, ignoring the other inputs. In a dual-redundant system (as opposed to triple-redundant), instead of a voter, there is some means to latch or gate a state update only when both inputs agree. In this case, the invention would require modification of the latch or gate so that it would operate normally in redundant mode, and would separately latch or gate the inputs in non-redundant mode.
A Test Methodology for Determining Space-Readiness of Xilinx SRAM-Based FPGA Designs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather M; Graham, Paul S; Morgan, Keith S

2008-01-01

Using reconfigurable, static random-access memory (SRAM) based field-programmable gate arrays (FPGAs) for space-based computation has been an exciting area of research for the past decade. Since both the circuit and the circuit's state is stored in radiation-tolerant memory, both could be alterd by the harsh space radiation environment. Both the circuit and the circuit's state can be prote cted by triple-moduler redundancy (TMR), but applying TMR to FPGA user designs is often an error-prone process. Faulty application of TMR could cause the FPGA user circuit to output incorrect data. This paper will describe a three-tiered methodology for testing FPGA usermore » designs for space-readiness. We will describe the standard approach to testing FPGA user designs using a particle accelerator, as well as two methods using fault injection and a modeling tool. While accelerator testing is the current 'gold standard' for pre-launch testing, we believe the use of fault injection and modeling tools allows for easy, cheap and uniform access for discovering errors early in the design process.« less
A Hardware-Accelerated Quantum Monte Carlo framework (HAQMC) for N-body systems

NASA Astrophysics Data System (ADS)

Gothandaraman, Akila; Peterson, Gregory D.; Warren, G. Lee; Hinde, Robert J.; Harrison, Robert J.

2009-12-01

1 consisting of a dual-core, dualprocessor AMD Opteron 2.2 GHz with a Xilinx Virtex-4 (V4LX160) or Xilinx Virtex-II Pro (XC2VP50) FPGA per node. We use the compute node with the Xilinx Virtex-4 FPGA Operating system: Red Hat Enterprise Linux OS Has the code been vectorised or parallelized?: Yes Classification: 6.1 Nature of problem: Quantum Monte Carlo is a practical method to solve the Schrödinger equation for large many-body systems and obtain the ground-state properties of such systems. This method involves the sampling of a number of configurations of atoms and averaging the properties of the configurations over a number of iterations. We are interested in applying the QMC method to obtain the energy and other properties of highly quantum clusters, such as inert gas clusters. Solution method: The proposed framework provides a combined hardware-software approach, in which the QMC simulation is performed on the host processor, with the computationally intensive functions such as energy and trial wave function computations mapped onto the field-programmable gate array (FPGA) logic device attached as a co-processor to the host processor. We perform the QMC simulation for a number of iterations as in the case of our original software QMC approach, to reduce the statistical uncertainty of the results. However, our proposed HAQMC framework accelerates each iteration of the simulation, by significantly reducing the time taken to calculate the ground-state properties of the configurations of atoms, thereby accelerating the overall QMC simulation. We provide a generic interpolation framework that can be extended to study a variety of pure and doped atomic clusters, irrespective of the chemical identities of the atoms. For the FPGA implementation of the properties, we use a two-region approach for accurately computing the properties over the entire domain, employ deep pipelines and fixed-point for all our calculations guaranteeing the accuracy required for our simulation.
Field-programmable gate array implementation of an all-digital IEEE 802.15.4-compliant transceiver

NASA Astrophysics Data System (ADS)

Cornetta, Gianluca; Touhafi, Abdellah; Santos, David J.; Vázquez, José M.

2010-12-01

An architecture for a low-cost, low-complexity digital transceiver is presented in this article. The proposed architecture targets the IEEE 802.15.4 standard for short-range wireless personal area networks and has been implemented as a synthesisable VHDL register transfer level description. The system has been evaluated and tested using a Xilinx 90 nm Virtex-4 field-programmable gate array as the target technology. Bit error rate (BER) and error vector magnitude (EVM) have been used as the figures of merit for modem performance. Simulations show that the recommended minimum BER is achieved at E b/N 0 = 8.7 dB, whereas the EVM is 19.5%. The implemented device occupies 10% of the target FPGA and has a normalised maximum power consumption of 44 mW in transmit mode and 53 mW in receiver mode.
Upgrading the Digital Electronics of the PEP-II Bunch Current Monitors at the Stanford Linear Accelerator Center

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kline, Josh; /SLAC

2006-08-28

The testing of the upgrade prototype for the bunch current monitors (BCMs) in the PEP-II storage rings at the Stanford Linear Accelerator Center (SLAC) is the topic of this paper. Bunch current monitors are used to measure the charge in the electron/positron bunches traveling in particle storage rings. The BCMs in the PEP-II storage rings need to be upgraded because components of the current system have failed and are known to be failure prone with age, and several of the integrated chips are no longer produced making repairs difficult if not impossible. The main upgrade is replacing twelve old (1995)more » field programmable gate arrays (FPGAs) with a single Virtex II FPGA. The prototype was tested using computer synthesis tools, a commercial signal generator, and a fast pulse generator.« less
In-beam experience with a highly granular DAQ and control network: TrbNet

NASA Astrophysics Data System (ADS)

Michel, J.; Korcyl, G.; Maier, L.; Traxler, M.

2013-02-01

Virtually all Data Acquisition Systems (DAQ) for nuclear and particle physics experiments use a large number of Field Programmable Gate Arrays (FPGAs) for data transport and more complex tasks as pattern recognition and data reduction. All these FPGAs in a large system have to share a common state like a trigger number or an epoch counter to keep the system synchronized for a consistent event/epoch building. Additionally, the collected data has to be transported with high bandwidth, optionally via the ubiquitous Ethernet protocol. Furthermore, the FPGAs' internal states and configuration memories have to be accessed for control and monitoring purposes. Another requirement for a modern DAQ-network is the fault-tolerance for intermittent data errors in the form of automatic retransmission of faulty data. As FPGAs suffer from Single Event Effects when exposed to ionizing particles, the system has to deal with failing FPGAs. The TrbNet protocol was developed taking all these requirements into account. Three virtual channels are merged on one physical medium: The trigger/epoch information is transported with the highest priority. The data channel is second in the priority order, while the control channel is the last. Combined with a small frame size of 80 bit this guarantees a low latency data transport: A system with 100 front-ends can be built with a one-way latency of 2.2 us. The TrbNet-protocol was implemented in each of the 550 FPGAs of the HADES upgrade project and has been successfully used during the Au+Au campaign in April 2012. With 2ṡ106/s Au-ions and 3% interaction ratio the accepted trigger rate is 10 kHz while data is written to storage with 150 MBytes/s. Errors are reliably mitigated via the implemented retransmission of packets and auto-shut-down of individual links. TrbNet was also used for full monitoring of the FEE status. The network stack is written in VHDL and was successfully deployed on various Lattice and Xilinx devices. The TrbNet is also
FPGA implementation of neuro-fuzzy system with improved PSO learning.

PubMed

Karakuzu, Cihan; Karakaya, Fuat; Çavuşlu, Mehmet Ali

2016-07-01

This paper presents the first hardware implementation of neuro-fuzzy system (NFS) with its metaheuristic learning ability on field programmable gate array (FPGA). Metaheuristic learning of NFS for all of its parameters is accomplished by using the improved particle swarm optimization (iPSO). As a second novelty, a new functional approach, which does not require any memory and multiplier usage, is proposed for the Gaussian membership functions of NFS. NFS and its learning using iPSO are implemented on Xilinx Virtex5 xc5vlx110-3ff1153 and efficiency of the proposed implementation tested on two dynamic system identification problems and licence plate detection problem as a practical application. Results indicate that proposed NFS implementation and membership function approximation is as effective as the other approaches available in the literature but requires less hardware resources. Copyright © 2016 Elsevier Ltd. All rights reserved.
FPGA Online Tracking Algorithm for the PANDA Straw Tube Tracker

NASA Astrophysics Data System (ADS)

Liang, Yutie; Ye, Hua; Galuska, Martin J.; Gessler, Thomas; Kuhn, Wolfgang; Lange, Jens Soren; Wagner, Milan N.; Liu, Zhen'an; Zhao, Jingzhou

2017-06-01

A novel FPGA based online tracking algorithm for helix track reconstruction in a solenoidal field, developed for the PANDA spectrometer, is described. Employing the Straw Tube Tracker detector with 4636 straw tubes, the algorithm includes a complex track finder, and a track fitter. Implemented in VHDL, the algorithm is tested on a Xilinx Virtex-4 FX60 FPGA chip with different types of events, at different event rates. A processing time of 7 $\\mu$s per event for an average of 6 charged tracks is obtained. The momentum resolution is about 3\\% (4\\%) for $p_t$ ($p_z$) at 1 GeV/c. Comparing to the algorithm running on a CPU chip (single core Intel Xeon E5520 at 2.26 GHz), an improvement of 3 orders of magnitude in processing time is obtained. The algorithm can handle severe overlapping of events which are typical for interaction rates above 10 MHz.
Image processing for a tactile/vision substitution system using digital CNN.

PubMed

Lin, Chien-Nan; Yu, Sung-Nien; Hu, Jin-Cheng

2006-01-01

In view of the parallel processing and easy implementation properties of CNN, we propose to use digital CNN as the image processor of a tactile/vision substitution system (TVSS). The digital CNN processor is used to execute the wavelet down-sampling filtering and the half-toning operations, aiming to extract important features from the images. A template combination method is used to embed the two image processing functions into a single CNN processor. The digital CNN processor is implemented on an intellectual property (IP) and is implemented on a XILINX VIRTEX II 2000 FPGA board. Experiments are designated to test the capability of the CNN processor in the recognition of characters and human subjects in different environments. The experiments demonstrates impressive results, which proves the proposed digital CNN processor a powerful component in the design of efficient tactile/vision substitution systems for the visually impaired people.
Research of x-ray nondestructive detector for high-speed running conveyor belt with steel wire ropes

NASA Astrophysics Data System (ADS)

Wang, Junfeng; Miao, Changyun; Wang, Wei; Lu, Xiaocui

2008-03-01

An X-ray nondestructive detector for high-speed running conveyor belt with steel wire ropes is researched in the paper. The principle of X-ray nondestructive testing (NDT) is analyzed, the general scheme of the X-ray nondestructive testing system is proposed, and the nondestructive detector for high-speed running conveyor belt with steel wire ropes is developed. The hardware of system is designed with Xilinx's VIRTEX-4 FPGA that embeds PowerPC and MAC IP core, and its network communication software based on TCP/IP protocol is programmed by loading LwIP to PowerPC. The nondestructive testing of high-speed conveyor belt with steel wire ropes and network transfer function are implemented. It is a strong real-time system with rapid scanning speed, high reliability and remotely nondestructive testing function. The nondestructive detector can be applied to the detection of product line in industry.
A Practical, Hardware Friendly MMSE Detector for MIMO-OFDM-Based Systems

NASA Astrophysics Data System (ADS)

Kim, Hun Seok; Zhu, Weijun; Bhatia, Jatin; Mohammed, Karim; Shah, Anish; Daneshrad, Babak

2008-12-01

Design and implementation of a highly optimized MIMO (multiple-input multiple-output) detector requires cooptimization of the algorithm with the underlying hardware architecture. Special attention must be paid to application requirements such as throughput, latency, and resource constraints. In this work, we focus on a highly optimized matrix inversion free [InlineEquation not available: see fulltext.] MMSE (minimum mean square error) MIMO detector implementation. The work has resulted in a real-time field-programmable gate array-based implementation (FPGA-) on a Xilinx Virtex-2 6000 using only 9003 logic slices, 66 multipliers, and 24 Block RAMs (less than 33% of the overall resources of this part). The design delivers over 420 Mbps sustained throughput with a small 2.77-microsecond latency. The designed [InlineEquation not available: see fulltext.] linear MMSE MIMO detector is capable of complying with the proposed IEEE 802.11n standard.
Subnanosecond time-to-digital converter implemented in a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Sano, Y.; Horii, Y.; Ikeno, M.; Sasaki, O.; Tomoto, M.; Uchida, T.

2017-12-01

Time-to-digital converters (TDCs) are used in various fields, including high-energy physics. One advantage of implementing TDCs in field-programmable gate arrays (FPGAs) is the flexibility on the modification of the logics, which is useful to cope with the changes in the experimental conditions. Recent FPGAs make it possible to implement TDCs with a time resolution less than 10 ps. On the other hand, various drift chambers require a time resolution of O(0.1) ns, and a simple and easy-to-implement TDC is useful for a robust operation. Herein an eight-channel TDC with a variable bin size down to 0.28 ns is implemented in a Xilinx Kintex-7 FPGA and tested. The TDC is based on a multisampling scheme with quad phase clocks synchronised with an external reference clock. Calibration of the bin size is unnecessary if a stable reference clock is available, which is common in high-energy physics experiments. Depending on the channel, the standard deviation of the differential nonlinearity for a 0.28 ns bin size is 0.13-0.31. The performance has a negligible dependence on the temperature. The power consumption and the potential to extend the number of channels are also discussed.
A self-timed multipurpose delay sensor for Field Programmable Gate Arrays (FPGAs).

PubMed

Osuna, Carlos Gómez; Ituero, Pablo; López-Vallejo, Marisa

2013-12-20

This paper presents a novel self-timed multi-purpose sensor especially conceived for Field Programmable Gate Arrays (FPGAs). The aim of the sensor is to measure performance variations during the life-cycle of the device, such as process variability, critical path timing and temperature variations. The proposed topology, through the use of both combinational and sequential FPGA elements, amplifies the time of a signal traversing a delay chain to produce a pulse whose width is the sensor's measurement. The sensor is fully self-timed, avoiding the need for clock distribution networks and eliminating the limitations imposed by the system clock. One single off- or on-chip time-to-digital converter is able to perform digitization of several sensors in a single operation. These features allow for a simplified approach for designers wanting to intertwine a multi-purpose sensor network with their application logic. Employed as a temperature sensor, it has been measured to have an error of ±0.67 °C, over the range of 20-100 °C, employing 20 logic elements with a 2-point calibration.

A Self-Timed Multipurpose Delay Sensor for Field Programmable Gate Arrays (FPGAs)

PubMed Central

Osuna, Carlos Gómez; Ituero, Pablo; López-Vallejo, Marisa

2014-01-01

This paper presents a novel self-timed multi-purpose sensor especially conceived for Field Programmable Gate Arrays (FPGAs). The aim of the sensor is to measure performance variations during the life-cycle of the device, such as process variability, critical path timing and temperature variations. The proposed topology, through the use of both combinational and sequential FPGA elements, amplifies the time of a signal traversing a delay chain to produce a pulse whose width is the sensor's measurement. The sensor is fully self-timed, avoiding the need for clock distribution networks and eliminating the limitations imposed by the system clock. One single off- or on-chip time-to-digital converter is able to perform digitization of several sensors in a single operation. These features allow for a simplified approach for designers wanting to intertwine a multi-purpose sensor network with their application logic. Employed as a temperature sensor, it has been measured to have an error of ±0.67 °C, over the range of 20–100 °C, employing 20 logic elements with a 2-point calibration. PMID:24361927
Dynamic partial reconfiguration of logic controllers implemented in FPGAs

NASA Astrophysics Data System (ADS)

Bazydło, Grzegorz; Wiśniewski, Remigiusz

2016-09-01

Technological progress in recent years benefits in digital circuits containing millions of logic gates with the capability for reprogramming and reconfiguring. On the one hand it provides the unprecedented computational power, but on the other hand the modelled systems are becoming increasingly complex, hierarchical and concurrent. Therefore, abstract modelling supported by the Computer Aided Design tools becomes a very important task. Even the higher consumption of the basic electronic components seems to be acceptable because chip manufacturing costs tend to fall over the time. The paper presents a modelling approach for logic controllers with the use of Unified Modelling Language (UML). Thanks to the Model Driven Development approach, starting with a UML state machine model, through the construction of an intermediate Hierarchical Concurrent Finite State Machine model, a collection of Verilog files is created. The system description generated in hardware description language can be synthesized and implemented in reconfigurable devices, such as FPGAs. Modular specification of the prototyped controller permits for further dynamic partial reconfiguration of the prototyped system. The idea bases on the exchanging of the functionality of the already implemented controller without stopping of the FPGA device. It means, that a part (for example a single module) of the logic controller is replaced by other version (called context), while the rest of the system is still running. The method is illustrated by a practical example by an exemplary Home Area Network system.
Efficient Smart CMOS Camera Based on FPGAs Oriented to Embedded Image Processing

PubMed Central

Bravo, Ignacio; Baliñas, Javier; Gardel, Alfredo; Lázaro, José L.; Espinosa, Felipe; García, Jorge

2011-01-01

This article describes an image processing system based on an intelligent ad-hoc camera, whose two principle elements are a high speed 1.2 megapixel Complementary Metal Oxide Semiconductor (CMOS) sensor and a Field Programmable Gate Array (FPGA). The latter is used to control the various sensor parameter configurations and, where desired, to receive and process the images captured by the CMOS sensor. The flexibility and versatility offered by the new FPGA families makes it possible to incorporate microprocessors into these reconfigurable devices, and these are normally used for highly sequential tasks unsuitable for parallelization in hardware. For the present study, we used a Xilinx XC4VFX12 FPGA, which contains an internal Power PC (PPC) microprocessor. In turn, this contains a standalone system which manages the FPGA image processing hardware and endows the system with multiple software options for processing the images captured by the CMOS sensor. The system also incorporates an Ethernet channel for sending processed and unprocessed images from the FPGA to a remote node. Consequently, it is possible to visualize and configure system operation and captured and/or processed images remotely. PMID:22163739
SpaceCube v2.0 Space Flight Hybrid Reconfigurable Data Processing System

NASA Technical Reports Server (NTRS)

Petrick, Dave

2014-01-01

This paper details the design architecture, design methodology, and the advantages of the SpaceCube v2.0 high performance data processing system for space applications. The purpose in building the SpaceCube v2.0 system is to create a superior high performance, reconfigurable, hybrid data processing system that can be used in a multitude of applications including those that require a radiation hardened and reliable solution. The SpaceCube v2.0 system leverages seven years of board design, avionics systems design, and space flight application experiences. This paper shows how SpaceCube v2.0 solves the increasing computing demands of space data processing applications that cannot be attained with a standalone processor approach.The main objective during the design stage is to find a good system balance between power, size, reliability, cost, and data processing capability. These design variables directly impact each other, and it is important to understand how to achieve a suitable balance. This paper will detail how these critical design factors were managed including the construction of an Engineering Model for an experiment on the International Space Station to test out design concepts. We will describe the designs for the processor card, power card, backplane, and a mission unique interface card. The mechanical design for the box will also be detailed since it is critical in meeting the stringent thermal and structural requirements imposed by the processing system. In addition, the mechanical design uses advanced thermal conduction techniques to solve the internal thermal challenges.The SpaceCube v2.0 processing system is based on an extended version of the 3U cPCI standard form factor where each card is 190mm x 100mm in size The typical power draw of the processor card is 8 to 10W and scales with application complexity. The SpaceCube v2.0 data processing card features two Xilinx Virtex-5 QV Field Programmable Gate Arrays (FPGA), eight memory modules, a monitor
Digital Radar-Signal Processors Implemented in FPGAs

NASA Technical Reports Server (NTRS)

Berkun, Andrew; Andraka, Ray

2004-01-01

High-performance digital electronic circuits for onboard processing of return signals in an airborne precipitation- measuring radar system have been implemented in commercially available field-programmable gate arrays (FPGAs). Previously, it was standard practice to downlink the radar-return data to a ground station for postprocessing a costly practice that prevents the nearly-real-time use of the data for automated targeting. In principle, the onboard processing could be performed by a system of about 20 personal- computer-type microprocessors; relative to such a system, the present FPGA-based processor is much smaller and consumes much less power. Alternatively, the onboard processing could be performed by an application-specific integrated circuit (ASIC), but in comparison with an ASIC implementation, the present FPGA implementation offers the advantages of (1) greater flexibility for research applications like the present one and (2) lower cost in the small production volumes typical of research applications. The generation and processing of signals in the airborne precipitation measuring radar system in question involves the following especially notable steps: The system utilizes a total of four channels two carrier frequencies and two polarizations at each frequency. The system uses pulse compression: that is, the transmitted pulse is spread out in time and the received echo of the pulse is processed with a matched filter to despread it. The return signal is band-limited and digitally demodulated to a complex baseband signal that, for each pulse, comprises a large number of samples. Each complex pair of samples (denoted a range gate in radar terminology) is associated with a numerical index that corresponds to a specific time offset from the beginning of the radar pulse, so that each such pair represents the energy reflected from a specific range. This energy and the average echo power are computed. The phase of each range bin is compared to the previous echo
An embedded vision system for an unmanned four-rotor helicopter

NASA Astrophysics Data System (ADS)

Lillywhite, Kirt; Lee, Dah-Jye; Tippetts, Beau; Fowers, Spencer; Dennis, Aaron; Nelson, Brent; Archibald, James

2006-10-01

In this paper an embedded vision system and control module is introduced that is capable of controlling an unmanned four-rotor helicopter and processing live video for various law enforcement, security, military, and civilian applications. The vision system is implemented on a newly designed compact FPGA board (Helios). The Helios board contains a Xilinx Virtex-4 FPGA chip and memory making it capable of implementing real time vision algorithms. A Smooth Automated Intelligent Leveling daughter board (SAIL), attached to the Helios board, collects attitude and heading information to be processed in order to control the unmanned helicopter. The SAIL board uses an electrolytic tilt sensor, compass, voltage level converters, and analog to digital converters to perform its operations. While level flight can be maintained, problems stemming from the characteristics of the tilt sensor limits maneuverability of the helicopter. The embedded vision system has proven to give very good results in its performance of a number of real-time robotic vision algorithms.
A Secure Content Delivery System Based on a Partially Reconfigurable FPGA

NASA Astrophysics Data System (ADS)

Hori, Yohei; Yokoyama, Hiroyuki; Sakane, Hirofumi; Toda, Kenji

We developed a content delivery system using a partially reconfigurable FPGA to securely distribute digital content on the Internet. With partial reconfigurability of a Xilinx Virtex-II Pro FPGA, the system provides an innovative single-chip solution for protecting digital content. In the system, a partial circuit must be downloaded from a server to the client terminal to play content. Content will be played only when the downloaded circuit is correctly combined (=interlocked) with the circuit built in the terminal. Since each circuit has a unique I/O configuration, the downloaded circuit interlocks with the corresponding built-in circuit designed for a particular terminal. Thus, the interface of the circuit itself provides a novel authentication mechanism. This paper describes the detailed architecture of the system and clarify the feasibility and effectiveness of the system. In addition, we discuss a fail-safe mechanism and future work necessary for the practical application of the system.
Electronics design of a multi-rate DPSK modem for free-space optical communications

NASA Astrophysics Data System (ADS)

Rao, H. G.; Browne, C. A.; Caplan, D. O.; Carney, J. J.; Chavez, M. L.; Fletcher, A. S.; Fitzgerald, J. J.; Kaminsky, R. D.; Lund, G.; Hamilton, S. A.; Magliocco, R. J.; Mikulina, O. V.; Murphy, R. J.; Seaver, M. M.; Scheinbart, M. S.; Spellmeyer, N. W.; Wang, J. P.

2014-03-01

We have designed and experimentally demonstrated a radiation-hardened modem suitable for NASA's Laser Communications Relay Demonstration. The modem supports free-space DPSK communication over a wide range of channel rates, from 72 Mb/s up to 2.88 Gb/s. The modem transmitter electronics generate a bursty DPSK waveform, such that only one optical modulator is required. The receiver clock recovery is capable of operating over all channel rates at average optical signal levels below -70 dBm. The modem incorporates a radiation-hardened Xilinx Virtex 5 FPGA and a radiation-hardened Aeroflex UT699 CPU. The design leverages unique capabilities of each device, such as the FPGA's multi-gigabit transceivers. The modem scrubs itself against radiation events, but does not require pervasive triple-mode redundant logic. The modem electronics include automatic stabilization functions for its optical components, and software to control its initialization and operation. The design allows the modem to be put into a low-power standby mode.
Board Saver for Use with Developmental FPGAs

NASA Technical Reports Server (NTRS)

Berkun, Andrew

2009-01-01

A device denoted a board saver has been developed as a means of reducing wear and tear of a printed-circuit board onto which an antifuse field programmable gate array (FPGA) is to be eventually soldered permanently after a number of design iterations. The need for the board saver or a similar device arises because (1) antifuse-FPGA design iterations are common and (2) repeated soldering and unsoldering of FPGAs on the printed-circuit board to accommodate design iterations can wear out the printed-circuit board. The board saver is basically a solderable/unsolderable FPGA receptacle that is installed temporarily on the printed-circuit board. The board saver is, more specifically, a smaller, square-ring-shaped, printed-circuit board (see figure) that contains half via holes one for each contact pad along its periphery. As initially fabricated, the board saver is a wider ring containing full via holes, but then it is milled along its outer edges, cutting the via holes in half and laterally exposing their interiors. The board saver is positioned in registration with the designated FPGA footprint and each via hole is soldered to the outer portion of the corresponding FPGA contact pad on the first-mentioned printed-circuit board. The via-hole/contact joints can be inspected visually and can be easily unsoldered later. The square hole in the middle of the board saver is sized to accommodate the FPGA, and the thickness of the board saver is the same as that of the FPGA. Hence, when a non-final FPGA is placed in the square hole, the combination of the non-final FPGA and the board saver occupy no more area and thickness than would a final FPGA soldered directly into its designated position on the first-mentioned circuit board. The contact leads of a non-final FPGA are not bent and are soldered, at the top of the board saver, to the corresponding via holes. A non-final FPGA can readily be unsoldered from the board saver and replaced by another one. Once the final FPGA design
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation.

PubMed

Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

2017-03-10

Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted.
FPGA-based RF spectrum merging and adaptive hopset selection

NASA Astrophysics Data System (ADS)

McLean, R. K.; Flatley, B. N.; Silvius, M. D.; Hopkinson, K. M.

The radio frequency (RF) spectrum is a limited resource. Spectrum allotment disputes stem from this scarcity as many radio devices are confined to a fixed frequency or frequency sequence. One alternative is to incorporate cognition within a reconfigurable radio platform, therefore enabling the radio to adapt to dynamic RF spectrum environments. In this way, the radio is able to actively sense the RF spectrum, decide, and act accordingly, thereby sharing the spectrum and operating in more flexible manner. In this paper, we present a novel solution for merging many distributed RF spectrum maps into one map and for subsequently creating an adaptive hopset. We also provide an example of our system in operation, the result of which is a pseudorandom adaptive hopset. The paper then presents a novel hardware design for the frequency merger and adaptive hopset selector, both of which are written in VHDL and implemented as a custom IP core on an FPGA-based embedded system using the Xilinx Embedded Development Kit (EDK) software tool. The design of the custom IP core is optimized for area, and it can process a high-volume digital input via a low-latency circuit architecture. The complete embedded system includes the Xilinx PowerPC microprocessor, UART serial connection, and compact flash memory card IP cores, and our custom map merging/hopset selection IP core, all of which are targeted to the Virtex IV FPGA. This system is then incorporated into a cognitive radio prototype on a Rice University Wireless Open Access Research Platform (WARP) reconfigurable radio.
FPGA Coprocessor for Accelerated Classification of Images

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.

2008-01-01

An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.
Real-Time Digital Signal Processing Based on FPGAs for Electronic Skin Implementation †

PubMed Central

Ibrahim, Ali; Gastaldo, Paolo; Chible, Hussein; Valle, Maurizio

2017-01-01

Enabling touch-sensing capability would help appliances understand interaction behaviors with their surroundings. Many recent studies are focusing on the development of electronic skin because of its necessity in various application domains, namely autonomous artificial intelligence (e.g., robots), biomedical instrumentation, and replacement prosthetic devices. An essential task of the electronic skin system is to locally process the tactile data and send structured information either to mimic human skin or to respond to the application demands. The electronic skin must be fabricated together with an embedded electronic system which has the role of acquiring the tactile data, processing, and extracting structured information. On the other hand, processing tactile data requires efficient methods to extract meaningful information from raw sensor data. Machine learning represents an effective method for data analysis in many domains: it has recently demonstrated its effectiveness in processing tactile sensor data. In this framework, this paper presents the implementation of digital signal processing based on FPGAs for tactile data processing. It provides the implementation of a tensorial kernel function for a machine learning approach. Implementation results are assessed by highlighting the FPGA resource utilization and power consumption. Results demonstrate the feasibility of the proposed implementation when real-time classification of input touch modalities are targeted. PMID:28287448
A Method of Sky Ripple Residual Nonuniformity Reduction for a Cooled Infrared Imager and Hardware Implementation.

PubMed

Li, Yiyang; Jin, Weiqi; Li, Shuo; Zhang, Xu; Zhu, Jin

2017-05-08

Cooled infrared detector arrays always suffer from undesired ripple residual nonuniformity (RNU) in sky scene observations. The ripple residual nonuniformity seriously affects the imaging quality, especially for small target detection. It is difficult to eliminate it using the calibration-based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified temporal high-pass nonuniformity correction algorithm using fuzzy scene classification. The fuzzy scene classification is designed to control the correction threshold so that the algorithm can remove ripple RNU without degrading the scene details. We test the algorithm on a real infrared sequence by comparing it to several well-established methods. The result shows that the algorithm has obvious advantages compared with the tested methods in terms of detail conservation and convergence speed for ripple RNU correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA), which has two advantages: (1) low resources consumption; and (2) small hardware delay (less than 10 image rows). It has been successfully applied in an actual system.
Real-Time On-Board Processing Validation of MSPI Ground Camera Images

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Werne, Thomas A.; Bekker, Dmitriy L.

2010-01-01

The Earth Sciences Decadal Survey identifies a multiangle, multispectral, high-accuracy polarization imager as one requirement for the Aerosol-Cloud-Ecosystem (ACE) mission. JPL has been developing a Multiangle SpectroPolarimetric Imager (MSPI) as a candidate to fill this need. A key technology development needed for MSPI is on-board signal processing to calculate polarimetry data as imaged by each of the 9 cameras forming the instrument. With funding from NASA's Advanced Information Systems Technology (AIST) Program, JPL is solving the real-time data processing requirements to demonstrate, for the first time, how signal data at 95 Mbytes/sec over 16-channels for each of the 9 multiangle cameras in the spaceborne instrument can be reduced on-board to 0.45 Mbytes/sec. This will produce the intensity and polarization data needed to characterize aerosol and cloud microphysical properties. Using the Xilinx Virtex-5 FPGA including PowerPC440 processors we have implemented a least squares fitting algorithm that extracts intensity and polarimetric parameters in real-time, thereby substantially reducing the image data volume for spacecraft downlink without loss of science information.
Exploiting the chaotic behaviour of atmospheric models with reconfigurable architectures

NASA Astrophysics Data System (ADS)

Russell, Francis P.; Düben, Peter D.; Niu, Xinyu; Luk, Wayne; Palmer, T. N.

2017-12-01

Reconfigurable architectures are becoming mainstream: Amazon, Microsoft and IBM are supporting such architectures in their data centres. The computationally intensive nature of atmospheric modelling is an attractive target for hardware acceleration using reconfigurable computing. Performance of hardware designs can be improved through the use of reduced-precision arithmetic, but maintaining appropriate accuracy is essential. We explore reduced-precision optimisation for simulating chaotic systems, targeting atmospheric modelling, in which even minor changes in arithmetic behaviour will cause simulations to diverge quickly. The possibility of equally valid simulations having differing outcomes means that standard techniques for comparing numerical accuracy are inappropriate. We use the Hellinger distance to compare statistical behaviour between reduced-precision CPU implementations to guide reconfigurable designs of a chaotic system, then analyse accuracy, performance and power efficiency of the resulting implementations. Our results show that with only a limited loss in accuracy corresponding to less than 10% uncertainty in input parameters, the throughput and energy efficiency of a single-precision chaotic system implemented on a Xilinx Virtex-6 SX475T Field Programmable Gate Array (FPGA) can be more than doubled.
A high-throughput two channel discrete wavelet transform architecture for the JPEG2000 standard

NASA Astrophysics Data System (ADS)

Badakhshannoory, Hossein; Hashemi, Mahmoud R.; Aminlou, Alireza; Fatemi, Omid

2005-07-01

The Discrete Wavelet Transform (DWT) is increasingly recognized in image and video compression standards, as indicated by its use in JPEG2000. The lifting scheme algorithm is an alternative DWT implementation that has a lower computational complexity and reduced resource requirement. In the JPEG2000 standard two lifting scheme based filter banks are introduced: the 5/3 and 9/7. In this paper a high throughput, two channel DWT architecture for both of the JPEG2000 DWT filters is presented. The proposed pipelined architecture has two separate input channels that process the incoming samples simultaneously with minimum memory requirement for each channel. The architecture had been implemented in VHDL and synthesized on a Xilinx Virtex2 XCV1000. The proposed architecture applies DWT on a 2K by 1K image at 33 fps with a 75 MHZ clock frequency. This performance is achieved with 70% less resources than two independent single channel modules. The high throughput and reduced resource requirement has made this architecture the proper choice for real time applications such as Digital Cinema.
Active vibration control of a full scale aircraft wing using a reconfigurable controller

NASA Astrophysics Data System (ADS)

Prakash, Shashikala; Renjith Kumar, T. G.; Raja, S.; Dwarakanathan, D.; Subramani, H.; Karthikeyan, C.

2016-01-01

This work highlights the design of a Reconfigurable Active Vibration Control (AVC) System for aircraft structures using adaptive techniques. The AVC system with a multichannel capability is realized using Filtered-X Least Mean Square algorithm (FxLMS) on Xilinx Virtex-4 Field Programmable Gate Array (FPGA) platform in Very High Speed Integrated Circuits Hardware Description Language, (VHDL). The HDL design is made based on Finite State Machine (FSM) model with Floating point Intellectual Property (IP) cores for arithmetic operations. The use of FPGA facilitates to modify the system parameters even during runtime depending on the changes in user's requirements. The locations of the control actuators are optimized based on dynamic modal strain approach using genetic algorithm (GA). The developed system has been successfully deployed for the AVC testing of the full-scale wing of an all composite two seater transport aircraft. Several closed loop configurations like single channel and multi-channel control have been tested. The experimental results from the studies presented here are very encouraging. They demonstrate the usefulness of the system's reconfigurability for real time applications.
S-Band POSIX Device Drivers for RTEMS

NASA Technical Reports Server (NTRS)

Lux, James P.; Lang, Minh; Peters, Kenneth J.; Taylor, Gregory H.

2011-01-01

This is a set of POSIX device driver level abstractions in the RTEMS RTOS (Real-Time Executive for Multiprocessor Systems real-time operating system) to SBand radio hardware devices that have been instantiated in an FPGA (field-programmable gate array). These include A/D (analog-to-digital) sample capture, D/A (digital-to-analog) sample playback, PLL (phase-locked-loop) tuning, and PWM (pulse-width-modulation)-controlled gain. This software interfaces to Sband radio hardware in an attached Xilinx Virtex-2 FPGA. It uses plug-and-play device discovery to map memory to device IDs. Instead of interacting with hardware devices directly, using direct-memory mapped access at the application level, this driver provides an application programming interface (API) offering that easily uses standard POSIX function calls. This simplifies application programming, enables portability, and offers an additional level of protection to the hardware. There are three separate device drivers included in this package: sband_device (ADC capture and DAC playback), pll_device (RF front end PLL tuning), and pwm_device (RF front end AGC control).
A neuro-inspired spike-based PID motor controller for multi-motor robots with low cost FPGAs.

PubMed

Jimenez-Fernandez, Angel; Jimenez-Moreno, Gabriel; Linares-Barranco, Alejandro; Dominguez-Morales, Manuel J; Paz-Vicente, Rafael; Civit-Balcells, Anton

2012-01-01

In this paper we present a neuro-inspired spike-based close-loop controller written in VHDL and implemented for FPGAs. This controller has been focused on controlling a DC motor speed, but only using spikes for information representation, processing and DC motor driving. It could be applied to other motors with proper driver adaptation. This controller architecture represents one of the latest layers in a Spiking Neural Network (SNN), which implements a bridge between robotics actuators and spike-based processing layers and sensors. The presented control system fuses actuation and sensors information as spikes streams, processing these spikes in hard real-time, implementing a massively parallel information processing system, through specialized spike-based circuits. This spike-based close-loop controller has been implemented into an AER platform, designed in our labs, that allows direct control of DC motors: the AER-Robot. Experimental results evidence the viability of the implementation of spike-based controllers, and hardware synthesis denotes low hardware requirements that allow replicating this controller in a high number of parallel controllers working together to allow a real-time robot control.

A Neuro-Inspired Spike-Based PID Motor Controller for Multi-Motor Robots with Low Cost FPGAs

PubMed Central

Jimenez-Fernandez, Angel; Jimenez-Moreno, Gabriel; Linares-Barranco, Alejandro; Dominguez-Morales, Manuel J.; Paz-Vicente, Rafael; Civit-Balcells, Anton

2012-01-01

In this paper we present a neuro-inspired spike-based close-loop controller written in VHDL and implemented for FPGAs. This controller has been focused on controlling a DC motor speed, but only using spikes for information representation, processing and DC motor driving. It could be applied to other motors with proper driver adaptation. This controller architecture represents one of the latest layers in a Spiking Neural Network (SNN), which implements a bridge between robotics actuators and spike-based processing layers and sensors. The presented control system fuses actuation and sensors information as spikes streams, processing these spikes in hard real-time, implementing a massively parallel information processing system, through specialized spike-based circuits. This spike-based close-loop controller has been implemented into an AER platform, designed in our labs, that allows direct control of DC motors: the AER-Robot. Experimental results evidence the viability of the implementation of spike-based controllers, and hardware synthesis denotes low hardware requirements that allow replicating this controller in a high number of parallel controllers working together to allow a real-time robot control. PMID:22666004
A new MicroTCA-based waveform digitizer for the Muon g-2 experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sweigart, David A.

We present the design of a newmore » $$\\mu$$TCA-based waveform digitizer, which will be deployed in the Muon g-2 experiment at Fermilab and will allow our pileup identification requirement to be met. This digitizer features five independent channels, each with 12-bit, 800-MSPS digitization and a 1-Gbit memory buffer. The data storage and readout along with configuration are handled by six Xilinx Kintex-7 FPGAs. In addition, the digitizer is equipped with a mezzanine card for analog signal conditioning prior to digitization, further widening its range of possible applications. The performance results of this design are also presented, highlighting its $$0.51 \\pm 0.13$$ mV intrinsic noise level and $< 22$ ps intrinsic timing resolution between channels. We believe that its performance, together with its flexible design, could be of interest to future experiments in search of a cost-effective waveform digitizer.« less
High-channel-count, high-density microelectrode array for closed-loop investigation of neuronal networks.

PubMed

Tsai, David; John, Esha; Chari, Tarun; Yuste, Rafael; Shepard, Kenneth

2015-01-01

We present a system for large-scale electrophysiological recording and stimulation of neural tissue with a planar topology. The recording system has 65,536 electrodes arranged in a 256 × 256 grid, with 25.5 μm pitch, and covering an area approximately 42.6 mm(2). The recording chain has 8.66 μV rms input-referred noise over a 100 ~ 10k Hz bandwidth while providing up to 66 dB of voltage gain. When recording from all electrodes in the array, it is capable of 10-kHz sampling per electrode. All electrodes can also perform patterned electrical microstimulation. The system produces ~ 1 GB/s of data when recording from the full array. To handle, store, and perform nearly real-time analyses of this large data stream, we developed a framework based around Xilinx FPGAs, Intel x86 CPUs and the NVIDIA Streaming Multiprocessors to interface with the electrode array.
A High-Linearity, Ring-Oscillator-Based, Vernier Time-to-Digital Converter Utilizing Carry Chains in FPGAs

NASA Astrophysics Data System (ADS)

Cui, Ke; Ren, Zhongjie; Li, Xiangyu; Liu, Zongkai; Zhu, Rihong

2017-01-01

Time-to-digital converters (TDCs) using dedicated carry chains of field programmable gate arrays (FPGAs) are usually organized in tapped-delay-line type which are intensively researched in recent years. However this method incurs poor differential nonlinearity (DNL) which arises from the inherent uneven bin granularity. This paper proposes a TDC architecture which utilizes the carry chains in a quite different manner in order to alleviate this long-standing problem. Two independent carry chains working as the delay lines for the fine time interpolation are organized in a ring-oscillator-based Vernier style and the time difference between them is finely adjusted by assigning different number of basic delay cells. A specific design flow is described to obtain the desired delay difference. The TDC was implemented on a Stratix III FPGA. Test results show that the obtained resolution is 31 ps and the DNL\\INL is in the range of (-0.080 LSB, 0.073 LSB)(-0.087 LSB, 0.091 LSB). This demonstrates that the proposed architecture greatly improves linearity compared to previous techniques. Additionally the resource cost is rather low which uses only 319 LUTs and 104 registers per TDC channel.
A Method of Sky Ripple Residual Nonuniformity Reduction for a Cooled Infrared Imager and Hardware Implementation

PubMed Central

Li, Yiyang; Jin, Weiqi; Li, Shuo; Zhang, Xu; Zhu, Jin

2017-01-01

Cooled infrared detector arrays always suffer from undesired ripple residual nonuniformity (RNU) in sky scene observations. The ripple residual nonuniformity seriously affects the imaging quality, especially for small target detection. It is difficult to eliminate it using the calibration-based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified temporal high-pass nonuniformity correction algorithm using fuzzy scene classification. The fuzzy scene classification is designed to control the correction threshold so that the algorithm can remove ripple RNU without degrading the scene details. We test the algorithm on a real infrared sequence by comparing it to several well-established methods. The result shows that the algorithm has obvious advantages compared with the tested methods in terms of detail conservation and convergence speed for ripple RNU correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA), which has two advantages: (1) low resources consumption; and (2) small hardware delay (less than 10 image rows). It has been successfully applied in an actual system. PMID:28481320
The GANDALF 128-Channel Time-to-Digital Converter

NASA Astrophysics Data System (ADS)

Büchele, M.; Fischer, H.; Herrmann, F.; Königsmann, K.; Schill, C.; Schopferer, S.

The GANDALF 6U-VME64x/VXS module has been designed to cope with a variety of readout tasks in high energy and nuclear physics experiments, in particular the COMPASS experiment at CERN. The exchangeable mezzanine cards allow for an employment of the system in very different applications such as analog-to-digital or time-to-digital conversions, coincidence matrix formation, fast pattern recognition or fast trigger generation. Based on this platform, we present a 128-channel TDC which is implemented in a single Xilinx Virtex-5 FPGA using a shifted clock sampling method. In this concept each input signal is continuously sampled by 16 flip-flops using equidistant phase-shifted clocks. Compared to previous FPGA designs, usually based on delay lines and comprising few TDC channels with resolutions in the order of 10 ps, our design permits the implementation of a large number of TDC channels with a resolution of 64 ps in a single FPGA. Predictable placement of logic components and uniform routing inside the FPGA fabric is a particular challenge of this design. We present measurement results for the time resolution and the nonlinearity of the TDC readout system.
DeepX: Deep Learning Accelerator for Restricted Boltzmann Machine Artificial Neural Networks.

PubMed

Kim, Lok-Won

2018-05-01

Although there have been many decades of research and commercial presence on high performance general purpose processors, there are still many applications that require fully customized hardware architectures for further computational acceleration. Recently, deep learning has been successfully used to learn in a wide variety of applications, but their heavy computation demand has considerably limited their practical applications. This paper proposes a fully pipelined acceleration architecture to alleviate high computational demand of an artificial neural network (ANN) which is restricted Boltzmann machine (RBM) ANNs. The implemented RBM ANN accelerator (integrating network size, using 128 input cases per batch, and running at a 303-MHz clock frequency) integrated in a state-of-the art field-programmable gate array (FPGA) (Xilinx Virtex 7 XC7V-2000T) provides a computational performance of 301-billion connection-updates-per-second and about 193 times higher performance than a software solution running on general purpose processors. Most importantly, the architecture enables over 4 times (12 times in batch learning) higher performance compared with a previous work when both are implemented in an FPGA device (XC2VP70).
Ripple FPN reduced algorithm based on temporal high-pass filter and hardware implementation

NASA Astrophysics Data System (ADS)

Li, Yiyang; Li, Shuo; Zhang, Zhipeng; Jin, Weiqi; Wu, Lei; Jin, Minglei

2016-11-01

Cooled infrared detector arrays always suffer from undesired Ripple Fixed-Pattern Noise (FPN) when observe the scene of sky. The Ripple Fixed-Pattern Noise seriously affect the imaging quality of thermal imager, especially for small target detection and tracking. It is hard to eliminate the FPN by the Calibration based techniques and the current scene-based nonuniformity algorithms. In this paper, we present a modified space low-pass and temporal high-pass nonuniformity correction algorithm using adaptive time domain threshold (THP&GM). The threshold is designed to significantly reduce ghosting artifacts. We test the algorithm on real infrared in comparison to several previously published methods. This algorithm not only can effectively correct common FPN such as Stripe, but also has obviously advantage compared with the current methods in terms of detail protection and convergence speed, especially for Ripple FPN correction. Furthermore, we display our architecture with a prototype built on a Xilinx Virtex-5 XC5VLX50T field-programmable gate array (FPGA). The hardware implementation of the algorithm based on FPGA has two advantages: (1) low resources consumption, and (2) small hardware delay (less than 20 lines). The hardware has been successfully applied in actual system.
FPGA Implementation of Metastability-Based True Random Number Generator

NASA Astrophysics Data System (ADS)

Hata, Hisashi; Ichikawa, Shuichi

True random number generators (TRNGs) are important as a basis for computer security. Though there are some TRNGs composed of analog circuit, the use of digital circuits is desired for the application of TRNGs to logic LSIs. Some of the digital TRNGs utilize jitter in free-running ring oscillators as a source of entropy, which consume large power. Another type of TRNG exploits the metastability of a latch to generate entropy. Although this kind of TRNG has been mostly implemented with full-custom LSI technology, this study presents an implementation based on common FPGA technology. Our TRNG is comprised of logic gates only, and can be integrated in any kind of logic LSI. The RS latch in our TRNG is implemented as a hard-macro to guarantee the quality of randomness by minimizing the signal skew and load imbalance of internal nodes. To improve the quality and throughput, the output of 64-256 latches are XOR'ed. The derived design was verified on a Xilinx Virtex-4 FPGA (XC4VFX20), and passed NIST statistical test suite without post-processing. Our TRNG with 256 latches occupies 580 slices, while achieving 12.5Mbps throughput.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Fernandes, Ana; Pereira, Rita C.; Sousa, Jorge

The Instituto de Plasmas e Fusao Nuclear (IPFN) has developed dedicated re-configurable modules based on field programmable gate array (FPGA) devices for several nuclear fusion machines worldwide. Moreover, new Advanced Telecommunication Computing Architecture (ATCA) based modules developed by IPFN are already included in the ITER catalogue. One of the requirements for re-configurable modules operating in future nuclear environments including ITER is the remote update capability. Accordingly, this work presents an alternative method for FPGA remote programing to be implemented in new ATCA based re-configurable modules. FPGAs are volatile devices and their programming code is usually stored in dedicated flash memoriesmore » for properly configuration during module power-on. The presented method is capable to store new FPGA codes in Serial Peripheral Interface (SPI) flash memories using the PCIexpress (PCIe) network established on the ATCA back-plane, linking data acquisition endpoints and the data switch blades. The method is based on the Xilinx Quick Boot application note, adapted to PCIe protocol and ATCA based modules. (authors)« less
The RTE inversion on FPGA aboard the solar orbiter PHI instrument

NASA Astrophysics Data System (ADS)

Cobos Carrascosa, J. P.; Aparicio del Moral, B.; Ramos Mas, J. L.; Balaguer, M.; López Jiménez, A. C.; del Toro Iniesta, J. C.

2016-07-01

In this work we propose a multiprocessor architecture to reach high performance in floating point operations by using radiation tolerant FPGA devices, and under narrow time and power constraints. This architecture is used in the PHI instrument that carries out the scientific analysis aboard the ESA's Solar Orbiter mission. The proposed architecture, in a SIMD flavor, is aimed to be an accelerator within the Data Processing Unit (it is composed by a main Leon processor and two FPGAs) for carrying out the RTE inversion on board the spacecraft using a relatively slow FPGA device - Xilinx XQR4VSX55-. The proposed architecture squeezes the FPGA resources in order to reach the computational requirements and improves the ground-based system performance based on commercial CPUs regarding time and power consumption. In this work we demonstrate the feasibility of using this FPGA devices embedded in the SO/PHI instrument. With that goal in mind, we perform tests to evaluate the scientific results and to measure the processing time and power consumption for carrying out the RTE inversion.
Modular design and implementation of field-programmable-gate-array-based Gaussian noise generator

NASA Astrophysics Data System (ADS)

Li, Yuan-Ping; Lee, Ta-Sung; Hwang, Jeng-Kuang

2016-05-01

The modular design of a Gaussian noise generator (GNG) based on field-programmable gate array (FPGA) technology was studied. A new range reduction architecture was included in a series of elementary function evaluation modules and was integrated into the GNG system. The approximation and quantisation errors for the square root module with a first polynomial approximation were high; therefore, we used the central limit theorem (CLT) to improve the noise quality. This resulted in an output rate of one sample per clock cycle. We subsequently applied Newton's method for the square root module, thus eliminating the need for the use of the CLT because applying the CLT resulted in an output rate of two samples per clock cycle (>200 million samples per second). Two statistical tests confirmed that our GNG is of high quality. Furthermore, the range reduction, which is used to solve a limited interval of the function approximation algorithms of the System Generator platform using Xilinx FPGAs, appeared to have a higher numerical accuracy, was operated at >350 MHz, and can be suitably applied for any function evaluation.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
DESDynI Quad First Stage Processor - A Four Channel Digitizer and Digital Beam Forming Processor

NASA Technical Reports Server (NTRS)

Chuang, Chung-Lun; Shaffer, Scott; Smythe, Robert; Niamsuwan, Noppasin; Li, Samuel; Liao, Eric; Lim, Chester; Morfopolous, Arin; Veilleux, Louise

2013-01-01

The proposed Deformation, Eco-Systems, and Dynamics of Ice Radar (DESDynI-R) L-band SAR instrument employs multiple digital channels to optimize resolution while keeping a large swath on a single pass. High-speed digitization with very fine synchronization and digital beam forming are necessary in order to facilitate this new technique. The Quad First Stage Processor (qFSP) was developed to achieve both the processing performance as well as the digitizing fidelity in order to accomplish this sweeping SAR technique. The qFSP utilizes high precision and high-speed analog to digital converters (ADCs), each with a finely adjustable clock distribution network to digitize the channels at the fidelity necessary to allow for digital beam forming. The Xilinx produced FX130T Virtex 5 part handles the processing to digitally calibrate each channel as well as filter and beam form the receive signals. Demonstrating the digital processing required for digital beam forming and digital calibration is instrumental to the viability of the proposed DESDynI instrument. The qFSP development brings this implementation to Technology Readiness Level (TRL) 6. This paper will detail the design and development of the prototype qFSP as well as the preliminary results from hardware tests.
An FPGA-Based Rapid Wheezing Detection System

PubMed Central

Lin, Bor-Shing; Yen, Tian-Shiue

2014-01-01

Wheezing is often treated as a crucial indicator in the diagnosis of obstructive pulmonary diseases. A rapid wheezing detection system may help physicians to monitor patients over the long-term. In this study, a portable wheezing detection system based on a field-programmable gate array (FPGA) is proposed. This system accelerates wheezing detection, and can be used as either a single-process system, or as an integrated part of another biomedical signal detection system. The system segments sound signals into 2-second units. A short-time Fourier transform was used to determine the relationship between the time and frequency components of wheezing sound data. A spectrogram was processed using 2D bilateral filtering, edge detection, multithreshold image segmentation, morphological image processing, and image labeling, to extract wheezing features according to computerized respiratory sound analysis (CORSA) standards. These features were then used to train the support vector machine (SVM) and build the classification models. The trained model was used to analyze sound data to detect wheezing. The system runs on a Xilinx Virtex-6 FPGA ML605 platform. The experimental results revealed that the system offered excellent wheezing recognition performance (0.912). The detection process can be used with a clock frequency of 51.97 MHz, and is able to perform rapid wheezing classification. PMID:24481034
Bitstream decoding processor for fast entropy decoding of variable length coding-based multiformat videos

NASA Astrophysics Data System (ADS)

Jo, Hyunho; Sim, Donggyu

2014-06-01

We present a bitstream decoding processor for entropy decoding of variable length coding-based multiformat videos. Since most of the computational complexity of entropy decoders comes from bitstream accesses and table look-up process, the developed bitstream processing unit (BsPU) has several designated instructions to access bitstreams and to minimize branch operations in the table look-up process. In addition, the instruction for bitstream access has the capability to remove emulation prevention bytes (EPBs) of H.264/AVC without initial delay, repeated memory accesses, and additional buffer. Experimental results show that the proposed method for EPB removal achieves a speed-up of 1.23 times compared to the conventional EPB removal method. In addition, the BsPU achieves speed-ups of 5.6 and 3.5 times in entropy decoding of H.264/AVC and MPEG-4 Visual bitstreams, respectively, compared to an existing processor without designated instructions and a new table mapping algorithm. The BsPU is implemented on a Xilinx Virtex5 LX330 field-programmable gate array. The MPEG-4 Visual (ASP, Level 5) and H.264/AVC (Main Profile, Level 4) are processed using the developed BsPU with a core clock speed of under 250 MHz in real time.
Fast particles identification in programmable form at level-0 trigger by means of the 3D-Flow system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crosetto, Dario B.

1998-10-30

The 3D-Flow Processor system is a new, technology-independent concept in very fast, real-time system architectures. Based on either an FPGA or an ASIC implementation, it can address, in a fully programmable manner, applications where commercially available processors would fail because of throughput requirements. Possible applications include filtering-algorithms (pattern recognition) from the input of multiple sensors, as well as moving any input validated by these filtering-algorithms to a single output channel. Both operations can easily be implemented on a 3D-Flow system to achieve a real-time processing system with a very short lag time. This system can be built either with off-the-shelfmore » FPGAs or, for higher data rates, with CMOS chips containing 4 to 16 processors each. The basic building block of the system, a 3D-Flow processor, has been successfully designed in VHDL code written in ''Generic HDL'' (mostly made of reusable blocks that are synthesizable in different technologies, or FPGAs), to produce a netlist for a four-processor ASIC featuring 0.35 micron CBA (Ceil Base Array) technology at 3.3 Volts, 884 mW power dissipation at 60 MHz and 63.75 mm sq. die size. The same VHDL code has been targeted to three FPGA manufacturers (Altera EPF10K250A, ORCA-Lucent Technologies 0R3T165 and Xilinx XCV1000). A complete set of software tools, the 3D-Flow System Manager, equally applicable to ASIC or FPGA implementations, has been produced to provide full system simulation, application development, real-time monitoring, and run-time fault recovery. Today's technology can accommodate 16 processors per chip in a medium size die, at a cost per processor of less than $5 based on the current silicon die/size technology cost.« less
First results of the silicon telescope using an 'artificial retina' for fast track finding

DOE Office of Scientific and Technical Information (OSTI.GOV)

Neri, N.; Abba, A.; Caponio, F.

We present the first results of the prototype of a silicon tracker with trigger capabilities based on a novel approach for fast track finding. The working principle of the 'artificial retina' is inspired by the processing of visual images by the brain and it is based on extensive parallelization of data distribution and pattern recognition. The algorithm has been implemented in commercial FPGAs in three main logic modules: a switch for the routing of the detector hits, a pool of engines for the digital processing of the hits, and a block for the calculation of the track parameters. The architecturemore » is fully pipelined and allows the reconstruction of real-time tracks with a latency less then 100 clock cycles, corresponding to 0.25 microsecond at 400 MHz clock. The silicon telescope consists of 8 layers of single-sided silicon strip detectors with 512 strips each. The detector size is about 10 cm x 10 cm and the strip pitch is 183 μm. The detectors are read out by the Beetle chip, a custom ASICs developed for LHCb, which provides the measurement of the hit position and pulse height of 128 channels. The 'artificial retina' algorithm has been implemented on custom data acquisition boards based on FPGAs Xilinx Kintex 7 lx160. The parameters of the tracks detected are finally transferred to host PC via USB 3.0. The boards manage the read-out ASICs and the sampling of the analog channels. The read-out is performed at 40 MHz on 4 channels for each ASIC that corresponds to a decoding of the telescope information at 1.1 MHz. We report on the first results of the fast tracking device and compare with simulations. (authors)« less
FPGA implementation for real-time background subtraction based on Horprasert model.

PubMed

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W.
Intelligent FPGA Data Acquisition Framework

NASA Astrophysics Data System (ADS)

Bai, Yunpeng; Gaisbauer, Dominic; Huber, Stefan; Konorov, Igor; Levit, Dmytro; Steffen, Dominik; Paul, Stephan

2017-06-01

In this paper, we present the field programmable gate arrays (FPGA)-based framework intelligent FPGA data acquisition (IFDAQ), which is used for the development of DAQ systems for detectors in high-energy physics. The framework supports Xilinx FPGA and provides a collection of IP cores written in very high speed integrated circuit hardware description language, which use the common interconnect interface. The IP core library offers functionality required for the development of the full DAQ chain. The library consists of Serializer/Deserializer (SERDES)-based time-to-digital conversion channels, an interface to a multichannel 80-MS/s 10-b analog-digital conversion, data transmission, and synchronization protocol between FPGAs, event builder, and slow control. The functionality is distributed among FPGA modules built in the AMC form factor: front end and data concentrator. This modular design also helps to scale and adapt the DAQ system to the needs of the particular experiment. The first application of the IFDAQ framework is the upgrade of the read-out electronics for the drift chambers and the electromagnetic calorimeters (ECALs) of the COMPASS experiment at CERN. The framework will be presented and discussed in the context of this paper.

A 256-channel, high throughput and precision time-to-digital converter with a decomposition encoding scheme in a Kintex-7 FPGA

NASA Astrophysics Data System (ADS)

Song, Z.; Wang, Y.; Kuang, J.

2018-05-01

Field Programmable Gate Arrays (FPGAs) made with 28 nm and more advanced process technology have great potentials for implementation of high precision time-to-digital convertors (TDC), because the delay cells in the tapped delay line (TDL) used for time interpolation are getting smaller and smaller. However, the bubble problems in the TDL status are becoming more complicated, which make it difficult to achieve TDCs on these chips with a high time precision. In this paper, we are proposing a novel decomposition encoding scheme, which not only can solve the bubble problem easily, but also has a high encoding efficiency. The potential of these chips to realize TDC can be fully released with the scheme. In a Xilinx Kintex-7 FPGA chip, we implemented a TDC system with 256 TDC channels, which doubles the number of TDC channels that our previous technique could achieve. Performances of all these TDC channels are evaluated. The average RMS time precision among them is 10.23 ps in the time-interval measurement range of (0–10 ns), and their measurement throughput reaches 277 M measures per second.
FPGA Implementation for Real-Time Background Subtraction Based on Horprasert Model

PubMed Central

Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J.; Diaz, Javier; Ros, Eduardo

2012-01-01

Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W. PMID:22368487
Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

NASA Astrophysics Data System (ADS)

Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

2006-05-01

The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
Area and power efficient DCT architecture for image compression

NASA Astrophysics Data System (ADS)

Dhandapani, Vaithiyanathan; Ramachandran, Seshasayanan

2014-12-01

The discrete cosine transform (DCT) is one of the major components in image and video compression systems. The final output of these systems is interpreted by the human visual system (HVS), which is not perfect. The limited perception of human visualization allows the algorithm to be numerically approximate rather than exact. In this paper, we propose a new matrix for discrete cosine transform. The proposed 8 × 8 transformation matrix contains only zeros and ones which requires only adders, thus avoiding the need for multiplication and shift operations. The new class of transform requires only 12 additions, which highly reduces the computational complexity and achieves a performance in image compression that is comparable to that of the existing approximated DCT. Another important aspect of the proposed transform is that it provides an efficient area and power optimization while implementing in hardware. To ensure the versatility of the proposal and to further evaluate the performance and correctness of the structure in terms of speed, area, and power consumption, the model is implemented on Xilinx Virtex 7 field programmable gate array (FPGA) device and synthesized with Cadence® RTL Compiler® using UMC 90 nm standard cell library. The analysis obtained from the implementation indicates that the proposed structure is superior to the existing approximation techniques with a 30% reduction in power and 12% reduction in area.
Waveform Developer's Guide for the Integrated Power, Avionics, and Software (iPAS) Space Telecommunications Radio System (STRS) Radio

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary Jo W.; Roche, Rigoberto

2017-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework for software defined radios (SDRs) to abstract the application software from the radio platform hardware. The STRS standard aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. To promote the use of the STRS architecture for future NASA advanced exploration missions, NASA Glenn Research Center (GRC) developed an STRS-compliant SDR on a radio platform used by the Advance Exploration System program at the Johnson Space Center (JSC) in their Integrated Power, Avionics, and Software (iPAS) laboratory. The iPAS STRS Radio was implemented on the Reconfigurable, Intelligently-Adaptive Communication System (RIACS) platform, currently being used for radio development at JSC. The platform consists of a Xilinx(Trademark) ML605 Virtex(Trademark)-6 FPGA board, an Analog Devices FMCOMMS1-EBZ RF transceiver board, and an Embedded PC (Axiomtek(Trademark) eBox 620-110-FL) running the Ubuntu 12.4 operating system. The result of this development is a very low cost STRS compliant platform that can be used for waveform developments for multiple applications. The purpose of this document is to describe how to develop a new waveform using the RIACS platform and the Very High Speed Integrated Circuits (VHSIC) Hardware Description Language (VHDL) FPGA wrapper code and the STRS implementation on the Axiomtek processor.
Hardware Interface Description for the Integrated Power, Avionics, and Software (iPAS) Space Telecommunications Radio Ssystem (STRS) Radio

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary Jo W.; Roche, Rigoberto

2017-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework for software defined radios (SDRs) to abstract the application software from the radio platform hardware. The STRS standard aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. To promote the use of the STRS architecture for future NASA advanced exploration missions, NASA Glenn Research Center (GRC) developed an STRS-compliant SDR on a radio platform used by the Advance Exploration System program at the Johnson Space Center (JSC) in their Integrated Power, Avionics, and Software (iPAS) laboratory. The iPAS STRS Radio was implemented on the Reconfigurable, Intelligently-Adaptive Communication System (RIACS) platform, currently being used for radio development at JSC. The platform consists of a Xilinx ML605 Virtex-6 FPGA board, an Analog Devices FMCOMMS1-EBZ RF transceiver board, and an Embedded PC (Axiomtek eBox 620-110-FL) running the Ubuntu 12.4 operating system. Figure 1 shows the RIACS platform hardware. The result of this development is a very low cost STRS compliant platform that can be used for waveform developments for multiple applications.The purpose of this document is to describe how to develop a new waveform using the RIACS platform and the Very High Speed Integrated Circuits (VHSIC) Hardware Description Language (VHDL) FPGA wrapper code and the STRS implementation on the Axiomtek processor.
Configurable Crossbar Switch for Deterministic, Low-latency Inter-blade Communications in a MicroTCA Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karamooz, Saeed; Breeding, John Eric; Justice, T Alan

As MicroTCA expands into applications beyond the telecommunications industry from which it originated, it faces new challenges in the area of inter-blade communications. The ability to achieve deterministic, low-latency communications between blades is critical to realizing a scalable architecture. In the past, legacy bus architectures accomplished inter-blade communications using dedicated parallel buses across the backplane. Because of limited fabric resources on its backplane, MicroTCA uses the carrier hub (MCH) for this purpose. Unfortunately, MCH products from commercial vendors are limited to standard bus protocols such as PCI Express, Serial Rapid IO and 10/40GbE. While these protocols have exceptional throughput capability,more » they are neither deterministic nor necessarily low-latency. To overcome this limitation, an MCH has been developed based on the Xilinx Virtex-7 690T FPGA. This MCH provides the system architect/developer complete flexibility in both the interface protocol and routing of information between blades. In this paper, we present the application of this configurable MCH concept to the Machine Protection System under development for the Spallation Neutron Sources's proton accelerator. Specifically, we demonstrate the use of the configurable MCH as a 12x4-lane crossbar switch using the Aurora protocol to achieve a deterministic, low-latency data link. In this configuration, the crossbar has an aggregate bandwidth of 48 GB/s.« less
FPGA in-the-loop simulations of cardiac excitation model under voltage clamp conditions

NASA Astrophysics Data System (ADS)

Othman, Norliza; Adon, Nur Atiqah; Mahmud, Farhanahani

2017-01-01

Voltage clamp technique allows the detection of single channel currents in biological membranes in identifying variety of electrophysiological problems in the cellular level. In this paper, a simulation study of the voltage clamp technique has been presented to analyse current-voltage (I-V) characteristics of ion currents based on Luo-Rudy Phase-I (LR-I) cardiac model by using a Field Programmable Gate Array (FPGA). Nowadays, cardiac models are becoming increasingly complex which can cause a vast amount of time to run the simulation. Thus, a real-time hardware implementation using FPGA could be one of the best solutions for high-performance real-time systems as it provides high configurability and performance, and able to executes in parallel mode operation. For shorter time development while retaining high confidence results, FPGA-based rapid prototyping through HDL Coder from MATLAB software has been used to construct the algorithm for the simulation system. Basically, the HDL Coder is capable to convert the designed MATLAB Simulink blocks into hardware description language (HDL) for the FPGA implementation. As a result, the voltage-clamp fixed-point design of LR-I model has been successfully conducted in MATLAB Simulink and the simulation of the I-V characteristics of the ionic currents has been verified on Xilinx FPGA Virtex-6 XC6VLX240T development board through an FPGA-in-the-loop (FIL) simulation.
Motion-sensor fusion-based gesture recognition and its VLSI architecture design for mobile devices

NASA Astrophysics Data System (ADS)

Zhu, Wenping; Liu, Leibo; Yin, Shouyi; Hu, Siqi; Tang, Eugene Y.; Wei, Shaojun

2014-05-01

With the rapid proliferation of smartphones and tablets, various embedded sensors are incorporated into these platforms to enable multimodal human-computer interfaces. Gesture recognition, as an intuitive interaction approach, has been extensively explored in the mobile computing community. However, most gesture recognition implementations by now are all user-dependent and only rely on accelerometer. In order to achieve competitive accuracy, users are required to hold the devices in predefined manner during the operation. In this paper, a high-accuracy human gesture recognition system is proposed based on multiple motion sensor fusion. Furthermore, to reduce the energy overhead resulted from frequent sensor sampling and data processing, a high energy-efficient VLSI architecture implemented on a Xilinx Virtex-5 FPGA board is also proposed. Compared with the pure software implementation, approximately 45 times speed-up is achieved while operating at 20 MHz. The experiments show that the average accuracy for 10 gestures achieves 93.98% for user-independent case and 96.14% for user-dependent case when subjects hold the device randomly during completing the specified gestures. Although a few percent lower than the conventional best result, it still provides competitive accuracy acceptable for practical usage. Most importantly, the proposed system allows users to hold the device randomly during operating the predefined gestures, which substantially enhances the user experience.
RFI Risk Reduction Activities Using New Goddard Digital Radiometry Capabilities

NASA Technical Reports Server (NTRS)

Bradley, Damon; Kim, Ed; Young, Peter; Miles, Lynn; Wong, Mark; Morris, Joel

2012-01-01

The Goddard Radio-Frequency Explorer (GREX) is the latest fast-sampling radiometer digital back-end processor that will be used for radiometry and radio-frequency interference (RFI) surveying at Goddard Space Flight Center. The system is compact and deployable, with a mass of about 40 kilograms. It is intended to be flown on aircraft. GREX is compatible with almost any aircraft, including P-3, twin otter, C-23, C-130, G3, and G5 types. At a minimum, the system can function as a clone of the Soil Moisture Active Passive (SMAP) ground-based development unit [1], or can be a completely independent system that is interfaced to any radiometer, provided that frequency shifting to GREX's intermediate frequency is performed prior to sampling. If the radiometer RF is less than 200MHz, then the band can be sampled and acquired directly by the system. A key feature of GREX is its ability to simultaneously sample two polarization channels simultaneously at up to 400MSPS, 14-bit resolution each. The sampled signals can be recorded continuously to a 23 TB solid-state RAID storage array. Data captures can be analyzed offline using the supercomputing facilities at Goddard Space Flight Center. In addition, various Field Programmable Gate Array (FPGA) - amenable radiometer signal processing and RFI detection algorithms can be implemented directly on the GREX system because it includes a high-capacity Xilinx Virtex-5 FPGA prototyping system that is user customizable.
Hand veins feature extraction using DT-CNNS

NASA Astrophysics Data System (ADS)

Malki, Suleyman; Spaanenburg, Lambert

2007-05-01

As the identification process is based on the unique patterns of the users, biometrics technologies are expected to provide highly secure authentication systems. The existing systems using fingerprints or retina patterns are, however, very vulnerable. One's fingerprints are accessible as soon as the person touches a surface, while a high resolution camera easily captures the retina pattern. Thus, both patterns can easily be "stolen" and forged. Beside, technical considerations decrease the usability for these methods. Due to the direct contact with the finger, the sensor gets dirty, which decreases the authentication success ratio. Aligning the eye with a camera to capture the retina pattern gives uncomfortable feeling. On the other hand, vein patterns of either a palm of the hand or a single finger offer stable, unique and repeatable biometrics features. A fingerprint-based identification system using Cellular Neural Networks has already been proposed by Gao. His system covers all stages of a typical fingerprint verification procedure from Image Preprocessing to Feature Matching. This paper performs a critical review of the individual algorithmic steps. Notably, the operation of False Feature Elimination is applied only once instead of 3 times. Furthermore, the number of iterations is limited to 1 for all used templates. Hence, the computational need of the feedback contribution is removed. Consequently the computational effort is drastically reduced without a notable chance in quality. This allows a full integration of the detection mechanism. The system is prototyped on a Xilinx Virtex II Pro P30 FPGA.
Research and design of portable photoelectric rotary table data-acquisition and analysis system

NASA Astrophysics Data System (ADS)

Yang, Dawei; Yang, Xiufang; Han, Junfeng; Yan, Xiaoxu

2015-02-01

Photoelectric rotary table as the main test tracking measurement platform, widely use in shooting range and aerospace fields. In the range of photoelectric tracking measurement system, in order to meet the photoelectric testing instruments and equipment of laboratory and field application demand, research and design the portable photoelectric rotary table data acquisition and analysis system, and introduces the FPGA device based on Xilinx company Virtex-4 series and its peripheral module of the system hardware design, and the software design of host computer in VC++ 6.0 programming platform and MFC package based on class libraries. The data acquisition and analysis system for data acquisition, display and storage, commission control, analysis, laboratory wave playback, transmission and fault diagnosis, and other functions into an organic whole, has the advantages of small volume, can be embedded, high speed, portable, simple operation, etc. By photoelectric tracking turntable as experimental object, carries on the system software and hardware alignment, the experimental results show that the system can realize the data acquisition, analysis and processing of photoelectric tracking equipment and control of turntable debugging good, and measurement results are accurate, reliable and good maintainability and extensibility. The research design for advancing the photoelectric tracking measurement equipment debugging for diagnosis and condition monitoring and fault analysis as well as the standardization and normalization of the interface and improve the maintainability of equipment is of great significance, and has certain innovative and practical value.
Unified transform architecture for AVC, AVS, VC-1 and HEVC high-performance codecs

NASA Astrophysics Data System (ADS)

Dias, Tiago; Roma, Nuno; Sousa, Leonel

2014-12-01

A unified architecture for fast and efficient computation of the set of two-dimensional (2-D) transforms adopted by the most recent state-of-the-art digital video standards is presented in this paper. Contrasting to other designs with similar functionality, the presented architecture is supported on a scalable, modular and completely configurable processing structure. This flexible structure not only allows to easily reconfigure the architecture to support different transform kernels, but it also permits its resizing to efficiently support transforms of different orders (e.g. order-4, order-8, order-16 and order-32). Consequently, not only is it highly suitable to realize high-performance multi-standard transform cores, but it also offers highly efficient implementations of specialized processing structures addressing only a reduced subset of transforms that are used by a specific video standard. The experimental results that were obtained by prototyping several configurations of this processing structure in a Xilinx Virtex-7 FPGA show the superior performance and hardware efficiency levels provided by the proposed unified architecture for the implementation of transform cores for the Advanced Video Coding (AVC), Audio Video coding Standard (AVS), VC-1 and High Efficiency Video Coding (HEVC) standards. In addition, such results also demonstrate the ability of this processing structure to realize multi-standard transform cores supporting all the standards mentioned above and that are capable of processing the 8k Ultra High Definition Television (UHDTV) video format (7,680 × 4,320 at 30 fps) in real time.
On Multiple AER Handshaking Channels Over High-Speed Bit-Serial Bidirectional LVDS Links With Flow-Control and Clock-Correction on Commercial FPGAs for Scalable Neuromorphic Systems.

PubMed

Yousefzadeh, Amirreza; Jablonski, Miroslaw; Iakymchuk, Taras; Linares-Barranco, Alejandro; Rosado, Alfredo; Plana, Luis A; Temple, Steve; Serrano-Gotarredona, Teresa; Furber, Steve B; Linares-Barranco, Bernabe

2017-10-01

Address event representation (AER) is a widely employed asynchronous technique for interchanging "neural spikes" between different hardware elements in neuromorphic systems. Each neuron or cell in a chip or a system is assigned an address (or ID), which is typically communicated through a high-speed digital bus, thus time-multiplexing a high number of neural connections. Conventional AER links use parallel physical wires together with a pair of handshaking signals (request and acknowledge). In this paper, we present a fully serial implementation using bidirectional SATA connectors with a pair of low-voltage differential signaling (LVDS) wires for each direction. The proposed implementation can multiplex a number of conventional parallel AER links for each physical LVDS connection. It uses flow control, clock correction, and byte alignment techniques to transmit 32-bit address events reliably over multiplexed serial connections. The setup has been tested using commercial Spartan6 FPGAs attaining a maximum event transmission speed of 75 Meps (Mega events per second) for 32-bit events at a line rate of 3.0 Gbps. Full HDL codes (vhdl/verilog) and example demonstration codes for the SpiNNaker platform will be made available.
Radiation tolerance of readout electronics for Belle II

NASA Astrophysics Data System (ADS)

Higuchi, T.; Nakao, M.; Nakano, E.

2012-02-01

We plan to start the Belle II experiment in 2015 and to continue data taking for more than ten years. Because some of the front-end electronics cards of Belle II are located inside the detector, radiation effects onto their components will be a severe problem. Using experimental exposure facilities of neutrons and γ rays, we study the radiation effects from these particles to the Virtex-5 FPGA, optical transceivers, and voltage regulators. The Virtex-5 FPGA is found to keep its operation after irradiation of more than 20-year-equivalent neutron flux of Belle II and 88-year-equivalent γ-ray dose. We observe single event upsets (SEUs) and multiple bit upsets (MBUs) in the Virtex-5 FPGA in the neutron irradiation. We also find almost doubled SEU counts in the Virtex-5 FPGA bombarded from its tail side than its head side. We extrapolate the observed SEU and MBU counts in the Virtex-5 FPGA to the entire readout system of the Belle II central drift chamber, and expect the SEU and MBU rates as one SEU per four minutes and one MBU per 11.5 hours, respectively. The optical transceivers are found to keep its operation after integration of 12-year-equivalent neutron flux, while they are killed by about 3-year-equivalent γ-ray dose, which should be solved in the future research. The voltage regulators are found to keep its operation for more than 10-year-equivalent γ-ray dose.
Three-phase Four-leg Inverter LabVIEW FPGA Control Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

in parallel with other voltage regulating devices on the AC or DC buses. This flexibility allows the Inverter to operate as a stand-alone voltage source, connected to the grid, or in parallel with other controllable voltage sources as part of a microgrid or remote power system. In addition, as the inverter is expected to operate under severe unbalanced conditions, the software includes algorithms to accurately compute real and reactive power for each phase based on definitions provided in the IEEE Standard 1459: IEEE Standard Definitions for the Measurement of Electric Power Quantities Under Sinusoidal, Nonsinusoidal, Balanced, or Unbalanced Conditions. Finally, the software includes code to output analog signals for debugging and for tuning of control loops. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, user-settable switching frequencies and synchronized control loop update rates of tens of kHz, and reference waveform generation, including Phase Lock Loop (PLL), update rate of 100 kHz.« less
A 3.9 ps Time-Interval RMS Precision Time-to-Digital Converter Using a Dual-Sampling Method in an UltraScale FPGA

NASA Astrophysics Data System (ADS)

Wang, Yonggang; Liu, Chong

2016-10-01

Field programmable gate arrays (FPGAs) manufactured with more advanced processing technology have faster carry chains and smaller delay elements, which are favorable for the design of tapped delay line (TDL)-style time-to-digital converters (TDCs) in FPGA. However, new challenges are posed in using them to implement TDCs with a high time precision. In this paper, we propose a bin realignment method and a dual-sampling method for TDC implementation in a Xilinx UltraScale FPGA. The former realigns the disordered time delay taps so that the TDC precision can approach the limit of its delay granularity, while the latter doubles the number of taps in the delay line so that the TDC precision beyond the cell delay limitation can be expected. Two TDC channels were implemented in a Kintex UltraScale FPGA, and the effectiveness of the new methods was evaluated. For fixed time intervals in the range from 0 to 440 ns, the average RMS precision measured by the two TDC channels reaches 5.8 ps using the bin realignment, and it further improves to 3.9 ps by using the dual-sampling method. The time precision has a 5.6% variation in the measured temperature range. Every part of the TDC, including dual-sampling, encoding, and on-line calibration, could run at a 500 MHz clock frequency. The system measurement dead time is only 4 ns.
SPIDR, a general-purpose readout system for pixel ASICs

NASA Astrophysics Data System (ADS)

van der Heijden, B.; Visser, J.; van Beuzekom, M.; Boterenbrood, H.; Kulis, S.; Munneke, B.; Schreuder, F.

2017-02-01

The SPIDR (Speedy PIxel Detector Readout) system is a flexible general-purpose readout platform that can be easily adapted to test and characterize new and existing detector readout ASICs. It is originally designed for the readout of pixel ASICs from the Medipix/Timepix family, but other types of ASICs or front-end circuits can be read out as well. The SPIDR system consists of an FPGA board with memory and various communication interfaces, FPGA firmware, CPU subsystem and an API library on the PC . The FPGA firmware can be adapted to read out other ASICs by re-using IP blocks. The available IP blocks include a UDP packet builder, 1 and 10 Gigabit Ethernet MAC's and a "soft core" CPU . Currently the firmware is targeted at the Xilinx VC707 development board and at a custom board called Compact-SPIDR . The firmware can easily be ported to other Xilinx 7 series and ultra scale FPGAs. The gap between an ASIC and the data acquisition back-end is bridged by the SPIDR system. Using the high pin count VITA 57 FPGA Mezzanine Card (FMC) connector only a simple chip carrier PCB is required. A 1 and a 10 Gigabit Ethernet interface handle the connection to the back-end. These can be used simultaneously for high-speed data and configuration over separate channels. In addition to the FMC connector, configurable inputs and outputs are available for synchronization with other detectors. A high resolution (≈ 27 ps bin size) Time to Digital converter is provided for time stamping events in the detector. The SPIDR system is frequently used as readout for the Medipix3 and Timepix3 ASICs. Using the 10 Gigabit Ethernet interface it is possible to read out a single chip at full bandwidth or up to 12 chips at a reduced rate. Another recent application is the test-bed for the VeloPix ASIC, which is developed for the Vertex Detector of the LHCb experiment. In this case the SPIDR system processes the 20 Gbps scrambled data stream from the VeloPix and distributes it over four 10 Gigabit
NASA Tech Briefs, February 2012

NASA Technical Reports Server (NTRS)

2012-01-01

This issue contains the following briefs: (1) Optical Comb from a Whispering Gallery Mode Resonator for Spectroscopy and Astronomy Instruments Calibration (2) Real-Time Flight Envelope Monitoring System (3) Nemesis Autonomous Test System (4) Mirror Metrology Using Nano-Probe Supports (5) Automated Lab-on-a-Chip Electrophoresis System (6) Techniques for Down-Sampling a Measured Surface Height Map for Model Validation (7) Multi-Component, Multi-Point Interferometric Rayleigh/Mie Doppler Velocimeter (8) Frequency to Voltage Converter Analog Front-End Prototype (9) Dust-Tolerant Intelligent Electrical Connection System (10) Gigabit Ethernet Asynchronous Clock Compensation FIFO (11) High-Speed, Multi-Channel Serial ADC LVDS Interface for Xilinx Virtex-5 FPGA (12) Glovebox for GeoLab Subsystem in HDU1-PEM (13) Modified Process Reduces Porosity when Soldering in Reduced Gravity Environments (14) Use of Functionalized Carbon Nanotubes for Covalent Attachment of Nanotubes to Silicon (15) Flexible Plug Repair for Shuttle Wing Leading Edge (16) Three Dimensionally Interlinked, Dense, Solid Form of Single-Walled CNT Ropes (17) Axel Robotic Platform for Crater and Extreme Terrain Exploration (18) Site Tamper and Material Plow Tool - STAMP (19) Magnetic Interface for Segmented Mirror Assembly (20) Transpiration-Cooled Spacecraft-Insulation-Repair Fasteners (21) Fluorescence-Based Sensor for Monitoring Activation of Lunar Dust (22) Aperture Ion Source (23) Virtual Ultrasound Guidance for Inexperienced Operators (24) Model-Based Fault Diagnosis: Performing Root Cause and Impact Analyses in Real Time (25) Interactive Schematic Integration Within the Propellant System Modeling Environment (26) Magnetic and Electric Field Polarizations of Oblique Magnetospheric Chorus Waves (27) Variable Sampling Mapping.
The KLOE-2 high energy taggers

NASA Astrophysics Data System (ADS)

Curciarello, F.

2017-06-01

The precision measurement of the π0 → γγ width allows to gain insights into the low-energy QCD dynamics. A way to achieve the precision needed (1%) in order to test theory predictions is to study the π0 production through γγ fusion in the e+e- → e+e-γ*γ* → e+e-π0 reaction. The KLOE-2 experiment, currently running at the DAΦNE facility in Frascati, aims to perform this measurement. For this reason, new detectors, which allow to tag final state leptons, have been installed along the DAΦNE beam line in order to reduce the background coming from phi-meson decays. The High Energy Tagger (HET) detector measures the deviation of leptons from their main orbit by determining their position and timing. The HET detectors are placed in roman pots just at the exit of the DAΦNE dipole magnets, 11 m away from the IP, both on positron and electron sides. The HET sensitive area is made up of a set of 28 plastic scintillators. A dedicated DAQ electronic board, based on a Xilinx Virtex-5 FPGA, has been developed for this detector. It provides a MultiHit TDC with a time resolution of 550(1) ps and the possibility to clearly identify the correct bunch crossing (ΔTbunch ~ 2.7 ns). The most relevant features of the KLOE-2 tagging system operation as time performance, stability and the techniques used to determine the time overlap between the KLOE and HET asynchronous DAQs will be presented.

Parallel point-multiplication architecture using combined group operations for high-speed cryptographic applications.

PubMed

Hossain, Md Selim; Saeedi, Ehsan; Kong, Yinan

2017-01-01

In this paper, we propose a novel parallel architecture for fast hardware implementation of elliptic curve point multiplication (ECPM), which is the key operation of an elliptic curve cryptography processor. The point multiplication over binary fields is synthesized on both FPGA and ASIC technology by designing fast elliptic curve group operations in Jacobian projective coordinates. A novel combined point doubling and point addition (PDPA) architecture is proposed for group operations to achieve high speed and low hardware requirements for ECPM. It has been implemented over the binary field which is recommended by the National Institute of Standards and Technology (NIST). The proposed ECPM supports two Koblitz and random curves for the key sizes 233 and 163 bits. For group operations, a finite-field arithmetic operation, e.g. multiplication, is designed on a polynomial basis. The delay of a 233-bit point multiplication is only 3.05 and 3.56 μs, in a Xilinx Virtex-7 FPGA, for Koblitz and random curves, respectively, and 0.81 μs in an ASIC 65-nm technology, which are the fastest hardware implementation results reported in the literature to date. In addition, a 163-bit point multiplication is also implemented in FPGA and ASIC for fair comparison which takes around 0.33 and 0.46 μs, respectively. The area-time product of the proposed point multiplication is very low compared to similar designs. The performance ([Formula: see text]) and Area × Time × Energy (ATE) product of the proposed design are far better than the most significant studies found in the literature.
Aeroflex Technology as Class-Y Demonstrator

NASA Technical Reports Server (NTRS)

Suh, Jong-ook; Agarwal, Shri; Popelar, Scott

2014-01-01

Modern space field programmable gate array (FPGA) devices with increased functional density and operational frequency, such as Xilinx Virtex 4 (V4) and S (V5), are packaged in non-hermetic ceramic flip chip forms. These next generation space parts were not qualified to the MIL-PRF-38535 Qualified Manufacturer Listing (QML) class-V when they were released because class-V was only intended for hermetic parts. In order to bring Xilinx V5 type packages into the QML system, it was suggested that class-Y be set up as a new category. From 2010 through 2014, a JEDEC G12 task group developed screening and qualification requirements for Class-Y products. The Document Standardization Division of the Defense Logistics Agency (DLA) has completed an engineering practice study. In parallel with the class-Y efforts, the NASA Electronic Parts and Packaging (NEPP) program has funded JPL to study potential reliability issues of the class-Y products. The major hurdle of this task was the absence of adequate research samples. Figure 1-1 shows schematic diagrams of typical structures of class-Y type products. Typically, class-Y products are either in ceramic flip chip column grid array (CGA) or land grid array (LGA) form. In class-Y packages, underfill and heat spread adhesive materials are directly exposed to the spacecraft environment due to their non-hermeticity. One of the concerns originally raised was that the underfill material could degrade due to the spacecraft environment and negatively impact the reliability of the package. In order to study such issues, it was necessary to use ceramic daisy chain flip chip package samples so that continuity of flip chip solder bumps could be monitored during the reliability tests. However, none of the commercially available class-Y daisy chain parts had electrical connections through flip chip solder bumps; only solder columns were daisy chained, which made it impossible to test continuity of flip chip solder bumps without using extremely
Fpga based L-band pulse doppler radar design and implementation

NASA Astrophysics Data System (ADS)

Savci, Kubilay

As its name implies RADAR (Radio Detection and Ranging) is an electromagnetic sensor used for detection and locating targets from their return signals. Radar systems propagate electromagnetic energy, from the antenna which is in part intercepted by an object. Objects reradiate a portion of energy which is captured by the radar receiver. The received signal is then processed for information extraction. Radar systems are widely used for surveillance, air security, navigation, weather hazard detection, as well as remote sensing applications. In this work, an FPGA based L-band Pulse Doppler radar prototype, which is used for target detection, localization and velocity calculation has been built and a general-purpose Pulse Doppler radar processor has been developed. This radar is a ground based stationary monopulse radar, which transmits a short pulse with a certain pulse repetition frequency (PRF). Return signals from the target are processed and information about their location and velocity is extracted. Discrete components are used for the transmitter and receiver chain. The hardware solution is based on Xilinx Virtex-6 ML605 FPGA board, responsible for the control of the radar system and the digital signal processing of the received signal, which involves Constant False Alarm Rate (CFAR) detection and Pulse Doppler processing. The algorithm is implemented in MATLAB/SIMULINK using the Xilinx System Generator for DSP tool. The field programmable gate arrays (FPGA) implementation of the radar system provides the flexibility of changing parameters such as the PRF and pulse length therefore it can be used with different radar configurations as well. A VHDL design has been developed for 1Gbit Ethernet connection to transfer digitized return signal and detection results to PC. An A-Scope software has been developed with C# programming language to display time domain radar signals and detection results on PC. Data are processed both in FPGA chip and on PC. FPGA uses fixed
Primary investigation the impacts of the external memory (DDR3) failures on the performance of Xilinx Zynq-7010 SoC based system (MicroZed) using laser irradiation

NASA Astrophysics Data System (ADS)

Liu, Shuhuan; Du, Xuecheng; Du, Xiaozhi; Zhang, Yao; Mubashiru, Lawal Olarewaju; Luo, Dongyang; yuan, Yuan; Deng, Tianxiang; Li, Zhuoqi; Zang, Hang; Li, Yonghong; He, Chaohui; Ma, Yingqi; Shangguan, Shipeng

2017-09-01

The impacts of the external dynamic memory (DDR3) failures on the performance of 28 nm Xilinx Zynq-7010 SoC based system (MicroZed) were investigated with two sets of 1064 nm laser platforms. The failure sensitive area distributionsons on the back surface of the test DDR3 were primarily localized with a CW laser irradiation platform. During the CW laser scanning on the back surface of the DDR3 of the test board system, various failure modes except SEU and SEL (MBU, SEFI, data storage address error, rebooting, etc) were found in the testing embedded modules (ALU, PL, Register, Cache and DMA, etc) of SoC. Moreover, the experimental results demonstrated that there were 16 failure sensitive blocks symmetrically distributed on the back surface of the DDR3 with every sensitive block area measured was about 1 mm × 0.5 mm. The influence factors on the failure modes of the embedded modules were primarily analyzed and the SEE characteristics of DDR3 induced by the picoseconds pulsed laser were tested. The failure modes of DDR3 found were SEU, SEFI, SEL, test board rebooting by itself, unknown data, etc. Furthermore, the time interval distributions of failure occurrence in DDR3 changes with the pulsed laser irradiation energy and the CPU operating frequency were measured and compared. Meanwhile, the failure characteristics of DDR3 induced by pulsed laser irradiation were primarily explored. The measured results and the testing techniques designed in this paper provide some reference information for evaluating the reliability of the test system or other similar electronic system in harsh environment.
Evaluation of FPGA to PC feedback loop

NASA Astrophysics Data System (ADS)

Linczuk, Pawel; Zabolotny, Wojciech M.; Wojenski, Andrzej; Krawczyk, Rafal D.; Pozniak, Krzysztof T.; Chernyshova, Maryna; Czarski, Tomasz; Gaska, Michal; Kasprowicz, Grzegorz; Kowalska-Strzeciwilk, Ewa; Malinowski, Karol

2017-08-01

The paper presents the evaluation study of the performance of the data transmission subsystem which can be used in High Energy Physics (HEP) and other High-Performance Computing (HPC) systems. The test environment consisted of Xilinx Artix-7 FPGA and server-grade PC connected via the PCIe 4xGen2 bus. The DMA engine was based on the Xilinx DMA for PCI Express Subsystem1 controlled by the modified Xilinx XDMA kernel driver.2 The research is focused on the influence of the system configuration on achievable throughput and latency of data transfer.
Software Defined GPS Receiver for International Space Station

NASA Technical Reports Server (NTRS)

Duncan, Courtney B.; Robison, David E.; Koelewyn, Cynthia Lee

2011-01-01

JPL is providing a software defined radio (SDR) that will fly on the International Space Station (ISS) as part of the CoNNeCT project under NASA's SCaN program. The SDR consists of several modules including a Baseband Processor Module (BPM) and a GPS Module (GPSM). The BPM executes applications (waveforms) consisting of software components for the embedded SPARC processor and logic for two Virtex II Field Programmable Gate Arrays (FPGAs) that operate on data received from the GPSM. GPS waveforms on the SDR are enabled by an L-Band antenna, low noise amplifier (LNA), and the GPSM that performs quadrature downconversion at L1, L2, and L5. The GPS waveform for the JPL SDR will acquire and track L1 C/A, L2C, and L5 GPS signals from a CoNNeCT platform on ISS, providing the best GPS-based positioning of ISS achieved to date, the first use of multiple frequency GPS on ISS, and potentially the first L5 signal tracking from space. The system will also enable various radiometric investigations on ISS such as local multipath or ISS dynamic behavior characterization. In following the software-defined model, this work will create a highly portable GPS software and firmware package that can be adapted to another platform with the necessary processor and FPGA capability. This paper also describes ISS applications for the JPL CoNNeCT SDR GPS waveform, possibilities for future global navigation satellite system (GNSS) tracking development, and the applicability of the waveform components to other space navigation applications.
All-Digital Time-Domain CMOS Smart Temperature Sensor with On-Chip Linearity Enhancement.

PubMed

Chen, Chun-Chi; Chen, Chao-Lieh; Lin, Yi

2016-01-30

This paper proposes the first all-digital on-chip linearity enhancement technique for improving the accuracy of the time-domain complementary metal-oxide semiconductor (CMOS) smart temperature sensor. To facilitate on-chip application and intellectual property reuse, an all-digital time-domain smart temperature sensor was implemented using 90 nm Field Programmable Gate Arrays (FPGAs). Although the inverter-based temperature sensor has a smaller circuit area and lower complexity, two-point calibration must be used to achieve an acceptable inaccuracy. With the help of a calibration circuit, the influence of process variations was reduced greatly for one-point calibration support, reducing the test costs and time. However, the sensor response still exhibited a large curvature, which substantially affected the accuracy of the sensor. Thus, an on-chip linearity-enhanced circuit is proposed to linearize the curve and achieve a new linearity-enhanced output. The sensor was implemented on eight different Xilinx FPGA using 118 slices per sensor in each FPGA to demonstrate the benefits of the linearization. Compared with the unlinearized version, the maximal inaccuracy of the linearized version decreased from 5 °C to 2.5 °C after one-point calibration in a range of -20 °C to 100 °C. The sensor consumed 95 μW using 1 kSa/s. The proposed linearity enhancement technique significantly improves temperature sensing accuracy, avoiding costly curvature compensation while it is fully synthesizable for future Very Large Scale Integration (VLSI) system.
All-Digital Time-Domain CMOS Smart Temperature Sensor with On-Chip Linearity Enhancement

PubMed Central

Chen, Chun-Chi; Chen, Chao-Lieh; Lin, Yi

2016-01-01

This paper proposes the first all-digital on-chip linearity enhancement technique for improving the accuracy of the time-domain complementary metal-oxide semiconductor (CMOS) smart temperature sensor. To facilitate on-chip application and intellectual property reuse, an all-digital time-domain smart temperature sensor was implemented using 90 nm Field Programmable Gate Arrays (FPGAs). Although the inverter-based temperature sensor has a smaller circuit area and lower complexity, two-point calibration must be used to achieve an acceptable inaccuracy. With the help of a calibration circuit, the influence of process variations was reduced greatly for one-point calibration support, reducing the test costs and time. However, the sensor response still exhibited a large curvature, which substantially affected the accuracy of the sensor. Thus, an on-chip linearity-enhanced circuit is proposed to linearize the curve and achieve a new linearity-enhanced output. The sensor was implemented on eight different Xilinx FPGA using 118 slices per sensor in each FPGA to demonstrate the benefits of the linearization. Compared with the unlinearized version, the maximal inaccuracy of the linearized version decreased from 5 °C to 2.5 °C after one-point calibration in a range of −20 °C to 100 °C. The sensor consumed 95 μW using 1 kSa/s. The proposed linearity enhancement technique significantly improves temperature sensing accuracy, avoiding costly curvature compensation while it is fully synthesizable for future Very Large Scale Integration (VLSI) system. PMID:26840316
JPEG XS, a new standard for visually lossless low-latency lightweight image compression

NASA Astrophysics Data System (ADS)

Descampe, Antonin; Keinert, Joachim; Richter, Thomas; Fößel, Siegfried; Rouvroy, Gaël.

2017-09-01

JPEG XS is an upcoming standard from the JPEG Committee (formally known as ISO/IEC SC29 WG1). It aims to provide an interoperable visually lossless low-latency lightweight codec for a wide range of applications including mezzanine compression in broadcast and Pro-AV markets. This requires optimal support of a wide range of implementation technologies such as FPGAs, CPUs and GPUs. Targeted use cases are professional video links, IP transport, Ethernet transport, real-time video storage, video memory buffers, and omnidirectional video capture and rendering. In addition to the evaluation of the visual transparency of the selected technologies, a detailed analysis of the hardware and software complexity as well as the latency has been done to make sure that the new codec meets the requirements of the above-mentioned use cases. In particular, the end-to-end latency has been constrained to a maximum of 32 lines. Concerning the hardware complexity, neither encoder nor decoder should require more than 50% of an FPGA similar to Xilinx Artix 7 or 25% of an FPGA similar to Altera Cyclon 5. This process resulted in a coding scheme made of an optional color transform, a wavelet transform, the entropy coding of the highest magnitude level of groups of coefficients, and the raw inclusion of the truncated wavelet coefficients. This paper presents the details and status of the standardization process, a technical description of the future standard, and the latest performance evaluation results.
Reliable and redundant FPGA based read-out design in the ATLAS TileCal Demonstrator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Akerstedt, Henrik; Muschter, Steffen; Drake, Gary

The Tile Calorimeter at ATLAS [1] is a hadron calorimeter based on steel plates and scintillating tiles read out by PMTs. The current read-out system uses standard ADCs and custom ASICs to digitize and temporarily store the data on the detector. However, only a subset of the data is actually read out to the counting room. The on-detector electronics will be replaced around 2023. To achieve the required reliability the upgraded system will be highly redundant. Here the ASICs will be replaced with Kintex-7 FPGAs from Xilinx. This, in addition to the use of multiple 10 Gbps optical read-out links,more » will allow a full read-out of all detector data. Due to the higher radiation levels expected when the beam luminosity is increased, opportunities for repairs will be less frequent. The circuitry and firmware must therefore be designed for sufficiently high reliability using redundancy and radiation tolerant components. Within a year, a hybrid demonstrator including the new readout system will be installed in one slice of the ATLAS Tile Calorimeter. This will allow the proposed upgrade to be thoroughly evaluated well before the planned 2023 deployment in all slices, especially with regard to long term reliability. Different firmware strategies alongside with their integration in the demonstrator are presented in the context of high reliability protection against hardware malfunction and radiation induced errors.« less
Multiplier less high-speed squaring circuit for binary numbers

NASA Astrophysics Data System (ADS)

Sethi, Kabiraj; Panda, Rutuparna

2015-03-01

The squaring operation is important in many applications in signal processing, cryptography etc. In general, squaring circuits reported in the literature use fast multipliers. A novel idea of a squaring circuit without using multipliers is proposed in this paper. Ancient Indian method used for squaring decimal numbers is extended here for binary numbers. The key to our success is that no multiplier is used. Instead, one squaring circuit is used. The hardware architecture of the proposed squaring circuit is presented. The design is coded in VHDL and synthesised and simulated in Xilinx ISE Design Suite 10.1 (Xilinx Inc., San Jose, CA, USA). It is implemented in Xilinx Vertex 4vls15sf363-12 device (Xilinx Inc.). The results in terms of time delay and area is compared with both modified Booth's algorithm and squaring circuit using Vedic multipliers. Our proposed squaring circuit seems to have better performance in terms of both speed and area.
Parallel point-multiplication architecture using combined group operations for high-speed cryptographic applications

PubMed Central

Saeedi, Ehsan; Kong, Yinan

2017-01-01

In this paper, we propose a novel parallel architecture for fast hardware implementation of elliptic curve point multiplication (ECPM), which is the key operation of an elliptic curve cryptography processor. The point multiplication over binary fields is synthesized on both FPGA and ASIC technology by designing fast elliptic curve group operations in Jacobian projective coordinates. A novel combined point doubling and point addition (PDPA) architecture is proposed for group operations to achieve high speed and low hardware requirements for ECPM. It has been implemented over the binary field which is recommended by the National Institute of Standards and Technology (NIST). The proposed ECPM supports two Koblitz and random curves for the key sizes 233 and 163 bits. For group operations, a finite-field arithmetic operation, e.g. multiplication, is designed on a polynomial basis. The delay of a 233-bit point multiplication is only 3.05 and 3.56 μs, in a Xilinx Virtex-7 FPGA, for Koblitz and random curves, respectively, and 0.81 μs in an ASIC 65-nm technology, which are the fastest hardware implementation results reported in the literature to date. In addition, a 163-bit point multiplication is also implemented in FPGA and ASIC for fair comparison which takes around 0.33 and 0.46 μs, respectively. The area-time product of the proposed point multiplication is very low compared to similar designs. The performance (1Area×Time=1AT) and Area × Time × Energy (ATE) product of the proposed design are far better than the most significant studies found in the literature. PMID:28459831
STAR: FPGA-based software defined satellite transponder

NASA Astrophysics Data System (ADS)

Davalle, Daniele; Cassettari, Riccardo; Saponara, Sergio; Fanucci, Luca; Cucchi, Luca; Bigongiari, Franco; Errico, Walter

2013-05-01

This paper presents STAR, a flexible Telemetry, Tracking & Command (TT&C) transponder for Earth Observation (EO) small satellites, developed in collaboration with INTECS and SITAEL companies. With respect to state-of-the-art EO transponders, STAR includes the possibility of scientific data transfer thanks to the 40 Mbps downlink data-rate. This feature represents an important optimization in terms of hardware mass, which is important for EO small satellites. Furthermore, in-flight re-configurability of communication parameters via telecommand is important for in-orbit link optimization, which is especially useful for low orbit satellites where visibility can be as short as few hundreds of seconds. STAR exploits the principles of digital radio to minimize the analog section of the transceiver. 70MHz intermediate frequency (IF) is the interface with an external S/X band radio-frequency front-end. The system is composed of a dedicated configurable high-speed digital signal processing part, the Signal Processor (SP), described in technology-independent VHDL working with a clock frequency of 184.32MHz and a low speed control part, the Control Processor (CP), based on the 32-bit Gaisler LEON3 processor clocked at 32 MHz, with SpaceWire and CAN interfaces. The quantization parameters were fine-tailored to reach a trade-off between hardware complexity and implementation loss which is less than 0.5 dB at BER = 10-5 for the RX chain. The IF ports require 8-bit precision. The system prototype is fitted on the Xilinx Virtex 6 VLX75T-FF484 FPGA of which a space-qualified version has been announced. The total device occupation is 82 %.
Programmable Logic Device (PLD) Design Description for the Integrated Power, Avionics, and Software (iPAS) Space Telecommunications Radio System (STRS) Radio

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary Jo W.

2017-01-01

The Space Telecommunications Radio System (STRS) provides a common, consistent framework for software defined radios (SDRs) to abstract the application software from the radio platform hardware. The STRS standard aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. To promote the use of the STRS architecture for future NASA advanced exploration missions, NASA Glenn Research Center (GRC) developed an STRS compliant SDR on a radio platform used by the Advance Exploration System program at the Johnson Space Center (JSC) in their Integrated Power, Avionics, and Software (iPAS) laboratory. At the conclusion of the development, the software and hardware description language (HDL) code was delivered to JSC for their use in their iPAS test bed to get hands-on experience with the STRS standard, and for development of their own STRS Waveforms on the now STRS compliant platform.The iPAS STRS Radio was implemented on the Reconfigurable, Intelligently-Adaptive Communication System (RIACS) platform, currently being used for radio development at JSC. The platform consists of a Xilinx ML605 Virtex-6 FPGA board, an Analog Devices FMCOMMS1-EBZ RF transceiver board, and an Embedded PC (Axiomtek eBox 620-110-FL) running the Ubuntu 12.4 operating system. Figure 1 shows the RIACS platform hardware. The result of this development is a very low cost STRS compliant platform that can be used for waveform developments for multiple applications.The purpose of this document is to describe the design of the HDL code for the FPGA portion of the iPAS STRS Radio particularly the design of the FPGA wrapper and the test waveform.
FPGAs and HPC

DTIC Science & Technology

2007-01-01

Ridge Technology, internal unpublished document. 10. Byoungro, S.; Diniz , P. C.; Hall, M. W. Using Estimates From Behavioral Synthesis Tools in...WIERSCHKE OLAC PL/RKFE 10 E SATURN BLVD EDWARDS AFB CA 93524-7680 1 NVL RSRCH LAB D PAPCONSTANTOPOULOS WASHINGTON DC 20375-5000 1
Field Programmable Gate Array for Implementation of Redundant Advanced Digital Feedback Control

NASA Technical Reports Server (NTRS)

King, K. D.

2003-01-01

The goal of this effort was to develop a digital motor controller using field programmable gate arrays (FPGAs). This is a more rugged approach than a conventional microprocessor digital controller. FPGAs typically have higher radiation (rad) tolerance than both the microprocessor and memory required for a conventional digital controller. Furthermore, FPGAs can typically operate at higher speeds. (While speed is usually not an issue for motor controllers, it can be for other system controllers.) Other than motor power, only a 3.3-V digital power supply was used in the controller; no analog bias supplies were used. Since most of the circuit was implemented in the FPGA, no additional parts were needed other than the power transistors to drive the motor. The benefits that FPGAs provide over conventional designs-lower power and fewer parts-allow for smaller packaging and reduced weight and cost.
Test Waveform Applications for JPL STRS Operating Environment

NASA Technical Reports Server (NTRS)

Lux, James P.; Peters, Kenneth J.; Taylor, Gregory H.; Lang, Minh; Stern, Ryan A.; Duncan, Courtney B.

2013-01-01

This software demonstrates use of the JPL Space Telecommunications Radio System (STRS) Operating Environment (OE), tests APIs (application programming interfaces) presented by JPL STRS OE, and allows for basic testing of the underlying hardware platform. This software uses the JPL STRS Operating Environment ["JPL Space Tele com - munications Rad io System Operating Environment,"(NPO-4776) NASA Tech Briefs, commercial edition, Vol. 37, No. 1 (January 2013), p. 47] to interact with the JPL-SDR Software Defined Radio developed for the CoNNeCT (COmmunications, Navigation, and Networking rEconfigurable Testbed) Project as part of the SCaN Testbed installed on the International Space Station (ISS). These are the first applications that are compliant with the new NASA STRS Architecture Standard. Several example waveform applications are provided to demonstrate use of the JPL STRS OE for the JPL-SDR platform used for the CoNNeCT Project. The waveforms provide a simple digitizer and playback capability for the SBand RF slice, and a simple digitizer for the GPS slice [CoNNeCT Global Positioning System RF Module, (NPO-47764) NASA Tech Briefs, commercial edition, Vol. 36, No. 3 (March 2012), p. 36]. These waveforms may be used for hardware test, as well as for on-orbit or laboratory checkout. Additional example waveforms implement SpaceWire and timer modules, which can be used for time transfer and demonstration of communication between the two Xilinx FPGAs in the JPLSDR. The waveforms are also compatible with ground-based use of the JPL STRS OE on radio breadboards and Linux.
An FPGA-based instrumentation platform for use at deep cryogenic temperatures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Conway Lamb, I. D.; Colless, J. I.; Hornibrook, J. M.

2016-01-15

We describe the operation of a cryogenic instrumentation platform incorporating commercially available field-programmable gate arrays (FPGAs). The functionality of the FPGAs at temperatures approaching 4 K enables signal routing, multiplexing, and complex digital signal processing in close proximity to cooled devices or detectors within the cryostat. The performance of the FPGAs in a cryogenic environment is evaluated, including clock speed, error rates, and power consumption. Although constructed for the purpose of controlling and reading out quantum computing devices with low latency, the instrument is generic enough to be of broad use in a range of cryogenic applications.
Reconfigurable Processing Module

NASA Technical Reports Server (NTRS)

Somervill, Kevin; Hodson, Robert; Jones, Robert; Williams, John

2005-01-01

To accommodate a wide spectrum of applications and technologies, NASA s Exploration System's Missions Directorate has called for reconfigurable and modular technologies to support future missions to the moon and Mars. In response, Langley Research Center is leading a program entitled Reconfigurable Scaleable Computing (RSC) that is centered on the development of FPGA-based computing resources in a stackable form factor. This paper details the architecture and implementation of the Reconfigurable Processing Module (RPM), which is the key element of the RSC system. The RPM is an FPGA-based, space-qualified printed circuit assembly leveraging terrestrial/commercial design standards into the space applications domain. The form factor is similar to, and backwards compatible with, the PCI-104 standard utilizing only the PCI interface. The size is expanded to accommodate the required functionality while still better than 30% smaller than a 3U CompactPCI(TradeMark)card and without the overhead of the backplane. The architecture is built around two FPGA devices, one hosting PCI and memory interfaces, and another hosting mission application resources; both of which are connected with a high-speed data bus. The PCI interface FPGA provides access via the PCI bus to onboard SDRAM, flash PROM, and the application resources; both configuration management as well as runtime interaction. The reconfigurable FPGA, referred to as the Application FPGA - or simply "the application" - is a radiation-tolerant Xilinx Virtex-4 FX60 hosting custom application specific logic or soft microprocessor IP. The RPM implements various SEE mitigation techniques including TMR, EDAC, and configuration scrubbing of the reconfigurable FPGA. Prototype hardware and formal modeling techniques are used to explore the performability trade space. These models provide a novel way to calculate quality-of-service performance measures while simultaneously considering fault-related behavior due to SEE soft errors.
Path planning on cellular nonlinear network using active wave computing technique

NASA Astrophysics Data System (ADS)

Yeniçeri, Ramazan; Yalçın, Müstak E.

2009-05-01

This paper introduces a simple algorithm to solve robot path finding problem using active wave computing techniques. A two-dimensional Cellular Neural/Nonlinear Network (CNN), consist of relaxation oscillators, has been used to generate active waves and to process the visual information. The network, which has been implemented on a Field Programmable Gate Array (FPGA) chip, has the feature of being programmed, controlled and observed by a host computer. The arena of the robot is modelled as the medium of the active waves on the network. Active waves are employed to cover the whole medium with their own dynamics, by starting from an initial point. The proposed algorithm is achieved by observing the motion of the wave-front of the active waves. Host program first loads the arena model onto the active wave generator network and command to start the generation. Then periodically pulls the network image from the generator hardware to analyze evolution of the active waves. When the algorithm is completed, vectorial data image is generated. The path from any of the pixel on this image to the active wave generating pixel is drawn by the vectors on this image. The robot arena may be a complicated labyrinth or may have a simple geometry. But, the arena surface always must be flat. Our Autowave Generator CNN implementation which is settled on the Xilinx University Program Virtex-II Pro Development System is operated by a MATLAB program running on the host computer. As the active wave generator hardware has 16, 384 neurons, an arena with 128 × 128 pixels can be modeled and solved by the algorithm. The system also has a monitor and network image is depicted on the monitor simultaneously.

Design of a reversible single precision floating point subtractor.

PubMed

Anantha Lakshmi, Av; Sudha, Gf

2014-01-04

In recent years, Reversible logic has emerged as a major area of research due to its ability to reduce the power dissipation which is the main requirement in the low power digital circuit design. It has wide applications like low power CMOS design, Nano-technology, Digital signal processing, Communication, DNA computing and Optical computing. Floating-point operations are needed very frequently in nearly all computing disciplines, and studies have shown floating-point addition/subtraction to be the most used floating-point operation. However, few designs exist on efficient reversible BCD subtractors but no work on reversible floating point subtractor. In this paper, it is proposed to present an efficient reversible single precision floating-point subtractor. The proposed design requires reversible designs of an 8-bit and a 24-bit comparator unit, an 8-bit and a 24-bit subtractor, and a normalization unit. For normalization, a 24-bit Reversible Leading Zero Detector and a 24-bit reversible shift register is implemented to shift the mantissas. To realize a reversible 1-bit comparator, in this paper, two new 3x3 reversible gates are proposed The proposed reversible 1-bit comparator is better and optimized in terms of the number of reversible gates used, the number of transistor count and the number of garbage outputs. The proposed work is analysed in terms of number of reversible gates, garbage outputs, constant inputs and quantum costs. Using these modules, an efficient design of a reversible single precision floating point subtractor is proposed. Proposed circuits have been simulated using Modelsim and synthesized using Xilinx Virtex5vlx30tff665-3. The total on-chip power consumed by the proposed 32-bit reversible floating point subtractor is 0.410 W.
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction

NASA Astrophysics Data System (ADS)

Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh

2014-05-01

Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
Reconfigurable fault tolerant avionics system

NASA Astrophysics Data System (ADS)

Ibrahim, M. M.; Asami, K.; Cho, Mengu

This paper presents the design of a reconfigurable avionics system based on modern Static Random Access Memory (SRAM)-based Field Programmable Gate Array (FPGA) to be used in future generations of nano satellites. A major concern in satellite systems and especially nano satellites is to build robust systems with low-power consumption profiles. The system is designed to be flexible by providing the capability of reconfiguring itself based on its orbital position. As Single Event Upsets (SEU) do not have the same severity and intensity in all orbital locations, having the maximum at the South Atlantic Anomaly (SAA) and the polar cusps, the system does not have to be fully protected all the time in its orbit. An acceptable level of protection against high-energy cosmic rays and charged particles roaming in space is provided within the majority of the orbit through software fault tolerance. Check pointing and roll back, besides control flow assertions, is used for that level of protection. In the minority part of the orbit where severe SEUs are expected to exist, a reconfiguration for the system FPGA is initiated where the processor systems are triplicated and protection through Triple Modular Redundancy (TMR) with feedback is provided. This technique of reconfiguring the system as per the level of the threat expected from SEU-induced faults helps in reducing the average dynamic power consumption of the system to one-third of its maximum. This technique can be viewed as a smart protection through system reconfiguration. The system is built on the commercial version of the (XC5VLX50) Xilinx Virtex5 FPGA on bulk silicon with 324 IO. Simulations of orbit SEU rates were carried out using the SPENVIS web-based software package.
The NASA Electronic Parts and Packaging (NEPP) Program: Insertion of New Electronics Technologies

NASA Technical Reports Server (NTRS)

LaBel, Kenneth A.; Sampson, Michael J.

2007-01-01

This viewgraph presentation gives an overview of NASA Electronic Parts and Packaging (NEPP) Program's new electronics technology trends. The topics include: 1) The Changing World of Radiation Testing of Memories; 2) Even Application-Specific Tests are Costly!; 3) Hypothetical New Technology Part Qualification Cost; 4) Where we are; 5) Approaching FPGAs as a More Than a "Part" for Reliability; 6) FPGAs Beget Novel Radiation Test Setups; 7) Understanding the Complex Radiation Data; 8) Tracking Packaging Complexity and Reliability for FPGAs; 9) Devices Supporting the FPGA Need to be Considered; 10) Summary of the New Electronic Technologies and Insertion into Flight Programs Workshop; and 11) Highlights of Panel Notes and Comments
NEPP Update of Independent Single Event Upset Field Programmable Gate Array Testing

NASA Technical Reports Server (NTRS)

Berg, Melanie; Label, Kenneth; Campola, Michael; Pellish, Jonathan

2017-01-01

This presentation provides a NASA Electronic Parts and Packaging (NEPP) Program update of independent Single Event Upset (SEU) Field Programmable Gate Array (FPGA) testing including FPGA test guidelines, Microsemi RTG4 heavy-ion results, Xilinx Kintex-UltraScale heavy-ion results, Xilinx UltraScale+ single event effect (SEE) test plans, development of a new methodology for characterizing SEU system response, and NEPP involvement with FPGA security and trust.
The electronics system for the LBNL positron emission mammography (PEM) camera

NASA Astrophysics Data System (ADS)

Moses, W. W.; Young, J. W.; Baker, K.; Jones, W.; Lenox, M.; Ho, M. H.; Weng, M.

2001-06-01

Describes the electronics for a high-performance positron emission mammography (PEM) camera. It is based on the electronics for a human brain positron emission tomography (PET) camera (the Siemens/CTI HRRT), modified to use a detector module that incorporates a photodiode (PD) array. An application-specified integrated circuit (ASIC) services the photodetector (PD) array, amplifying its signal and identifying the crystal of interaction. Another ASIC services the photomultiplier tube (PMT), measuring its output and providing a timing signal. Field-programmable gate arrays (FPGAs) and lookup RAMs are used to apply crystal-by-crystal correction factors and measure the energy deposit and the interaction depth (based on the PD/PMT ratio). Additional FPGAs provide event multiplexing, derandomization, coincidence detection, and real-time rebinning. Embedded PC/104 microprocessors provide communication, real-time control, and configure the system. Extensive use of FPGAs make the overall design extremely flexible, allowing many different functions (or design modifications) to be realized without hardware changes. Incorporation of extensive onboard diagnostics, implemented in the FPGAs, is required by the very high level of integration and density achieved by this system.
A Survey on FPGA-Based Sensor Systems: Towards Intelligent and Reconfigurable Low-Power Sensors for Computer Vision, Control and Signal Processing

PubMed Central

García, Gabriel J.; Jara, Carlos A.; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M.; Torres, Fernando

2014-01-01

The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field. PMID:24691100
A survey on FPGA-based sensor systems: towards intelligent and reconfigurable low-power sensors for computer vision, control and signal processing.

PubMed

García, Gabriel J; Jara, Carlos A; Pomares, Jorge; Alabdo, Aiman; Poggi, Lucas M; Torres, Fernando

2014-03-31

The current trend in the evolution of sensor systems seeks ways to provide more accuracy and resolution, while at the same time decreasing the size and power consumption. The use of Field Programmable Gate Arrays (FPGAs) provides specific reprogrammable hardware technology that can be properly exploited to obtain a reconfigurable sensor system. This adaptation capability enables the implementation of complex applications using the partial reconfigurability at a very low-power consumption. For highly demanding tasks FPGAs have been favored due to the high efficiency provided by their architectural flexibility (parallelism, on-chip memory, etc.), reconfigurability and superb performance in the development of algorithms. FPGAs have improved the performance of sensor systems and have triggered a clear increase in their use in new fields of application. A new generation of smarter, reconfigurable and lower power consumption sensors is being developed in Spain based on FPGAs. In this paper, a review of these developments is presented, describing as well the FPGA technologies employed by the different research groups and providing an overview of future research within this field.
From OO to FPGA :

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kou, Stephen; Palsberg, Jens; Brooks, Jeffrey

Consumer electronics today such as cell phones often have one or more low-power FPGAs to assist with energy-intensive operations in order to reduce overall energy consumption and increase battery life. However, current techniques for programming FPGAs require people to be specially trained to do so. Ideally, software engineers can more readily take advantage of the benefits FPGAs offer by being able to program them using their existing skills, a common one being object-oriented programming. However, traditional techniques for compiling object-oriented languages are at odds with todays FPGA tools, which support neither pointers nor complex data structures. Open until now ismore » the problem of compiling an object-oriented language to an FPGA in a way that harnesses this potential for huge energy savings. In this paper, we present a new compilation technique that feeds into an existing FPGA tool chain and produces FPGAs with up to almost an order of magnitude in energy savings compared to a low-power microprocessor while still retaining comparable performance and area usage.« less
Generation of Custom DSP Transform IP Cores: Case Study Walsh-Hadamard Transform

DTIC Science & Technology

2002-09-01

mathematics and hardware design What I know: Finite state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing...state machine Pipelining Systolic array … What I know: Linear algebra Digital signal processing Adaptive filter theory … A math guy A hardware engineer...Synthesis Technology Libary Bit-width (8) HF factor (1,2,3,6) VF factor (1,2,4, ... 32) Xilinx FPGA Place&Route Xilinx FPGA Place&Route Performance
Open Component Portability Infrastructure (OPENCPI)

DTIC Science & Technology

2009-11-01

Disk Drive 7 1 www.antec.com P182 $120. ATX Mid Tower Computer Case 8 1 www.xilinx.com HW-V5-ML555-G $2200. Xilinx ML555 V5 Dev Kit Notes: Cost...s/ GEORGE RAMSEYER EDWARD J. JONES, Deputy Chief Work Unit Manager Advanced Computing ...uniquely positioned to meet the goals of the Software Systems Stockroom (S3) since in some sense component-based systems are computer -science’s
Design of the SLAC RCE Platform: A General Purpose ATCA Based Data Acquisition System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Herbst, R.; Claus, R.; Freytag, M.

2015-01-23

The SLAC RCE platform is a general purpose clustered data acquisition system implemented on a custom ATCA compliant blade, called the Cluster On Board (COB). The core of the system is the Reconfigurable Cluster Element (RCE), which is a system-on-chip design based upon the Xilinx Zynq family of FPGAs, mounted on custom COB daughter-boards. The Zynq architecture couples a dual core ARM Cortex A9 based processor with a high performance 28nm FPGA. The RCE has 12 external general purpose bi-directional high speed links, each supporting serial rates of up to 12Gbps. 8 RCE nodes are included on a COB, eachmore » with a 10Gbps connection to an on-board 24-port Ethernet switch integrated circuit. The COB is designed to be used with a standard full-mesh ATCA backplane allowing multiple RCE nodes to be tightly interconnected with minimal interconnect latency. Multiple shelves can be clustered using the front panel 10-gbps connections. The COB also supports local and inter-blade timing and trigger distribution. An experiment specific Rear Transition Module adapts the 96 high speed serial links to specific experiments and allows an experiment-specific timing and busy feedback connection. This coupling of processors with a high performance FPGA fabric in a low latency, multiple node cluster allows high speed data processing that can be easily adapted to any physics experiment. RTEMS and Linux are both ported to the module. The RCE has been used or is the baseline for several current and proposed experiments (LCLS, HPS, LSST, ATLAS-CSC, LBNE, DarkSide, ILC-SiD, etc).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Underwood, Keith D; Ulmer, Craig D.; Thompson, David

Field programmable gate arrays (FPGAs) have been used as alternative computational de-vices for over a decade; however, they have not been used for traditional scientific com-puting due to their perceived lack of floating-point performance. In recent years, there hasbeen a surge of interest in alternatives to traditional microprocessors for high performancecomputing. Sandia National Labs began two projects to determine whether FPGAs wouldbe a suitable alternative to microprocessors for high performance scientific computing and,if so, how they should be integrated into the system. We present results that indicate thatFPGAs could have a significant impact on future systems. FPGAs have thepotentialtohave ordermore » of magnitude levels of performance wins on several key algorithms; however,there are serious questions as to whether the system integration challenge can be met. Fur-thermore, there remain challenges in FPGA programming and system level reliability whenusing FPGA devices.4 AcknowledgmentArun Rodrigues provided valuable support and assistance in the use of the Structural Sim-ulation Toolkit within an FPGA context. Curtis Janssen and Steve Plimpton provided valu-able insights into the workings of two Sandia applications (MPQC and LAMMPS, respec-tively).5« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Citterio, M.; Camplani, A.; Cannon, M.

SRAM based Field Programmable Gate Arrays (FPGAs) have been rarely used in High Energy Physics (HEP) due to their sensitivity to radiation. The last generation of commercial FPGAs based on 28 nm feature size and on Silicon On Insulator (SOI) technologies are more tolerant to radiation to the level that their use in front-end electronics is now feasible. FPGAs provide re-programmability, high-speed computation and fast data transmission through the embedded serial transceivers. They could replace custom application specific integrated circuits in front end electronics in locations with moderate radiation field. Finally, the use of a FPGA in HEP experiments ismore » only limited by our ability to mitigate single event effects induced by the high energy hadrons present in the radiation field.« less
Defense Industrial Base Assessment: U.S. Integrated Circuit Design and Fabrication Capability

DTIC Science & Technology

2009-05-01

in the U.S for the period 2003-2006, with projections to 2011.6 The resulting draft OTE survey was field tested for accuracy and usability with a...custom application specific integrated circuits (ASICs) to field programmable gate arrays (FPGAs). Companies of all sizes can manufacture these IC...able to design one-time Electronically Programmable Gate Arrays (EPGAs) while nine are able to design Field Programmable Gate Arrays (FPGAs). Eight
FPGA applications for single dish activity at Medicina radio telescopes

NASA Astrophysics Data System (ADS)

Bartolini, M.; Naldi, G.; Mattana, A.; Maccaferri, A.; De Biaggi, M.

FPGA technologies are gaining major attention in the recent years in the field of radio astronomy. At Medicina radio telescopes, FPGAs have been used in the last ten years for a number of purposes and in this article we will take into exam the applications developed and installed for the Medicina Single Dish 32m Antenna: these range from high performance digital signal processing to instrument control developed on top of smaller FPGAs.
FPGA wavelet processor design using language for instruction-set architectures (LISA)

NASA Astrophysics Data System (ADS)

Meyer-Bäse, Uwe; Vera, Alonzo; Rao, Suhasini; Lenk, Karl; Pattichis, Marios

2007-04-01

The design of an microprocessor is a long, tedious, and error-prone task consisting of typically three design phases: architecture exploration, software design (assembler, linker, loader, profiler), architecture implementation (RTL generation for FPGA or cell-based ASIC) and verification. The Language for instruction-set architectures (LISA) allows to model a microprocessor not only from instruction-set but also from architecture description including pipelining behavior that allows a design and development tool consistency over all levels of the design. To explore the capability of the LISA processor design platform a.k.a. CoWare Processor Designer we present in this paper three microprocessor designs that implement a 8/8 wavelet transform processor that is typically used in today's FBI fingerprint compression scheme. We have designed a 3 stage pipelined 16 bit RISC processor (NanoBlaze). Although RISC μPs are usually considered "fast" processors due to design concept like constant instruction word size, deep pipelines and many general purpose registers, it turns out that DSP operations consume essential processing time in a RISC processor. In a second step we have used design principles from programmable digital signal processor (PDSP) to improve the throughput of the DWT processor. A multiply-accumulate operation along with indirect addressing operation were the key to achieve higher throughput. A further improvement is possible with today's FPGA technology. Today's FPGAs offer a large number of embedded array multipliers and it is now feasible to design a "true" vector processor (TVP). A multiplication of two vectors can be done in just one clock cycle with our TVP, a complete scalar product in two clock cycles. Code profiling and Xilinx FPGA ISE synthesis results are provided that demonstrate the essential improvement that a TVP has compared with traditional RISC or PDSP designs.
Risk Reduction for Use of Complex Devices in Space Projects

NASA Technical Reports Server (NTRS)

Berg, Melanie; Poivey, Christian; Friendlich, Mark; Petrick, Dave; LaBel, Kenneth; Stansberry, Scott

2007-01-01

We present guidel!nes to reduce risk to an acceptable level when using complex devices in space applications. Application to Virtex 4 Field Programmable Gate Array (FPGA) on Express Logistic Carrier (ELC) project is presented.
In-situ FPGA debug driven by on-board microcontroller

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Zachary Kent

2009-01-01

Often we are faced with the situation that the behavior of a circuit changes in an unpredictable way when chassis cover is attached or the system is not easily accessible. For instance, in a deployed environment, such as space, hardware can malfunction in unpredictable ways. What can a designer do to ascertain the cause of the problem? Register interrogations only go so far, and sometimes the problem being debugged is register transactions themselves, or the problem lies in FPGA programming. This work provides a solution to this; namely, the ability to drive a JTAG chain via an on-board microcontroller andmore » use a simple clone of the Xilinx Chipscope core without a Xilinx JTAG cable or any external interfaces required. We have demonstrated the functionality of the prototype system using a Xilinx Spartan 3E FPGA and a Microchip PIC18j2550 microcontroller. This paper will discuss the implementation details as well as present case studies describing how the tools have aided satellite hardware development.« less
Implementing a Microcontroller Watchdog with a Field-Programmable Gate Array (FPGA)

NASA Technical Reports Server (NTRS)

Straka, Bartholomew

2013-01-01

Reliability is crucial to safety. Redundancy of important system components greatly enhances reliability and hence safety. Field-Programmable Gate Arrays (FPGAs) are useful for monitoring systems and handling the logic necessary to keep them running with minimal interruption when individual components fail. A complete microcontroller watchdog with logic for failure handling can be implemented in a hardware description language (HDL.). HDL-based designs are vendor-independent and can be used on many FPGAs with low overhead.

Robust Fuzzy Controllers Using FPGAs

NASA Technical Reports Server (NTRS)

Monroe, Author Gene S., Jr.

2007-01-01

Electro-mechanical device controllers typically come in one of three forms, proportional (P), Proportional Derivative (PD), and Proportional Integral Derivative (PID). Two methods of control are discussed in this paper; they are (1) the classical technique that requires an in-depth mathematical use of poles and zeros, and (2) the fuzzy logic (FL) technique that is similar to the way humans think and make decisions. FL controllers are used in multiple industries; examples include control engineering, computer vision, pattern recognition, statistics, and data analysis. Presented is a study on the development of a PD motor controller written in very high speed hardware description language (VHDL), and implemented in FL. Four distinct abstractions compose the FL controller, they are the fuzzifier, the rule-base, the fuzzy inference system (FIS), and the defuzzifier. FL is similar to, but different from, Boolean logic; where the output value may be equal to 0 or 1, but it could also be equal to any decimal value between them. This controller is unique because of its VHDL implementation, which uses integer mathematics. To compensate for VHDL's inability to synthesis floating point numbers, a scale factor equal to 10(sup (N/4) is utilized; where N is equal to data word size. The scaling factor shifts the decimal digits to the left of the decimal point for increased precision. PD controllers are ideal for use with servo motors, where position control is effective. This paper discusses control methods for motion-base platforms where a constant velocity equivalent to a spectral resolution of 0.25 cm(exp -1) is required; however, the control capability of this controller extends to various other platforms.
Rapid Corner Detection Using FPGAs

NASA Technical Reports Server (NTRS)

Morfopoulos, Arin C.; Metz, Brandon C.

2010-01-01

In order to perform precision landings for space missions, a control system must be accurate to within ten meters. Feature detection applied against images taken during descent and correlated against the provided base image is computationally expensive and requires tens of seconds of processing time to do just one image while the goal is to process multiple images per second. To solve this problem, this algorithm takes that processing load from the central processing unit (CPU) and gives it to a reconfigurable field programmable gate array (FPGA), which is able to compute data in parallel at very high clock speeds. The workload of the processor then becomes simpler; to read an image from a camera, it is transferred into the FPGA, and the results are read back from the FPGA. The Harris Corner Detector uses the determinant and trace to find a corner score, with each step of the computation occurring on independent clock cycles. Essentially, the image is converted into an x and y derivative map. Once three lines of pixel information have been queued up, valid pixel derivatives are clocked into the product and averaging phase of the pipeline. Each x and y derivative is squared against itself, as well as the product of the ix and iy derivative, and each value is stored in a WxN size buffer, where W represents the size of the integration window and N is the width of the image. In this particular case, a window size of 5 was chosen, and the image is 640 480. Over a WxN size window, an equidistance Gaussian is applied (to bring out the stronger corners), and then each value in the entire window is summed and stored. The required components of the equation are in place, and it is just a matter of taking the determinant and trace. It should be noted that the trace is being weighted by a constant k, a value that is found empirically to be within 0.04 to 0.15 (and in this implementation is 0.05). The constant k determines the number of corners available to be compared against a threshold sigma to mark a valid corner. After a fixed delay from when the first pixel is clocked in (to fill the pipeline), a score is achieved after each successive clock. This score corresponds with an (x,y) location within the image. If the score is higher than the predetermined threshold sigma, then a flag is set high and the location is recorded.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather; Wirthlin, Michael

A variety of fault emulation systems have been created to study the effect of single-event effects (SEEs) in static random access memory (SRAM) based field-programmable gate arrays (FPGAs). These systems are useful for augmenting radiation-hardness assurance (RHA) methodologies for verifying the effectiveness for mitigation techniques; understanding error signatures and failure modes in FPGAs; and failure rate estimation. For radiation effects researchers, it is important that these systems properly emulate how SEEs manifest in FPGAs. If the fault emulation systems does not mimic the radiation environment, the system will generate erroneous data and incorrect predictions of behavior of the FPGA inmore » a radiation environment. Validation determines whether the emulated faults are reasonable analogs to the radiation-induced faults. In this study we present methods for validating fault emulation systems and provide several examples of validated FPGA fault emulation systems.« less
Digital intermediate frequency QAM modulator using parallel processing

DOEpatents

Pao, Hsueh-Yuan [Livermore, CA; Tran, Binh-Nien [San Ramon, CA

2008-05-27

The digital Intermediate Frequency (IF) modulator applies to various modulation types and offers a simple and low cost method to implement a high-speed digital IF modulator using field programmable gate arrays (FPGAs). The architecture eliminates multipliers and sequential processing by storing the pre-computed modulated cosine and sine carriers in ROM look-up-tables (LUTs). The high-speed input data stream is parallel processed using the corresponding LUTs, which reduces the main processing speed, allowing the use of low cost FPGAs.
FPGA implementation of adaptive beamforming in hearing aids.

PubMed

Samtani, Kartik; Thomas, Jobin; Varma, G Abhinav; Sumam, David S; Deepu, S P

2017-07-01

Beamforming is a spatial filtering technique used in hearing aids to improve target sound reception by reducing interference from other directions. In this paper we propose improvements in an existing architecture present for two omnidirectional microphone array based adaptive beamforming for hearing aid applications and implement the same on Xilinx Artix 7 FPGA using VHDL coding and Xilinx Vivado ® 2015.2. The nulls are introduced in particular directions by combination of two fixed polar patterns. This combination can be adaptively controlled to steer the null in the direction of noise. The beamform patterns and improvements in SNR values obtained from experiments in a conference room environment are analyzed.
Spaceborne Hybrid-FPGA System for Processing FTIR Data

NASA Technical Reports Server (NTRS)

Bekker, Dmitriy; Blavier, Jean-Francois L.; Pingree, Paula J.; Lukowiak, Marcin; Shaaban, Muhammad

2008-01-01

Progress has been made in a continuing effort to develop a spaceborne computer system for processing readout data from a Fourier-transform infrared (FTIR) spectrometer to reduce the volume of data transmitted to Earth. The approach followed in this effort, oriented toward reducing design time and reducing the size and weight of the spectrometer electronics, has been to exploit the versatility of recently developed hybrid field-programmable gate arrays (FPGAs) to run diverse software on embedded processors while also taking advantage of the reconfigurable hardware resources of the FPGAs.
SAD-Based Stereo Matching Using FPGAs

NASA Astrophysics Data System (ADS)

Ambrosch, Kristian; Humenberger, Martin; Kubinger, Wilfried; Steininger, Andreas

In this chapter we present a field-programmable gate array (FPGA) based stereo matching architecture. This architecture uses the sum of absolute differences (SAD) algorithm and is targeted at automotive and robotics applications. The disparity maps are calculated using 450×375 input images and a disparity range of up to 150 pixels. We discuss two different implementation approaches for the SAD and analyze their resource usage. Furthermore, block sizes ranging from 3×3 up to 11×11 and their impact on the consumed logic elements as well as on the disparity map quality are discussed. The stereo matching architecture enables a frame rate of up to 600 fps by calculating the data in a highly parallel and pipelined fashion. This way, a software solution optimized by using Intel's Open Source Computer Vision Library running on an Intel Pentium 4 with 3 GHz clock frequency is outperformed by a factor of 400.
An FPGA computing demo core for space charge simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Jinyuan; Huang, Yifei; /Fermilab

2009-01-01

In accelerator physics, space charge simulation requires large amount of computing power. In a particle system, each calculation requires time/resource consuming operations such as multiplications, divisions, and square roots. Because of the flexibility of field programmable gate arrays (FPGAs), we implemented this task with efficient use of the available computing resources and completely eliminated non-calculating operations that are indispensable in regular micro-processors (e.g. instruction fetch, instruction decoding, etc.). We designed and tested a 16-bit demo core for computing Coulomb's force in an Altera Cyclone II FPGA device. To save resources, the inverse square-root cube operation in our design is computedmore » using a memory look-up table addressed with nine to ten most significant non-zero bits. At 200 MHz internal clock, our demo core reaches a throughput of 200 M pairs/s/core, faster than a typical 2 GHz micro-processor by about a factor of 10. Temperature and power consumption of FPGAs were also lower than those of micro-processors. Fast and convenient, FPGAs can serve as alternatives to time-consuming micro-processors for space charge simulation.« less
Python based high-level synthesis compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radosław; Pozniak, Krzysztof; Romaniuk, Ryszard

2014-11-01

This paper presents a python based High-Level synthesis (HLS) compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and map it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Creating parallel programs implemented in FPGAs is not trivial. This article describes design, implementation and first results of created Python based compiler.
FPGA implementation of bit controller in double-tick architecture

NASA Astrophysics Data System (ADS)

Kobylecki, Michał; Kania, Dariusz

2017-11-01

This paper presents a comparison of the two original architectures of programmable bit controllers built on FPGAs. Programmable Logic Controllers (which include, among other things programmable bit controllers) built on FPGAs provide a efficient alternative to the controllers based on microprocessors which are expensive and often too slow. The presented and compared methods allow for the efficient implementation of any bit control algorithm written in Ladder Diagram language into the programmable logic system in accordance with IEC61131-3. In both cases, we have compared the effect of the applied architecture on the performance of executing the same bit control program in relation to its own size.
Remotely Powered Reconfigurable Receiver for Extreme Environment Sensing Platforms

NASA Technical Reports Server (NTRS)

Sheldon, Douglas J.

2012-01-01

Wireless sensors connected in a local network offer revolutionary exploration capabilities, but the current solutions do not work in extreme environments of low temperatures (200K) and low to moderate radiation levels (<50 krad). These sensors (temperature, radiation, infrared, etc.) would need to operate outside the spacecraft/ lander and be totally independent of power from the spacecraft/lander. Flash memory field-programmable gate arrays (FPGAs) are being used as the main signal processing and protocol generation platform in a new receiver. Flash-based FPGAs have been shown to have at least 100 reduced standby power and 10 reduction operating power when compared to normal SRAM-based FPGA technology.
Splash 2

NASA Technical Reports Server (NTRS)

Arnold, Jeffrey M.; Buell, Duncan A.; Kleinfelder, Walter J.

1993-01-01

Splash 2 is an attached processor system for Sun SPARC 2 workstations that uses Xilinx 4010 Field Programmable Gate Arrays (FPGA's) as its processing elements. The purpose of this paper is to describe Splash 2. The predecessor system, Splash 1, was designed to be used as a systolic processing system. Although it was very successful in that mode, there were many other applications that were not systolic, but which were successful, nonetheless, on Splash 1, or that were not implemented successfully due to one or more architectural limitations, most notably I/O bandwidth and interprocessor communication. Although other uses to increase computational performance have been found for the Xilinx FPGA's that are Splash's processing elements. Splash is unique in its goal to be programmable in a general sense.
Application of Reconfigurable Computing Technology to Multi-KiloHertz Micro-Laser Altimeter (MMLA) Data Processing

NASA Technical Reports Server (NTRS)

Powell, Wesley; Dabney, Philip; Hicks, Edward; Pinchinat, Maxime; Day, John H. (Technical Monitor)

2002-01-01

limits the ability of the MMLA to operate in environments with sparse signal returns and a high number of noise return. However, under an IR&D project, an FPGA-based, reconfigurable computing data system has been developed that has been demonstrated to perform real-time signal extraction under realistic operating constraints. This reconfigurable data system is based on the commercially available Firebird Board from Annapolis Microsystems. This PCI board consists of a Xilinx Virtex 2000E FPGA along with 36 MB of SRAM arranged in five separately addressable banks. This board is housed in a rackmount PC with dual 850MHz Pentium processors running the Windows 2000 operating system. This data system performs all signal extraction in hardware on the Firebird, but also runs the existing "software based" signal extraction in tandem for comparison purposes. Using a relatively small amount of the Virtex XCV2000E resources, the reconfigurable data system has demonstrated to improve performance improvement over the existing software based data system by an order of magnitude. Performance could be further improved by employing parallelism. Ground testing and a preliminary engineering test flight aboard the NASA P3 has been performed, during which the reconfigurable data system has been demonstrated to match the results of the existing data system.
Biologically inspired collision avoidance system for unmanned vehicles

NASA Astrophysics Data System (ADS)

Ortiz, Fernando E.; Graham, Brett; Spagnoli, Kyle; Kelmelis, Eric J.

2009-05-01

In this project, we collaborate with researchers in the neuroscience department at the University of Delaware to develop an Field Programmable Gate Array (FPGA)-based embedded computer, inspired by the brains of small vertebrates (fish). The mechanisms of object detection and avoidance in fish have been extensively studied by our Delaware collaborators. The midbrain optic tectum is a biological multimodal navigation controller capable of processing input from all senses that convey spatial information, including vision, audition, touch, and lateral-line (water current sensing in fish). Unfortunately, computational complexity makes these models too slow for use in real-time applications. These simulations are run offline on state-of-the-art desktop computers, presenting a gap between the application and the target platform: a low-power embedded device. EM Photonics has expertise in developing of high-performance computers based on commodity platforms such as graphic cards (GPUs) and FPGAs. FPGAs offer (1) high computational power, low power consumption and small footprint (in line with typical autonomous vehicle constraints), and (2) the ability to implement massively-parallel computational architectures, which can be leveraged to closely emulate biological systems. Combining UD's brain modeling algorithms and the power of FPGAs, this computer enables autonomous navigation in complex environments, and further types of onboard neural processing in future applications.
FPGA-Based Front-End Electronics for Positron Emission Tomography

PubMed Central

Haselman, Michael; DeWitt, Don; McDougald, Wendy; Lewellen, Thomas K.; Miyaoka, Robert; Hauck, Scott

2010-01-01

Modern Field Programmable Gate Arrays (FPGAs) are capable of performing complex discrete signal processing algorithms with clock rates above 100MHz. This combined with FPGA’s low expense, ease of use, and selected dedicated hardware make them an ideal technology for a data acquisition system for positron emission tomography (PET) scanners. Our laboratory is producing a high-resolution, small-animal PET scanner that utilizes FPGAs as the core of the front-end electronics. For this next generation scanner, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilizes to add significant signal processing power to produce higher resolution images. In this paper two such processes, sub-clock rate pulse timing and event localization, will be discussed in detail. We show that timing performed in the FPGA can achieve a resolution that is suitable for small-animal scanners, and will outperform the analog version given a low enough sampling period for the ADC. We will also show that the position of events in the scanner can be determined in real time using a statistical positioning based algorithm. PMID:21961085
CoNNeCT Baseband Processor Module Boot Code SoftWare (BCSW)

NASA Technical Reports Server (NTRS)

Yamamoto, Clifford K.; Orozco, David S.; Byrne, D. J.; Allen, Steven J.; Sahasrabudhe, Adit; Lang, Minh

2012-01-01

This software provides essential startup and initialization routines for the CoNNeCT baseband processor module (BPM) hardware upon power-up. A command and data handling (C&DH) interface is provided via 1553 and diagnostic serial interfaces to invoke operational, reconfiguration, and test commands within the code. The BCSW has features unique to the hardware it is responsible for managing. In this case, the CoNNeCT BPM is configured with an updated CPU (Atmel AT697 SPARC processor) and a unique set of memory and I/O peripherals that require customized software to operate. These features include configuration of new AT697 registers, interfacing to a new HouseKeeper with a flash controller interface, a new dual Xilinx configuration/scrub interface, and an updated 1553 remote terminal (RT) core. The BCSW is intended to provide a "safe" mode for the BPM when initially powered on or when an unexpected trap occurs, causing the processor to reset. The BCSW allows the 1553 bus controller in the spacecraft or payload controller to operate the BPM over 1553 to upload code; upload Xilinx bit files; perform rudimentary tests; read, write, and copy the non-volatile flash memory; and configure the Xilinx interface. Commands also exist over 1553 to cause the CPU to jump or call a specified address to begin execution of user-supplied code. This may be in the form of a real-time operating system, test routine, or specific application code to run on the BPM.
First light on a new fully digital camera based on SiPM for CTA SST-1M telescope

NASA Astrophysics Data System (ADS)

della Volpe, Domenico; Al Samarai, Imen; Alispach, Cyril; Bulik, Tomasz; Borkowski, Jerzy; Cadoux, Franck; Coco, Victor; Favre, Yannick; Grudzińska, Mira; Heller, Matthieu; Jamrozy, Marek; Kasperek, Jerzy; Lyard, Etienne; Mach, Emil; Mandat, Dusan; Michałowski, Jerzy; Moderski, Rafal; Montaruli, Teresa; Neronov, Andrii; Niemiec, Jacek; Njoh Ekoume, T. R. S.; Ostrowski, Michal; Paśko, Paweł; Pech, Miroslav; Rajda, Pawel; Rafalski, Jakub; Schovanek, Petr; Seweryn, Karol; Skowron, Krzysztof; Sliusar, Vitalii; Stawarz, Łukasz; Stodulska, Magdalena; Stodulski, Marek; Travnicek, Petr; Troyano Pujadas, Isaac; Walter, Roland; Zagdański, Adam; Zietara, Krzysztof

2017-08-01

The Cherenkov Telescope Array (CTA) will explore with unprecedented precision the Universe in the gammaray domain covering an energy range from 50 GeV to more the 300 TeV. To cover such a broad range with a sensitivity which will be ten time better than actual instruments, different types of telescopes are needed: the Large Size Telescopes (LSTs), with a ˜24 m diameter mirror, a Medium Size Telescopes (MSTs), with a ˜12 m mirror and the small size telescopes (SSTs), with a ˜4 m diameter mirror. The single mirror small size telescope (SST-1M), one of the proposed solutions to become part of the small-size telescopes of CTA, will be equipped with an innovative camera. The SST-1M has a Davies-Cotton optical design with a mirror dish of 4 m diameter and focal ratio 1.4 focussing the Cherenkov light produced in atmospheric showers onto a 90 cm wide hexagonal camera providing a FoV of 9 degrees. The camera is an innovative design based on silicon photomultipliers (SiPM ) and adopting a fully digital trigger and readout architecture. The camera features 1296 custom designed large area hexagonal SiPM coupled to hollow optical concentrators to achieve a pixel size of almost 2.4 cm. The SiPM is a custom design developed with Hamamatsu and with its active area of almost 1 cm2 is one of the largest monolithic SiPM existing. Also the optical concentrators are innovative being light funnels made of a polycarbonate substrate coated with a custom designed UV-enhancing coating. The analog signals coming from the SiPM are fed into the fully digital readout electronics, where digital data are processed by high-speed FPGAs both for trigger and readout. The trigger logic, implemented into an Virtex 7 FPGA, uses the digital data to elaborate a trigger decision by matching data against predefined patterns. This approach is extremely flexible and allows improvements and continued evolutions of the system. The prototype camera is being tested in laboratory prior to its installation
Super-Resolution in Plenoptic Cameras Using FPGAs

PubMed Central

Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime

2014-01-01

Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes. PMID:24841246
Implementation of a watershed algorithm on FPGAs

NASA Astrophysics Data System (ADS)

Zahirazami, Shahram; Akil, Mohamed

1998-10-01

In this article we present an implementation of a watershed algorithm on a multi-FPGA architecture. This implementation is based on an hierarchical FIFO. A separate FIFO for each gray level. The gray scale value of a pixel is taken for the altitude of the point. In this way we look at the image as a relief. We proceed by a flooding step. It's like as we immerse the relief in a lake. The water begins to come up and when the water of two different catchment basins reach each other, we will construct a separator or a `Watershed'. This approach is data dependent, hence the process time is different for different images. The H-FIFO is used to guarantee the nature of immersion, it means that we need two types of priority. All the points of an altitude `n' are processed before any point of altitude `n + 1'. And inside an altitude water propagates with a constant velocity in all directions from the source. This operator needs two images as input. An original image or it's gradient and the marker image. A classic way to construct the marker image is to build an image of minimal regions. Each minimal region has it's unique label. This label is the color of the water and will be used to see whether two different water touch each other. The algorithm at first fill the hierarchy FIFO with neighbors of all the regions who are not colored. Next it fetches the first pixel from the first non-empty FIFO and treats this pixel. This pixel will take the color of its neighbor, and all the neighbors who are not already in the H-FIFO are put in their correspondent FIFO. The process is over when the H-FIFO is empty. The result is a segmented and labeled image.
FPGAs in Space Environment and Design Techniques

NASA Technical Reports Server (NTRS)

Katz, Richard B.; Day, John H. (Technical Monitor)

2001-01-01

This viewgraph presentation gives an overview of Field Programmable Gate Arrays (FPGA) in the space environment and design techniques. Details are given on the effects of the space radiation environment, total radiation dose, single event upset, single event latchup, single event transient, antifuse technology and gate rupture, proton upsets and sensitivity, and loss of functionality.

Super-resolution in plenoptic cameras using FPGAs.

PubMed

Pérez, Joel; Magdaleno, Eduardo; Pérez, Fernando; Rodríguez, Manuel; Hernández, David; Corrales, Jaime

2014-05-16

Plenoptic cameras are a new type of sensor that extend the possibilities of current commercial cameras allowing 3D refocusing or the capture of 3D depths. One of the limitations of plenoptic cameras is their limited spatial resolution. In this paper we describe a fast, specialized hardware implementation of a super-resolution algorithm for plenoptic cameras. The algorithm has been designed for field programmable graphic array (FPGA) devices using VHDL (very high speed integrated circuit (VHSIC) hardware description language). With this technology, we obtain an acceleration of several orders of magnitude using its extremely high-performance signal processing capability through parallelism and pipeline architecture. The system has been developed using generics of the VHDL language. This allows a very versatile and parameterizable system. The system user can easily modify parameters such as data width, number of microlenses of the plenoptic camera, their size and shape, and the super-resolution factor. The speed of the algorithm in FPGA has been successfully compared with the execution using a conventional computer for several image sizes and different 3D refocusing planes.
Single Pass Streaming BLAST on FPGAs*†

PubMed Central

Herbordt, Martin C.; Model, Josh; Sukhwani, Bharat; Gu, Yongfeng; VanCourt, Tom

2008-01-01

Approximate string matching is fundamental to bioinformatics and has been the subject of numerous FPGA acceleration studies. We address issues with respect to FPGA implementations of both BLAST- and dynamic-programming- (DP) based methods. Our primary contribution is a new algorithm for emulating the seeding and extension phases of BLAST. This operates in a single pass through a database at streaming rate, and with no preprocessing other than loading the query string. Moreover, it emulates parameters turned to maximum possible sensitivity with no slowdown. While current DP-based methods also operate at streaming rate, generating results can be cumbersome. We address this with a new structure for data extraction. We present results from several implementations showing order of magnitude acceleration over serial reference code. A simple extension assures compatibility with NCBI BLAST. PMID:19081828
Evolutionary Based Techniques for Fault Tolerant Field Programmable Gate Arrays

NASA Technical Reports Server (NTRS)

Larchev, Gregory V.; Lohn, Jason D.

2006-01-01

The use of SRAM-based Field Programmable Gate Arrays (FPGAs) is becoming more and more prevalent in space applications. Commercial-grade FPGAs are potentially susceptible to permanently debilitating Single-Event Latchups (SELs). Repair methods based on Evolutionary Algorithms may be applied to FPGA circuits to enable successful fault recovery. This paper presents the experimental results of applying such methods to repair four commonly used circuits (quadrature decoder, 3-by-3-bit multiplier, 3-by-3-bit adder, 440-7 decoder) into which a number of simulated faults have been introduced. The results suggest that evolutionary repair techniques can improve the process of fault recovery when used instead of or as a supplement to Triple Modular Redundancy (TMR), which is currently the predominant method for mitigating FPGA faults.
Fabless company mask technology approach: fabless but not fab-careless

NASA Astrophysics Data System (ADS)

Hisamura, Toshiyuki; Wu, Xin

2009-10-01

There are two different foundry-fabless working models in the aspect of mask. Some foundries have in-house mask facility while others contract with merchant mask vendors. Significant progress has been made in both kinds of situations. Xilinx as one of the pioneers of fabless semiconductor companies has been continually working very closely with both merchant mask vendors and mask facilities of foundries in past many years, contributed well in both technology development and benefited from corporations. Our involvement in manufacturing is driven by the following three elements: The first element is to understand the new fabrication and mask technologies and then find a suitable design / layout style to better utilize these new technologies and avoid potential risks. Because Xilinx has always been involved in early stage of advanced technology nodes, this early understanding and adoption is especially important. The second element is time to market. Reduction in mask and wafer manufacturing cycle-time can ensure faster time to market. The third element is quality. Commitment to quality is our highest priority for our customers. We have enough visibility on any manufacturing issues affecting the device functionality. Good correlation has consistently been observed between FPGA speed uniformity and the poly mask Critical Dimension (CD) uniformity performance. To achieve FPGA speed uniformity requirement, the manufacturing process as well as the mask and wafer CD uniformity has to be monitored. Xilinx works closely with the wafer foundries and mask suppliers to improve productivity and the yield from initial development stage of mask making operations. As an example, defect density reduction is one of the biggest challenges for mask supplier in development stage to meet the yield target satisfying the mask cost and mask turn-around-time (TAT) requirement. Historically, masks were considered to be defect free but at these advanced process nodes, that assumption no longer
77 FR 26045 - Notice Pursuant to the National Cooperative Research and Production Act of 1993-Accellera Systems...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-05-02

..., IRELAND; Freescale Semiconductor, Austin, TX; IBM, Hopewell Junction, NY; Jasper Design Automation..., San Jose, CA; Vayavya Labs, Belguam, INDIA; Verilab, Austin, TX; and Xilinx, Inc., San Jose, CA, have... DEPARTMENT OF JUSTICE Antitrust Division Notice Pursuant to the National Cooperative Research and...
Modeling and Analysis of a Constant Power Series-Loaded Resonant Converter

DTIC Science & Technology

2011-06-01

Paperwork Reduction Project (0704-0188) Washington DC 20503. 1 . AGENCY USE ONLY (Leave blank) 2 . REPORT DATE June 2011 3. REPORT TYPE AND DATES...CONVERTER THEORY .......................8 1 . Converter Topology .............................................................................8 2 . Modes of...25 1 . Fixed-Point Numbers. ........................................................................25 2 . Xilinx
FPGA-Based Pulse Pile-Up Correction With Energy and Timing Recovery.

PubMed

Haselman, M D; Pasko, J; Hauck, S; Lewellen, T K; Miyaoka, R S

2012-10-01

Modern field programmable gate arrays (FPGAs) are capable of performing complex discrete signal processing algorithms with clock rates well above 100 MHz. This, combined with FPGA's low expense, ease of use, and selected dedicated hardware make them an ideal technology for a data acquisition system for a positron emission tomography (PET) scanner. The University of Washington is producing a high-resolution, small-animal PET scanner that utilizes FPGAs as the core of the front-end electronics. For this scanner, functions that are typically performed in dedicated circuits, or offline, are being migrated to the FPGA. This will not only simplify the electronics, but the features of modern FPGAs can be utilized to add significant signal processing power to produce higher quality images. In this paper we report on an all-digital pulse pile-up correction algorithm that has been developed for the FPGA. The pile-up mitigation algorithm will allow the scanner to run at higher count rates without incurring large data losses due to the overlapping of scintillation signals. This correction technique utilizes a reference pulse to extract timing and energy information for most pile-up events. Using pulses acquired from a Zecotech Photonics MAPD-N with an LFS-3 scintillator, we show that good timing and energy information can be achieved in the presence of pile-up utilizing a moderate amount of FPGA resources.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Dondero, Rachel Elizabeth

The increased use of Field Programmable Gate Arrays (FPGAs) in critical systems brings new challenges in securing the diversely programmable fabric from cyber-attacks. FPGAs are an inexpensive, efficient, and flexible alternative to Application Specific Integrated Circuits (ASICs), which are becoming increasingly expensive and impractical for low volume manufacturing as technology nodes continue to shrink. Unfortunately, FPGAs are not designed for high security applications, and their high-flexibility lends itself to low security and vulnerability to malicious attacks. Similar to securing an ASIC’s functionality, FPGA programmers can exploit the inherent randomness introduced into hardware structures during fabrication for security applications. Physically Unclonablemore » Functions (PUFs) are one such solution that uses the die specific variability in hardware fabrication for both secret key generation and verification. PUFs strive to be random, unique, and reliable. Throughout recent years many PUF structures have been presented to try and maximize these three design constraints, reliability being the most difficult of the three to achieve. This thesis presents a new PUF structure that combines two elementary PUF concepts (a bi-stable SRAM PUF and a delay-based arbiter PUF) to create a PUF with increased reliability, while maintaining both random and unique qualities. Properties of the new PUF will be discussed as well as the various design modifications that can be made to tweak the desired performance and overhead.« less
Field Programmable Gate Array Reliability Analysis Guidelines for Launch Vehicle Reliability Block Diagrams

NASA Technical Reports Server (NTRS)

Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.

2017-01-01

Field Programmable Gate Arrays (FPGAs) integrated circuits (IC) are one of the key electronic components in today's sophisticated launch and space vehicle complex avionic systems, largely due to their superb reprogrammable and reconfigurable capabilities combined with relatively low non-recurring engineering costs (NRE) and short design cycle. Consequently, FPGAs are prevalent ICs in communication protocols and control signal commands. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
Design and implementation of power efficient 10-bit dual port SRAM on 28 nm technology

NASA Astrophysics Data System (ADS)

Gulati, Anmol; Gupta, Ashutosh; Murgai, Shruti; Bhaskar, Lala

2016-03-01

In this paper, 10 bit synchronous clock gated Dual port RAM has been designed. The negative latch based clock gating technique has been employed to optimize the power of the design. The design has been implemented on XV7K70T device, -3 speed grade, and kintex 7 FPGA family on Xilinx ISE Design Suite 14.7 using 28 nm technology. The design has been synthesized using Verilog HDL. We have been successful in achieving approximately 55 % reduction in total clock power, 81.55% reduction in BRAM power, 82.65%, 0.07%, 1.04% and 11.31% reduction in static power, 72.32%, 38.60%, 68.74% and 71.97%, reduction in dynamic power and 72.44%, 16.96%, 60.88% and 71.06% reduction in total supply power at 1 THz, 1GHz, 100 GHz and 1000 GHz frequency respectively. The power of the device has been calculated using XPower Analyzer tool of Xilinx ISE Design Suite 14.7.
Power efficient, clock gated multiplexer based full adder cell using 28 nm technology

NASA Astrophysics Data System (ADS)

Gupta, Ashutosh; Murgai, Shruti; Gulati, Anmol; Kumar, Pradeep

2016-03-01

Clock gating is a leading technique used for power saving. Full adders is one of the basic circuit that can be found in maximum VLSI circuits. In this paper clock gated multiplexer based full adder cell is implemented on 28 nm technology. We have designed a full adder cell using a multiplexer with a gated clock without degrading its performance of the cell. We have negative latch circuit for generating gated clock. This gated clock is used to control the multiplexer based full adder cell. The circuit has been synthesized on kintex FPGA through Xilinx ISE Design Suite 14.7 using 28 nm technology in Verilog HDL. The circuit has been simulated on Modelsim 10.3c. The design is verified using System Verilog on QuestaSim in UVM environment. The total power of the circuit has been reduced by 7.41% without degrading the performance of original circuit. The power has been calculated using XPower Analyzer tool of XILINX ISE DESIGN SUITE 14.3.
Susceptibility of Redundant Versus Singular Clock Domains Implemented in SRAM-Based FPGA TMR Designs

NASA Technical Reports Server (NTRS)

Berg, Melanie D.; LaBel, Kenneth A.; Pellish, Jonathan

2016-01-01

We present the challenges that arise when using redundant clock domains due to their clock-skew. Radiation data show that a singular clock domain (DTMR) provides an improved TMR methodology for SRAM-based FPGAs over redundant clocks.
Magnetics and Power System Upgrades for the Pegasus-U Experiment

NASA Astrophysics Data System (ADS)

Preston, R. C.; Bongard, M. W.; Fonck, R. J.; Lewicki, B. T.

2014-10-01

To support the missions of developing local helicity injection startup and exploiting advanced tokamak physics studies at near unity aspect ratio, the proposed Pegasus-U will include expanded magnetic systems and associated power supplies. A new centerstack increases the toroidal field seven times to 1 T and the volt-seconds by a factor of six while maintaining operation at an aspect ratio of 1.2. The poloidal field magnet system is expanded to support improved shape control and robust double or single null divertor operation at the full plasma current of 0.3 MA. An integrated digital control system based on Field Programmable Gate Arrays (FPGAs) provides active feedback control of all magnet currents. Implementation of the FPGAs is achieved with modular noise reducing electronics. The digital feedback controllers replace the existing analog systems and switch multiplexing technology. This will reduce noise sensitivity and allow the operational Ohmic power supply voltage to increase from 2100 V to its maximum capacity of 2400 V. The feedback controller replacement also allows frequency control for ``freewheeling''--stopping the switching for a short interval and allowing the current to coast. The FPGAs assist in optimizing pulse length by having programmable switching events to minimize energy losses. They also allow for more efficient switching topologies that provide improved stored energy utilization, and support increasing the pulse length from 25 ms to 50-100 ms. Work supported by US DOE Grant DE-FG02-96ER54375.
Adaptation of the Electra Radio to Support Multiple Receive Channels

NASA Technical Reports Server (NTRS)

Satorius, Edgar H.; Shah, Biren N.; Bruvold, Kristoffer N.; Bell, David J.

2011-01-01

Proposed future Mars missions plan communication between multiple assets (rovers). This paper presents the results of a study carried out to assess the potential adaptation of the Electra radio to a multi-channel transceiver. The basic concept is a Frequency Division multiplexing (FDM) communications scheme wherein different receiver architectures are examined. Options considered include: (1) multiple IF slices, A/D and FPGAs each programmed with an Electra baseband modem; (2) common IF but multiple A/Ds and FPGAs and (3) common IF, single A/D and single or multiple FPGAs programmed to accommodate the FDM signals. These options represent the usual tradeoff between analog and digital complexity. Given the space application, a common IF is preferable; however, multiple users present dynamic range challenges (e.g., near-far constraints) that would favor multiple IF slices (Option 1). Vice versa, with a common IF and multiple A/Ds (Option 2), individual AGC control of the A/Ds would be an important consideration. Option 3 would require a common AGC control strategy and would entail multiple digital down conversion paths within the FPGA. In this paper, both FDM parameters as well as the different Electra design options will be examined. In particular, signal channel spacing as a function of user data rates and transmit powers will be evaluated. In addition, tradeoffs between the different Electra design options will be presented with the ultimate goal of defining an augmented Electra radio architecture for potential future missions.
Novel processor architecture for onboard infrared sensors

NASA Astrophysics Data System (ADS)

Hihara, Hiroki; Iwasaki, Akira; Tamagawa, Nobuo; Kuribayashi, Mitsunobu; Hashimoto, Masanori; Mitsuyama, Yukio; Ochi, Hiroyuki; Onodera, Hidetoshi; Kanbara, Hiroyuki; Wakabayashi, Kazutoshi; Tada, Munehiro

2016-09-01

Infrared sensor system is a major concern for inter-planetary missions that investigate the nature and the formation processes of planets and asteroids. The infrared sensor system requires signal preprocessing functions that compensate for the intensity of infrared image sensors to get high quality data and high compression ratio through the limited capacity of transmission channels towards ground stations. For those implementations, combinations of Field Programmable Gate Arrays (FPGAs) and microprocessors are employed by AKATSUKI, the Venus Climate Orbiter, and HAYABUSA2, the asteroid probe. On the other hand, much smaller size and lower power consumption are demanded for future missions to accommodate more sensors. To fulfill this future demand, we developed a novel processor architecture which consists of reconfigurable cluster cores and programmable-logic cells with complementary atom switches. The complementary atom switches enable hardware programming without configuration memories, and thus soft-error on logic circuit connection is completely eliminated. This is a noteworthy advantage for space applications which cannot be found in conventional re-writable FPGAs. Almost one-tenth of lower power consumption is expected compared to conventional re-writable FPGAs because of the elimination of configuration memories. The proposed processor architecture can be reconfigured by behavioral synthesis with higher level language specification. Consequently, compensation functions are implemented in a single chip without accommodating program memories, which is accompanied with conventional microprocessors, while maintaining the comparable performance. This enables us to embed a processor element on each infrared signal detector output channel.
The Effects of Race Conditions when Implementing Single-Source Redundant Clock Trees in Triple Modular Redundant Synchronous Architectures

NASA Technical Reports Server (NTRS)

Berg, Melanie D.; Label, Kenneth A.; Pellish, Jonathan

2016-01-01

We present the challenges that arise when using redundant clock domains due to their clock-skew. Heavy-ion radiation data show that a singular clock domain (DTMR) provides an improved TMR methodology for SRAM-based FPGAs over redundant clocks.
Pixel Perfect

DOE Office of Scientific and Technical Information (OSTI.GOV)

Perrine, Kenneth A.; Hopkins, Derek F.; Lamarche, Brian L.

2005-09-01

Biologists and computer engineers at Pacific Northwest National Laboratory have specified, designed, and implemented a hardware/software system for performing real-time, multispectral image processing on a confocal microscope. This solution is intended to extend the capabilities of the microscope, enabling scientists to conduct advanced experiments on cell signaling and other kinds of protein interactions. FRET (fluorescence resonance energy transfer) techniques are used to locate and monitor protein activity. In FRET, it is critical that spectral images be precisely aligned with each other despite disturbances in the physical imaging path caused by imperfections in lenses and cameras, and expansion and contraction ofmore » materials due to temperature changes. The central importance of this work is therefore automatic image registration. This runs in a framework that guarantees real-time performance (processing pairs of 1024x1024, 8-bit images at 15 frames per second) and enables the addition of other types of advanced image processing algorithms such as image feature characterization. The supporting system architecture consists of a Visual Basic front-end containing a series of on-screen interfaces for controlling various aspects of the microscope and a script engine for automation. One of the controls is an ActiveX component written in C++ for handling the control and transfer of images. This component interfaces with a pair of LVDS image capture boards and a PCI board containing a 6-million gate Xilinx Virtex-II FPGA. Several types of image processing are performed on the FPGA in a pipelined fashion, including the image registration. The FPGA offloads work that would otherwise need to be performed by the main CPU and has a guaranteed real-time throughput. Image registration is performed in the FPGA by applying a cubic warp on one image to precisely align it with the other image. Before each experiment, an automated calibration procedure is run in order to set
Spaceborne synthetic aperture radar signal processing using FPGAs

NASA Astrophysics Data System (ADS)

Sugimoto, Yohei; Ozawa, Satoru; Inaba, Noriyasu

2017-10-01

Synthetic Aperture Radar (SAR) imagery requires image reproduction through successive signal processing of received data before browsing images and extracting information. The received signal data records of the ALOS-2/PALSAR-2 are stored in the onboard mission data storage and transmitted to the ground. In order to compensate the storage usage and the capacity of transmission data through the mission date communication networks, the operation duty of the PALSAR-2 is limited. This balance strongly relies on the network availability. The observation operations of the present spaceborne SAR systems are rigorously planned by simulating the mission data balance, given conflicting user demands. This problem should be solved such that we do not have to compromise the operations and the potential of the next-generation spaceborne SAR systems. One of the solutions is to compress the SAR data through onboard image reproduction and information extraction from the reproduced images. This is also beneficial for fast delivery of information products and event-driven observations by constellation. The Emergence Studio (Sōhatsu kōbō in Japanese) with Japan Aerospace Exploration Agency is developing evaluation models of FPGA-based signal processing system for onboard SAR image reproduction. The model, namely, "Fast L1 Processor (FLIP)" developed in 2016 can reproduce a 10m-resolution single look complex image (Level 1.1) from ALOS/PALSAR raw signal data (Level 1.0). The processing speed of the FLIP at 200 MHz results in twice faster than CPU-based computing at 3.7 GHz. The image processed by the FLIP is no way inferior to the image processed with 32-bit computing in MATLAB.
A SEU-Hard Flip-Flop for Antifuse FPGAs

NASA Technical Reports Server (NTRS)

Katz, R.; Wang, J. J.; McCollum, J.; Cronquist, B.; Chan, R.; Yu, D.; Kleyner, I.; Day, John H. (Technical Monitor)

2001-01-01

A single event upset (SEU)-hardened flip-flop has been designed and developed for antifuse Field Programmable Gate Array (FPGA) application. Design and application issues, testability, test methods, simulation, and results are discussed.
Measuring Input Thresholds on an Existing Board

NASA Technical Reports Server (NTRS)

Kuperman, Igor; Gutrich, Daniel G.; Berkun, Andrew C.

2011-01-01

A critical PECL (positive emitter-coupled logic) interface to Xilinx interface needed to be changed on an existing flight board. The new Xilinx input interface used a CMOS (complementary metal-oxide semiconductor) type of input, and the driver could meet its thresholds typically, but not in worst-case, according to the data sheet. The previous interface had been based on comparison with an external reference, but the CMOS input is based on comparison with an internal divider from the power supply. A way to measure what the exact input threshold was for this device for 64 inputs on a flight board was needed. The measurement technique allowed an accurate measurement of the voltage required to switch a Xilinx input from high to low for each of the 64 lines, while only probing two of them. Directly driving an external voltage was considered too risky, and tests done on any other unit could not be used to qualify the flight board. The two lines directly probed gave an absolute voltage threshold calibration, while data collected on the remaining 62 lines without probing gave relative measurements that could be used to identify any outliers. The PECL interface was forced to a long-period square wave by driving a saturated square wave into the ADC (analog to digital converter). The active pull-down circuit was turned off, causing each line to rise rapidly and fall slowly according to the input s weak pull-down circuitry. The fall time shows up as a change in the pulse width of the signal ready by the Xilinx. This change in pulse width is a function of capacitance, pulldown current, and input threshold. Capacitance was known from the different trace lengths, plus a gate input capacitance, which is the same for all inputs. The pull-down current is the same for all inputs including the two that are probed directly. The data was combined, and the Excel solver tool was used to find input thresholds for the 62 lines. This was repeated over different supply voltages and

Global Educational Ecosystem: Case Study of a Partnership with K-12 Schools, Community Organizations, and Business

ERIC Educational Resources Information Center

Lewis, Donna S.

2010-01-01

The purpose of this study was to describe a collaborative partnership model known as the Global Educational Ecosystem, which involves three K-12 schools in Northern California, community organizations (representing science, technology, health, and arts), and Xilinx, Inc. from the perspectives of the leaders of the involved partner organizations in…
Three Realizations and Comparison of Hardware for Piezoresistive Tactile Sensors

PubMed Central

Vidal-Verdú, Fernando; Oballe-Peinado, Óscar; Sánchez-Durán, José A.; Castellanos-Ramos, Julián; Navas-González, Rafael

2011-01-01

Tactile sensors are basically arrays of force sensors that are intended to emulate the skin in applications such as assistive robotics. Local electronics are usually implemented to reduce errors and interference caused by long wires. Realizations based on standard microcontrollers, Programmable Systems on Chip (PSoCs) and Field Programmable Gate Arrays (FPGAs) have been proposed by the authors for the case of piezoresistive tactile sensors. The solution employing FPGAs is especially relevant since their performance is closer to that of Application Specific Integrated Circuits (ASICs) than that of the other devices. This paper presents an implementation of such an idea for a specific sensor. For the purpose of comparison, the circuitry based on the other devices is also made for the same sensor. This paper discusses the implementation issues, provides details regarding the design of the hardware based on the three devices and compares them. PMID:22163797
Algorithmic synthesis using Python compiler

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej

2015-09-01

This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.
Development of ROACH firmware for microwave multiplexed X-ray TES microcalorimeters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Madden, T. J.; Cecil, T. W.; Gades, L. M.

We are developing room temperature electronics based upon the ROACH platform for reading out microwave multiplexed X-ray TES. ROACH is an open-source hardware and software platform featuring a large Xilinx Field Programmable Gate Array (FPGA), Power PC processor, several 10GB Ethernet SFP+ interfaces, and a collection of daughter boards for analog signal generation and acquisition. The combination of a ROACH board, ADC/DAC conversion daughter boards, and hardware for RF mixing allows for the generation and capture of multiple RF tones for reading out microwave multiplexed x-ray TES microcalorimeters. The FPGA is used to generate multiple tones in base band, frommore » 10MHz to 250MHz, which are subsequently mixed to RF in the multiple GHz range and sent through the microwave multiplexer. The tones are generated in the FPGA by storing a large lookup table in Quad Data Rate (QDR) SRAM modules and playing out the waveform to a DAC board. Once the signal has been modulated to RF, passed through the microwave multiplexer, and has been modulated back to base band, the signal is digitized by an ADC board. The tones are modulated to 0Hz by using a FPGA circuit consisting of a polyphase filter bank, several Xilinx FFT blocks, Xilinx CORDIC blocks (for converting to magnitude and phase), and special phase accumulator circuit for mixing to exactly 0Hz. Upwards of 256 channels can be simultaneously captured and written into a bank of 256 First-In-First-Out (FIFO) memories, with each FIFO corresponding to a channel. Individual channel data can be further processed in the FPGA before being streamed through a 10GB Ethernet fiber-optic interface to a Linux system. The Linux system runs software written in Python and QT C++ for controlling the ROACH system, capturing data, and processing data.« less
Design Methodology of an Equalizer for Unipolar Non Return to Zero Binary Signals in the Presence of Additive White Gaussian Noise Using a Time Delay Neural Network on a Field Programmable Gate Array

PubMed Central

Pérez Suárez, Santiago T.; Travieso González, Carlos M.; Alonso Hernández, Jesús B.

2013-01-01

This article presents a design methodology for designing an artificial neural network as an equalizer for a binary signal. Firstly, the system is modelled in floating point format using Matlab. Afterward, the design is described for a Field Programmable Gate Array (FPGA) using fixed point format. The FPGA design is based on the System Generator from Xilinx, which is a design tool over Simulink of Matlab. System Generator allows one to design in a fast and flexible way. It uses low level details of the circuits and the functionality of the system can be fully tested. System Generator can be used to check the architecture and to analyse the effect of the number of bits on the system performance. Finally the System Generator design is compiled for the Xilinx Integrated System Environment (ISE) and the system is described using a hardware description language. In ISE the circuits are managed with high level details and physical performances are obtained. In the Conclusions section, some modifications are proposed to improve the methodology and to ensure portability across FPGA manufacturers.
Single-Event Effect (SEE) Survey of Advanced Reconfigurable Field Programmable Gate Arrays: NASA Electronic Parts and Packaging (NEPP) Program Office of Safety and Mission Assurance

NASA Technical Reports Server (NTRS)

Allen, Gregory

2011-01-01

The NEPP Reconfigurable Field-Programmable Gate Array (FPGA) task has been charged to evaluate reconfigurable FPGA technologies for use in space. Under this task, the Xilinx single-event-immune, reconfigurable FPGA (SIRF) XQR5VFX130 device was evaluated for SEE. Additionally, the Altera Stratix-IV and SiliconBlue iCE65 were screened for single-event latchup (SEL).
A Discussion of Using a Reconfigurable Processor to Implement the Discrete Fourier Transform

NASA Technical Reports Server (NTRS)

White, Michael J.

2004-01-01

This paper presents the design and implementation of the Discrete Fourier Transform (DFT) algorithm on a reconfigurable processor system. While highly applicable to many engineering problems, the DFT is an extremely computationally intensive algorithm. Consequently, the eventual goal of this work is to enhance the execution of a floating-point precision DFT algorithm by off loading the algorithm from the computing system. This computing system, within the context of this research, is a typical high performance desktop computer with an may of field programmable gate arrays (FPGAs). FPGAs are hardware devices that are configured by software to execute an algorithm. If it is desired to change the algorithm, the software is changed to reflect the modification, then download to the FPGA, which is then itself modified. This paper will discuss methodology for developing the DFT algorithm to be implemented on the FPGA. We will discuss the algorithm, the FPGA code effort, and the results to date.
Performance evaluation of heart sound cancellation in FPGA hardware implementation for electronic stethoscope.

PubMed

Chao, Chun-Tang; Maneetien, Nopadon; Wang, Chi-Jo; Chiou, Juing-Shian

2014-01-01

This paper presents the design and evaluation of the hardware circuit for electronic stethoscopes with heart sound cancellation capabilities using field programmable gate arrays (FPGAs). The adaptive line enhancer (ALE) was adopted as the filtering methodology to reduce heart sound attributes from the breath sounds obtained via the electronic stethoscope pickup. FPGAs were utilized to implement the ALE functions in hardware to achieve near real-time breath sound processing. We believe that such an implementation is unprecedented and crucial toward a truly useful, standalone medical device in outpatient clinic settings. The implementation evaluation with one Altera cyclone II-EP2C70F89 shows that the proposed ALE used 45% resources of the chip. Experiments with the proposed prototype were made using DE2-70 emulation board with recorded body signals obtained from online medical archives. Clear suppressions were observed in our experiments from both the frequency domain and time domain perspectives.
Performance Evaluation of Heart Sound Cancellation in FPGA Hardware Implementation for Electronic Stethoscope

PubMed Central

Chao, Chun-Tang

2014-01-01

This paper presents the design and evaluation of the hardware circuit for electronic stethoscopes with heart sound cancellation capabilities using field programmable gate arrays (FPGAs). The adaptive line enhancer (ALE) was adopted as the filtering methodology to reduce heart sound attributes from the breath sounds obtained via the electronic stethoscope pickup. FPGAs were utilized to implement the ALE functions in hardware to achieve near real-time breath sound processing. We believe that such an implementation is unprecedented and crucial toward a truly useful, standalone medical device in outpatient clinic settings. The implementation evaluation with one Altera cyclone II–EP2C70F89 shows that the proposed ALE used 45% resources of the chip. Experiments with the proposed prototype were made using DE2-70 emulation board with recorded body signals obtained from online medical archives. Clear suppressions were observed in our experiments from both the frequency domain and time domain perspectives. PMID:24790573
A new FPGA architecture suitable for DSP applications

NASA Astrophysics Data System (ADS)

Liyun, Wang; Jinmei, Lai; Jiarong, Tong; Pushan, Tang; Xing, Chen; Xueyan, Duan; Liguang, Chen; Jian, Wang; Yuan, Wang

2011-05-01

A new FPGA architecture suitable for digital signal processing applications is presented. DSP modules can be inserted into FPGA conveniently with the proposed architecture, which is much faster when used in the field of digital signal processing compared with traditional FPGAs. An advanced 2-level MUX (multiplexer) is also proposed. With the added SLEEP MODE PASS to traditional 2-level MUX, static leakage is reduced. Furthermore, buffers are inserted at early returns of long lines. With this kind of buffer, the delay of the long line is improved by 9.8% while the area increases by 4.37%. The layout of this architecture has been taped out in standard 0.13 μm CMOS technology successfully. The die size is 6.3 × 4.5 mm2 with the QFP208 package. Test results show that performances of presented classical DSP cases are improved by 28.6%-302% compared with traditional FPGAs.
An acceleration framework for synthetic aperture radar algorithms

NASA Astrophysics Data System (ADS)

Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.

2017-04-01

Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
FASEA: A FPGA Acquisition System and Software Event Analysis for liquid scintillation counting

NASA Astrophysics Data System (ADS)

Steele, T.; Mo, L.; Bignell, L.; Smith, M.; Alexiev, D.

2009-10-01

The FASEA (FPGA based Acquisition and Software Event Analysis) system has been developed to replace the MAC3 for coincidence pulse processing. The system uses a National Instruments Virtex 5 FPGA card (PXI-7842R) for data acquisition and a purpose developed data analysis software for data analysis. Initial comparisons to the MAC3 unit are included based on measurements of 89Sr and 3H, confirming that the system is able to accurately emulate the behaviour of the MAC3 unit.
Custom instruction set NIOS-based OFDM processor for FPGAs

NASA Astrophysics Data System (ADS)

Meyer-Bäse, Uwe; Sunkara, Divya; Castillo, Encarnacion; Garcia, Antonio

2006-05-01

Orthogonal Frequency division multiplexing (OFDM) spread spectrum technique, sometimes also called multi-carrier or discrete multi-tone modulation, are used in bandwidth-efficient communication systems in the presence of channel distortion. The benefits of OFDM are high spectral efficiency, resiliency to RF interference, and lower multi-path distortion. OFDM is the basis for the European digital audio broadcasting (DAB) standard, the global asymmetric digital subscriber line (ADSL) standard, in the IEEE 802.11 5.8 GHz band standard, and ongoing development in wireless local area networks. The modulator and demodulator in an OFDM system can be implemented by use of a parallel bank of filters based on the discrete Fourier transform (DFT), in case the number of subchannels is large (e.g. K > 25), the OFDM system are efficiently implemented by use of the fast Fourier transform (FFT) to compute the DFT. We have developed a custom FPGA-based Altera NIOS system to increase the performance, programmability, and low power in mobil wireless systems. The overall gain observed for a 1024-point FFT ranges depending on the multiplier used by the NIOS processor between a factor of 3 and 16. A careful optimization described in the appendix yield a performance gain of up to 77% when compared with our preliminary results.
JTAG-based remote configuration of FPGAs over optical fibers

DOE PAGES

Deng, B.; Xu, H.; Liu, C.; ...

2015-01-28

In this study, a remote FPGA-configuration method based on JTAG extension over optical fibers is presented. The method takes advantage of commercial components and ready-to-use software such as iMPACT and does not require any hardware or software development. The method combines the advantages of the slow remote JTAG configuration and the fast local flash memory configuration. The method has been verified successfully and used in the Demonstrator of Liquid-Argon Trigger Digitization Board (LTDB) for the ATLAS liquid argon calorimeter Phase-I trigger upgrade. All components on the FPGA side are verified to meet the radiation tolerance requirements.
Initial Approaches for Discovery of Undocumented Functionality in FPGAs

DTIC Science & Technology

2017-03-01

commercial pressures such as IP protection, support cost, and time to market , modern COTS devices contain many functions that are not exposed to the... market pressures have increased, industry increasingly uses the current generation device to do trial runs of next-generation architecture features...the product of industry operating in a highly cost competitive market , and are not inserted with malicious intent, however, this does not preclude
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quinn, Heather; Robinson, William H.; Rech, Paolo

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Using benchmarks for radiation testing of microprocessors and FPGAs

DOE PAGES

Quinn, Heather; Robinson, William H.; Rech, Paolo; ...

2015-12-17

Performance benchmarks have been used over the years to compare different systems. These benchmarks can be useful for researchers trying to determine how changes to the technology, architecture, or compiler affect the system's performance. No such standard exists for systems deployed into high radiation environments, making it difficult to assess whether changes in the fabrication process, circuitry, architecture, or software affect reliability or radiation sensitivity. In this paper, we propose a benchmark suite for high-reliability systems that is designed for field-programmable gate arrays and microprocessors. As a result, we describe the development process and report neutron test data for themore » hardware and software benchmarks.« less
Clock and carrier recovery in high-speed coherent optical communication systems

NASA Astrophysics Data System (ADS)

Amado, Sofia B.; Ferreira, Ricardo; Costa, Pedro S.; Guiomar, Fernando P.; Ziaie, Somayeh; Teixeira, António L.; Muga, Nelson J.; Pinto, Armando N.

2014-08-01

In this paper, the implementations of clock and carrier recovery in digital domain are analyzed. Hardware implementation details, resources estimation and real-time results are presented. Analog-to-Digital Converters (ADC), operating at 1.25Gsa/s, and a Virtex-6 Field-Programmable Gate Array (FPGA), have been used, allowing the implementation of a real-time Quadrature Phase Shift Keying (QPSK) system operating at 1.25Gb/s. The real-time mode operation is successfully demonstrated over 80 km of Standard Single Mode Fiber (SSMF).
An Architecture for Coexistence with Multiple Users in Frequency Hopping Cognitive Radio Networks

DTIC Science & Technology

2013-03-01

the base WARP system, a custom IP core written in VHDL , and the Virtex IV’s embedded PowerPC core with C code to implement the radio and hopset...shown in Appendix C as Figure C.2. All VHDL code necessary to implement this IP core is included in Appendix G. 69 Figure 3.19: FPGA bus structure...subsystem functionality. A total of 1,430 lines of VHDL code were implemented for this research. 1 library ieee; 2 use ieee.std logic 1164.all; 3 use
Design and implementation of low power clock gated 64-bit ALU on ultra scale FPGA

NASA Astrophysics Data System (ADS)

Gupta, Ashutosh; Murgai, Shruti; Gulati, Anmol; Kumar, Pradeep

2016-03-01

64-bit energy efficient Arithmetic and Logic Unit using negative latch based clock gating technique is designed in this paper. The 64-bit ALU is designed using multiplexer based full adder cell. We have designed a 64-bit ALU with a gated clock. We have used negative latch based circuit for generating gated clock. This gated clock is used to control the multiplexer based 64-bit ALU. The circuit has been synthesized on kintex FPGA through Xilinx ISE Design Suite 14.7 using 28 nm technology in Verilog HDL. The circuit has been simulated on Modelsim 10.3c. The design is verified using System Verilog on QuestaSim in UVM environment. We have achieved 74.07%, 92. 93% and 95.53% reduction in total clock power, 89.73%, 91.35% and 92.85% reduction in I/Os power, 67.14%, 62.84% and 74.34% reduction in dynamic power and 25.47%, 29.05% and 46.13% reduction in total supply power at 20 MHz, 200 MHz and 2 GHz frequency respectively. The power has been calculated using XPower Analyzer tool of Xilinx ISE Design Suite 14.3.

FPGA Implementation of Reed-Solomon Decoder for IEEE 802.16 WiMAX Systems using Simulink-Sysgen Design Environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bobrek, Miljko; Albright, Austin P

This paper presents FPGA implementation of the Reed-Solomon decoder for use in IEEE 802.16 WiMAX systems. The decoder is based on RS(255,239) code, and is additionally shortened and punctured according to the WiMAX specifications. Simulink model based on Sysgen library of Xilinx blocks was used for simulation and hardware implementation. At the end, simulation results and hardware implementation performances are presented.
FPGA Boot Loader and Scrubber

NASA Technical Reports Server (NTRS)

Wade, Randall S.; Jones, Bailey

2009-01-01

A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), reads back and verifies that code, reloads the code if an error is detected, and monitors the performance of the FPGA for errors in the presence of radiation. The program consists mainly of a set of VHDL files (wherein "VHDL" signifies "VHSIC Hardware Description Language" and "VHSIC" signifies "very-high-speed integrated circuit").
Design and Implementation of Viterbi Decoder Using VHDL

NASA Astrophysics Data System (ADS)

Thakur, Akash; Chattopadhyay, Manju K.

2018-03-01

A digital design conversion of Viterbi decoder for ½ rate convolutional encoder with constraint length k = 3 is presented in this paper. The design is coded with the help of VHDL, simulated and synthesized using XILINX ISE 14.7. Synthesis results show a maximum frequency of operation for the design is 100.725 MHz. The requirement of memory is less as compared to conventional method.
Data acquisition system issues for large experiments

NASA Astrophysics Data System (ADS)

Siskind, E. J.

2007-09-01

This talk consists of personal observations on two classes of data acquisition ("DAQ") systems for Silicon trackers in large experiments with which the author has been concerned over the last three or more years. The first half is a classic "lessons learned" recital based on experience with the high-level debug and configuration of the DAQ system for the GLAST LAT detector. The second half is concerned with a discussion of the promises and pitfalls of using modern (and future) generations of "system-on-a-chip" ("SOC") or "platform" field-programmable gate arrays ("FPGAs") in future large DAQ systems. The DAQ system pipeline for the 864k channels of Si tracker in the GLAST LAT consists of five tiers of hardware buffers which ultimately feed into the main memory of the (two-active-node) level-3 trigger processor farm. The data formats and buffer volumes of these tiers are briefly described, as well as the flow control employed between successive tiers. Lessons learned regarding data formats, buffer volumes, and flow control/data discard policy are discussed. The continued development of platform FPGAs containing large amounts of configurable logic fabric, embedded PowerPC hard processor cores, digital signal processing components, large volumes of on-chip buffer memory, and multi-gigabit serial I/O capability permits DAQ system designers to vastly increase the amount of data preprocessing that can be performed in parallel within the DAQ pipeline for detector systems in large experiments. The capabilities of some currently available FPGA families are reviewed, along with the prospects for next-generation families of announced, but not yet available, platform FPGAs. Some experience with an actual implementation is presented, and reconciliation between advertised and achievable specifications is attempted. The prospects for applying these components to space-borne Si tracker detectors are briefly discussed.
FPGA-based GEM detector signal acquisition for SXR spectroscopy system

NASA Astrophysics Data System (ADS)

Wojenski, A.; Pozniak, K. T.; Kasprowicz, G.; Kolasinski, P.; Krawczyk, R.; Zabolotny, W.; Chernyshova, M.; Czarski, T.; Malinowski, K.

2016-11-01

The presented work is related to the Gas Electron Multiplier (GEM) detector soft X-ray spectroscopy system for tokamak applications. The used GEM detector has one-dimensional, 128 channel readout structure. The channels are connected to the radiation-hard electronics with configurable analog stage and fast ADCs, supporting speeds of 125 MSPS for each channel. The digitalized data is sent directly to the FPGAs using fast serial links. The preprocessing algorithms are implemented in the FPGAs, with the data buffering made in the on-board 2Gb DDR3 memory chips. After the algorithmic stage, the data is sent to the Intel Xeon-based PC for further postprocessing using PCI-Express link Gen 2. For connection of multiple FPGAs, PCI-Express switch 8-to-1 was designed. The whole system can support up to 2048 analog channels. The scope of the work is an FPGA-based implementation of the recorder of the raw signal from GEM detector. Since the system will work in a very challenging environment (neutron radiation, intense electro-magnetic fields), the registered signals from the GEM detector can be corrupted. In the case of the very intense hot plasma radiation (e.g. laser generated plasma), the registered signals can overlap. Therefore, it is valuable to register the raw signals from the GEM detector with high number of events during soft X-ray radiation. The signal analysis will have the direct impact on the implementation of photon energy computation algorithms. As the result, the system will produce energy spectra and topological distribution of soft X-ray radiation. The advanced software was developed in order to perform complex system startup and monitoring of hardware units. Using the array of two one-dimensional GEM detectors it will be possible to perform tomographic reconstruction of plasma impurities radiation in the SXR region.
Status of the photomultiplier-based FlashCam camera for the Cherenkov Telescope Array

NASA Astrophysics Data System (ADS)

Pühlhofer, G.; Bauer, C.; Eisenkolb, F.; Florin, D.; Föhr, C.; Gadola, A.; Garrecht, F.; Hermann, G.; Jung, I.; Kalekin, O.; Kalkuhl, C.; Kasperek, J.; Kihm, T.; Koziol, J.; Lahmann, R.; Manalaysay, A.; Marszalek, A.; Rajda, P. J.; Reimer, O.; Romaszkan, W.; Rupinski, M.; Schanz, T.; Schwab, T.; Steiner, S.; Straumann, U.; Tenzer, C.; Vollhardt, A.; Weitzel, Q.; Winiarski, K.; Zietara, K.

2014-07-01

The FlashCam project is preparing a camera prototype around a fully digital FADC-based readout system, for the medium sized telescopes (MST) of the Cherenkov Telescope Array (CTA). The FlashCam design is the first fully digital readout system for Cherenkov cameras, based on commercial FADCs and FPGAs as key components for digitization and triggering, and a high performance camera server as back end. It provides the option to easily implement different types of trigger algorithms as well as digitization and readout scenarios using identical hardware, by simply changing the firmware on the FPGAs. The readout of the front end modules into the camera server is Ethernet-based using standard Ethernet switches and a custom, raw Ethernet protocol. In the current implementation of the system, data transfer and back end processing rates of 3.8 GB/s and 2.4 GB/s have been achieved, respectively. Together with the dead-time-free front end event buffering on the FPGAs, this permits the cameras to operate at trigger rates of up to several ten kHz. In the horizontal architecture of FlashCam, the photon detector plane (PDP), consisting of photon detectors, preamplifiers, high voltage-, control-, and monitoring systems, is a self-contained unit, mechanically detached from the front end modules. It interfaces to the digital readout system via analogue signal transmission. The horizontal integration of FlashCam is expected not only to be more cost efficient, it also allows PDPs with different types of photon detectors to be adapted to the FlashCam readout system. By now, a 144-pixel mini-camera" setup, fully equipped with photomultipliers, PDP electronics, and digitization/ trigger electronics, has been realized and extensively tested. Preparations for a full-scale, 1764 pixel camera mechanics and a cooling system are ongoing. The paper describes the status of the project.
FPGA-Based Filterbank Implementation for Parallel Digital Signal Processing

NASA Technical Reports Server (NTRS)

Berner, Stephan; DeLeon, Phillip

1999-01-01

One approach to parallel digital signal processing decomposes a high bandwidth signal into multiple lower bandwidth (rate) signals by an analysis bank. After processing, the subband signals are recombined into a fullband output signal by a synthesis bank. This paper describes an implementation of the analysis and synthesis banks using (Field Programmable Gate Arrays) FPGAs.
Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions

Treesearch

Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners

1995-01-01

In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...
The DCU: the detector control unit for SPICA-SAFARI

NASA Astrophysics Data System (ADS)

Clénet, Antoine; Ravera, Laurent; Bertrand, Bernard; den Hartog, Roland H.; Jackson, Brian D.; van Leeuven, Bert-Joost; van Loon, Dennis; Parot, Yann; Pointecouteau, Etienne; Sournac, Anthony

2014-08-01

IRAP is developing the warm electronic, so called Detector Control Unit" (DCU), in charge of the readout of the SPICA-SAFARI's TES type detectors. The architecture of the electronics used to readout the 3 500 sensors of the 3 focal plane arrays is based on the frequency domain multiplexing technique (FDM). In each of the 24 detection channels the data of up to 160 pixels are multiplexed in frequency domain between 1 and 3:3 MHz. The DCU provides the AC signals to voltage-bias the detectors; it demodulates the detectors data which are readout in the cold by a SQUID; and it computes a feedback signal for the SQUID to linearize the detection chain in order to optimize its dynamic range. The feedback is computed with a specific technique, so called baseband feedback (BBFB) which ensures that the loop is stable even with long propagation and processing delays (i.e. several µs) and with fast signals (i.e. frequency carriers at 3:3 MHz). This digital signal processing is complex and has to be done at the same time for the 3 500 pixels. It thus requires an optimisation of the power consumption. We took the advantage of the relatively reduced science signal bandwidth (i.e. 20 - 40 Hz) to decouple the signal sampling frequency (10 MHz) and the data processing rate. Thanks to this method we managed to reduce the total number of operations per second and thus the power consumption of the digital processing circuit by a factor of 10. Moreover we used time multiplexing techniques to share the resources of the circuit (e.g. a single BBFB module processes 32 pixels). The current version of the firmware is under validation in a Xilinx Virtex 5 FPGA, the final version will be developed in a space qualified digital ASIC. Beyond the firmware architecture the optimization of the instrument concerns the characterization routines and the definition of the optimal parameters. Indeed the operation of the detection and readout chains requires to properly define more than 17 500 parameters
High-speed, multi-channel detector readout electronics for fast radiation detectors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hennig, Wolfgang

2012-06-22

In this project, we are developing a high speed digital spectrometer that a) captures detector waveforms at rates up to 500 MSPS b) has upgraded event data acquisition with additional data buffers for zero dead time operation c) moves energy calculations to the FPGA to increase spectrometer throughput in fast scintillator applications d) uses a streamlined architecture and high speed data interface for even faster readout to the host PC These features are in addition to the standard functions in our existing spectrometers such as digitization, programmable trigger and energy filters, pileup inspection, data acquisition with energy and time stamps,more » MCA histograms, and run statistics. In Phase I, we upgraded one of our existing spectrometer designs to demonstrate the key principle of fast waveform capture using a 500 MSPS, 12 bit ADC and a Xilinx Virtex-4 FPGA. This upgraded spectrometer, named P500, performed well in initial tests of energy resolution, pulse shape analysis, and timing measurements, thus achieving item (a) above. In Phase II, we are revising the P500 to build a commercial prototype with the improvements listed in items (b)-(d). As described in the previous report, two devices were built to pursue this goal, named the Pixie-500 and the Pixie-500 Express. The Pixie-500 has only minor improvements from the Phase I prototype and is intended as an early commercial product (its production and part of its development were funded outside the SBIR). It also allows testing of the ADC performance in real applications.The Pixie-500 Express (or Pixie-500e) includes all of the improvements (b)-(d). At the end of Phase II of the project, we have tested and debugged the hardware, firmware and software of the Pixie-500 Express prototype boards delivered 12/3/2010. This proved substantially more complex than anticipated. At the time of writing, all hardware bugs have been fixed, the PCI Express interface is working, the SDRAM has been successfully tested and
Case for a field-programmable gate array multicore hybrid machine for an image-processing application

NASA Astrophysics Data System (ADS)

Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos

2011-01-01

General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Integrating Reconfigurable Hardware-Based Grid for High Performance Computing

PubMed Central

Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

2015-01-01

FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241
RPython high-level synthesis

NASA Astrophysics Data System (ADS)

Cieszewski, Radoslaw; Linczuk, Maciej

2016-09-01

The development of FPGA technology and the increasing complexity of applications in recent decades have forced compilers to move to higher abstraction levels. Compilers interprets an algorithmic description of a desired behavior written in High-Level Languages (HLLs) and translate it to Hardware Description Languages (HDLs). This paper presents a RPython based High-Level synthesis (HLS) compiler. The compiler get the configuration parameters and map RPython program to VHDL. Then, VHDL code can be used to program FPGA chips. In comparison of other technologies usage, FPGAs have the potential to achieve far greater performance than software as a result of omitting the fetch-decode-execute operations of General Purpose Processors (GPUs), and introduce more parallel computation. This can be exploited by utilizing many resources at the same time. Creating parallel algorithms computed with FPGAs in pure HDL is difficult and time consuming. Implementation time can be greatly reduced with High-Level Synthesis compiler. This article describes design methodologies and tools, implementation and first results of created VHDL backend for RPython compiler.
Automated Design of Board and MCM Level Digital Systems.

DTIC Science & Technology

1997-10-01

Partitioning for Multicomponent Synthesis 159 Appendix K: Resource Constrained RTL Partitioning for Synthesis of Multi- FPGA Designs 169 Appendix L...digital signal processing) ar- chitectures. These target architectures, illustrated in Figure 1, can contain application-specific ASICS, FPGAs ...synthesis tools for ASIC, FPGA and MCM synthesis (Figure 8). Multicomponent Partitioning Engine The par- titioning engine is a hierarchical partitioning
Determining the Best-Fit FPGA for a Space Mission: An Analysis of Cost, SEU Sensitivity,and Reliability

NASA Technical Reports Server (NTRS)

Berg, Melanie; LaBel, Ken

2007-01-01

This viewgraph presentation reviews the selection of the optimum Field Programmable Gate Arrays (FPGA) for space missions. Included in this review is a discussion on differentiating amongst various FPGAs, cost analysis of the various options, the investigation of radiation effects, an expansion of the evaluation criteria, and the application of the evaluation criteria to the selection process.
Commercial Parts Radiation Testing

DTIC Science & Technology

2015-01-13

New Mexico’s COSMIAC Center performed radiation testing on a series of operational amplifiers, microcontrollers and microprocessor. The...commercial microcontroller and microprocessor equipment. The team would develop a list of the most promising commercial parts that might be utilized to...parts will include microprocessors, microcontrollers and memory modules. In addition, Field Programmable Gate Arrays (FPGAs) will also be chosen
Field Programmable Gate Aray (FPGA) Radiation Data: All Data is Not Equal

NASA Technical Reports Server (NTRS)

Label, Kenneth A.; Berg, Melanie D.

2016-01-01

Electronic parts (integrated circuits) have grown in complexity such that determining all failure modes and risks based on single particle event radiation testing is impossible. In this presentation, the authors will present why this is so and provide some realism on what this means to FPGAs. Its all about understanding actual risks and not making assumptions.
A Model for Minimizing Numeric Function Generator Complexity and Delay

DTIC Science & Technology

2007-12-01

allow computation of difficult mathematical functions in less time and with less hardware than commonly employed methods. They compute piecewise...Programmable Gate Arrays (FPGAs). The algorithms and estimation techniques apply to various NFG architectures and mathematical functions. This...thesis compares hardware utilization and propagation delay for various NFG architectures, mathematical functions, word widths, and segmentation methods
FPGA-Based, Self-Checking, Fault-Tolerant Computers

NASA Technical Reports Server (NTRS)

Some, Raphael; Rennels, David

2004-01-01

A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing
Rad-Hard Structured ASIC Body of Knowledge

NASA Technical Reports Server (NTRS)

Heidecker, Jason

2013-01-01

Structured Application-Specific Integrated Circuit (ASIC) technology is a platform between traditional ASICs and Field-Programmable Gate Arrays (FPGA). The motivation behind structured ASICs is to combine the low nonrecurring engineering costs (NRE) costs of FPGAs with the high performance of ASICs. This report provides an overview of the structured ASIC platforms that are radiation-hardened and intended for space application

Analyzing System on A Chip Single Event Upset Responses using Single Event Upset Data, Classical Reliability Models, and Space Environment Data

NASA Technical Reports Server (NTRS)

Berg, Melanie; LaBel, Kenneth; Campola, Michael; Xapsos, Michael

2017-01-01

We are investigating the application of classical reliability performance metrics combined with standard single event upset (SEU) analysis data. We expect to relate SEU behavior to system performance requirements. Our proposed methodology will provide better prediction of SEU responses in harsh radiation environments with confidence metrics. single event upset (SEU), single event effect (SEE), field programmable gate array devises (FPGAs)
Critical Information Protection on FPGAs through Unique Device Specific Keys

DTIC Science & Technology

2011-09-01

63 Appendix B ...64 B .1 Analysis of Circuit DNA Entry Changes Across a Large Temperature Range ..... 64 Appendix C...71 x List of Figures Figure 1. (a) An ideal transistor design. ( b ) SEM image of Transistor
Moving Horizon Estimation on a Chip

DTIC Science & Technology

2014-06-26

description, e.g. VHDL or Verilog, for FPGA implementation . Especially for those whose main expertise is in control system design, writing algorithms in C...ditional Kalman Filter(KF) where recursive solution is available. We devel- oped various MHE designs and implemented them on the Xilinx Zynq ZC702 FPGA...practical deployment of the MHE technology. 2.2 Implementation of MHE on FPGA The next paper demonstrated the feasibility of implementing MHE algo
Advanced Wireless Integrated Navy Network - AWINN

DTIC Science & Technology

2005-09-30

progress report No. 3 on AWINN hardware and software configurations of smart , wideband, multi-function antennas, secure configurable platform, close-in...results to the host PC via a UART soft core. The UART core used is a proprietary Xilinx core which incorporates features described in National...current software uses wheel odometry and visual landmarks to create a map and estimate position on an internal x, y grid . The wheel odometry provides a
Digital Device Architecture and the Safe Use of Flash Devices in Munitions

NASA Technical Reports Server (NTRS)

Katz, Richard B.; Flowers, David; Bergevin, Keith

2017-01-01

Flash technology is being utilized in fuzed munition applications and, based on the development of digital logic devices in the commercial world, usage of flash technology will increase. Digital devices of interest to designers include flash-based microcontrollers and field programmable gate arrays (FPGAs). Almost a decade ago, a study was undertaken to determine if flash-based microcontrollers could be safely used in fuzes and, if so, how should such devices be applied. The results were documented in the Technical Manual for the Use of Logic Devices in Safety Features. This paper will first review the Technical Manual and discuss the rationale behind the suggested architectures for microcontrollers and a brief review of the concern about data retention in flash cells. An architectural feature in the microcontroller under study will be discussed and its use will show how to screen for weak or failed cells during manufacture, storage, or immediately prior to use. As was done for microcontrollers a decade ago, architectures for a flash-based FPGA will be discussed, showing how it can be safely used in fuzes. Additionally, architectures for using non-volatile (including flash-based) storage will be discussed for SRAM-based FPGAs.
Field programmable gate array-assigned complex-valued computation and its limits

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bernard-Schwarz, Maria, E-mail: maria.bernardschwarz@ni.com; Institute of Applied Physics, TU Wien, Wiedner Hauptstrasse 8, 1040 Wien; Zwick, Wolfgang

We discuss how leveraging Field Programmable Gate Array (FPGA) technology as part of a high performance computing platform reduces latency to meet the demanding real time constraints of a quantum optics simulation. Implementations of complex-valued operations using fixed point numeric on a Virtex-5 FPGA compare favorably to more conventional solutions on a central processing unit. Our investigation explores the performance of multiple fixed point options along with a traditional 64 bits floating point version. With this information, the lowest execution times can be estimated. Relative error is examined to ensure simulation accuracy is maintained.
PHANTOM: Practical Oblivious Computation in a Secure Processor

DTIC Science & Technology

2014-05-16

Utilizing Multiple FPGAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6 Implementation on the HC-2ex 50 6.1 Integration with a RISC -V...development of Phantom, Mohit also contributed to the code base, in particular with regard to the integration between the ORAM controller and the RISC -V...well. v Tremendous thanks is owed to the team that developed the RISC -V processor Phantom is using: among other contributors, this includes
Implementation of a Configurable Fault Tolerant Processor (CFTP) Using Internal Triple Modular Redundancy (TMR)

DTIC Science & Technology

2005-12-01

Upsets in SRAM FPGAs,” Military and Aerospace Applications of Programmable Logic Devices, September 2002. 8. Wakerly , John F,. “Microcomputer...change. The goal of the Configurable Fault Tolerant Processor (CFTP) Project is to explore, develop and demonstrate the applicability of using off-the...develop and demonstrate the applicability of using commercial-of-the-shelf (COTS) Field Programmable Gate Arrays (FPGA) in the design of
FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG

DTIC Science & Technology

2014-06-01

is normalized to π. The proposed burst-mode architecture is written in VHDL and verified using Modelsim. The VHDL design is implemented on a Xilinx...Document Number: SET 2014-0043 412TW-PA-14298 FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG June 2014 Final Report Test...To) 9/11 -- 8/14 4. TITLE AND SUBTITLE FPGA Implementation of Burst-Mode Synchronization for SOQSPK-TG 5a. CONTRACT NUMBER: W900KK-11-C-0032 5b
Neuro-Inspired Spike-Based Motion: From Dynamic Vision Sensor to Robot Motor Open-Loop Control through Spike-VITE

PubMed Central

Perez-Peña, Fernando; Morgado-Estevez, Arturo; Linares-Barranco, Alejandro; Jimenez-Fernandez, Angel; Gomez-Rodriguez, Francisco; Jimenez-Moreno, Gabriel; Lopez-Coronado, Juan

2013-01-01

In this paper we present a complete spike-based architecture: from a Dynamic Vision Sensor (retina) to a stereo head robotic platform. The aim of this research is to reproduce intended movements performed by humans taking into account as many features as possible from the biological point of view. This paper fills the gap between current spike silicon sensors and robotic actuators by applying a spike processing strategy to the data flows in real time. The architecture is divided into layers: the retina, visual information processing, the trajectory generator layer which uses a neuroinspired algorithm (SVITE) that can be replicated into as many times as DoF the robot has; and finally the actuation layer to supply the spikes to the robot (using PFM). All the layers do their tasks in a spike-processing mode, and they communicate each other through the neuro-inspired AER protocol. The open-loop controller is implemented on FPGA using AER interfaces developed by RTC Lab. Experimental results reveal the viability of this spike-based controller. Two main advantages are: low hardware resources (2% of a Xilinx Spartan 6) and power requirements (3.4 W) to control a robot with a high number of DoF (up to 100 for a Xilinx Spartan 6). It also evidences the suitable use of AER as a communication protocol between processing and actuation. PMID:24264330
Neuro-inspired spike-based motion: from dynamic vision sensor to robot motor open-loop control through spike-VITE.

PubMed

Perez-Peña, Fernando; Morgado-Estevez, Arturo; Linares-Barranco, Alejandro; Jimenez-Fernandez, Angel; Gomez-Rodriguez, Francisco; Jimenez-Moreno, Gabriel; Lopez-Coronado, Juan

2013-11-20

In this paper we present a complete spike-based architecture: from a Dynamic Vision Sensor (retina) to a stereo head robotic platform. The aim of this research is to reproduce intended movements performed by humans taking into account as many features as possible from the biological point of view. This paper fills the gap between current spike silicon sensors and robotic actuators by applying a spike processing strategy to the data flows in real time. The architecture is divided into layers: the retina, visual information processing, the trajectory generator layer which uses a neuroinspired algorithm (SVITE) that can be replicated into as many times as DoF the robot has; and finally the actuation layer to supply the spikes to the robot (using PFM). All the layers do their tasks in a spike-processing mode, and they communicate each other through the neuro-inspired AER protocol. The open-loop controller is implemented on FPGA using AER interfaces developed by RTC Lab. Experimental results reveal the viability of this spike-based controller. Two main advantages are: low hardware resources (2% of a Xilinx Spartan 6) and power requirements (3.4 W) to control a robot with a high number of DoF (up to 100 for a Xilinx Spartan 6). It also evidences the suitable use of AER as a communication protocol between processing and actuation.
FPGA implementation of image dehazing algorithm for real time applications

NASA Astrophysics Data System (ADS)

Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.

2017-09-01

Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Optimization of a Fast Neutron Scintillator for Real-Time Pulse Shape Discrimination in the Transient Reactor Test Facility (TREAT) Hodoscope

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, James T.; Thompson, Scott J.; Watson, Scott M.

We present a multi-channel, fast neutron/gamma ray detector array system that utilizes ZnS(Ag) scintillator detectors. The system employs field programmable gate arrays (FPGAs) to do real-time all digital neutron/gamma ray discrimination with pulse height and time histograms to allow count rates in excess of 1,000,000 pulses per second per channel. The system detector number is scalable in blocks of 16 channels.
Semantically Aware Foundation Environment (SAFE) for Clean-Slate Design of Resilient, Adaptive Secure Hosts (CRASH)

DTIC Science & Technology

2016-02-01

system consists of a high-fidelity hardware simulation using field programmable gate arrays (FPGAs), with a set of runtime services (ConcreteWare...perimeter protection, patch, and pray” is not aligned with the threat. Programmers will not bail us out of this situation (by writing defect free code...hosted on a Field Programmable Gate Array (FPGA), with a set of runtime services (concreteware) running on the hardware. Secure applications can be
Evaluating De-centralised and Distributional Options for the Distributed Electronic Warfare Situation Awareness and Response Test Bed

DTIC Science & Technology

2013-12-01

effectors (deployed on ground based or aerial platforms) to detect , identify, locate, track or suppress stationary or slow moving surface based RF...ground based or aerial platforms) to detect , identify, locate, track or suppress stationary or slow moving surface based RF emitting targets. In the...Electronic Support EO Electro-Optic FPGAs Field Programmable Gate Arrays IR Infra-red LADAR Laser Detection and Ranging OSX Mac OS X; the apple
Design Considerations for a Computationally-Lightweight Authentication Mechanism for Passive RFID Tags

DTIC Science & Technology

2009-09-01

suffer the power and complexity requirements of a public key system. 28 In [18], a simulation of the SHA –1 algorithm is performed on a Xilinx FPGA ... 256 bits. Thus, the construction of a hash table would need 2512 independent comparisons. It is known that hash collisions of the SHA –1 algorithm... SHA –1 algorithm for small-core FPGA design. Small-core FPGA design is the process by which a circuit is adapted to use the minimal amount of logic
Hardware Design and Implementation of Fixed-Width Standard and Truncated 4×4, 6×6, 8×8 and 12×12-BIT Multipliers Using Fpga

NASA Astrophysics Data System (ADS)

Rais, Muhammad H.

2010-06-01

This paper presents Field Programmable Gate Array (FPGA) implementation of standard and truncated multipliers using Very High Speed Integrated Circuit Hardware Description Language (VHDL). Truncated multiplier is a good candidate for digital signal processing (DSP) applications such as finite impulse response (FIR) and discrete cosine transform (DCT). Remarkable reduction in FPGA resources, delay, and power can be achieved using truncated multipliers instead of standard parallel multipliers when the full precision of the standard multiplier is not required. The truncated multipliers show significant improvement as compared to standard multipliers. Results show that the anomaly in Spartan-3 AN average connection and maximum pin delay have been efficiently reduced in Virtex-4 device.
Aquarius Digital Processing Unit

NASA Technical Reports Server (NTRS)

Forgione, Joshua; Winkert, George; Dobson, Norman

2009-01-01

Three documents provide information on a digital processing unit (DPU) for the planned Aquarius mission, in which a radiometer aboard a spacecraft orbiting Earth is to measure radiometric temperatures from which data on sea-surface salinity are to be deduced. The DPU is the interface between the radiometer and an instrument-command-and-data system aboard the spacecraft. The DPU cycles the radiometer through a programmable sequence of states, collects and processes all radiometric data, and collects all housekeeping data pertaining to operation of the radiometer. The documents summarize the DPU design, with emphasis on innovative aspects that include mainly the following: a) In the radiometer and the DPU, conversion from analog voltages to digital data is effected by means of asynchronous voltage-to-frequency converters in combination with a frequency-measurement scheme implemented in field-programmable gate arrays (FPGAs). b) A scheme to compensate for aging and changes in the temperature of the DPU in order to provide an overall temperature-measurement accuracy within 0.01 K includes a high-precision, inexpensive DC temperature measurement scheme and a drift-compensation scheme that was used on the Cassini radar system. c) An interface among multiple FPGAs in the DPU guarantees setup and hold times.
OpenPET Hardware, Firmware, Software, and Board Design Files

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abu-Nimeh, Faisal; Choong, Woon-Sengq; Moses, William W.

OpenPET is an open source, flexible, high-performance, and modular data acquisition system for a variety of applications. The OpenPET electronics are capable of reading analog voltage or current signals from a wide variety of sensors. The electronics boards make extensive use of field programmable gate arrays (FPGAs) to provide flexibility and scalability. Firmware and software for the FPGAs and computer are used to control and acquire data from the system. The command and control flow is similar to the data flow, however, the commands are initiated from the computer similar to a tree topology (i.e., from top-to-bottom). Each node inmore » the tree discovers its parent and children, and all addresses are configured accordingly. A user (or a script) initiates a command from the computer. This command will be translated and encoded to the corresponding child (e.g., SB, MB, DB, etc.). Consecutively, each node will pass the command to its corresponding child(ren) by looking at the destination address. Finally, once the command reaches its desired destination(s) the corresponding node(s) execute(s) the command and send(s) a reply, if required. All the firmware, software, and the electronics board design files are distributed through the OpenPET website (http://openpet.lbl.gov).« less
Rapid prototyping of update algorithm of discrete Fourier transform for real-time signal processing

NASA Astrophysics Data System (ADS)

Kakad, Yogendra P.; Sherlock, Barry G.; Chatapuram, Krishnan V.; Bishop, Stephen

2001-10-01

An algorithm is developed in the companion paper, to update the existing DFT to represent the new data series that results when a new signal point is received. Updating the DFT in this way uses less computation than directly evaluating the DFT using the FFT algorithm, This reduces the computational order by a factor of log2 N. The algorithm is able to work in the presence of data window function, for use with rectangular window, the split triangular, Hanning, Hamming, and Blackman windows. In this paper, a hardware implementation of this algorithm, using FPGA technology, is outlined. Unlike traditional fully customized VLSI circuits, FPGAs represent a technical break through in the corresponding industry. The FPGA implements thousands of gates of logic in a single IC chip and it can be programmed by users at their site in a few seconds or less depending on the type of device used. The risk is low and the development time is short. The advantages have made FPGAs very popular for rapid prototyping of algorithms in the area of digital communication, digital signal processing, and image processing. Our paper addresses the related issues of implementation using hardware descriptive language in the development of the design and the subsequent downloading on the programmable hardware chip.

Advanced Wireless Integrated Navy Network (AWINN)

DTIC Science & Technology

2005-12-31

handle high data rates using COTS FPGAs . The effort of the Cross-Layer Optimization group is focused on cross-layer design of UWB for position location...From Transmitter Boar1 To Receiver BoardTransmittedl Receiver i i.. Switch Lowpass -20 dB FPGA -2dB Filter Gain Controlled Gain Variable Attenuator... FPGA Code * April - June 2006 "o Demonstrate Transceiver Operation "o Integrate Transceiver with Other AWINN Activities Personnel: Chris R. Anderson
Porting of an FPGA Based High Data Rate DVB-S2 Modulator

DTIC Science & Technology

2011-06-13

broadcast satellite market. The physical layer is detailed in the ETSI EN 302 307 V 1.1.2 (2006-06) standard. The waveform has seen broad adoption and...independent u IRRC Atar fi I I ii I .• DDS l; OAC Interface ~ (opCIontJ) " " 7 a RRC Filler V; ~ implementation, and one from Xilinx, which is...at 37- 38 is shown in Fignre 6. Additionally, the HDR DVB-S2 waveform running on the BDR-I was tested for interoperability at the physical layer
Electronic readout system for the Belle II imaging Time-Of-Propagation detector

NASA Astrophysics Data System (ADS)

Kotchetkov, Dmitri

2017-07-01

The imaging Time-Of-Propagation (iTOP) detector, constructed for the Belle II experiment at the SuperKEKB e+e- collider, is an 8192-channel high precision Cherenkov particle identification detector with timing resolution below 50 ps. To acquire data from the iTOP, a novel front-end electronic readout system was designed, built, and integrated. Switched-capacitor array application-specific integrated circuits are used to sample analog signals. Triggering, digitization, readout, and data transfer are controlled by Xilinx Zynq-7000 system on a chip devices.
Floating-Point Units and Algorithms for field-programmable gate arrays

DOE Office of Scientific and Technical Information (OSTI.GOV)

Underwood, Keith D.; Hemmert, K. Scott

2005-11-01

The software that we are attempting to copyright is a package of floating-point unit descriptions and example algorithm implementations using those units for use in FPGAs. The floating point units are best-in-class implementations of add, multiply, divide, and square root floating-point operations. The algorithm implementations are sample (not highly flexible) implementations of FFT, matrix multiply, matrix vector multiply, and dot product. Together, one could think of the collection as an implementation of parts of the BLAS library or something similar to the FFTW packages (without the flexibility) for FPGAs. Results from this work has been published multiple times and wemore » are working on a publication to discuss the techniques we use to implement the floating-point units, For some more background, FPGAS are programmable hardware. "Programs" for this hardware are typically created using a hardware description language (examples include Verilog, VHDL, and JHDL). Our floating-point unit descriptions are written in JHDL, which allows them to include placement constraints that make them highly optimized relative to some other implementations of floating-point units. Many vendors (Nallatech from the UK, SRC Computers in the US) have similar implementations, but our implementations seem to be somewhat higher performance. Our algorithm implementations are written in VHDL and models of the floating-point units are provided in VHDL as well. FPGA "programs" make multiple "calls" (hardware instantiations) to libraries of intellectual property (IP), such as the floating-point unit library described here. These programs are then compiled using a tool called a synthesizer (such as a tool from Synplicity, Inc.). The compiled file is a netlist of gates and flip-flops. This netlist is then mapped to a particular type of FPGA by a mapper and then a place- and-route tool. These tools assign the gates in the netlist to specific locations on the specific type of FPGA chip used
SpaceCube 2.0: An Advanced Hybrid Onboard Data Processor

NASA Technical Reports Server (NTRS)

Lin, Michael; Flatley, Thomas; Godfrey, John; Geist, Alessandro; Espinosa, Daniel; Petrick, David

2011-01-01

The SpaceCube 2.0 is a compact, high performance, low-power onboard processing system that takes advantage of cutting-edge hybrid (CPU/FPGA/DSP) processing elements. The SpaceCube 2.0 design concept includes two commercial Virtex-5 field-programmable gate array (FPGA) parts protected by gradiation hardened by software" technology, and possesses exceptional size, weight, and power characteristics [5x5x7 in., 3.5 lb (approximately equal to 12.7 x 12.7 x 17.8 cm, 1.6 kg) 5-25 W, depending on the application fs required clock rate]. The two Virtex-5 FPGA parts are implemented in a unique back-toback configuration to maximize data transfer and computing performance. Draft computing power specifications for the SpaceCube 2.0 unit include four PowerPC 440s (1100 DMIPS each), 500+ DSP48Es (2x580 GMACS), 100+ LVDS high-speed serial I/Os (1.25 Gbps each), and 2x190 GFLOPS single-precision (65 GFLOPS double-precision) floating point performance. The SpaceCube 2.0 includes PROM memory for CPU boot, health and safety, and basic command and telemetry functionality; RAM memory for program execution; and FLASH/EEPROM memory to store algorithms and application code for the CPU, FPGA, and DSP processing elements. Program execution can be reconfigured in real time and algorithms can be updated, modified, and/or replaced at any point during the mission. Gigabit Ethernet, Spacewire, SATA and highspeed LVDS serial/parallel I/O channels are available for instrument/sensor data ingest, and mission-unique instrument interfaces can be accommodated using a compact PCI (cPCI) expansion card interface. The SpaceCube 2.0 can be utilized in NASA Earth Science, Helio/Astrophysics and Exploration missions, and Department of Defense satellites for onboard data processing. It can also be used in commercial communication and mapping satellites.
Re-Form: FPGA-Powered True Codesign Flow for High-Performance Computing In The Post-Moore Era

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cappello, Franck; Yoshii, Kazutomo; Finkel, Hal

Multicore scaling will end soon because of practical power limits. Dark silicon is becoming a major issue even more than the end of Moore’s law. In the post-Moore era, the energy efficiency of computing will be a major concern. FPGAs could be a key to maximizing the energy efficiency. In this paper we address severe challenges in the adoption of FPGA in HPC and describe “Re-form,” an FPGA-powered codesign flow.
Creating an Assured Joint DOD and Interagency Interoperable Net-Centric Enterprise. Report of the Defense Science Board Task Force on Achieving Interoperability in a Net-Centric Environment

DTIC Science & Technology

2009-03-01

policy, elliptic curve public key cryptography using the 256 -bit prime modulus elliptic curve as specified in FIPS-186-2 and SHA - 256 are appropriate for...publications/fips/fips186-2/fips186-2-change1.pdf 76 I P ART I . CH A PT E R 5 Hashing via the Secure Hash Algorithm (using SHA - 256 and...lithography and processing techniques. Field programmable gate arrays ( FPGAs ) are a chip design of interest. These devices are extensively used in
The IMPACT Common Module - A Low Cost, Reconfigurable Building Block for Next Generation Phased Arrays

DTIC Science & Technology

2016-03-31

The SiGe receiver has two stages of programmable RF filtering and one stage of IF filtering. Each filter can be tuned in center frequency and...distribution unlimited. transmit, with an IF to RF upconversion chain that is split to programmable phase shifters and VGAs at each output port. Figure 2...These are optimized to run on medium grade Field Programmable Gate Arrays (FPGAs), such as the Altera Arria 10, and represent a few of the many
Single Event Test Methodologies and System Error Rate Analysis for Triple Modular Redundant Field Programmable Gate Arrays

NASA Technical Reports Server (NTRS)

Allen, Gregory; Edmonds, Larry D.; Swift, Gary; Carmichael, Carl; Tseng, Chen Wei; Heldt, Kevin; Anderson, Scott Arlo; Coe, Michael

2010-01-01

We present a test methodology for estimating system error rates of Field Programmable Gate Arrays (FPGAs) mitigated with Triple Modular Redundancy (TMR). The test methodology is founded in a mathematical model, which is also presented. Accelerator data from 90 nm Xilins Military/Aerospace grade FPGA are shown to fit the model. Fault injection (FI) results are discussed and related to the test data. Design implementation and the corresponding impact of multiple bit upset (MBU) are also discussed.
Asymmetric Core Computing for U.S. Army High-Performance Computing Applications

DTIC Science & Technology

2009-04-01

Playstation 4 (should one be announced). 8 4.2 FPGAs Reconfigurable computing refers to performing computations using Field Programmable Gate Arrays...2008 4 . TITLE AND SUBTITLE Asymmetric Core Computing for U.S. Army High-Performance Computing Applications 5a. CONTRACT NUMBER 5b. GRANT NUMBER...Acknowledgments vi 1. Introduction 1 2. Relevant Technologies 2 3. Technical Approach 5 4 . Research and Development Highlights 7 4.1 Cell
Implementation of a Fault Tolerant Control Unit within an FPGA for Space Applications

DTIC Science & Technology

2006-12-01

Conference 2002, September 2002. [20] M. Alderighi, A. Candelori, F. Casini, S. D’Angelo, M. Mancini, A. Paccagnella, S. Pastore , G.R. Sechi, “Heavy...Luigi Carro and Ricardo Reis , “Designing and Testing Fault-Tolerant Techniques for SRAM-based FPGAs,” in Proc. 1st Conference on Computer Frontiers, pp...susceptibility,” in IEEE Proc. 12th IEEE Intl. Symposium on On-Line Testing, pp. 89-91, 2006. [45] Fernanda Lima, Luigi Carro and Ricardo Reis
Coarse Grain Reconfigurable ASIC through Multiplexer Based Switches

DTIC Science & Technology

2015-09-15

chip area (0.5 mm2), and from simulation their power consumption is negligible (0.002% from simulation, too small to measure in physical system...performing implementation that is also flexible. REFERENCES [1] I. Kuon and J. Rose, “ Measuring the gap between FPGAs and ASICs,” IEEE Trans...A 3GPP- LTE Example," Solid-State Circuits, IEEE Journal of , vol.47, no.3, pp.757,768, March 2012. [5] Agarwal, A.; Hassanieh, H.; Abari, O
A MOdular System for Acquisition, Interface and Control (MOSAIC) of detectors and their related electronics for high energy physics experiment

NASA Astrophysics Data System (ADS)

Robertis, G. De; Fanizzi, G.; Loddo, F.; Manzari, V.; Rizzi, M.

2018-02-01

In this work the MOSAIC ("MOdular System for Acquisition, Interface and Control") board, designed for the readout and testing of the pixel modules for the silicon tracker upgrade of the ALICE (A Large Ion Collider Experiment) experiment at teh CERN LHC, is described. It is based on an Artix7 Field Programmable Gate Array device by Xilinx and is compliant with the six unit "Versa Modular Eurocard" standard (6U-VME) for easy housing in a standard VMEbus crate from which it takes only power supplies and cooling.
Electronics for a highly segmented electromagnetic calorimeter prototype

NASA Astrophysics Data System (ADS)

Fehlker, D.; Alme, J.; van den Brink, A.; de Haas, A. P.; Nooren, G.-J.; Reicher, M.; Röhrich, D.; Rossewij, M.; Ullaland, K.; Yang, S.

2013-03-01

A prototype of a highly segmented electromagnetic calorimeter has been developed. The detector tower is made of 24 layers of PHASE2/MIMOSA23 silicon sensors sandwiched between tungsten plates, with 4 sensors per layer, a total of 96 MIMOSA sensors, resulting in 39 MPixels for the complete prototype detector tower. The paper focuses on the electronics of this calorimeter prototype. Two detector readout and control systems are used, each containing two Spartan 6 and one Virtex 6 FPGA, running embedded Linux, each system serving 12 detector layers. In 550 ms a total of 4 Gbytes of data is read from the detector, stored in memory on the electronics and then shipped to the DAQ system via Gigabit ethernet.
Design of FPGA ICA for hyperspectral imaging processing

NASA Astrophysics Data System (ADS)

Nordin, Anis; Hsu, Charles C.; Szu, Harold H.

2001-03-01

The remote sensing problem which uses hyperspectral imaging can be transformed into a blind source separation problem. Using this model, hyperspectral imagery can be de-mixed into sub-pixel spectra which indicate the different material present in the pixel. This can be further used to deduce areas which contain forest, water or biomass, without even knowing the sources which constitute the image. This form of remote sensing allows previously blurred images to show the specific terrain involved in that region. The blind source separation problem can be implemented using an Independent Component Analysis algorithm. The ICA Algorithm has previously been successfully implemented using software packages such as MATLAB, which has a downloadable version of FastICA. The challenge now lies in implementing it in a form of hardware, or firmware in order to improve its computational speed. Hardware implementation also solves insufficient memory problem encountered by software packages like MATLAB when employing ICA for high resolution images and a large number of channels. Here, a pipelined solution of the firmware, realized using FPGAs are drawn out and simulated using C. Since C code can be translated into HDLs or be used directly on the FPGAs, it can be used to simulate its actual implementation in hardware. The simulated results of the program is presented here, where seven channels are used to model the 200 different channels involved in hyperspectral imaging.
A binary link tracker for the BaBar level 1 trigger system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berenyi, A.; Chen, H.K.; Dao, K.

1999-08-01

The BaBar detector at PEP-II will operate in a high-luminosity e{sup +}e{sup {minus}} collider environment near the {Upsilon}(4S) resonance with the primary goal of studying CP violation in the B meson system. In this environment, typical physics events of interest involve multiple charged particles. These events are identified by counting these tracks in a fast first level (Level 1) trigger system, by reconstructing the tracks in real time. For this purpose, a Binary Link Tracker Module (BLTM) was designed and fabricated for the BaBar Level 1 Drift Chamber trigger system. The BLTM is responsible for linking track segments, constructed bymore » the Track Segment Finder Modules (TSFM), into complete tracks. A single BLTM module processes a 360 MBytes/s stream of segment hit data, corresponding to information from the entire Drift Chamber, and implements a fast and robust algorithm that tolerates high hit occupancies as well as local inefficiencies of the Drift Chamber. The algorithms and the necessary control logic of the BLTM were implemented in Field Programmable Gate Arrays (FPGAs), using the VHDL hardware description language. The finished 9U x 400 mm Euro-format board contains roughly 75,000 gates of programmable logic or about 10,000 lines of VHDL code synthesized into five FPGAs.« less
Using SRAM Based FPGAs for Power-Aware High Performance Wireless Sensor Networks

PubMed Central

Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa

2012-01-01

While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today’s applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements. PMID:22736971
Using SRAM based FPGAs for power-aware high performance wireless sensor networks.

PubMed

Valverde, Juan; Otero, Andres; Lopez, Miguel; Portilla, Jorge; de la Torre, Eduardo; Riesgo, Teresa

2012-01-01

While for years traditional wireless sensor nodes have been based on ultra-low power microcontrollers with sufficient but limited computing power, the complexity and number of tasks of today's applications are constantly increasing. Increasing the node duty cycle is not feasible in all cases, so in many cases more computing power is required. This extra computing power may be achieved by either more powerful microcontrollers, though more power consumption or, in general, any solution capable of accelerating task execution. At this point, the use of hardware based, and in particular FPGA solutions, might appear as a candidate technology, since though power use is higher compared with lower power devices, execution time is reduced, so energy could be reduced overall. In order to demonstrate this, an innovative WSN node architecture is proposed. This architecture is based on a high performance high capacity state-of-the-art FPGA, which combines the advantages of the intrinsic acceleration provided by the parallelism of hardware devices, the use of partial reconfiguration capabilities, as well as a careful power-aware management system, to show that energy savings for certain higher-end applications can be achieved. Finally, comprehensive tests have been done to validate the platform in terms of performance and power consumption, to proof that better energy efficiency compared to processor based solutions can be achieved, for instance, when encryption is imposed by the application requirements.
FPGA for Power Control of MSL Avionics

NASA Technical Reports Server (NTRS)

Wang, Duo; Burke, Gary R.

2011-01-01

A PLGT FPGA (Field Programmable Gate Array) is included in the LCC (Load Control Card), GID (Guidance Interface & Drivers), TMC (Telemetry Multiplexer Card), and PFC (Pyro Firing Card) boards of the Mars Science Laboratory (MSL) spacecraft. (PLGT stands for PFC, LCC, GID, and TMC.) It provides the interface between the backside bus and the power drivers on these boards. The LCC drives power switches to switch power loads, and also relays. The GID drives the thrusters and latch valves, as well as having the star-tracker and Sun-sensor interface. The PFC drives pyros, and the TMC receives digital and analog telemetry. The FPGA is implemented both in Xilinx (Spartan 3- 400) and in Actel (RTSX72SU, ASX72S). The Xilinx Spartan 3 part is used for the breadboard, the Actel ASX part is used for the EM (Engineer Module), and the pin-compatible, radiation-hardened RTSX part is used for final EM and flight. The MSL spacecraft uses a FC (Flight Computer) to control power loads, relays, thrusters, latch valves, Sun-sensor, and star-tracker, and to read telemetry such as temperature. Commands are sent over a 1553 bus to the MREU (Multi-Mission System Architecture Platform Remote Engineering Unit). The MREU resends over a remote serial command bus c-bus to the LCC, GID TMC, and PFC. The MREU also sends out telemetry addresses via a remote serial telemetry address bus to the LCC, GID, TMC, and PFC, and the status is returned over the remote serial telemetry data bus.
A real-time multi-gases detection and concentration measurements based-on time-division multiplexed-lasers

NASA Astrophysics Data System (ADS)

Yazdandoust, Fatemeh; Tatenguem Fankem, Hervé; Milde, Tobias; Jimenez, Alvaro; Sacher, Joachim

2018-02-01

We report the development of a platform, based-on a Field-Programmable Gate Arrays (FPGAs) and suitable for Time-Division-Multiplexed DFB lasers. The designed platform is subsequently combined with a spectroscopy setup, for detection and quantification of species in a gas mixture. The experimental results show a detection limit of 460 ppm, an uncertainty of 0.1% and a computation time of less than 1000 clock cycles. The proposed system offers a high level of flexibility and is applicable to arbitrary types of gas-mixtures.

Multi-DSP and FPGA based Multi-channel Direct IF/RF Digital receiver for atmospheric radar

NASA Astrophysics Data System (ADS)

Yasodha, Polisetti; Jayaraman, Achuthan; Kamaraj, Pandian; Durga rao, Meka; Thriveni, A.

2016-07-01

Modern phased array radars depend highly on digital signal processing (DSP) to extract the echo signal information and to accomplish reliability along with programmability and flexibility. The advent of ASIC technology has made various digital signal processing steps to be realized in one DSP chip, which can be programmed as per the application and can handle high data rates, to be used in the radar receiver to process the received signal. Further, recent days field programmable gate array (FPGA) chips, which can be re-programmed, also present an opportunity to utilize them to process the radar signal. A multi-channel direct IF/RF digital receiver (MCDRx) is developed at NARL, taking the advantage of high speed ADCs and high performance DSP chips/FPGAs, to be used for atmospheric radars working in HF/VHF bands. Multiple channels facilitate the radar t be operated in multi-receiver modes and also to obtain the wind vector with improved time resolution, without switching the antenna beam. MCDRx has six channels, implemented on a custom built digital board, which is realized using six numbers of ADCs for simultaneous processing of the six input signals, Xilinx vertex5 FPGA and Spartan6 FPGA, and two ADSPTS201 DSP chips, each of which performs one phase of processing. MCDRx unit interfaces with the data storage/display computer via two gigabit ethernet (GbE) links. One of the six channels is used for Doppler beam swinging (DBS) mode and the other five channels are used for multi-receiver mode operations, dedicatedly. Each channel has (i) ADC block, to digitize RF/IF signal, (ii) DDC block for digital down conversion of the digitized signal, (iii) decoding block to decode the phase coded signal, and (iv) coherent integration block for integrating the data preserving phase intact. ADC block consists of Analog devices make AD9467 16-bit ADCs, to digitize the input signal at 80 MSPS. The output of ADC is centered around (80 MHz - input frequency). The digitized data is fed
Digital Interface Board to Control Phase and Amplitude of Four Channels

NASA Technical Reports Server (NTRS)

Smith, Amy E.; Cook, Brian M.; Khan, Abdur R.; Lux, James P.

2011-01-01

An increasing number of parts are designed with digital control interfaces, including phase shifters and variable attenuators. When designing an antenna array in which each antenna has independent amplitude and phase control, the number of digital control lines that must be set simultaneously can grow very large. Use of a parallel interface would require separate line drivers, more parts, and thus additional failure points. A convenient form of control where single-phase shifters or attenuators could be set or the whole set could be programmed with an update rate of 100 Hz is needed to solve this problem. A digital interface board with a field-programmable gate array (FPGA) can simultaneously control an essentially arbitrary number of digital control lines with a serial command interface requiring only three wires. A small set of short, high-level commands provides a simple programming interface for an external controller. Parity bits are used to validate the control commands. Output timing is controlled within the FPGA to allow for rapid update rates of the phase shifters and attenuators. This technology has been used to set and monitor eight 5-bit control signals via a serial UART (universal asynchronous receiver/transmitter) interface. The digital interface board controls the phase and amplitude of the signals for each element in the array. A host computer running Agilent VEE sends commands via serial UART connection to a Xilinx VirtexII FPGA. The commands are decoded, and either outputs are set or telemetry data is sent back to the host computer describing the status and the current phase and amplitude settings. This technology is an integral part of a closed-loop system in which the angle of arrival of an X-band uplink signal is detected and the appropriate phase shifts are applied to the Ka-band downlink signal to electronically steer the array back in the direction of the uplink signal. It will also be used in the non-beam-steering case to compensate for
Field programmable gate arrays-based number plate binarization and adjustment for automatic number plate recognition systems

NASA Astrophysics Data System (ADS)

Zhai, Xiaojun; Bensaali, Faycal; Sotudeh, Reza

2013-01-01

Number plate (NP) binarization and adjustment are important preprocessing stages in automatic number plate recognition (ANPR) systems and are used to link the number plate localization (NPL) and character segmentation stages. Successfully linking these two stages will improve the performance of the entire ANPR system. We present two optimized low-complexity NP binarization and adjustment algorithms. Efficient area/speed architectures based on the proposed algorithms are also presented and have been successfully implemented and tested using the Mentor Graphics RC240 FPGA development board, which together require only 9% of the available on-chip resources of a Virtex-4 FPGA, run with a maximum frequency of 95.8 MHz and are capable of processing one image in 0.07 to 0.17 ms.
On the design of a radix-10 online floating-point multiplier

NASA Astrophysics Data System (ADS)

McIlhenny, Robert D.; Ercegovac, Milos D.

2009-08-01

This paper describes an approach to design and implement a radix-10 online floating-point multiplier. An online approach is considered because it offers computational flexibility not available with conventional arithmetic. The design was coded in VHDL and compiled, synthesized, and mapped onto a Virtex 5 FPGA to measure cost in terms of LUTs (look-up-tables) as well as the cycle time and total latency. The routing delay which was not optimized is the major component in the cycle time. For a rough estimate of the cost/latency characteristics, our design was compared to a standard radix-2 floating-point multiplier of equivalent precision. The results demonstrate that even an unoptimized radix-10 online design is an attractive implementation alternative for FPGA floating-point multiplication.
Particle identification algorithms for the PANDA Endcap Disc DIRC

NASA Astrophysics Data System (ADS)

Schmidt, M.; Ali, A.; Belias, A.; Dzhygadlo, R.; Gerhardt, A.; Götzen, K.; Kalicy, G.; Krebs, M.; Lehmann, D.; Nerling, F.; Patsyuk, M.; Peters, K.; Schepers, G.; Schmitt, L.; Schwarz, C.; Schwiening, J.; Traxler, M.; Böhm, M.; Eyrich, W.; Lehmann, A.; Pfaffinger, M.; Uhlig, F.; Düren, M.; Etzelmüller, E.; Föhl, K.; Hayrapetyan, A.; Kreutzfeld, K.; Merle, O.; Rieke, J.; Wasem, T.; Achenbach, P.; Cardinali, M.; Hoek, M.; Lauth, W.; Schlimme, S.; Sfienti, C.; Thiel, M.

2017-12-01

The Endcap Disc DIRC has been developed to provide an excellent particle identification for the future PANDA experiment by separating pions and kaons up to a momentum of 4 GeV/c with a separation power of 3 standard deviations in the polar angle region from 5o to 22o. This goal will be achieved using dedicated particle identification algorithms based on likelihood methods and will be applied in an offline analysis and online event filtering. This paper evaluates the resulting PID performance using Monte-Carlo simulations to study basic single track PID as well as the analysis of complex physics channels. The online reconstruction algorithm has been tested with a Virtex4 FGPA card and optimized regarding the resulting constraints.
Solder Joint Health Monitoring Testbed

NASA Technical Reports Server (NTRS)

Delaney, Michael M.; Flynn, James; Browder, Mark

2009-01-01

A method of monitoring the health of selected solder joints, called SJ-BIST, has been developed by Ridgetop Group Inc. under a Small Business Innovative Research (SBIR) contract. The primary goal of this research program is to test and validate this method in a flight environment using realistically seeded faults in selected solder joints. An additional objective is to gather environmental data for future development of physics-based and data-driven prognostics algorithms. A test board is being designed using a Xilinx FPGA. These boards will be tested both in flight and on the ground using a shaker table and an altitude chamber.
Controller for the Electronically Scanned Thinned Array Radiometer (ESTAR) instrument

NASA Technical Reports Server (NTRS)

Zomberg, Brian G.; Chren, William A., Jr.

1994-01-01

A prototype controller for the ESTAR (electronically scanned thinned array radiometer) instrument has been designed and tested. It manages the operation of the digital data subsystem (DDS) and its communication with the Small Explorer data system (SEDS). Among the data processing tasks that it coordinates are FEM data acquisition, noise removal, phase alignment and correlation. Its control functions include instrument calibration and testing of two critical subsystems, the output data formatter and Walsh function generator. It is implemented in a Xilinx XC3064PC84-100 field programmable gate array (FPGA) and has a maximum clocking frequency of 10 MHz.
Data Acquisition System for Silicon Ultra Fast Cameras for Electron and Gamma Sources in Medical Applications (sucima Imager)

NASA Astrophysics Data System (ADS)

Czermak, A.; Zalewska, A.; Dulny, B.; Sowicki, B.; Jastrząb, M.; Nowak, L.

2004-07-01

The needs for real time monitoring of the hadrontherapy beam intensity and profile as well as requirements for the fast dosimetry using Monolithic Active Pixel Sensors (MAPS) forced the SUCIMA collaboration to the design of the unique Data Acquisition System (DAQ SUCIMA Imager). The DAQ system has been developed on one of the most advanced XILINX Field Programmable Gate Array chip - VERTEX II. The dedicated multifunctional electronic board for the detector's analogue signals capture, their parallel digital processing and final data compression as well as transmission through the high speed USB 2.0 port has been prototyped and tested.
Low-power hardware implementation of movement decoding for brain computer interface with reduced-resolution discrete cosine transform.

PubMed

Minho Won; Albalawi, Hassan; Xin Li; Thomas, Donald E

2014-01-01

This paper describes a low-power hardware implementation for movement decoding of brain computer interface. Our proposed hardware design is facilitated by two novel ideas: (i) an efficient feature extraction method based on reduced-resolution discrete cosine transform (DCT), and (ii) a new hardware architecture of dual look-up table to perform discrete cosine transform without explicit multiplication. The proposed hardware implementation has been validated for movement decoding of electrocorticography (ECoG) signal by using a Xilinx FPGA Zynq-7000 board. It achieves more than 56× energy reduction over a reference design using band-pass filters for feature extraction.
G(sup 4)FET Implementations of Some Logic Circuits

NASA Technical Reports Server (NTRS)

Mojarradi, Mohammad; Akarvardar, Kerem; Cristoleveanu, Sorin; Gentil, Paul; Blalock, Benjamin; Chen, Suhan

2009-01-01

Some logic circuits have been built and demonstrated to work substantially as intended, all as part of a continuing effort to exploit the high degrees of design flexibility and functionality of the electronic devices known as G(sup 4)FETs and described below. These logic circuits are intended to serve as prototypes of more complex advanced programmable-logicdevice-type integrated circuits, including field-programmable gate arrays (FPGAs). In comparison with prior FPGAs, these advanced FPGAs could be much more efficient because the functionality of G(sup 4)FETs is such that fewer discrete components are needed to perform a given logic function in G(sup 4)FET circuitry than are needed perform the same logic function in conventional transistor-based circuitry. The underlying concept of using G(sup 4)FETs as building blocks of programmable logic circuitry was also described, from a different perspective, in G(sup 4)FETs as Universal and Programmable Logic Gates (NPO-41698), NASA Tech Briefs, Vol. 31, No. 7 (July 2007), page 44. A G(sup 4)FET can be characterized as an accumulation-mode silicon-on-insulator (SOI) metal oxide/semiconductor field-effect transistor (MOSFET) featuring two junction field-effect transistor (JFET) gates. The structure of a G(sup 4)FET (see Figure 1) is the same as that of a p-channel inversion-mode SOI MOSFET with two body contacts on each side of the channel. The top gate (G1), the substrate emulating a back gate (G2), and the junction gates (JG1 and JG2) can be biased independently of each other and, hence, each can be used to independently control some aspects of the conduction characteristics of the transistor. The independence of the actions of the four gates is what affords the enhanced functionality and design flexibility of G(sup 4)FETs. The present G(sup 4)FET logic circuits include an adjustable-threshold inverter, a real-time-reconfigurable logic gate, and a dynamic random-access memory (DRAM) cell (see Figure 2). The configuration
Fault-Tolerant, Radiation-Hard DSP

NASA Technical Reports Server (NTRS)

Czajkowski, David

2011-01-01

Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high
Localized Triple Modular Redundancy vs. Distributed Triple Modular Redundancy on a ProASIC3E Reprogrammable FPGA

NASA Technical Reports Server (NTRS)

McGuffey, Alex; Berg, Melanie; Pellish, Jonathan

2010-01-01

Field programmable gate arrays (FPGA) are used in every space application. Currently, most space flight applications use radiation hardened (RH) FPGAs, which are very expensive. There is a desire to use cheaper, commercial off the shelf reprogrammable FPGAs, which are more susceptible to radiation effects known as single-event effects (SEE). The RH parts have SEE and total ionizing dose (TID) hardened elements pre-integrated into the part. This means that the designer does not need to implement any hardening techniques while configuring the device. The COTS parts on the other hand must be mitigated by design in order to insure any form of mitigation. The design techniques this project examines concern the use of localized triple modular redundancy (LTMR) and distributed triple modular redundancy (DTMR). LTMR triples every flip flop in the device architecture while DTMR triples everything except for the global routes (clocks, resets, and enables). The testing was performed on a ProASIC3E FPGA at the Texas A&M cyclotron facility. Two design architectures were used: shift registers and counters, both with LTMR and DTMR mitigation techniques. The test results prove that DTMR is more effective at reducing SEE than LTMR. We also determined that there was not a significant difference between the use of shift registers and counters for test purposes. More testing is required to obtain additional linear energy transfer values for each architecture and mitigation technique in order to determine the most cost-effective method of SEE mitigation.
Implementation of a cone-beam backprojection algorithm on the cell broadband engine processor

NASA Astrophysics Data System (ADS)

Bockenbach, Olivier; Knaup, Michael; Kachelrieß, Marc

2007-03-01

Tomographic image reconstruction is computationally very demanding. In all cases the backprojection represents the performance bottleneck due to the high operational count and due to the high demand put on the memory subsystem. In the past, solving this problem has lead to the implementation of specific architectures, connecting Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs) to memory through dedicated high speed busses. More recently, there have also been attempt to use Graphic Processing Units (GPUs) to perform the backprojection step. Originally aimed at the gaming market, IBM, Toshiba and Sony have introduced the Cell Broadband Engine (CBE) processor, often considered as a multicomputer on a chip. Clocked at 3 GHz, the Cell allows for a theoretical performance of 192 GFlops and a peak data transfer rate over the internal bus of 200 GB/s. This performance indeed makes the Cell a very attractive architecture for implementing tomographic image reconstruction algorithms. In this study, we investigate the relative performance of a perspective backprojection algorithm when implemented on a standard PC and on the Cell processor. We compare these results to the performance achievable with FPGAs based boards and high end GPUs. The cone-beam backprojection performance was assessed by backprojecting a full circle scan of 512 projections of 1024x1024 pixels into a volume of size 512x512x512 voxels. It took 3.2 minutes on the PC (single CPU) and is as fast as 13.6 seconds on the Cell.
Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.« less
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale.

PubMed

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2016-10-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.
UniBoard: generic hardware for radio astronomy signal processing

NASA Astrophysics Data System (ADS)

Hargreaves, J. E.

2012-09-01

UniBoard is a generic high-performance computing platform for radio astronomy, developed as a Joint Research Activity in the RadioNet FP7 Programme. The hardware comprises eight Altera Stratix IV Field Programmable Gate Arrays (FPGAs) interconnected by a high speed transceiver mesh. Each FPGA is connected to two DDR3 memory modules and three external 10Gbps ports. In addition, a total of 128 low voltage differential input lines permit connection to external ADC cards. The DSP capability of the board exceeds 644E9 complex multiply-accumulate operations per second. The first production run of eight boards was distributed to partners in The Netherlands, France, Italy, UK, China and Korea in May 2011, with a further production runs completed in December 2011 and early 2012. The function of the board is determined by the firmware loaded into its FPGAs. Current applications include beamformers, correlators, digital receivers, RFI mitigation for pulsar astronomy, and pulsar gating and search machines The new UniBoard based correlator for the European VLBI network (EVN) uses an FX architecture with half the resources of the board devoted to station based processing: delay and phase correction and channelization, and half to the correlation function. A single UniBoard can process a 64MHz band from 32 stations, 2 polarizations, sampled at 8 bit. Adding more UniBoards can expand the total bandwidth of the correlator. The design is able to process both prerecorded and real time (eVLBI) data.
Next-Generation A/D Sampler ADS3000+ for VLBI2010

NASA Technical Reports Server (NTRS)

Takefuji, Kazuhiro; Takeuchi, Hiroshi; Tsutsumi, Masanori; Koyama, Yasuhiro

2010-01-01

A high-speed A/D sampler, called ADS3000+, has been developed in 2008, which can sample one analog signal up to 4 Gbps to versatile Linux PC. After A/D conversion, the ADS3000+ can perform digital signal processing such as real-time DBBC (Digital Base Band Conversion) and FIR filtering such as simple CW RFI filtering using the installed FPGAs. A 4 Gsps fringe test with the ADS3000+ has been successfully performed. The ADS3000+ will not exclusively be used for VLBI but will also be employed in other applications.
Net-aware bitstreams that upgrade FPGA hardware remotely over the Internet: creating intelligent bitstreams that know where to go, what to do when they get there, and can report back when they're done

NASA Astrophysics Data System (ADS)

Casselman, Steve; Schewel, John

2002-07-01

Success in the marketplace may well depend upon the ability to upgrade and test hardware designs instantly around the world. An upgrade management strategy requires more than just the bitstream file, email or a JTAG cable. A well-managed methodology, capable of transmitting bitstreams directly into targeted FPGAs over the network or internet is an essential element for a successful FPGA based product strategy. Virtual Computer Corporation"s HOTMan, Bitstream Management Environment combines a feature rich cross-platform API with an Object Oriented Bitstream technique for Remote Upgrading of Hardware over the Internet.
Experimenting Galileo on Board the International Space Station

NASA Technical Reports Server (NTRS)

Fantinato, Samuele; Pozzobon, Oscar; Gamba, Giovanni; Chiara, Andrea Dalla; Montagner, Stefano; Giordano, Pietro; Crisci, Massimo; Enderle, Werner; Chelmins, David T.; Sands, Obed S.;

2016-01-01

The SCaN Testbed is an advanced integrated communications system and laboratory facility installed on the International Space Station (ISS) in 2012. The testbed incorporates a set of new generation of Software Defined Radio (SDR) technologies intended to allow researchers to develop, test, and demonstrate new communications, networking, and navigation capabilities in the actual environment of space. Qascom, in cooperation with ESA and NASA, is designing a Software Defined Radio GalileoGPS Receiver capable to provide accurate positioning and timing to be installed on the ISS SCaN Testbed. The GalileoGPS waveform will be operated in the JPL SDR that is constituted by several hardware components that can be used for experimentations in L-Band and S-Band. The JPL SDR includes an L-Band Dorne Margolin antenna mounted onto a choke ring. The antenna is connected to a radio front end capable to provide one bit samples for the three GNSS frequencies (L1, L2 and L5) at 38 MHz, exploiting the subharmonic sampling. The baseband processing is then performed by an ATMEL AT697 processor (100 MIPS) and two Virtex 2 FPGAs. The JPL SDR supports the STRS (Space Telecommunications Radio System) that provides common waveform software interfaces, methods of instantiation, operation, and testing among different compliant hardware and software products. The standard foresees the development of applications that are modular, portable, reconfigurable, and reusable. The developed waveform uses the STRS infrastructure-provided application program interfaces (APIs) and services to load, verify, execute, change parameters, terminate, or unload an application. The project is divided in three main phases. 1)Design and Development of the GalileoGPS waveform for the SCaN Testbed starting from Qascom existing GNSS SDR receiver. The baseline design is limited to the implementation of the single frequency Galileo and GPS L1E1 receiver even if as part of the activity it will be to assess the

A hierarchical scheduling and management solution for dynamic reconfiguration in FPGA-based embedded systems

NASA Astrophysics Data System (ADS)

Cervero, T.; Gómez, A.; López, S.; Sarmiento, R.; Dondo, J.; Rincón, F.; López, J. C.

2013-05-01

One of the limiting factors that have prevented a widely dissemination of the reconfigurable technology is the absence of an appropriate model for certain target applications capable of offering a reliable control. Moreover, the lack of flexible and easy-to-use scheduling and management systems are also relevant drawbacks to be considered. Under static scenarios, it is relatively easy to schedule and manage the reconfiguration process since all the variations corresponding to predetermined and well-known tasks. However, the difficulty increases when the adaptation needs of the overall system change semi-randomly according to the environmental fluctuations. In this context, this work proposes a change in the paradigm of dynamically reconfigurable systems, by attending to the dynamically reconfigurable control problematic as a whole, in which the scheduling and the placement issues are packed together as a hierarchical management structure, interacting together as one entity from the system point of view, but performing their tasks with certain degree of independence each other. In this sense, the top hierarchical level corresponds with a dynamic scheduler in charge of planning and adjusting all the reconfigurable modules according to the variations of the external stimulus. The lower level interacts with the physical layer of the device by means of instantiating, relocating, removing a reconfigurable module following the scheduler's instructions. In regards to how fast is the proposed solution, the total partial reconfiguration time achieved with this proposal has been measured and compared with other two approaches: 1) using traditional Xilinx's tools; 2) using an optimized version of the Xilinx's drivers. The collected numbers demonstrate that our solution reaches a gain up to 10 times faster than the other approaches.

A real-time hybrid neuron network for highly parallel cognitive systems.

PubMed

Christiaanse, Gerrit Jan; Zjajo, Amir; Galuzzi, Carlo; van Leuken, Rene

2016-08-01

For comprehensive understanding of how neurons communicate with each other, new tools need to be developed that can accurately mimic the behaviour of such neurons and neuron networks under `real-time' constraints. In this paper, we propose an easily customisable, highly pipelined, neuron network design, which executes optimally scheduled floating-point operations for maximal amount of biophysically plausible neurons per FPGA family type. To reduce the required amount of resources without adverse effect on the calculation latency, a single exponent instance is used for multiple neuron calculation operations. Experimental results indicate that the proposed network design allows the simulation of up to 1188 neurons on Virtex7 (XC7VX550T) device in brain real-time yielding a speed-up of x12.4 compared to the state-of-the art.
Monitoring system for testing the radiation hardness of a KINTEX-7 FPGA

NASA Astrophysics Data System (ADS)

Cojocariu, L. N.; Placinta, V. M.; Dumitru, L.

2016-03-01

A much more efficient Ring Imaging Cherenkov sub-detector system will be rebuilt in the second long shutdown of Large Hadron Collider for the LHCb experiment. Radiation-hard electronic components together with Commercial Off-The-Shelf ones will be used in the new Cherenkov photon detection system architecture. An irradiation program was foreseen to determine the radiation tolerance for the new electronic devices, including a Field Programmable Gate Array from KINTEX-7 family of XILINX. An automated test bench for online monitoring of the XC7K70T KINTEX-7 device operation in radiation conditions was designed and implemented by the LHCb Romanian group.
UW VLSI chip tester

NASA Astrophysics Data System (ADS)

McKenzie, Neil

1989-12-01

We present a design for a low-cost, functional VLSI chip tester. It is based on the Apple MacIntosh II personal computer. It tests chips that have up to 128 pins. All pin drivers of the tester are bidirectional; each pin is programmed independently as an input or an output. The tester can test both static and dynamic chips. Rudimentary speed testing is provided. Chips are tested by executing C programs written by the user. A software library is provided for program development. Tests run under both the Mac Operating System and A/UX. The design is implemented using Xilinx Logic Cell Arrays. Price/performance tradeoffs are discussed.
On the use of programmable hardware and reduced numerical precision in earth-system modeling.

PubMed

Düben, Peter D; Russell, Francis P; Niu, Xinyu; Luk, Wayne; Palmer, T N

2015-09-01

Programmable hardware, in particular Field Programmable Gate Arrays (FPGAs), promises a significant increase in computational performance for simulations in geophysical fluid dynamics compared with CPUs of similar power consumption. FPGAs allow adjusting the representation of floating-point numbers to specific application needs. We analyze the performance-precision trade-off on FPGA hardware for the two-scale Lorenz '95 model. We scale the size of this toy model to that of a high-performance computing application in order to make meaningful performance tests. We identify the minimal level of precision at which changes in model results are not significant compared with a maximal precision version of the model and find that this level is very similar for cases where the model is integrated for very short or long intervals. It is therefore a useful approach to investigate model errors due to rounding errors for very short simulations (e.g., 50 time steps) to obtain a range for the level of precision that can be used in expensive long-term simulations. We also show that an approach to reduce precision with increasing forecast time, when model errors are already accumulated, is very promising. We show that a speed-up of 1.9 times is possible in comparison to FPGA simulations in single precision if precision is reduced with no strong change in model error. The single-precision FPGA setup shows a speed-up of 2.8 times in comparison to our model implementation on two 6-core CPUs for large model setups.
3D imaging and wavefront sensing with a plenoptic objective

NASA Astrophysics Data System (ADS)

Rodríguez-Ramos, J. M.; Lüke, J. P.; López, R.; Marichal-Hernández, J. G.; Montilla, I.; Trujillo-Sevilla, J.; Femenía, B.; Puga, M.; López, M.; Fernández-Valdivia, J. J.; Rosa, F.; Dominguez-Conde, C.; Sanluis, J. C.; Rodríguez-Ramos, L. F.

2011-06-01

Plenoptic cameras have been developed over the last years as a passive method for 3d scanning. Several superresolution algorithms have been proposed in order to increase the resolution decrease associated with lightfield acquisition with a microlenses array. A number of multiview stereo algorithms have also been applied in order to extract depth information from plenoptic frames. Real time systems have been implemented using specialized hardware as Graphical Processing Units (GPUs) and Field Programmable Gates Arrays (FPGAs). In this paper, we will present our own implementations related with the aforementioned aspects but also two new developments consisting of a portable plenoptic objective to transform every conventional 2d camera in a 3D CAFADIS plenoptic camera, and the novel use of a plenoptic camera as a wavefront phase sensor for adaptive optics (OA). The terrestrial atmosphere degrades the telescope images due to the diffraction index changes associated with the turbulence. These changes require a high speed processing that justify the use of GPUs and FPGAs. Na artificial Laser Guide Stars (Na-LGS, 90km high) must be used to obtain the reference wavefront phase and the Optical Transfer Function of the system, but they are affected by defocus because of the finite distance to the telescope. Using the telescope as a plenoptic camera allows us to correct the defocus and to recover the wavefront phase tomographically. These advances significantly increase the versatility of the plenoptic camera, and provides a new contribution to relate the wave optics and computer vision fields, as many authors claim.
Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale

PubMed Central

Huang, Muhuan; Wu, Di; Yu, Cody Hao; Fang, Zhenman; Interlandi, Matteo; Condie, Tyson; Cong, Jason

2017-01-01

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft’s FPGA deployment in its Bing search engine and Intel’s 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems—like Apache Spark and Hadoop—to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster. PMID:28317049
Modified Phasemeter for a Heterodyne Laser Interferometer

NASA Technical Reports Server (NTRS)

Loya, Frank M.

2010-01-01

Modifications have been made in the design of instruments of the type described in "Digital Averaging Phasemeter for Heterodyne Interferometry". A phasemeter of this type measures the difference between the phases of the unknown and reference heterodyne signals in a heterodyne laser interferometer. The phasemeter design lacked immunity to drift of the heterodyne frequency, was bandwidth-limited by computer bus architectures then in use, and was resolution-limited by the nature of field-programmable gate arrays (FPGAs) then available. The modifications have overcome these limitations and have afforded additional improvements in accuracy, speed, and modularity. The modifications are summarized.
76 FR 2148 - Xilinx, Inc. Including On-Site Leased Workers of TEKsystems, Albuquerque, NM; Notice of Revised...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-01-12

..., for integrated circuit test engineers and test equipment engineers for a Product and Test Engineering... engineering services. In the request for reconsideration, workers alleged that the subject firm has shifted abroad the supply of services like and directly competitive with the internal-use engineering services...
75 FR 65526 - Xilinx, Inc., Including On-Site Leased Workers of TEKsystems, Albuquerque, NM; Notice of Revised...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-10-25

..., for integrated circuit test engineers and test equipment engineers for a Product and Test Engineering... engineering services. In the request for reconsideration, workers alleged that the subject firm has shifted abroad the supply of services like and directly competitive with the internal-use engineering services...
An FPGA Implementation of a Polychronous Spiking Neural Network with Delay Adaptation.

PubMed

Wang, Runchun; Cohen, Gregory; Stiefel, Klaus M; Hamilton, Tara Julia; Tapson, Jonathan; van Schaik, André

2013-01-01

We present an FPGA implementation of a re-configurable, polychronous spiking neural network with a large capacity for spatial-temporal patterns. The proposed neural network generates delay paths de novo, so that only connections that actually appear in the training patterns will be created. This allows the proposed network to use all the axons (variables) to store information. Spike Timing Dependent Delay Plasticity is used to fine-tune and add dynamics to the network. We use a time multiplexing approach allowing us to achieve 4096 (4k) neurons and up to 1.15 million programmable delay axons on a Virtex 6 FPGA. Test results show that the proposed neural network is capable of successfully recalling more than 95% of all spikes for 96% of the stored patterns. The tests also show that the neural network is robust to noise from random input spikes.
Synthesis of blind source separation algorithms on reconfigurable FPGA platforms

NASA Astrophysics Data System (ADS)

Du, Hongtao; Qi, Hairong; Szu, Harold H.

2005-03-01

-Specific Integrated Circuit (ASIC) using standard-height cells. ICA is an algorithm that can solve BSS problems by carrying out the all-order statistical, decorrelation-based transforms, in which an assumption that neighborhood pixels share the same but unknown mixing matrix A is made. In this paper, we continue our investigation on the design challenges of firmware approaches to smart algorithms. We think two levels of parallelization can be explored, including pixel-based parallelization and the parallelization of the restoration algorithm performed at each pixel. This paper focuses on the latter and we use ICA as an example to explain the design and implementation methods. It is well known that the capacity constraints of single FPGA have limited the implementation of many complex algorithms including ICA. Using the reconfigurability of FPGA, we show, in this paper, how to manipulate the FPGA-based system to provide extra computing power for the parallelized ICA algorithm with limited FPGA resources. The synthesis aiming at the pilchard re-configurable FPGA platform is reported. The pilchard board is embedded with single Xilinx VIRTEX 1000E FPGA and transfers data directly to CPU on the 64-bit memory bus at the maximum frequency of 133MHz. Both the feasibility performance evaluations and experimental results validate the effectiveness and practicality of this synthesis, which can be extended to the spatial-variant jitter restoration for micro-UAV deployment.
Implementation of the 2-D Wavelet Transform into FPGA for Image

NASA Astrophysics Data System (ADS)

León, M.; Barba, L.; Vargas, L.; Torres, C. O.

2011-01-01

This paper presents a hardware system implementation of the of discrete wavelet transform algoritm in two dimensions for FPGA, using the Daubechies filter family of order 2 (db2). The decomposition algorithm of this transform is designed and simulated with the Hardware Description Language VHDL and is implemented in a programmable logic device (FPGA) XC3S1200E reference, Spartan IIIE family, by Xilinx, take advantage the parallels properties of these gives us and speeds processing that can reach them. The architecture is evaluated using images input of different sizes. This implementation is done with the aim of developing a future images encryption hardware system using wavelet transform for security information.
NULL Convention Floating Point Multiplier

PubMed Central

Ramachandran, Seshasayanan

2015-01-01

Floating point multiplication is a critical part in high dynamic range and computational intensive digital signal processing applications which require high precision and low power. This paper presents the design of an IEEE 754 single precision floating point multiplier using asynchronous NULL convention logic paradigm. Rounding has not been implemented to suit high precision applications. The novelty of the research is that it is the first ever NULL convention logic multiplier, designed to perform floating point multiplication. The proposed multiplier offers substantial decrease in power consumption when compared with its synchronous version. Performance attributes of the NULL convention logic floating point multiplier, obtained from Xilinx simulation and Cadence, are compared with its equivalent synchronous implementation. PMID:25879069
NULL convention floating point multiplier.

PubMed

Albert, Anitha Juliette; Ramachandran, Seshasayanan

2015-01-01

Floating point multiplication is a critical part in high dynamic range and computational intensive digital signal processing applications which require high precision and low power. This paper presents the design of an IEEE 754 single precision floating point multiplier using asynchronous NULL convention logic paradigm. Rounding has not been implemented to suit high precision applications. The novelty of the research is that it is the first ever NULL convention logic multiplier, designed to perform floating point multiplication. The proposed multiplier offers substantial decrease in power consumption when compared with its synchronous version. Performance attributes of the NULL convention logic floating point multiplier, obtained from Xilinx simulation and Cadence, are compared with its equivalent synchronous implementation.
An IO block array in a radiation-hardened SOI SRAM-based FPGA

NASA Astrophysics Data System (ADS)

Yan, Zhao; Lihua, Wu; Xiaowei, Han; Yan, Li; Qianli, Zhang; Liang, Chen; Guoquan, Zhang; Jianzhong, Li; Bo, Yang; Jiantou, Gao; Jian, Wang; Ming, Li; Guizhai, Liu; Feng, Zhang; Xufeng, Guo; Kai, Zhao; Chen, Stanley L.; Fang, Yu; Zhongli, Liu

2012-01-01

We present an input/output block (IOB) array used in the radiation-hardened SRAM-based field-programmable gate array (FPGA) VS1000, which is designed and fabricated with a 0.5 μm partially depleted silicon-on-insulator (SOI) logic process at the CETC 58th Institute. Corresponding with the characteristics of the FPGA, each IOB includes a local routing pool and two IO cells composed of a signal path circuit, configurable input/output buffers and an ESD protection network. A boundary-scan path circuit can be used between the programmable buffers and the input/output circuit or as a transparent circuit when the IOB is applied in different modes. Programmable IO buffers can be used at TTL/CMOS standard levels. The local routing pool enhances the flexibility and routability of the connection between the IOB array and the core logic. Radiation-hardened designs, including A-type and H-type body-tied transistors and special D-type registers, improve the anti-radiation performance. The ESD protection network, which provides a high-impulse discharge path on a pad, prevents the breakdown of the core logic caused by the immense current. These design strategies facilitate the design of FPGAs with different capacities or architectures to form a series of FPGAs. The functionality and performance of the IOB array is proved after a functional test. The radiation test indicates that the proposed VS1000 chip with an IOB array has a total dose tolerance of 100 krad(Si), a dose survivability rate of 1.5 × 1011 rad(Si)/s, and a neutron fluence immunity of 1 × 1014 n/cm2.
A FPGA implementation for linearly unmixing a hyperspectral image using OpenCL

NASA Astrophysics Data System (ADS)

Guerra, Raúl; López, Sebastián.; Sarmiento, Roberto

2017-10-01

Hyperspectral imaging systems provide images in which single pixels have information from across the electromagnetic spectrum of the scene under analysis. These systems divide the spectrum into many contiguos channels, which may be even out of the visible part of the spectra. The main advantage of the hyperspectral imaging technology is that certain objects leave unique fingerprints in the electromagnetic spectrum, known as spectral signatures, which allow to distinguish between different materials that may look like the same in a traditional RGB image. Accordingly, the most important hyperspectral imaging applications are related with distinguishing or identifying materials in a particular scene. In hyperspectral imaging applications under real-time constraints, the huge amount of information provided by the hyperspectral sensors has to be rapidly processed and analysed. For such purpose, parallel hardware devices, such as Field Programmable Gate Arrays (FPGAs) are typically used. However, developing hardware applications typically requires expertise in the specific targeted device, as well as in the tools and methodologies which can be used to perform the implementation of the desired algorithms in the specific device. In this scenario, the Open Computing Language (OpenCL) emerges as a very interesting solution in which a single high-level synthesis design language can be used to efficiently develop applications in multiple and different hardware devices. In this work, the Fast Algorithm for Linearly Unmixing Hyperspectral Images (FUN) has been implemented into a Bitware Stratix V Altera FPGA using OpenCL. The obtained results demonstrate the suitability of OpenCL as a viable design methodology for quickly creating efficient FPGAs designs for real-time hyperspectral imaging applications.
FPGA-based real time processing of the Plenoptic Wavefront Sensor

NASA Astrophysics Data System (ADS)

Rodríguez-Ramos, L. F.; Marín, Y.; Díaz, J. J.; Piqueras, J.; García-Jiménez, J.; Rodríguez-Ramos, J. M.

The plenoptic wavefront sensor combines measurements at pupil and image planes in order to obtain simultaneously wavefront information from different points of view, being capable to sample the volume above the telescope to extract the tomographic information of the atmospheric turbulence. The advantages of this sensor are presented elsewhere at this conference (José M. Rodríguez-Ramos et al). This paper will concentrate in the processing required for pupil plane phase recovery, and its computation in real time using FPGAs (Field Programmable Gate Arrays). This technology eases the implementation of massive parallel processing and allows tailoring the system to the requirements, maintaining flexibility, speed and cost figures.
Tunable photonic cavities for in-situ spectroscopic trace gas detection

DOEpatents

Bond, Tiziana; Cole, Garrett; Goddard, Lynford

2012-11-13

Compact tunable optical cavities are provided for in-situ NIR spectroscopy. MEMS-tunable VCSEL platforms represents a solid foundation for a new class of compact, sensitive and fiber compatible sensors for fieldable, real-time, multiplexed gas detection systems. Detection limits for gases with NIR cross-sections such as O.sub.2, CH.sub.4, CO.sub.x and NO.sub.x have been predicted to approximately span from 10.sup.ths to 10s of parts per million. Exemplary oxygen detection design and a process for 760 nm continuously tunable VCSELS is provided. This technology enables in-situ self-calibrating platforms with adaptive monitoring by exploiting Photonic FPGAs.
SAD5 Stereo Correlation Line-Striping in an FPGA

NASA Technical Reports Server (NTRS)

Villalpando, Carlos Y.; Morfopoulos, Arin C.

2011-01-01

High precision SAD5 stereo computations can be performed in an FPGA (field-programmable gate array) at much higher speeds than possible in a conventional CPU (central processing unit), but this uses large amounts of FPGA resources that scale with image size. Of the two key resources in an FPGA, Slices and BRAM (block RAM), Slices scale linearly in the new algorithm with image size, and BRAM scales quadratically with image size. An approach was developed to trade latency for BRAM by sub-windowing the image vertically into overlapping strips and stitching the outputs together to create a single continuous disparity output. In stereo, the general rule of thumb is that the disparity search range must be 1/10 the image size. In the new algorithm, BRAM usage scales linearly with disparity search range and scales again linearly with line width. So a doubling of image size, say from 640 to 1,280, would in the previous design be an effective 4 of BRAM usage: 2 for line width, 2 again for disparity search range. The minimum strip size is twice the search range, and will produce an output strip width equal to the disparity search range. So assuming a disparity search range of 1/10 image width, 10 sequential runs of the minimum strip size would produce a full output image. This approach allowed the innovators to fit 1280 960 wide SAD5 stereo disparity in less than 80 BRAM, 52k Slices on a Virtex 5LX330T, 25% and 24% of resources, respectively. Using a 100-MHz clock, this build would perform stereo at 39 Hz. Of particular interest to JPL is that there is a flight qualified version of the Virtex 5: this could produce stereo results even for very large image sizes at 3 orders of magnitude faster than could be computed on the PowerPC 750 flight computer. The work covered in the report allows the stereo algorithm to run on much larger images than before, and using much less BRAM. This opens up choices for a smaller flight FPGA (which saves power and space), or for other algorithms
A high data rate universal lattice decoder on FPGA

NASA Astrophysics Data System (ADS)

Ma, Jing; Huang, Xinming; Kura, Swapna

2005-06-01

This paper presents the architecture design of a high data rate universal lattice decoder for MIMO channels on FPGA platform. A phost strategy based lattice decoding algorithm is modified in this paper to reduce the complexity of the closest lattice point search. The data dependency of the improved algorithm is examined and a parallel and pipeline architecture is developed with the iterative decoding function on FPGA and the division intensive channel matrix preprocessing on DSP. Simulation results demonstrate that the improved lattice decoding algorithm provides better bit error rate and less iteration number compared with the original algorithm. The system prototype of the decoder shows that it supports data rate up to 7Mbit/s on a Virtex2-1000 FPGA, which is about 8 times faster than the original algorithm on FPGA platform and two-orders of magnitude better than its implementation on a DSP platform.

VLSI implementation of RSA encryption system using ancient Indian Vedic mathematics

NASA Astrophysics Data System (ADS)

Thapliyal, Himanshu; Srinivas, M. B.

2005-06-01

This paper proposes the hardware implementation of RSA encryption/decryption algorithm using the algorithms of Ancient Indian Vedic Mathematics that have been modified to improve performance. The recently proposed hierarchical overlay multiplier architecture is used in the RSA circuitry for multiplication operation. The most significant aspect of the paper is the development of a division architecture based on Straight Division algorithm of Ancient Indian Vedic Mathematics and embedding it in RSA encryption/decryption circuitry for improved efficiency. The coding is done in Verilog HDL and the FPGA synthesis is done using Xilinx Spartan library. The results show that RSA circuitry implemented using Vedic division and multiplication is efficient in terms of area/speed compared to its implementation using conventional multiplication and division architectures.
FPGA architecture and implementation of sparse matrix vector multiplication for the finite element method

NASA Astrophysics Data System (ADS)

Elkurdi, Yousef; Fernández, David; Souleimanov, Evgueni; Giannacopoulos, Dennis; Gross, Warren J.

2008-04-01

The Finite Element Method (FEM) is a computationally intensive scientific and engineering analysis tool that has diverse applications ranging from structural engineering to electromagnetic simulation. The trends in floating-point performance are moving in favor of Field-Programmable Gate Arrays (FPGAs), hence increasing interest has grown in the scientific community to exploit this technology. We present an architecture and implementation of an FPGA-based sparse matrix-vector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from FEM applications. FEM matrices display specific sparsity patterns that can be exploited to improve the efficiency of hardware designs. Our architecture exploits FEM matrix sparsity structure to achieve a balance between performance and hardware resource requirements by relying on external SDRAM for data storage while utilizing the FPGAs computational resources in a stream-through systolic approach. The architecture is based on a pipelined linear array of processing elements (PEs) coupled with a hardware-oriented matrix striping algorithm and a partitioning scheme which enables it to process arbitrarily big matrices without changing the number of PEs in the architecture. Therefore, this architecture is only limited by the amount of external RAM available to the FPGA. The implemented SMVM-pipeline prototype contains 8 PEs and is clocked at 110 MHz obtaining a peak performance of 1.76 GFLOPS. For 8 GB/s of memory bandwidth typical of recent FPGA systems, this architecture can achieve 1.5 GFLOPS sustained performance. Using multiple instances of the pipeline, linear scaling of the peak and sustained performance can be achieved. Our stream-through architecture provides the added advantage of enabling an iterative implementation of the SMVM computation required by iterative solution techniques such as the conjugate gradient method, avoiding initialization time due to data loading and setup inside
Improving Design Efficiency for Large-Scale Heterogeneous Circuits

NASA Astrophysics Data System (ADS)

Gregerson, Anthony

Despite increases in logic density, many Big Data applications must still be partitioned across multiple computing devices in order to meet their strict performance requirements. Among the most demanding of these applications is high-energy physics (HEP), which uses complex computing systems consisting of thousands of FPGAs and ASICs to process the sensor data created by experiments at particles accelerators such as the Large Hadron Collider (LHC). Designing such computing systems is challenging due to the scale of the systems, the exceptionally high-throughput and low-latency performance constraints that necessitate application-specific hardware implementations, the requirement that algorithms are efficiently partitioned across many devices, and the possible need to update the implemented algorithms during the lifetime of the system. In this work, we describe our research to develop flexible architectures for implementing such large-scale circuits on FPGAs. In particular, this work is motivated by (but not limited in scope to) high-energy physics algorithms for the Compact Muon Solenoid (CMS) experiment at the LHC. To make efficient use of logic resources in multi-FPGA systems, we introduce Multi-Personality Partitioning, a novel form of the graph partitioning problem, and present partitioning algorithms that can significantly improve resource utilization on heterogeneous devices while also reducing inter-chip connections. To reduce the high communication costs of Big Data applications, we also introduce Information-Aware Partitioning, a partitioning method that analyzes the data content of application-specific circuits, characterizes their entropy, and selects circuit partitions that enable efficient compression of data between chips. We employ our information-aware partitioning method to improve the performance of the hardware validation platform for evaluating new algorithms for the CMS experiment. Together, these research efforts help to improve the efficiency
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.

PubMed

Zierke, Stephanie; Bakos, Jason D

2010-04-12

Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
Real-time windowing in imaging radar using FPGA technique

NASA Astrophysics Data System (ADS)

Ponomaryov, Volodymyr I.; Escamilla-Hernandez, Enrique

2005-02-01

The imaging radar uses the high frequency electromagnetic waves reflected from different objects for estimating of its parameters. Pulse compression is a standard signal processing technique used to minimize the peak transmission power and to maximize SNR, and to get a better resolution. Usually the pulse compression can be achieved using a matched filter. The level of the side-lobes in the imaging radar can be reduced using the special weighting function processing. There are very known different weighting functions: Hamming, Hanning, Blackman, Chebyshev, Blackman-Harris, Kaiser-Bessel, etc., widely used in the signal processing applications. Field Programmable Gate Arrays (FPGAs) offers great benefits like instantaneous implementation, dynamic reconfiguration, design, and field programmability. This reconfiguration makes FPGAs a better solution over custom-made integrated circuits. This work aims at demonstrating a reasonably flexible implementation of FM-linear signal and pulse compression using Matlab, Simulink, and System Generator. Employing FPGA and mentioned software we have proposed the pulse compression design on FPGA using classical and novel windows technique to reduce the side-lobes level. This permits increasing the detection ability of the small or nearly placed targets in imaging radar. The advantage of FPGA that can do parallelism in real time processing permits to realize the proposed algorithms. The paper also presents the experimental results of proposed windowing procedure in the marine radar with such the parameters: signal is linear FM (Chirp); frequency deviation DF is 9.375MHz; the pulse width T is 3.2μs taps number in the matched filter is 800 taps; sampling frequency 253.125*106 MHz. It has been realized the reducing of side-lobes levels in real time permitting better resolution of the small targets.
The Use of Field Programmable Gate Arrays (FPGA) in Small Satellite Communication Systems

NASA Technical Reports Server (NTRS)

Varnavas, Kosta; Sims, William Herbert; Casas, Joseph

2015-01-01

This paper will describe the use of digital Field Programmable Gate Arrays (FPGA) to contribute to advancing the state-of-the-art in software defined radio (SDR) transponder design for the emerging SmallSat and CubeSat industry and to provide advances for NASA as described in the TAO5 Communication and Navigation Roadmap (Ref 4). The use of software defined radios (SDR) has been around for a long time. A typical implementation of the SDR is to use a processor and write software to implement all the functions of filtering, carrier recovery, error correction, framing etc. Even with modern high speed and low power digital signal processors, high speed memories, and efficient coding, the compute intensive nature of digital filters, error correcting and other algorithms is too much for modern processors to get efficient use of the available bandwidth to the ground. By using FPGAs, these compute intensive tasks can be done in parallel, pipelined fashion and more efficiently use every clock cycle to significantly increase throughput while maintaining low power. These methods will implement digital radios with significant data rates in the X and Ka bands. Using these state-of-the-art technologies, unprecedented uplink and downlink capabilities can be achieved in a 1/2 U sized telemetry system. Additionally, modern FPGAs have embedded processing systems, such as ARM cores, integrated inside the FPGA allowing mundane tasks such as parameter commanding to occur easily and flexibly. Potential partners include other NASA centers, industry and the DOD. These assets are associated with small satellite demonstration flights, LEO and deep space applications. MSFC currently has an SDR transponder test-bed using Hardware-in-the-Loop techniques to evaluate and improve SDR technologies.
Performance and advantages of a soft-core based parallel architecture for energy peak detection in the calorimeter Level 0 trigger for the NA62 experiment at CERN

NASA Astrophysics Data System (ADS)

Ammendola, R.; Barbanera, M.; Bizzarri, M.; Bonaiuto, V.; Ceccucci, A.; Checcucci, B.; De Simone, N.; Fantechi, R.; Federici, L.; Fucci, A.; Lupi, M.; Paoluzzi, G.; Papi, A.; Piccini, M.; Ryjov, V.; Salamon, A.; Salina, G.; Sargeni, F.; Venditti, S.

2017-03-01

The NA62 experiment at CERN SPS has started its data-taking. Its aim is to measure the branching ratio of the ultra-rare decay K+ → π+ν ν̅ . In this context, rejecting the background is a crucial topic. One of the main background to the measurement is represented by the K+ → π+π0 decay. In the 1-8.5 mrad decay region this background is rejected by the calorimetric trigger processor (Cal-L0). In this work we present the performance of a soft-core based parallel architecture built on FPGAs for the energy peak reconstruction as an alternative to an implementation completely founded on VHDL language.
Precise delay measurement through combinatorial logic

NASA Technical Reports Server (NTRS)

Burke, Gary R. (Inventor); Chen, Yuan (Inventor); Sheldon, Douglas J. (Inventor)

2010-01-01

A high resolution circuit and method for facilitating precise measurement of on-chip delays for FPGAs for reliability studies. The circuit embeds a pulse generator on an FPGA chip having one or more groups of LUTS (the "LUT delay chain"), also on-chip. The circuit also embeds a pulse width measurement circuit on-chip, and measures the duration of the generated pulse through the delay chain. The pulse width of the output pulse represents the delay through the delay chain without any I/O delay. The pulse width measurement circuit uses an additional asynchronous clock autonomous from the main clock and the FPGA propagation delay can be displayed on a hex display continuously for testing purposes.
The new ATLAS/LUCID detector

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bruschi, Marco

The new ATLAS luminosity monitor has many innovative aspects implemented. Its photomultipliers tubes are used as detector elements by using the Cherenkov light produced by charged particles above threshold crossing the quartz windows. The analog shaping of the readout chain has been improved, in order to cope with the 25 ns bunch spacing of the LHC machine. The main readout card is a quite general processing unit based on 12 bit - 500 MS/s Flash ADC and on FPGAs, delivering the processed data to 1.3 Gb/s optical links. The article will describe all these aspects and will outline future perspectivesmore » of the card for next generation high energy physics experiments. (authors)« less
A Memory-Based Programmable Logic Device Using Look-Up Table Cascade with Synchronous Static Random Access Memories

NASA Astrophysics Data System (ADS)

Nakamura, Kazuyuki; Sasao, Tsutomu; Matsuura, Munehiro; Tanaka, Katsumasa; Yoshizumi, Kenichi; Nakahara, Hiroki; Iguchi, Yukihiro

2006-04-01

A large-scale memory-technology-based programmable logic device (PLD) using a look-up table (LUT) cascade is developed in the 0.35-μm standard complementary metal oxide semiconductor (CMOS) logic process. Eight 64 K-bit synchronous SRAMs are connected to form an LUT cascade with a few additional circuits. The features of the LUT cascade include: 1) a flexible cascade connection structure, 2) multi phase pseudo asynchronous operations with synchronous static random access memory (SRAM) cores, and 3) LUT-bypass redundancy. This chip operates at 33 MHz in 8-LUT cascades at 122 mW. Benchmark results show that it achieves a comparable performance to field programmable gate array (FPGAs).
Towards an Analogue Neuromorphic VLSI Instrument for the Sensing of Complex Odours

NASA Astrophysics Data System (ADS)

Ab Aziz, Muhammad Fazli; Harun, Fauzan Khairi Che; Covington, James A.; Gardner, Julian W.

2011-09-01

Almost all electronic nose instruments reported today employ pattern recognition algorithms written in software and run on digital processors, e.g. micro-processors, microcontrollers or FPGAs. Conversely, in this paper we describe the analogue VLSI implementation of an electronic nose through the design of a neuromorphic olfactory chip. The modelling, design and fabrication of the chip have already been reported. Here a smart interface has been designed and characterised for thisneuromorphic chip. Thus we can demonstrate the functionality of the a VLSI neuromorphic chip, producing differing principal neuron firing patterns to real sensor response data. Further work is directed towards integrating 9 separate neuromorphic chips to create a large neuronal network to solve more complex olfactory problems.
A novel approach to Hough Transform for implementation in fast triggers

NASA Astrophysics Data System (ADS)

Pozzobon, Nicola; Montecassiano, Fabio; Zotto, Pierluigi

2016-10-01

Telescopes of position sensitive detectors are common layouts in charged particles tracking, and programmable logic devices, such as FPGAs, represent a viable choice for the real-time reconstruction of track segments in such detector arrays. A compact implementation of the Hough Transform for fast triggers in High Energy Physics, exploiting a parameter reduction method, is proposed, targeting the reduction of the needed storage or computing resources in current, or next future, state-of-the-art FPGA devices, while retaining high resolution over a wide range of track parameters. The proposed approach is compared to a Standard Hough Transform with particular emphasis on their application to muon detectors. In both cases, an original readout implementation is modeled.
Secure TRNG with random phase stimulation

NASA Astrophysics Data System (ADS)

Wieczorek, Piotr Z.

2017-08-01

In this paper a novel TRNG concept is proposed which is a vital part of cryptographic systems. The proposed TRNG involves phase variability of a pair of ring oscillators (ROs) to force the multiple metastable events in a flip-flop (FF). In the solution, the ROs are periodically activated to ensure the violation of the FF timing and resultant state randomness, while the TRNG circuit adapts the structure of ROs to obtain the maximum entropy and circuit security. The TRNG can be implemented in inexpensive re-programmable devices (CPLDs or FPGAs) without the use of Digital Clock Managers (DCMs). Preliminary test results proved the circuit's immunity to the intentional frequency injection attacks.
The development of a specialized processor for a space-based multispectral earth imager

NASA Astrophysics Data System (ADS)

Khedr, Mostafa E.

2008-10-01

This work was done in the Department of Computer Engineering, Lvov Polytechnic National University, Lvov, Ukraine, as a thesis entitled "Space Imager Computer System for Raw Video Data Processing" [1]. This work describes the synthesis and practical implementation of a specialized computer system for raw data control and processing onboard a satellite MultiSpectral earth imager. This computer system is intended for satellites with resolution in the range of one meter with 12-bit precession. The design is based mostly on general off-the-shelf components such as (FPGAs) plus custom designed software for interfacing with PC and test equipment. The designed system was successfully manufactured and now fully functioning in orbit.
A Survey of Methods for Analyzing and Improving GPU Energy Efficiency

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh; Vetter, Jeffrey S

2014-01-01

Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architectmore » highly energy-efficient GPUs of tomorrow.« less
Dense real-time stereo matching using memory efficient semi-global-matching variant based on FPGAs

NASA Astrophysics Data System (ADS)

Buder, Maximilian

2012-06-01

This paper presents a stereo image matching system that takes advantage of a global image matching method. The system is designed to provide depth information for mobile robotic applications. Typical tasks of the proposed system are to assist in obstacle avoidance, SLAM and path planning. Mobile robots pose strong requirements about size, energy consumption, reliability and output quality of the image matching subsystem. Current available systems either rely on active sensors or on local stereo image matching algorithms. The first are only suitable in controlled environments while the second suffer from low quality depth-maps. Top ranking quality results are only achieved by an iterative approach using global image matching and color segmentation techniques which are computationally demanding and therefore difficult to be executed in realtime. Attempts were made to still reach realtime performance with global methods by simplifying the routines. The depth maps are at the end almost comparable to local methods. An equally named semi-global algorithm was proposed earlier that shows both very good image matching results and relatively simple operations. A memory efficient variant of the Semi-Global-Matching algorithm is reviewed and adopted for an implementation based on reconfigurable hardware. The implementation is suitable for realtime execution in the field of robotics. It will be shown that the modified version of the efficient Semi-Global-Matching method is delivering equivalent result compared to the original algorithm based on the Middlebury dataset. The system has proven to be capable of processing VGA sized images with a disparity resolution of 64 pixel at 33 frames per second based on low cost to mid-range hardware. In case the focus is shifted to a higher image resolution, 1024×1024-sized stereo frames may be processed with the same hardware at 10 fps. The disparity resolution settings stay unchanged. A mobile system that covers preprocessing, matching and interfacing operations is also presented.
Hardware-software face detection system based on multi-block local binary patterns

NASA Astrophysics Data System (ADS)

Acasandrei, Laurentiu; Barriga, Angel

2015-03-01

Face detection is an important aspect for biometrics, video surveillance and human computer interaction. Due to the complexity of the detection algorithms any face detection system requires a huge amount of computational and memory resources. In this communication an accelerated implementation of MB LBP face detection algorithm targeting low frequency, low memory and low power embedded system is presented. The resulted implementation is time deterministic and uses a customizable AMBA IP hardware accelerator. The IP implements the kernel operations of the MB-LBP algorithm and can be used as universal accelerator for MB LBP based applications. The IP employs 8 parallel MB-LBP feature evaluators cores, uses a deterministic bandwidth, has a low area profile and the power consumption is ~95 mW on a Virtex5 XC5VLX50T. The resulted implementation acceleration gain is between 5 to 8 times, while the hardware MB-LBP feature evaluation gain is between 69 and 139 times.
A new simple technique for improving the random properties of chaos-based cryptosystems

NASA Astrophysics Data System (ADS)

Garcia-Bosque, M.; Pérez-Resa, A.; Sánchez-Azqueta, C.; Celma, S.

2018-03-01

A new technique for improving the security of chaos-based stream ciphers has been proposed and tested experimentally. This technique manages to improve the randomness properties of the generated keystream by preventing the system to fall into short period cycles due to digitation. In order to test this technique, a stream cipher based on a Skew Tent Map algorithm has been implemented on a Virtex 7 FPGA. The randomness of the keystream generated by this system has been compared to the randomness of the keystream generated by the same system with the proposed randomness-enhancement technique. By subjecting both keystreams to the National Institute of Standards and Technology (NIST) tests, we have proved that our method can considerably improve the randomness of the generated keystreams. In order to incorporate our randomness-enhancement technique, only 41 extra slices have been needed, proving that, apart from effective, this method is also efficient in terms of area and hardware resources.
Programmable logic controller performance enhancement by field programmable gate array based design.

PubMed

Patel, Dhruv; Bhatt, Jignesh; Trivedi, Sanjay

2015-01-01

PLC, the core element of modern automation systems, due to serial execution, exhibits limitations like slow speed and poor scan time. Improved PLC design using FPGA has been proposed based on parallel execution mechanism for enhancement of performance and flexibility. Modelsim as simulation platform and VHDL used to translate, integrate and implement the logic circuit in FPGA. Xilinx's Spartan kit for implementation-testing and VB has been used for GUI development. Salient merits of the design include cost-effectiveness, miniaturization, user-friendliness, simplicity, along with lower power consumption, smaller scan time and higher speed. Various functionalities and applications like typical PLC and industrial alarm annunciator have been developed and successfully tested. Results of simulation, design and implementation have been reported. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Implementation in an FPGA circuit of Edge detection algorithm based on the Discrete Wavelet Transforms

NASA Astrophysics Data System (ADS)

Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia

2017-07-01

The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.

A minimal SATA III Host Controller based on FPGA

NASA Astrophysics Data System (ADS)

Liu, Hailiang

2018-03-01

SATA (Serial Advanced Technology Attachment) is an advanced serial bus which has a outstanding performance in transmitting high speed real-time data applied in Personal Computers, Financial Industry, astronautics and aeronautics, etc. In this express, a minimal SATA III Host Controller based on Xilinx Kintex 7 serial FPGA is designed and implemented. Compared to the state-of-art, registers utilization are reduced 25.3% and LUTs utilization are reduced 65.9%. According to the experimental results, the controller works precisely and steady with the reading bandwidth of up to 536 MB per second and the writing bandwidth of up to 512 MB per second, both of which are close to the maximum bandwidth of the SSD(Solid State Disk) device. The host controller is very suitable for high speed data transmission and mass data storage.
Irradiation setup at the U-120M cyclotron facility

NASA Astrophysics Data System (ADS)

Křížek, F.; Ferencei, J.; Matlocha, T.; Pospíšil, J.; Príbeli, P.; Raskina, V.; Isakov, A.; Štursa, J.; Vaňát, T.; Vysoká, K.

2018-06-01

This paper describes parameters of the proton beams provided by the U-120M cyclotron and the related irradiation setup at the open access irradiation facility at the Nuclear Physics Institute of the Czech Academy of Sciences. The facility is suitable for testing radiation hardness of various electronic components. The use of the setup is illustrated by a measurement of an error rate for errors caused by Single Event Transients in an SRAM-based Xilinx XC3S200 FPGA. This measurement provides an estimate of a possible occurrence of Single Event Transients. Data suggest that the variation of error rate of the Single Event Effects for different clock phase shifts is not significant enough to use clock phase alignment with the beam as a fault mitigation technique.
Data transmission optical link for RF-GUN project

NASA Astrophysics Data System (ADS)

Olowski, Krzysztof; Zielinski, Jerzy; Jalmuzna, Wojciech; Pozniak, Krzysztof; Romaniuk, Ryszard

2005-09-01

Today, the fast optical data transmission is one of the fundamentals of modern distributed control systems. The fibers are widely use as multi-gigabit data stream medium. For a short range transmission, the multimode fibers are in common use. The data rate for this kind of transmission exceeds 10 Gbps for 10 Gigabit Ethernet and 10G Fibre Channel protocols. The Field Programmable Gate Arrays are one of the opportunities of managing the optical transmission. This article is concerning a synchronous optical transmission system via a multimode fiber. The transmission is controlled by the FPGA of two manufacturers: Xilinx and Altera. This paper contains the newest technology overview and market device parameters. It also describes a board for the optical transmission, technical details of the transmission and optical transmission results.
An FPGA-based bolometer for the MAST-U Super-X divertor.

PubMed

Lovell, Jack; Naylor, Graham; Field, Anthony; Drewelow, Peter; Sharples, Ray

2016-11-01

A new resistive bolometer system has been developed for MAST-Upgrade. It will measure radiated power in the new Super-X divertor, with millisecond time resolution, along 16 vertical and 16 horizontal lines of sight. The system uses a Xilinx Zynq-7000 series Field-Programmable Gate Array (FPGA) in the D-TACQ ACQ2106 carrier to perform real time data acquisition and signal processing. The FPGA enables AC-synchronous detection using high performance digital filtering to achieve a high signal-to-noise ratio and will be able to output processed data in real time with millisecond latency. The system has been installed on 8 previously unused channels of the JET vertical bolometer system. Initial results suggest good agreement with data from existing vertical channels but with higher bandwidth and signal-to-noise ratio.
A FPGA-based Cluster Finder for CMOS Monolithic Active Pixel Sensors of the MIMOSA-26 Family

NASA Astrophysics Data System (ADS)

Li, Qiyan; Amar-Youcef, S.; Doering, D.; Deveaux, M.; Fröhlich, I.; Koziel, M.; Krebs, E.; Linnik, B.; Michel, J.; Milanovic, B.; Müntz, C.; Stroth, J.; Tischler, T.

2014-06-01

CMOS Monolithic Active Pixel Sensors (MAPS) demonstrated excellent performances in the field of charged particle tracking. Among their strong points are an single point resolution few μm, a light material budget of 0.05% X0 in combination with a good radiation tolerance and high rate capability. Those features make the sensors a valuable technology for vertex detectors of various experiments in heavy ion and particle physics. To reduce the load on the event builders and future mass storage systems, we have developed algorithms suited for preprocessing and reducing the data streams generated by the MAPS. This real-time processing employs remaining free resources of the FPGAs of the readout controllers of the detector and complements the on-chip data reduction circuits of the MAPS.
Optimized FPGA Implementation of the Thyroid Hormone Secretion Mechanism Using CAD Tools.

PubMed

Alghazo, Jaafar M

2017-02-01

The goal of this paper is to implement the secretion mechanism of the Thyroid Hormone (TH) based on bio-mathematical differential eqs. (DE) on an FPGA chip. Hardware Descriptive Language (HDL) is used to develop a behavioral model of the mechanism derived from the DE. The Thyroid Hormone secretion mechanism is simulated with the interaction of the related stimulating and inhibiting hormones. Synthesis of the simulation is done with the aid of CAD tools and downloaded on a Field Programmable Gate Arrays (FPGAs) Chip. The chip output shows identical behavior to that of the designed algorithm through simulation. It is concluded that the chip mimics the Thyroid Hormone secretion mechanism. The chip, operating in real-time, is computer-independent stand-alone system.
Real-time FPGA architectures for computer vision

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar

2000-03-01

This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
Field Programmable Gate Array Failure Rate Estimation Guidelines for Launch Vehicle Fault Tree Models

NASA Technical Reports Server (NTRS)

Al Hassan, Mohammad; Britton, Paul; Hatfield, Glen Spencer; Novack, Steven D.

2017-01-01

Today's launch vehicles complex electronic and avionics systems heavily utilize Field Programmable Gate Array (FPGA) integrated circuits (IC) for their superb speed and reconfiguration capabilities. Consequently, FPGAs are prevalent ICs in communication protocols such as MILSTD- 1553B and in control signal commands such as in solenoid valve actuations. This paper will identify reliability concerns and high level guidelines to estimate FPGA total failure rates in a launch vehicle application. The paper will discuss hardware, hardware description language, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC. The hardware description language portion will discuss the high level FPGA programming languages and software/code reliability growth. The radiation portion will discuss FPGA susceptibility to space environment radiation.
Integrated 3-D vision system for autonomous vehicles

NASA Astrophysics Data System (ADS)

Hou, Kun M.; Shawky, Mohamed; Tu, Xiaowei

1992-03-01

Nowadays, autonomous vehicles have become a multidiscipline field. Its evolution is taking advantage of the recent technological progress in computer architectures. As the development tools became more sophisticated, the trend is being more specialized, or even dedicated architectures. In this paper, we will focus our interest on a parallel vision subsystem integrated in the overall system architecture. The system modules work in parallel, communicating through a hierarchical blackboard, an extension of the 'tuple space' from LINDA concepts, where they may exchange data or synchronization messages. The general purpose processing elements are of different skills, built around 40 MHz i860 Intel RISC processors for high level processing and pipelined systolic array processors based on PLAs or FPGAs for low-level processing.
Image processing applications: From particle physics to society

NASA Astrophysics Data System (ADS)

Sotiropoulou, C.-L.; Luciano, P.; Gkaitatzis, S.; Citraro, S.; Giannetti, P.; Dell'Orso, M.

2017-01-01

We present an embedded system for extremely efficient real-time pattern recognition execution, enabling technological advancements with both scientific and social impact. It is a compact, fast, low consumption processing unit (PU) based on a combination of Field Programmable Gate Arrays (FPGAs) and the full custom associative memory chip. The PU has been developed for real time tracking in particle physics experiments, but delivers flexible features for potential application in a wide range of fields. It has been proposed to be used in accelerated pattern matching execution for Magnetic Resonance Fingerprinting (biomedical applications), in real time detection of space debris trails in astronomical images (space applications) and in brain emulation for image processing (cognitive image processing). We illustrate the potentiality of the PU for the new applications.
Field programmable gate array processing of eye-safe all-fiber coherent wind Doppler lidar return signals

NASA Astrophysics Data System (ADS)

Abdelazim, S.; Santoro, D.; Arend, M.; Moshary, F.; Ahmed, S.

2011-11-01

A field deployable all-fiber eye-safe Coherent Doppler LIDAR is being developed at the Optical Remote Sensing Lab at the City College of New York (CCNY) and is designed to monitor wind fields autonomously and continuously in urban settings. Data acquisition is accomplished by sampling lidar return signals at 400 MHz and performing onboard processing using field programmable gate arrays (FPGAs). The FPGA is programmed to accumulate signal information that is used to calculate the power spectrum of the atmospherically back scattered signal. The advantage of using FPGA is that signal processing will be performed at the hardware level, reducing the load on the host computer and allowing for 100% return signal processing. An experimental setup measured wind speeds at ranges of up to 3 km.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lovell, Jack, E-mail: jack.lovell@durham.ac.uk; Culham Centre for Fusion Energy, Culham Science Centre, Abingdon, Oxon OX14 3DB; Naylor, Graham

A new resistive bolometer system has been developed for MAST-Upgrade. It will measure radiated power in the new Super-X divertor, with millisecond time resolution, along 16 vertical and 16 horizontal lines of sight. The system uses a Xilinx Zynq-7000 series Field-Programmable Gate Array (FPGA) in the D-TACQ ACQ2106 carrier to perform real time data acquisition and signal processing. The FPGA enables AC-synchronous detection using high performance digital filtering to achieve a high signal-to-noise ratio and will be able to output processed data in real time with millisecond latency. The system has been installed on 8 previously unused channels of themore » JET vertical bolometer system. Initial results suggest good agreement with data from existing vertical channels but with higher bandwidth and signal-to-noise ratio.« less
A compact, smart Langmuir Probe control module for MAST-Upgrade

NASA Astrophysics Data System (ADS)

Lovell, J.; Stephen, R.; Bray, S.; Naylor, G.; Elmore, S.; Willett, H.; Peterka, M.; Dimitrova, M.; Havranek, A.; Hron, M.; Sharples, R.

2017-11-01

A new control module for the MAST-Upgrade Langmuir Probe system has been developed. It is based on a Xilinx Zynq FPGA, which allows for excellent configurability and ease of retrieving data. The module is capable of arbitrary bias voltage waveform generation, and digitises current and voltage readings from 16 probes. The probes are biased and measured one at a time in a time multiplexed fashion, with the multiplexing sequence completely configurable. In addition, simultaneous digitisation of the floating potential of all unbiased probes is possible. A suite of these modules, each coupled with a high voltage amplifier, enables biasing and digitisation of 640 Langmuir Probes in the MAST-Upgrade Super-X divertor. The system has been successfully tested on the York Linear Plasma Device and on the COMPASS tokamak. It will be installed on MAST-Upgrade ready for operations in 2018.
A Low-Complexity Euclidean Orthogonal LDPC Architecture for Low Power Applications

PubMed Central

Revathy, M.; Saravanan, R.

2015-01-01

Low-density parity-check (LDPC) codes have been implemented in latest digital video broadcasting, broadband wireless access (WiMax), and fourth generation of wireless standards. In this paper, we have proposed a high efficient low-density parity-check code (LDPC) decoder architecture for low power applications. This study also considers the design and analysis of check node and variable node units and Euclidean orthogonal generator in LDPC decoder architecture. The Euclidean orthogonal generator is used to reduce the error rate of the proposed LDPC architecture, which can be incorporated between check and variable node architecture. This proposed decoder design is synthesized on Xilinx 9.2i platform and simulated using Modelsim, which is targeted to 45 nm devices. Synthesis report proves that the proposed architecture greatly reduces the power consumption and hardware utilizations on comparing with different conventional architectures. PMID:26065017
A Low-Complexity Euclidean Orthogonal LDPC Architecture for Low Power Applications.

PubMed

Revathy, M; Saravanan, R

2015-01-01

Low-density parity-check (LDPC) codes have been implemented in latest digital video broadcasting, broadband wireless access (WiMax), and fourth generation of wireless standards. In this paper, we have proposed a high efficient low-density parity-check code (LDPC) decoder architecture for low power applications. This study also considers the design and analysis of check node and variable node units and Euclidean orthogonal generator in LDPC decoder architecture. The Euclidean orthogonal generator is used to reduce the error rate of the proposed LDPC architecture, which can be incorporated between check and variable node architecture. This proposed decoder design is synthesized on Xilinx 9.2i platform and simulated using Modelsim, which is targeted to 45 nm devices. Synthesis report proves that the proposed architecture greatly reduces the power consumption and hardware utilizations on comparing with different conventional architectures.
FPGA Techniques Based New Hybrid Modulation Strategies for Voltage Source Inverters

PubMed Central

Sudha, L. U.; Baskaran, J.; Elankurisil, S. A.

2015-01-01

This paper corroborates three different hybrid modulation strategies suitable for single-phase voltage source inverter. The proposed method is formulated using fundamental switching and carrier based pulse width modulation methods. The main tale of this proposed method is to optimize a specific performance criterion, such as minimization of the total harmonic distortion (THD), lower order harmonics, switching losses, and heat losses. The proposed method is articulated using fundamental switching and carrier based pulse width modulation methods. Thus, the harmonic pollution in the power system will be reduced and the power quality will be augmented with better harmonic profile for a target fundamental output voltage. The proposed modulation strategies are simulated in MATLAB r2010a and implemented in a Xilinx spartan 3E-500 FG 320 FPGA processor. The feasibility of these modulation strategies is authenticated through simulation and experimental results. PMID:25821852
Adaptive Instrument Module: Space Instrument Controller "Brain" through Programmable Logic Devices

NASA Technical Reports Server (NTRS)

Darrin, Ann Garrison; Conde, Richard; Chern, Bobbie; Luers, Phil; Jurczyk, Steve; Mills, Carl; Day, John H. (Technical Monitor)

2001-01-01

The Adaptive Instrument Module (AIM) will be the first true demonstration of reconfigurable computing with field-programmable gate arrays (FPGAs) in space, enabling the 'brain' of the system to evolve or adapt to changing requirements. In partnership with NASA Goddard Space Flight Center and the Australian Cooperative Research Centre for Satellite Systems (CRC-SS), APL has built the flight version to be flown on the Australian university-class satellite FEDSAT. The AIM provides satellites the flexibility to adapt to changing mission requirements by reconfiguring standardized processing hardware rather than incurring the large costs associated with new builds. This ability to reconfigure the processing in response to changing mission needs leads to true evolveable computing, wherein the instrument 'brain' can learn from new science data in order to perform state-of-the-art data processing. The development of the AIM is significant in its enormous potential to reduce total life-cycle costs for future space exploration missions. The advent of RAM-based FPGAs whose configuration can be changed at any time has enabled the development of the AIM for processing tasks that could not be performed in software. The use of the AIM enables reconfiguration of the FPGA circuitry while the spacecraft is in flight, with many accompanying advantages. The AIM demonstrates the practicalities of using reconfigurable computing hardware devices by conducting a series of designed experiments. These include the demonstration of implementing data compression, data filtering, and communication message processing and inter-experiment data computation. The second generation is the Adaptive Processing Template (ADAPT) which is further described in this paper. The next step forward is to make the hardware itself adaptable and the ADAPT pursues this challenge by developing a reconfigurable module that will be capable of functioning efficiently in various applications. ADAPT will take advantage of
Energy efficiency analysis and implementation of AES on an FPGA

NASA Astrophysics Data System (ADS)

Kenney, David

The Advanced Encryption Standard (AES) was developed by Joan Daemen and Vincent Rjimen and endorsed by the National Institute of Standards and Technology in 2001. It was designed to replace the aging Data Encryption Standard (DES) and be useful for a wide range of applications with varying throughput, area, power dissipation and energy consumption requirements. Field Programmable Gate Arrays (FPGAs) are flexible and reconfigurable integrated circuits that are useful for many different applications including the implementation of AES. Though they are highly flexible, FPGAs are often less efficient than Application Specific Integrated Circuits (ASICs); they tend to operate slower, take up more space and dissipate more power. There have been many FPGA AES implementations that focus on obtaining high throughput or low area usage, but very little research done in the area of low power or energy efficient FPGA based AES; in fact, it is rare for estimates on power dissipation to be made at all. This thesis presents a methodology to evaluate the energy efficiency of FPGA based AES designs and proposes a novel FPGA AES implementation which is highly flexible and energy efficient. The proposed methodology is implemented as part of a novel scripting tool, the AES Energy Analyzer, which is able to fully characterize the power dissipation and energy efficiency of FPGA based AES designs. Additionally, this thesis introduces a new FPGA power reduction technique called Opportunistic Combinational Operand Gating (OCOG) which is used in the proposed energy efficient implementation. The AES Energy Analyzer was able to estimate the power dissipation and energy efficiency of the proposed AES design during its most commonly performed operations. It was found that the proposed implementation consumes less energy per operation than any previous FPGA based AES implementations that included power estimations. Finally, the use of Opportunistic Combinational Operand Gating on an AES cipher
An optimization of the FPGA trigger based on the artificial neural network for a detection of neutrino-origin showers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Szadkowski, Zbigniew; Glas, Dariusz; Pytel, Krzysztof

Observations of ultra-high energy neutrinos became a priority in experimental astro-particle physics. Up to now, the Pierre Auger Observatory did not find any candidate on a neutrino event. This imposes competitive limits to the diffuse flux of ultra-high energy neutrinos in the EeV range and above. A very low rate of events potentially generated by neutrinos is a significant challenge for a detection technique and requires both sophisticated algorithms and high-resolution hardware. A trigger based on a artificial neural network was implemented into the Cyclone{sup R} V E FPGA 5CEFA9F31I7. The prototype Front-End boards for Auger-Beyond-2015 with Cyclone{sup R} Vmore » E can test the neural network algorithm in real pampas conditions in 2015. Showers for muon and tau neutrino initiating particles on various altitudes, angles and energies were simulated in CORSICA and Offline platforms giving pattern of ADC traces in Auger water Cherenkov detectors. The 3-layer 12-10-1 neural network was taught in MATLAB by simulated ADC traces according the Levenberg-Marquardt algorithm. Results show that a probability of a ADC traces generation is very low due to a small neutrino cross-section. Nevertheless, ADC traces, if occur, for 1-10 EeV showers are relatively short and can be analyzed by 16-point input algorithm. For 100 EeV range traces are much longer, but with significantly higher amplitudes, which can be detected by standard threshold algorithms. We optimized the coefficients from MATLAB to get a maximal range of potentially registered events and for fixed-point FPGA processing to minimize calculation errors. Currently used Front-End boards based on no-more produced ACEXR PLDs and obsolete Cyclone{sup R} FPGAs allow an implementation of relatively simple threshold algorithms for triggers. New sophisticated trigger implemented in Cyclone{sup R} V E FPGAs with large amount of DSP blocks, embedded memory running with 120 - 160 MHz sampling may support to discover neutrino
Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark

Manchester Coding Option for SpaceWire: Providing Choices for System Level Design

NASA Technical Reports Server (NTRS)

Rakow, Glenn; Kisin, Alex

2014-01-01

This paper proposes an optional coding scheme for SpaceWire in lieu of the current Data Strobe scheme for three reasons. First reason is to provide a straightforward method for electrical isolation of the interface; secondly to provide ability to reduce the mass and bend radius of the SpaceWire cable; and thirdly to provide a means for a common physical layer over which multiple spacecraft onboard data link protocols could operate for a wide range of data rates. The intent is to accomplish these goals without significant change to existing SpaceWire design investments. The ability to optionally use Manchester coding in place of the current Data Strobe coding provides the ability to DC balanced the signal transitions unlike the SpaceWire Data Strobe coding; and therefore the ability to isolate the electrical interface without concern. Additionally, because the Manchester code has the clock and data encoded on the same signal, the number of wires of the existing SpaceWire cable could be optionally reduced by 50. This reduction could be an important consideration for many users of SpaceWire as indicated by the already existing effort underway by the SpaceWire working group to reduce the cable mass and bend radius by elimination of shields. However, reducing the signal count by half would provide even greater gains. It is proposed to restrict the data rate for the optional Manchester coding to a fixed data rate of 10 Megabits per second (Mbps) in order to make the necessary changes simple and still able to run in current radiation tolerant Field Programmable Gate Arrays (FPGAs). Even with this constraint, 10 Mbps will meet many applications where SpaceWire is used. These include command and control applications and many instruments applications with have moderate data rate. For most NASA flight implementations, SpaceWire designs are in rad-tolerant FPGAs, and the desire to preserve the heritage design investment is important for cost and risk considerations. The
Facial emotion recognition system for autistic children: a feasible study based on FPGA implementation.

PubMed

Smitha, K G; Vinod, A P

2015-11-01

Children with autism spectrum disorder have difficulty in understanding the emotional and mental states from the facial expressions of the people they interact. The inability to understand other people's emotions will hinder their interpersonal communication. Though many facial emotion recognition algorithms have been proposed in the literature, they are mainly intended for processing by a personal computer, which limits their usability in on-the-move applications where portability is desired. The portability of the system will ensure ease of use and real-time emotion recognition and that will aid for immediate feedback while communicating with caretakers. Principal component analysis (PCA) has been identified as the least complex feature extraction algorithm to be implemented in hardware. In this paper, we present a detailed study of the implementation of serial and parallel implementation of PCA in order to identify the most feasible method for realization of a portable emotion detector for autistic children. The proposed emotion recognizer architectures are implemented on Virtex 7 XC7VX330T FFG1761-3 FPGA. We achieved 82.3% detection accuracy for a word length of 8 bits.
Smart Payload Development for High Data Rate Instrument Systems

NASA Technical Reports Server (NTRS)

Pingree, Paula J.; Norton, Charles D.

2007-01-01

This slide presentation reviews the development of smart payloads instruments systems with high data rates. On-board computation has become a bottleneck for advanced science instrument and engineering capabilities. In order to improve the computation capability on board, smart payloads have been proposed. A smart payload is a Localized instrument, that can offload the flight processor of extensive computing cycles, simplify the interfaces, and minimize the dependency of the instrument on the flight system. This has been proposed for the Mars mission, Mars Atmospheric Trace Molecule Spectroscopy (MATMOS). The design of this system is discussed; the features of the Virtex-4, are discussed, and the technical approach is reviewed. The proposed Hybrid Field Programmable Gate Array (FPGA) technology has been shown to deliver breakthrough performance by tightly coupling hardware and software. Smart Payload designs for instruments such as MATMOS can meet science data return requirements with more competitive use of available on-board resources and can provide algorithm acceleration in hardware leading to implementation of better (more advanced) algorithms in on-board systems for improved science data return
Phasemeter core for intersatellite laser heterodyne interferometry: modelling, simulations and experiments

NASA Astrophysics Data System (ADS)

Gerberding, Oliver; Sheard, Benjamin; Bykov, Iouri; Kullmann, Joachim; Esteban Delgado, Juan Jose; Danzmann, Karsten; Heinzel, Gerhard

2013-12-01

Intersatellite laser interferometry is a central component of future space-borne gravity instruments like Laser Interferometer Space Antenna (LISA), evolved LISA, NGO and future geodesy missions. The inherently small laser wavelength allows us to measure distance variations with extremely high precision by interfering a reference beam with a measurement beam. The readout of such interferometers is often based on tracking phasemeters, which are able to measure the phase of an incoming beatnote with high precision over a wide range of frequencies. The implementation of such phasemeters is based on all digital phase-locked loops (ADPLL), hosted in FPGAs. Here, we present a precise model of an ADPLL that allows us to design such a readout algorithm and we support our analysis by numerical performance measurements and experiments with analogue signals.
Radiation-hardened optically reconfigurable gate array exploiting holographic memory characteristics

NASA Astrophysics Data System (ADS)

Seto, Daisaku; Watanabe, Minoru

2015-09-01

In this paper, we present a proposal for a radiation-hardened optically reconfigurable gate array (ORGA). The ORGA is a type of field programmable gate array (FPGA). The ORGA configuration can be executed by the exploitation of holographic memory characteristics even if 20% of the configuration data are damaged. Moreover, the optoelectronic technology enables the high-speed reconfiguration of the programmable gate array. Such a high-speed reconfiguration can increase the radiation tolerance of its programmable gate array to 9.3 × 104 times higher than that of current FPGAs. Through experimentation, this study clarified the configuration dependability using the impulse-noise emulation and high-speed configuration capabilities of the ORGA with corrupt configuration contexts. Moreover, the radiation tolerance of the programmable gate array was confirmed theoretically through probabilistic calculation.
Real-time field programmable gate array architecture for computer vision

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Torres-Huitzil, Cesar

2001-01-01

This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
Programmable diagnostic devices made from paper and tape.

PubMed

Martinez, Andres W; Phillips, Scott T; Nie, Zhihong; Cheng, Chao-Min; Carrilho, Emanuel; Wiley, Benjamin J; Whitesides, George M

2010-10-07

This paper describes three-dimensional microfluidic paper-based analytical devices (3-D microPADs) that can be programmed (postfabrication) by the user to generate multiple patterns of flow through them. These devices are programmed by pressing single-use 'on' buttons, using a stylus or a ballpoint pen. Pressing a button closes a small space (gap) between two vertically aligned microfluidic channels, and allows fluids to wick from one channel to the other. These devices are simple to fabricate, and are made entirely out of paper and double-sided adhesive tape. Programmable devices expand the capabilities of microPADs and provide a simple method for controlling the movement of fluids in paper-based channels. They are the conceptual equivalent of field-programmable gate arrays (FPGAs) widely used in electronics.
Field Programmable Gate Array Failure Rate Estimation Guidelines for Launch Vehicle Fault Tree Models

NASA Technical Reports Server (NTRS)

Al Hassan, Mohammad; Novack, Steven D.; Hatfield, Glen S.; Britton, Paul

2017-01-01

Today's launch vehicles complex electronic and avionic systems heavily utilize the Field Programmable Gate Array (FPGA) integrated circuit (IC). FPGAs are prevalent ICs in communication protocols such as MIL-STD-1553B, and in control signal commands such as in solenoid/servo valves actuations. This paper will demonstrate guidelines to estimate FPGA failure rates for a launch vehicle, the guidelines will account for hardware, firmware, and radiation induced failures. The hardware contribution of the approach accounts for physical failures of the IC, FPGA memory and clock. The firmware portion will provide guidelines on the high level FPGA programming language and ways to account for software/code reliability growth. The radiation portion will provide guidelines on environment susceptibility as well as guidelines on tailoring other launch vehicle programs historical data to a specific launch vehicle.
Fault Tolerant State Machines

NASA Technical Reports Server (NTRS)

Burke, Gary R.; Taft, Stephanie

2004-01-01

State machines are commonly used to control sequential logic in FPGAs and ASKS. An errant state machine can cause considerable damage to the device it is controlling. For example in space applications, the FPGA might be controlling Pyros, which when fired at the wrong time will cause a mission failure. Even a well designed state machine can be subject to random errors us a result of SEUs from the radiation environment in space. There are various ways to encode the states of a state machine, and the type of encoding makes a large difference in the susceptibility of the state machine to radiation. In this paper we compare 4 methods of state machine encoding and find which method gives the best fault tolerance, as well as determining the resources needed for each method.
A high-speed DAQ framework for future high-level trigger and event building clusters

NASA Astrophysics Data System (ADS)

Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.

2017-03-01

Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
Energy efficient sensor network implementations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Frigo, Janette R; Raby, Eric Y; Brennan, Sean M

In this paper, we discuss a low power embedded sensor node architecture we are developing for distributed sensor network systems deployed in a natural environment. In particular, we examine the sensor node for energy efficient processing-at-the-sensor. We analyze the following modes of operation; event detection, sleep(wake-up), data acquisition, data processing modes using low power, high performance embedded technology such as specialized embedded DSP processors and a low power FPGAs at the sensing node. We use compute intensive sensor node applications: an acoustic vehicle classifier (frequency domain analysis) and a video license plate identification application (learning algorithm) as a case study.more » We report performance and total energy usage for our system implementations and discuss the system architecture design trade offs.« less
CoreTSAR: Core Task-Size Adapting Runtime

DOE PAGES

Scogland, Thomas R. W.; Feng, Wu-chun; Rountree, Barry; ...

2014-10-27

Heterogeneity continues to increase at all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other co-processors into everything from desktops to supercomputers. As a consequence, efficiently managing such disparate resources has become increasingly complex. CoreTSAR seeks to reduce this complexity by adaptively worksharing parallel-loop regions across compute resources without requiring any transformation of the code within the loop. Lastly, our results show performance improvements of up to three-fold over a current state-of-the-art heterogeneous task scheduler as well as linear performance scaling from a single GPU to four GPUs for many codes. In addition, CoreTSAR demonstratesmore » a robust ability to adapt to both a variety of workloads and underlying system configurations.« less
High speed fault tolerant secure communication for muon chamber using FPGA based GBTx emulator

NASA Astrophysics Data System (ADS)

Sau, Suman; Mandal, Swagata; Saini, Jogender; Chakrabarti, Amlan; Chattopadhyay, Subhasis

2015-12-01

The Compressed Baryonic Matter (CBM) experiment is a part of the Facility for Antiproton and Ion Research (FAIR) in Darmstadt at the GSI. The CBM experiment will investigate the highly compressed nuclear matter using nucleus-nucleus collisions. This experiment will examine lieavy-ion collisions in fixed target geometry and will be able to measure hadrons, electrons and muons. CBM requires precise time synchronization, compact hardware, radiation tolerance, self-triggered front-end electronics, efficient data aggregation schemes and capability to handle high data rate (up to several TB/s). As a part of the implementation of read out chain of Muon Cliamber(MUCH) [1] in India, we have tried to implement FPGA based emulator of GBTx in India. GBTx is a radiation tolerant ASIC that can be used to implement multipurpose high speed bidirectional optical links for high-energy physics (HEP) experiments and is developed by CERN. GBTx will be used in highly irradiated area and more prone to be affected by multi bit error. To mitigate this effect instead of single bit error correcting RS code we have used two bit error correcting (15, 7) BCH code. It will increase the redundancy which in turn increases the reliability of the coded data. So the coded data will be less prone to be affected by noise due to radiation. The data will go from detector to PC through multiple nodes through the communication channel. The computing resources are connected to a network which can be accessed by authorized person to prevent unauthorized data access which might happen by compromising the network security. Thus data encryption is essential. In order to make the data communication secure, advanced encryption standard [2] (AES - a symmetric key cryptography) and RSA [3], [4] (asymmetric key cryptography) are used after the channel coding. We have implemented GBTx emulator on two Xilinx Kintex-7 boards (KC705). One will act as transmitter and other will act as receiver and they are connected
Design of CMOS imaging system based on FPGA

NASA Astrophysics Data System (ADS)

Hu, Bo; Chen, Xiaolai

2017-10-01

In order to meet the needs of engineering applications for high dynamic range CMOS camera under the rolling shutter mode, a complete imaging system is designed based on the CMOS imaging sensor NSC1105. The paper decides CMOS+ADC+FPGA+Camera Link as processing architecture and introduces the design and implementation of the hardware system. As for camera software system, which consists of CMOS timing drive module, image acquisition module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The ISE 14.6 emulator ISim is used in the simulation of signals. The imaging experimental results show that the system exhibits a 1280*1024 pixel resolution, has a frame frequency of 25 fps and a dynamic range more than 120dB. The imaging quality of the system satisfies the requirement of the index.
VLSI Technology for Cognitive Radio

NASA Astrophysics Data System (ADS)

VIJAYALAKSHMI, B.; SIDDAIAH, P.

2017-08-01

One of the most challenging tasks of cognitive radio is the efficiency in the spectrum sensing scheme to overcome the spectrum scarcity problem. The popular and widely used spectrum sensing technique is the energy detection scheme as it is very simple and doesn’t require any previous information related to the signal. We propose one such approach which is an optimised spectrum sensing scheme with reduced filter structure. The optimisation is done in terms of area and power performance of the spectrum. The simulations of the VLSI structure of the optimised flexible spectrum is done using verilog coding by using the XILINX ISE software. Our method produces performance with 13% reduction in area and 66% reduction in power consumption in comparison to the flexible spectrum sensing scheme. All the results are tabulated and comparisons are made. A new scheme for optimised and effective spectrum sensing opens up with our model.
Influence of radiation on metastability-based TRNG

NASA Astrophysics Data System (ADS)

Wieczorek, Piotr Z.; Wieczorek, Zbigniew

2017-08-01

This paper presents a True Random Number Generator (TRNG) based on Flip-Flops with violated timing constraints. The proposed circuit has been implemented in a Xilinx Spartan 6 device. The TRNG circuit utilizes the metastability phenomenon as a source of randomness. Therefore, in the paper the influence of timing constraints on the flip-flop metastability proximity is discussed. The metastable range of operation enhances the noise influence on a Flip-Flop behavior. Therefore, the influence of an external stochastic source on the flip-flop operation is also investigated. For this purpose a radioactive source of radiation was used. According to the results shown in the paper the radiation increases the unpredictability of the metastable process of flip-flops operating as the randomness source in the TRNG. The statistical properties of TRNG operating in an increased radiation conditions were verified with the NIST battery of statistical tests.
A low power, area efficient fpga based beamforming technique for 1-D CMUT arrays.

PubMed

Joseph, Bastin; Joseph, Jose; Vanjari, Siva Rama Krishna

2015-08-01

A low power area efficient digital beamformer targeting low frequency (2MHz) 1-D linear Capacitive Micromachined Ultrasonic Transducer (CMUT) array is developed. While designing the beamforming logic, the symmetry of the CMUT array is well exploited to reduce the area and power consumption. The proposed method is verified in Matlab by clocking an Arbitrary Waveform Generator(AWG). The architecture is successfully implemented in Xilinx Spartan 3E FPGA kit to check its functionality. The beamforming logic is implemented for 8, 16, 32, and 64 element CMUTs targeting Application Specific Integrated Circuit (ASIC) platform at Vdd 1.62V for UMC 90nm technology. It is observed that the proposed architecture consumes significantly lesser power and area (1.2895 mW power and 47134.4 μm(2) area for a 64 element digital beamforming circuit) compared to the conventional square root based algorithm.
Analyzing Reliability and Performance Trade-Offs of HLS-Based Designs in SRAM-Based FPGAs Under Soft Errors

NASA Astrophysics Data System (ADS)

Tambara, Lucas Antunes; Tonfat, Jorge; Santos, André; Kastensmidt, Fernanda Lima; Medina, Nilberto H.; Added, Nemitala; Aguiar, Vitor A. P.; Aguirre, Fernando; Silveira, Marcilei A. G.

2017-02-01

The increasing system complexity of FPGA-based hardware designs and shortening of time-to-market have motivated the adoption of new designing methodologies focused on addressing the current need for high-performance circuits. High-Level Synthesis (HLS) tools can generate Register Transfer Level (RTL) designs from high-level software programming languages. These tools have evolved significantly in recent years, providing optimized RTL designs, which can serve the needs of safety-critical applications that require both high performance and high reliability levels. However, a reliability evaluation of HLS-based designs under soft errors has not yet been presented. In this work, the trade-offs of different HLS-based designs in terms of reliability, resource utilization, and performance are investigated by analyzing their behavior under soft errors and comparing them to a standard processor-based implementation in an SRAM-based FPGA. Results obtained from fault injection campaigns and radiation experiments show that it is possible to increase the performance of a processor-based system up to 5,000 times by changing its architecture with a small impact in the cross section (increasing up to 8 times), and still increasing the Mean Workload Between Failures (MWBF) of the system.
A New Pipelined Systolic Array-Based Architecture for Matrix Inversion in FPGAs with Kalman Filter Case Study

NASA Astrophysics Data System (ADS)

Bigdeli, Abbas; Biglari-Abhari, Morteza; Salcic, Zoran; Tin Lai, Yat

2006-12-01

A new pipelined systolic array-based (PSA) architecture for matrix inversion is proposed. The pipelined systolic array (PSA) architecture is suitable for FPGA implementations as it efficiently uses available resources of an FPGA. It is scalable for different matrix size and as such allows employing parameterisation that makes it suitable for customisation for application-specific needs. This new architecture has an advantage of[InlineEquation not available: see fulltext.] processing element complexity, compared to the[InlineEquation not available: see fulltext.] in other systolic array structures, where the size of the input matrix is given by[InlineEquation not available: see fulltext.]. The use of the PSA architecture for Kalman filter as an implementation example, which requires different structures for different number of states, is illustrated. The resulting precision error is analysed and shown to be negligible.
Experimental demonstration of real-time adaptively modulated DDO-OFDM systems with a high spectral efficiency up to 5.76bit/s/Hz transmission over SMF links.

PubMed

Chen, Ming; He, Jing; Tang, Jin; Wu, Xian; Chen, Lin

2014-07-28

In this paper, a FPGAs-based real-time adaptively modulated 256/64/16QAM-encoded base-band OFDM transceiver with a high spectral efficiency up to 5.76bit/s/Hz is successfully developed, and experimentally demonstrated in a simple intensity-modulated direct-detection optical communication system. Experimental results show that it is feasible to transmit a raw signal bit rate of 7.19Gbps adaptively modulated real-time optical OFDM signal over 20km and 50km single mode fibers (SMFs). The performance comparison between real-time and off-line digital signal processing is performed, and the results show that there is a negligible power penalty. In addition, to obtain the best transmission performance, direct-current (DC) bias voltage for MZM and launch power into optical fiber links are explored in the real-time optical OFDM systems.

A Flexible VHDL Floating Point Module for Control Algorithm Implementation in Space Applications

NASA Astrophysics Data System (ADS)

Padierna, A.; Nicoleau, C.; Sanchez, J.; Hidalgo, I.; Elvira, S.

2012-08-01

The implementation of control loops for space applications is an area with great potential. However, the characteristics of this kind of systems, such as its wide dynamic range of numeric values, make inadequate the use of fixed-point algorithms.However, because the generic chips available for the treatment of floating point data are, in general, not qualified to operate in space environments and the possibility of using an IP module in a FPGA/ASIC qualified for space is not viable due to the low amount of logic cells available for these type of devices, it is necessary to find a viable alternative.For these reasons, in this paper a VHDL Floating Point Module is presented. This proposal allows the design and execution of floating point algorithms with acceptable occupancy to be implemented in FPGAs/ASICs qualified for space environments.
Imaging photomultiplier array with integrated amplifiers and high-speed USB interfacea)

NASA Astrophysics Data System (ADS)

Blacksell, M.; Wach, J.; Anderson, D.; Howard, J.; Collis, S. M.; Blackwell, B. D.; Andruczyk, D.; James, B. W.

2008-10-01

Multianode photomultiplier tube (PMT) arrays are finding application as convenient high-speed light sensitive devices for plasma imaging. This paper describes the development of a USB-based "plug-n-play" 16-channel PMT camera with 16bits simultaneous acquisition of 16 signal channels at rates up to 2MS/s per channel. The preamplifiers and digital hardware are packaged in a compact housing which incorporates magnetic shielding, on-board generation of the high-voltage PMT bias, an optical filter mount and slits, and F-mount lens adaptor. Triggering, timing, and acquisition are handled by four field-programmable gate arrays (FPGAs) under instruction from a master FPGA controlled by a computer with a LABVIEW interface. We present technical design details and specifications and illustrate performance with high-speed images obtained on the H-1 heliac at the ANU.
Imaging photomultiplier array with integrated amplifiers and high-speed USB interface.

PubMed

Blacksell, M; Wach, J; Anderson, D; Howard, J; Collis, S M; Blackwell, B D; Andruczyk, D; James, B W

2008-10-01

Multianode photomultiplier tube (PMT) arrays are finding application as convenient high-speed light sensitive devices for plasma imaging. This paper describes the development of a USB-based "plug-n-play" 16-channel PMT camera with 16 bits simultaneous acquisition of 16 signal channels at rates up to 2 MSs per channel. The preamplifiers and digital hardware are packaged in a compact housing which incorporates magnetic shielding, on-board generation of the high-voltage PMT bias, an optical filter mount and slits, and F-mount lens adaptor. Triggering, timing, and acquisition are handled by four field-programmable gate arrays (FPGAs) under instruction from a master FPGA controlled by a computer with a LABVIEW interface. We present technical design details and specifications and illustrate performance with high-speed images obtained on the H-1 heliac at the ANU.
Statistical Anomalies of Bitflips in SRAMs to Discriminate SBUs From MCUs

NASA Astrophysics Data System (ADS)

Clemente, Juan Antonio; Franco, Francisco J.; Villa, Francesca; Baylac, Maud; Rey, Solenne; Mecha, Hortensia; Agapito, Juan A.; Puchner, Helmut; Hubert, Guillaume; Velazco, Raoul

2016-08-01

Recently, the occurrence of multiple events in static tests has been investigated by checking the statistical distribution of the difference between the addresses of the words containing bitflips. That method has been successfully applied to Field Programmable Gate Arrays (FPGAs) and the original authors indicate that it is also valid for SRAMs. This paper presents a modified methodology that is based on checking the XORed addresses with bitflips, rather than on the difference. Irradiation tests on CMOS 130 & 90 nm SRAMs with 14-MeV neutrons have been performed to validate this methodology. Results in high-altitude environments are also presented and cross-checked with theoretical predictions. In addition, this methodology has also been used to detect modifications in the organization of said memories. Theoretical predictions have been validated with actual data provided by the manufacturer.
IOTA: the array controller for a gigapixel OTCCD camera for Pan-STARRS

NASA Astrophysics Data System (ADS)

Onaka, Peter; Tonry, John; Luppino, Gerard; Lockhart, Charles; Lee, Aaron; Ching, Gregory; Isani, Sidik; Uyeshiro, Robin

2004-09-01

The PanSTARRS project has undertaken an ambitious effort to develop a completely new array controller architecture that is fundamentally driven by the large 1gigapixel, low noise, high speed OTCCD mosaic requirements as well as the size, power and weight restrictions of the PanSTARRS telescope. The result is a very small form factor next generation controller scalar building block with 1 Gigabit Ethernet interfaces that will be assembled into a system that will readout 512 outputs at ~1 Megapixel sample rates on each output. The paper will also discuss critical technology and fabrication techniques such as greater than 1MHz analog to digital converters (ADCs), multiple fast sampling and digital calculation of multiple correlated samples (DMCS), ball grid array (BGA) packaged circuits, LINUX running on embedded field programmable gate arrays (FPGAs) with hard core microprocessors for the prototype currently being developed.
Sensor network based vehicle classification and license plate identification system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Frigo, Janette Rose; Brennan, Sean M; Rosten, Edward J

Typically, for energy efficiency and scalability purposes, sensor networks have been used in the context of environmental and traffic monitoring applications in which operations at the sensor level are not computationally intensive. But increasingly, sensor network applications require data and compute intensive sensors such video cameras and microphones. In this paper, we describe the design and implementation of two such systems: a vehicle classifier based on acoustic signals and a license plate identification system using a camera. The systems are implemented in an energy-efficient manner to the extent possible using commercially available hardware, the Mica motes and the Stargate platform.more » Our experience in designing these systems leads us to consider an alternate more flexible, modular, low-power mote architecture that uses a combination of FPGAs, specialized embedded processing units and sensor data acquisition systems.« less
Introduction to the Special Issue on Digital Signal Processing in Radio Astronomy

NASA Astrophysics Data System (ADS)

Price, D. C.; Kocz, J.; Bailes, M.; Greenhill, L. J.

2016-03-01

Advances in astronomy are intimately linked to advances in digital signal processing (DSP). This special issue is focused upon advances in DSP within radio astronomy. The trend within that community is to use off-the-shelf digital hardware where possible and leverage advances in high performance computing. In particular, graphics processing units (GPUs) and field programmable gate arrays (FPGAs) are being used in place of application-specific circuits (ASICs); high-speed Ethernet and Infiniband are being used for interconnect in place of custom backplanes. Further, to lower hurdles in digital engineering, communities have designed and released general-purpose FPGA-based DSP systems, such as the CASPER ROACH board, ASTRON Uniboard, and CSIRO Redback board. In this introductory paper, we give a brief historical overview, a summary of recent trends, and provide an outlook on future directions.
Design and implementation in VHDL code of the two-dimensional fast Fourier transform for frequency filtering, convolution and correlation operations

NASA Astrophysics Data System (ADS)

Vilardy, Juan M.; Giacometto, F.; Torres, C. O.; Mattos, L.

2011-01-01

The two-dimensional Fast Fourier Transform (FFT 2D) is an essential tool in the two-dimensional discrete signals analysis and processing, which allows developing a large number of applications. This article shows the description and synthesis in VHDL code of the FFT 2D with fixed point binary representation using the programming tool Simulink HDL Coder of Matlab; showing a quick and easy way to handle overflow, underflow and the creation registers, adders and multipliers of complex data in VHDL and as well as the generation of test bench for verification of the codes generated in the ModelSim tool. The main objective of development of the hardware architecture of the FFT 2D focuses on the subsequent completion of the following operations applied to images: frequency filtering, convolution and correlation. The description and synthesis of the hardware architecture uses the XC3S1200E family Spartan 3E FPGA from Xilinx Manufacturer.
Proposed hardware architectures of particle filter for object tracking

NASA Astrophysics Data System (ADS)

Abd El-Halym, Howida A.; Mahmoud, Imbaby Ismail; Habib, SED

2012-12-01

In this article, efficient hardware architectures for particle filter (PF) are presented. We propose three different architectures for Sequential Importance Resampling Filter (SIRF) implementation. The first architecture is a two-step sequential PF machine, where particle sampling, weight, and output calculations are carried out in parallel during the first step followed by sequential resampling in the second step. For the weight computation step, a piecewise linear function is used instead of the classical exponential function. This decreases the complexity of the architecture without degrading the results. The second architecture speeds up the resampling step via a parallel, rather than a serial, architecture. This second architecture targets a balance between hardware resources and the speed of operation. The third architecture implements the SIRF as a distributed PF composed of several processing elements and central unit. All the proposed architectures are captured using VHDL synthesized using Xilinx environment, and verified using the ModelSim simulator. Synthesis results confirmed the resource reduction and speed up advantages of our architectures.
Design of low noise imaging system

NASA Astrophysics Data System (ADS)

Hu, Bo; Chen, Xiaolai

2017-10-01

In order to meet the needs of engineering applications for low noise imaging system under the mode of global shutter, a complete imaging system is designed based on the SCMOS (Scientific CMOS) image sensor CIS2521F. The paper introduces hardware circuit and software system design. Based on the analysis of key indexes and technologies about the imaging system, the paper makes chips selection and decides SCMOS + FPGA+ DDRII+ Camera Link as processing architecture. Then it introduces the entire system workflow and power supply and distribution unit design. As for the software system, which consists of the SCMOS control module, image acquisition module, data cache control module and transmission control module, the paper designs in Verilog language and drives it to work properly based on Xilinx FPGA. The imaging experimental results show that the imaging system exhibits a 2560*2160 pixel resolution, has a maximum frame frequency of 50 fps. The imaging quality of the system satisfies the requirement of the index.
Lifting Scheme DWT Implementation in a Wireless Vision Sensor Network

NASA Astrophysics Data System (ADS)

Ong, Jia Jan; Ang, L.-M.; Seng, K. P.

This paper presents the practical implementation of a Wireless Visual Sensor Network (WVSN) with DWT processing on the visual nodes. WVSN consists of visual nodes that capture video and transmit to the base-station without processing. Limitation of network bandwidth restrains the implementation of real time video streaming from remote visual nodes through wireless communication. Three layers of DWT filters are implemented to process the captured image from the camera. With having all the wavelet coefficients produced, it is possible just to transmit the low frequency band coefficients and obtain an approximate image at the base-station. This will reduce the amount of power required in transmission. When necessary, transmitting all the wavelet coefficients will produce the full detail of image, which is similar to the image captured at the visual nodes. The visual node combines the CMOS camera, Xilinx Spartan-3L FPGA and wireless ZigBee® network that uses the Ember EM250 chip.
Hierarchical Address Event Routing for Reconfigurable Large-Scale Neuromorphic Systems.

PubMed

Park, Jongkil; Yu, Theodore; Joshi, Siddharth; Maier, Christoph; Cauwenberghs, Gert

2017-10-01

We present a hierarchical address-event routing (HiAER) architecture for scalable communication of neural and synaptic spike events between neuromorphic processors, implemented with five Xilinx Spartan-6 field-programmable gate arrays and four custom analog neuromophic integrated circuits serving 262k neurons and 262M synapses. The architecture extends the single-bus address-event representation protocol to a hierarchy of multiple nested buses, routing events across increasing scales of spatial distance. The HiAER protocol provides individually programmable axonal delay in addition to strength for each synapse, lending itself toward biologically plausible neural network architectures, and scales across a range of hierarchies suitable for multichip and multiboard systems in reconfigurable large-scale neuromorphic systems. We show approximately linear scaling of net global synaptic event throughput with number of routing nodes in the network, at 3.6×10 7 synaptic events per second per 16k-neuron node in the hierarchy.
Development of a modular test system for the silicon sensor R&D of the ATLAS Upgrade

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, H.; Benoit, M.; Chen, H.

High Voltage CMOS sensors are a promising technology for tracking detectors in collider experiments. Extensive R&D studies are being carried out by the ATLAS Collaboration for a possible use of HV-CMOS in the High Luminosity LHC upgrade of the Inner Tracker detector. CaRIBOu (Control and Readout Itk BOard) is a modular test system developed to test Silicon based detectors. It currently includes five custom designed boards, a Xilinx ZC706 development board, FELIX (Front-End LInk eXchange) PCIe card and a host computer. A software program has been developed in Python to control the CaRIBOu hardware. CaRIBOu has been used in themore » testbeam of the HV-CMOS sensor AMS180v4 at CERN. Preliminary results have shown that the test system is very versatile. In conclusion, further development is ongoing to adapt to different sensors, and to make it available to various lab test stands.« less
Development of a modular test system for the silicon sensor R&D of the ATLAS Upgrade

DOE PAGES

Liu, H.; Benoit, M.; Chen, H.; ...

2017-01-11

High Voltage CMOS sensors are a promising technology for tracking detectors in collider experiments. Extensive R&D studies are being carried out by the ATLAS Collaboration for a possible use of HV-CMOS in the High Luminosity LHC upgrade of the Inner Tracker detector. CaRIBOu (Control and Readout Itk BOard) is a modular test system developed to test Silicon based detectors. It currently includes five custom designed boards, a Xilinx ZC706 development board, FELIX (Front-End LInk eXchange) PCIe card and a host computer. A software program has been developed in Python to control the CaRIBOu hardware. CaRIBOu has been used in themore » testbeam of the HV-CMOS sensor AMS180v4 at CERN. Preliminary results have shown that the test system is very versatile. In conclusion, further development is ongoing to adapt to different sensors, and to make it available to various lab test stands.« less
Diversification of Processors Based on Redundancy in Instruction Set

NASA Astrophysics Data System (ADS)

Ichikawa, Shuichi; Sawada, Takashi; Hata, Hisashi

By diversifying processor architecture, computer software is expected to be more resistant to plagiarism, analysis, and attacks. This study presents a new method to diversify instruction set architecture (ISA) by utilizing the redundancy in the instruction set. Our method is particularly suited for embedded systems implemented with FPGA technology, and realizes a genuine instruction set randomization, which has not been provided by the preceding studies. The evaluation results on four typical ISAs indicate that our scheme can provide a far larger degree of freedom than the preceding studies. Diversified processors based on MIPS architecture were actually implemented and evaluated with Xilinx Spartan-3 FPGA. The increase of logic scale was modest: 5.1% in Specialized design and 3.6% in RAM-mapped design. The performance overhead was also modest: 3.4% in Specialized design and 11.6% in RAM-mapped design. From these results, our scheme is regarded as a practical and promising way to secure FPGA-based embedded systems.
An Embedded Reconfigurable Logic Module

NASA Technical Reports Server (NTRS)

Tucker, Jerry H.; Klenke, Robert H.; Shams, Qamar A. (Technical Monitor)

2002-01-01

A Miniature Embedded Reconfigurable Computer and Logic (MERCAL) module has been developed and verified. MERCAL was designed to be a general-purpose, universal module that that can provide significant hardware and software resources to meet the requirements of many of today's complex embedded applications. This is accomplished in the MERCAL module by combining a sub credit card size PC in a DIMM form factor with a XILINX Spartan I1 FPGA. The PC has the ability to download program files to the FPGA to configure it for different hardware functions and to transfer data to and from the FPGA via the PC's ISA bus during run time. The MERCAL module combines, in a compact package, the computational power of a 133 MHz PC with up to 150,000 gate equivalents of digital logic that can be reconfigured by software. The general architecture and functionality of the MERCAL hardware and system software are described.
JPL Space Telecommunications Radio System Operating Environment

NASA Technical Reports Server (NTRS)

Lux, James P.; Lang, Minh; Peters, Kenneth J.; Taylor, Gregory H.; Duncan, Courtney B.; Orozco, David S.; Stern, Ryan A.; Ahten, Earl R.; Girard, Mike

2013-01-01

A flight-qualified implementation of a Software Defined Radio (SDR) Operating Environment for the JPL-SDR built for the CoNNeCT Project has been developed. It is compliant with the NASA Space Telecommunications Radio System (STRS) Architecture Standard, and provides the software infrastructure for STRS compliant waveform applications. This software provides a standards-compliant abstracted view of the JPL-SDR hardware platform. It uses industry standard POSIX interfaces for most functions, as well as exposing the STRS API (Application Programming In terface) required by the standard. This software includes a standardized interface for IP components instantiated within a Xilinx FPGA (Field Programmable Gate Array). The software provides a standardized abstracted interface to platform resources such as data converters, file system, etc., which can be used by STRS standards conformant waveform applications. It provides a generic SDR operating environment with a much smaller resource footprint than similar products such as SCA (Software Communications Architecture) compliant implementations, or the DoD Joint Tactical Radio Systems (JTRS).
A VLSI Implementation of Four-Phase Lift Controller Using Verilog HDL

NASA Astrophysics Data System (ADS)

Kumar, Manish; Singh, Priyanka; Singh, Shesha

2017-08-01

With the advent of an era of staggering range of new technologies to provide ease of mobility and transportation elevators have become an essential component of all high rise buildings. An elevator is a type of vertical transportation that moves people between the floors of a high rise building. A four-Phase lift controller modeled on Verilog HDL code using Finite State Machine (FSM) has been presented in this paper. Verilog HDL helps in automated analysis and simulation of lift controller circuit. This design is based on synchronous input that operates on a fixed frequency. The Lift motion is controlled by means of accepting the destination floor level as input and generate control signal as output. In the proposed design a Verilog RTL code is developed and verified. Project Navigator of XILINX has been used as a code writing platform and results were simulated using Modelsim 5.4a simulator. This paper discusses the overall evolution of design and also discusses simulated results.
FPGA-based architecture for motion recovering in real-time

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar

2002-03-01

A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
An FPGA-Based People Detection System

NASA Astrophysics Data System (ADS)

Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.

2005-12-01

This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.

Implementation of a Parameterized Interacting Multiple Model Filter on an FPGA for Satellite Communications

NASA Technical Reports Server (NTRS)

Hackett, Timothy M.; Bilen, Sven G.; Ferreira, Paulo Victor R.; Wyglinski, Alexander M.; Reinhart, Richard C.

2016-01-01

In a communications channel, the space environment between a spacecraft and an Earth ground station can potentially cause the loss of a data link or at least degrade its performance due to atmospheric effects, shadowing, multipath, or other impairments. In adaptive and coded modulation, the signal power level at the receiver can be used in order to choose a modulation-coding technique that maximizes throughput while meeting bit error rate (BER) and other performance requirements. It is the goal of this research to implement a generalized interacting multiple model (IMM) filter based on Kalman filters for improved received power estimation on software-dened radio (SDR) technology for satellite communications applications. The IMM filter has been implemented in Verilog consisting of a customizable bank of Kalman filters for choosing between performance and resource utilization. Each Kalman filter can be implemented using either solely a Schur complement module (for high area efficiency) or with Schur complement, matrix multiplication, and matrix addition modules (for high performance). These modules were simulated and synthesized for the Virtex II platform on the JPL Radio Experimenter Development System (EDS) at NASA Glenn Research Center. The results for simulation, synthesis, and hardware testing are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Heffner, M.; Riot, V.; Fabris, L.

Medium to large channel count detectors are usually faced with a few unattractive options for data acquisition (DAQ). Small to medium sized TPC experiments, for example, can be too small to justify the high expense and long development time of application specific integrated circuit (ASIC) development. In some cases an experiment can piggy-back on a larger experiment and the associated ASIC development, but this puts the time line of development out of the hands of the smaller experiment. Another option is to run perhaps thousands of cables to rack mounted equipment, which is clearly undesirable. The development of commercial high-speedmore » high-density FPGAs and ADCs combined with the small discrete components and robotic assembly open a new option that scales to tens of thousands of channels and is only slightly larger than ASICs using off-the-shelf components.« less
State-of-the-art in Heterogeneous Computing

DOE PAGES

Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...

2010-01-01

Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
A Compact, Flexible, High Channel Count DAQ Built From Off-the-Shelf Components

DOE PAGES

Heffner, M.; Riot, V.; Fabris, L.

2013-06-01

Medium to large channel count detectors are usually faced with a few unattractive options for data acquisition (DAQ). Small to medium sized TPC experiments, for example, can be too small to justify the high expense and long development time of application specific integrated circuit (ASIC) development. In some cases an experiment can piggy-back on a larger experiment and the associated ASIC development, but this puts the time line of development out of the hands of the smaller experiment. Another option is to run perhaps thousands of cables to rack mounted equipment, which is clearly undesirable. The development of commercial high-speedmore » high-density FPGAs and ADCs combined with the small discrete components and robotic assembly open a new option that scales to tens of thousands of channels and is only slightly larger than ASICs using off-the-shelf components.« less
Feasibility study for future implantable neural-silicon interface devices.

PubMed

Al-Armaghany, Allann; Yu, Bo; Mak, Terrence; Tong, Kin-Fai; Sun, Yihe

2011-01-01

The emerging neural-silicon interface devices bridge nerve systems with artificial systems and play a key role in neuro-prostheses and neuro-rehabilitation applications. Integrating neural signal collection, processing and transmission on a single device will make clinical applications more practical and feasible. This paper focuses on the wireless antenna part and real-time neural signal analysis part of implantable brain-machine interface (BMI) devices. We propose to use millimeter-wave for wireless connections between different areas of a brain. Various antenna, including microstrip patch, monopole antenna and substrate integrated waveguide antenna are considered for the intra-cortical proximity communication. A Hebbian eigenfilter based method is proposed for multi-channel neuronal spike sorting. Folding and parallel design techniques are employed to explore various structures and make a trade-off between area and power consumption. Field programmable logic arrays (FPGAs) are used to evaluate various structures.
Development of a front end controller/heap manager for PHENIX

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ericson, M.N.; Allen, M.D.; Musrock, M.S.

1996-12-31

A controller/heap manager has been designed for applicability to all detector subsystem types of PHENIX. the heap manager performs all functions associated with front end electronics control including ADC and analog memory control, data collection, command interpretation and execution, and data packet forming and communication. Interfaces to the unit consist of a timing and control bus, a serial bus, a parallel data bus, and a trigger interface. The topology developed is modular so that many functional blocks are identical for a number of subsystem types. Programmability is maximized through the use of flexible modular functions and implementation using field programmablemore » gate arrays (FPGAs). Details of unit design and functionality will be discussed with particular detail given to subsystems having analog memory-based front end electronics. In addition, mode control, serial functions, and FPGA implementation details will be presented.« less
Comparing an FPGA to a Cell for an Image Processing Application

NASA Astrophysics Data System (ADS)

Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.

2010-12-01

Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.
Spectral domain optical coherence tomography of multi-MHz A-scan rates at 1310 nm range and real-time 4D-display up to 41 volumes/second

PubMed Central

Choi, Dong-hak; Hiro-Oka, Hideaki; Shimizu, Kimiya; Ohbayashi, Kohji

2012-01-01

An ultrafast frequency domain optical coherence tomography system was developed at A-scan rates between 2.5 and 10 MHz, a B-scan rate of 4 or 8 kHz, and volume-rates between 12 and 41 volumes/second. In the case of the worst duty ratio of 10%, the averaged A-scan rate was 1 MHz. Two optical demultiplexers at a center wavelength of 1310 nm were used for linear-k spectral dispersion and simultaneous differential signal detection at 320 wavelengths. The depth-range, sensitivity, sensitivity roll-off by 6 dB, and axial resolution were 4 mm, 97 dB, 6 mm, and 23 μm, respectively. Using FPGAs for FFT and a GPU for volume rendering, a real-time 4D display was demonstrated at a rate up to 41 volumes/second for an image size of 256 (axial) × 128 × 128 (lateral) voxels. PMID:23243560
Graphical Environment Tools for Application to Gamma-Ray Energy Tracking Arrays

DOE Office of Scientific and Technical Information (OSTI.GOV)

Todd, Richard A.; Radford, David C.

2013-12-30

Highly segmented, position-sensitive germanium detector systems are being developed for nuclear physics research where traditional electronic signal processing with mixed analog and digital function blocks would be enormously complex and costly. Future systems will be constructed using pipelined processing of high-speed digitized signals as is done in the telecommunications industry. Techniques which provide rapid algorithm and system development for future systems are desirable. This project has used digital signal processing concepts and existing graphical system design tools to develop a set of re-usable modular functions and libraries targeted for the nuclear physics community. Researchers working with complex nuclear detector arraysmore » such as the Gamma-Ray Energy Tracking Array (GRETA) have been able to construct advanced data processing algorithms for implementation in field programmable gate arrays (FPGAs) through application of these library functions using intuitive graphical interfaces.« less
Design Tools for Reconfigurable Hardware in Orbit (RHinO)

NASA Technical Reports Server (NTRS)

French, Mathew; Graham, Paul; Wirthlin, Michael; Larchev, Gregory; Bellows, Peter; Schott, Brian

2004-01-01

The Reconfigurable Hardware in Orbit (RHinO) project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. These tools leverage an established FPGA design environment and focus primarily on space effects mitigation and power optimization. The project is creating software to automatically test and evaluate the single-event-upsets (SEUs) sensitivities of an FPGA design and insert mitigation techniques. Extensions into the tool suite will also allow evolvable algorithm techniques to reconfigure around single-event-latchup (SEL) events. In the power domain, tools are being created for dynamic power visualiization and optimization. Thus, this technology seeks to enable the use of Reconfigurable Hardware in Orbit, via an integrated design tool-suite aiming to reduce risk, cost, and design time of multimission reconfigurable space processors using SRAM-based FPGAs.
A PCIe Gen3 based readout for the LHCb upgrade

NASA Astrophysics Data System (ADS)

Bellato, M.; Collazuol, G.; D'Antone, I.; Durante, P.; Galli, D.; Jost, B.; Lax, I.; Liu, G.; Marconi, U.; Neufeld, N.; Schwemmer, R.; Vagnoni, V.

2014-06-01

The architecture of the data acquisition system foreseen for the LHCb upgrade, to be installed by 2018, is devised to readout events trigger-less, synchronously with the LHC bunch crossing rate at 40 MHz. Within this approach the readout boards act as a bridge between the front-end electronics and the High Level Trigger (HLT) computing farm. The baseline design for the LHCb readout is an ATCA board requiring dedicated crates. A local area standard network protocol is implemented in the on-board FPGAs to read out the data. The alternative solution proposed here consists in building the readout boards as PCIe peripherals of the event-builder servers. The main architectural advantage is that protocol and link-technology of the event-builder can be left open until very late, to profit from the most cost-effective industry technology available at the time of the LHC LS2.
Slow Controls Using the Axiom M5235BCC

NASA Astrophysics Data System (ADS)

Hague, Tyler

2008-10-01

The Forward Vertex Detector group at PHENIX plans to adopt the Axiom M5235 Business Card Controller for use as slow controls. It is also being evaluated for slow controls on FermiLab e906. This controller features the Freescale MCF5235 microprocessor. It also has three parallel buses, these being the MCU port, BUS port, and enhanced Time Processing Unit (eTPU) port. The BUS port uses a chip select module with three external chip selects to communicate with peripherals. This will be used to communicate with and configure Field Programmable Gate Arrays (FPGAs). The controller also has an Ethernet port which can use several different protocols such as TCP and UDP. This will be used to transfer files with computers on a network. The M5235 Business Card Controller will be placed in a VME crate along with VME card and a Spartan-3 FPGA.
Optical interconnect technologies for high-bandwidth ICT systems

NASA Astrophysics Data System (ADS)

Chujo, Norio; Takai, Toshiaki; Mizushima, Akiko; Arimoto, Hideo; Matsuoka, Yasunobu; Yamashita, Hiroki; Matsushima, Naoki

2016-03-01

The bandwidth of information and communication technology (ICT) systems is increasing and is predicted to reach more than 10 Tb/s. However, an electrical interconnect cannot achieve such bandwidth because of its density limits. To solve this problem, we propose two types of high-density optical fiber wiring for backplanes and circuit boards such as interface boards and switch boards. One type uses routed ribbon fiber in a circuit board because it has the ability to be formed into complex shapes to avoid interfering with the LSI and electrical components on the board. The backplane is required to exhibit high density and flexibility, so the second type uses loose fiber. We developed a 9.6-Tb/s optical interconnect demonstration system using embedded optical modules, optical backplane, and optical connector in a network apparatus chassis. We achieved 25-Gb/s transmission between FPGAs via the optical backplane.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Brusati, M.; Camplani, A.; Cannon, M.

SRAM-ba8ed Field Programmable Gate Array (FPGA) logic devices arc very attractive in applications where high data throughput is needed, such as the latest generation of High Energy Physics (HEP) experiments. FPGAs have been rarely used in such experiments because of their sensitivity to radiation. The present paper proposes a mitigation approach applied to commercial FPGA devices to meet the reliability requirements for the front-end electronics of the Liquid Argon (LAr) electromagnetic calorimeter of the ATLAS experiment, located at CERN. Particular attention will be devoted to define a proper mitigation scheme of the multi-gigabit transceivers embedded in the FPGA, which ismore » a critical part of the LAr data acquisition chain. A demonstrator board is being developed to validate the proposed methodology. :!\\litigation techniques such as Triple Modular Redundancy (T:t\\IR) and scrubbing will be used to increase the robustness of the design and to maximize the fault tolerance from Single-Event Upsets (SEUs).« less
Maximum-Likelihood Estimation With a Contracting-Grid Search Algorithm

PubMed Central

Hesterman, Jacob Y.; Caucci, Luca; Kupinski, Matthew A.; Barrett, Harrison H.; Furenlid, Lars R.

2010-01-01

A fast search algorithm capable of operating in multi-dimensional spaces is introduced. As a sample application, we demonstrate its utility in the 2D and 3D maximum-likelihood position-estimation problem that arises in the processing of PMT signals to derive interaction locations in compact gamma cameras. We demonstrate that the algorithm can be parallelized in pipelines, and thereby efficiently implemented in specialized hardware, such as field-programmable gate arrays (FPGAs). A 2D implementation of the algorithm is achieved in Cell/BE processors, resulting in processing speeds above one million events per second, which is a 20× increase in speed over a conventional desktop machine. Graphics processing units (GPUs) are used for a 3D application of the algorithm, resulting in processing speeds of nearly 250,000 events per second which is a 250× increase in speed over a conventional desktop machine. These implementations indicate the viability of the algorithm for use in real-time imaging applications. PMID:20824155
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Zheming; Yoshii, Kazutomo; Finkel, Hal

Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
A design approach for small vision-based autonomous vehicles

NASA Astrophysics Data System (ADS)

Edwards, Barrett B.; Fife, Wade S.; Archibald, James K.; Lee, Dah-Jye; Wilde, Doran K.

2006-10-01

This paper describes the design of a small autonomous vehicle based on the Helios computing platform, a custom FPGA-based board capable of supporting on-board vision. Target applications for the Helios computing platform are those that require lightweight equipment and low power consumption. To demonstrate the capabilities of FPGAs in real-time control of autonomous vehicles, a 16 inch long R/C monster truck was outfitted with a Helios board. The platform provided by such a small vehicle is ideal for testing and development. The proof of concept application for this autonomous vehicle was a timed race through an environment with obstacles. Given the size restrictions of the vehicle and its operating environment, the only feasible on-board sensor is a small CMOS camera. The single video feed is therefore the only source of information from the surrounding environment. The image is then segmented and processed by custom logic in the FPGA that also controls direction and speed of the vehicle based on visual input.
An artificial retina processor for track reconstruction at the LHC crossing rate

DOE PAGES

Bedeschi, F.; Cenci, R.; Marino, P.; ...

2017-11-23

The goal of the INFN-RETINA R&D project is to develop and implement a computational methodology that allows to reconstruct events with a large number (> 100) of charged-particle tracks in pixel and silicon strip detectors at 40 MHz, thus matching the requirements for processing LHC events at the full bunch-crossing frequency. Our approach relies on a parallel pattern-recognition algorithm, dubbed artificial retina, inspired by the early stages of image processing by the brain. In order to demonstrate that a track-processing system based on this algorithm is feasible, we built a sizable prototype of a tracking processor tuned to 3 000more » patterns, based on already existing readout boards equipped with Altera Stratix III FPGAs. The detailed geometry and charged-particle activity of a large tracking detector currently in operation are used to assess its performances. Here, we report on the test results with such a prototype.« less
Use of FPGA embedded processors for fast cluster reconstruction in the NA62 liquid krypton electromagnetic calorimeter

NASA Astrophysics Data System (ADS)

Badoni, D.; Bizzarri, M.; Bonaiuto, V.; Checcucci, B.; De Simone, N.; Federici, L.; Fucci, A.; Paoluzzi, G.; Papi, A.; Piccini, M.; Salamon, A.; Salina, G.; Santovetti, E.; Sargeni, F.; Venditti, S.

2014-01-01

The goal of the NA62 experiment at the CERN SPS is the measurement of the Branching Ratio of the very rare kaon decay K+→π+ ν bar nu with a 10% accuracy by collecting 100 events in two years of data taking. An efficient photon veto system is needed to reject the K+→π+ π0 background and a liquid krypton electromagnetic calorimeter will be used for this purpose in the 1-10 mrad angular region. The L0 trigger system for the calorimeter consists of a peak reconstruction algorithm implemented on FPGA by using a mixed parallel architecture based on soft core Altera NIOS II embedded processors together with custom VHDL modules. This solution allows an efficient and flexible reconstruction of the energy-deposition peak. The system will be totally composed of 36 TEL62 boards, 108 mezzanine cards and 215 high-performance FPGAs. We describe the design, current status and the results of the first performance tests.
TOGA - A GNSS Reflections Instrument for Remote Sensing Using Beamforming

NASA Technical Reports Server (NTRS)

Esterhuizen, S.; Meehan, T. K.; Robison, D.

2009-01-01

Remotely sensing the Earth's surface using GNSS signals as bi-static radar sources is one of the most challenging applications for radiometric instrument design. As part of NASA's Instrument Incubator Program, our group at JPL has built a prototype instrument, TOGA (Time-shifted, Orthometric, GNSS Array), to address a variety of GNSS science needs. Observing GNSS reflections is major focus of the design/development effort. The TOGA design features a steerable beam antenna array which can form a high-gain antenna pattern in multiple directions simultaneously. Multiple FPGAs provide flexible digital signal processing logic to process both GPS and Galileo reflections. A Linux OS based science processor serves as experiment scheduler and data post-processor. This paper outlines the TOGA design approach as well as preliminary results of reflection data collected from test flights over the Pacific ocean. This reflections data demonstrates observation of the GPS L1/L2C/L5 signals.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.