The Use of a Microcomputer Based Array Processor for Real Time Laser Velocimeter Data Processing
NASA Technical Reports Server (NTRS)
Meyers, James F.
1990-01-01
The application of an array processor to laser velocimeter data processing is presented. The hardware is described along with the method of parallel programming required by the array processor. A portion of the data processing program is described in detail. The increase in computational speed of a microcomputer equipped with an array processor is illustrated by comparative testing with a minicomputer.
NASA Technical Reports Server (NTRS)
Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)
1983-01-01
A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.
Electrically reconfigurable logic array
NASA Technical Reports Server (NTRS)
Agarwal, R. K.
1982-01-01
To compose the complicated systems using algorithmically specialized logic circuits or processors, one solution is to perform relational computations such as union, division and intersection directly on hardware. These relations can be pipelined efficiently on a network of processors having an array configuration. These processors can be designed and implemented with a few simple cells. In order to determine the state-of-the-art in Electrically Reconfigurable Logic Array (ERLA), a survey of the available programmable logic array (PLA) and the logic circuit elements used in such arrays was conducted. Based on this survey some recommendations are made for ERLA devices.
Reconfigurable signal processor designs for advanced digital array radar systems
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan (Rockee); Yu, Xining
2017-05-01
The new challenges originated from Digital Array Radar (DAR) demands a new generation of reconfigurable backend processor in the system. The new FPGA devices can support much higher speed, more bandwidth and processing capabilities for the need of digital Line Replaceable Unit (LRU). This study focuses on using the latest Altera and Xilinx devices in an adaptive beamforming processor. The field reprogrammable RF devices from Analog Devices are used as analog front end transceivers. Different from other existing Software-Defined Radio transceivers on the market, this processor is designed for distributed adaptive beamforming in a networked environment. The following aspects of the novel radar processor will be presented: (1) A new system-on-chip architecture based on Altera's devices and adaptive processing module, especially for the adaptive beamforming and pulse compression, will be introduced, (2) Successful implementation of generation 2 serial RapidIO data links on FPGA, which supports VITA-49 radio packet format for large distributed DAR processing. (3) Demonstration of the feasibility and capabilities of the processor in a Micro-TCA based, SRIO switching backplane to support multichannel beamforming in real-time. (4) Application of this processor in ongoing radar system development projects, including OU's dual-polarized digital array radar, the planned new cylindrical array radars, and future airborne radars.
A novel VLSI processor architecture for supercomputing arrays
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Pattabiraman, S.; Devanathan, R.; Ahmed, Ashaf; Venkataraman, S.; Ganesh, N.
1993-01-01
Design of the processor element for general purpose massively parallel supercomputing arrays is highly complex and cost ineffective. To overcome this, the architecture and organization of the functional units of the processor element should be such as to suit the diverse computational structures and simplify mapping of complex communication structures of different classes of algorithms. This demands that the computation and communication structures of different class of algorithms be unified. While unifying the different communication structures is a difficult process, analysis of a wide class of algorithms reveals that their computation structures can be expressed in terms of basic IP,IP,OP,CM,R,SM, and MAA operations. The execution of these operations is unified on the PAcube macro-cell array. Based on this PAcube macro-cell array, we present a novel processor element called the GIPOP processor, which has dedicated functional units to perform the above operations. The architecture and organization of these functional units are such to satisfy the two important criteria mentioned above. The structure of the macro-cell and the unification process has led to a very regular and simpler design of the GIPOP processor. The production cost of the GIPOP processor is drastically reduced as it is designed on high performance mask programmable PAcube arrays.
A digital retina-like low-level vision processor.
Mertoguno, S; Bourbakis, N G
2003-01-01
This correspondence presents the basic design and the simulation of a low level multilayer vision processor that emulates to some degree the functional behavior of a human retina. This retina-like multilayer processor is the lower part of an autonomous self-organized vision system, called Kydon, that could be used on visually impaired people with a damaged visual cerebral cortex. The Kydon vision system, however, is not presented in this paper. The retina-like processor consists of four major layers, where each of them is an array processor based on hexagonal, autonomous processing elements that perform a certain set of low level vision tasks, such as smoothing and light adaptation, edge detection, segmentation, line recognition and region-graph generation. At each layer, the array processor is a 2D array of k/spl times/m hexagonal identical autonomous cells that simultaneously execute certain low level vision tasks. Thus, the hardware design and the simulation at the transistor level of the processing elements (PEs) of the retina-like processor and its simulated functionality with illustrative examples are provided in this paper.
Chatterjee, Siddhartha [Yorktown Heights, NY; Gunnels, John A [Brewster, NY
2011-11-08
A method and structure of distributing elements of an array of data in a computer memory to a specific processor of a multi-dimensional mesh of parallel processors includes designating a distribution of elements of at least a portion of the array to be executed by specific processors in the multi-dimensional mesh of parallel processors. The pattern of the designating includes a cyclical repetitive pattern of the parallel processor mesh, as modified to have a skew in at least one dimension so that both a row of data in the array and a column of data in the array map to respective contiguous groupings of the processors such that a dimension of the contiguous groupings is greater than one.
NASA Astrophysics Data System (ADS)
Esepkina, N. A.; Lavrov, A. P.; Anan'ev, M. N.; Blagodarnyi, V. S.; Ivanov, S. I.; Mansyrev, M. I.; Molodyakov, S. A.
1995-10-01
Two new types of optoelectronic radio-signal processors were investigated. Charge-coupled device (CCD) photodetectors are used in these processors under continuous scanning conditions, i.e. in a time delay and storage mode. One of these processors is based on a CCD photodetector array with a reference-signal amplitude transparency and the other is an adaptive acousto-optical signal processor with linear frequency modulation. The processor with the transparency performs multichannel discrete—analogue convolution of an input signal with a corresponding kernel of the transformation determined by the transparency. If a light source is an array of light-emitting diodes of special (stripe) geometry, the optical stages of the processor can be made from optical fibre components and the whole processor then becomes a rigid 'sandwich' (a compact hybrid optoelectronic microcircuit). A report is given also of a study of a prototype processor with optical fibre components for the reception of signals from a system with antenna aperture synthesis, which forms a radio image of the Earth.
APRON: A Cellular Processor Array Simulation and Hardware Design Tool
NASA Astrophysics Data System (ADS)
Barr, David R. W.; Dudek, Piotr
2009-12-01
We present a software environment for the efficient simulation of cellular processor arrays (CPAs). This software (APRON) is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.
JPRS Report, Science & Technology, China, High-Performance Computer Systems
1992-10-28
microprocessor array The microprocessor array in the AP85 system is com- posed of 16 completely identical array element micro - processors . Each array element...microprocessors and capable of host machine reading and writing. The memory capacity of the array element micro - processors as a whole can be expanded...transmission functions to carry out data transmission from array element micro - processor to array element microprocessor, from array element
DFT algorithms for bit-serial GaAs array processor architectures
NASA Technical Reports Server (NTRS)
Mcmillan, Gary B.
1988-01-01
Systems and Processes Engineering Corporation (SPEC) has developed an innovative array processor architecture for computing Fourier transforms and other commonly used signal processing algorithms. This architecture is designed to extract the highest possible array performance from state-of-the-art GaAs technology. SPEC's architectural design includes a high performance RISC processor implemented in GaAs, along with a Floating Point Coprocessor and a unique Array Communications Coprocessor, also implemented in GaAs technology. Together, these data processors represent the latest in technology, both from an architectural and implementation viewpoint. SPEC has examined numerous algorithms and parallel processing architectures to determine the optimum array processor architecture. SPEC has developed an array processor architecture with integral communications ability to provide maximum node connectivity. The Array Communications Coprocessor embeds communications operations directly in the core of the processor architecture. A Floating Point Coprocessor architecture has been defined that utilizes Bit-Serial arithmetic units, operating at very high frequency, to perform floating point operations. These Bit-Serial devices reduce the device integration level and complexity to a level compatible with state-of-the-art GaAs device technology.
High-performance ultra-low power VLSI analog processor for data compression
NASA Technical Reports Server (NTRS)
Tawel, Raoul (Inventor)
1996-01-01
An apparatus for data compression employing a parallel analog processor. The apparatus includes an array of processor cells with N columns and M rows wherein the processor cells have an input device, memory device, and processor device. The input device is used for inputting a series of input vectors. Each input vector is simultaneously input into each column of the array of processor cells in a pre-determined sequential order. An input vector is made up of M components, ones of which are input into ones of M processor cells making up a column of the array. The memory device is used for providing ones of M components of a codebook vector to ones of the processor cells making up a column of the array. A different codebook vector is provided to each of the N columns of the array. The processor device is used for simultaneously comparing the components of each input vector to corresponding components of each codebook vector, and for outputting a signal representative of the closeness between the compared vector components. A combination device is used to combine the signal output from each processor cell in each column of the array and to output a combined signal. A closeness determination device is then used for determining which codebook vector is closest to an input vector from the combined signals, and for outputting a codebook vector index indicating which of the N codebook vectors was the closest to each input vector input into the array.
Noncoherent parallel optical processor for discrete two-dimensional linear transformations.
Glaser, I
1980-10-01
We describe a parallel optical processor, based on a lenslet array, that provides general linear two-dimensional transformations using noncoherent light. Such a processor could become useful in image- and signal-processing applications in which the throughput requirements cannot be adequately satisfied by state-of-the-art digital processors. Experimental results that illustrate the feasibility of the processor by demonstrating its use in parallel optical computation of the two-dimensional Walsh-Hadamard transformation are presented.
Array processor architecture connection network
NASA Technical Reports Server (NTRS)
Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)
1982-01-01
A connection network is disclosed for use between a parallel array of processors and a parallel array of memory modules for establishing non-conflicting data communications paths between requested memory modules and requesting processors. The connection network includes a plurality of switching elements interposed between the processor array and the memory modules array in an Omega networking architecture. Each switching element includes a first and a second processor side port, a first and a second memory module side port, and control logic circuitry for providing data connections between the first and second processor ports and the first and second memory module ports. The control logic circuitry includes strobe logic for examining data arriving at the first and the second processor ports to indicate when the data arriving is requesting data from a requesting processor to a requested memory module. Further, connection circuitry is associated with the strobe logic for examining requesting data arriving at the first and the second processor ports for providing a data connection therefrom to the first and the second memory module ports in response thereto when the data connection so provided does not conflict with a pre-established data connection currently in use.
Potential of minicomputer/array-processor system for nonlinear finite-element analysis
NASA Technical Reports Server (NTRS)
Strohkorb, G. A.; Noor, A. K.
1983-01-01
The potential of using a minicomputer/array-processor system for the efficient solution of large-scale, nonlinear, finite-element problems is studied. A Prime 750 is used as the host computer, and a software simulator residing on the Prime is employed to assess the performance of the Floating Point Systems AP-120B array processor. Major hardware characteristics of the system such as virtual memory and parallel and pipeline processing are reviewed, and the interplay between various hardware components is examined. Effective use of the minicomputer/array-processor system for nonlinear analysis requires the following: (1) proper selection of the computational procedure and the capability to vectorize the numerical algorithms; (2) reduction of input-output operations; and (3) overlapping host and array-processor operations. A detailed discussion is given of techniques to accomplish each of these tasks. Two benchmark problems with 1715 and 3230 degrees of freedom, respectively, are selected to measure the anticipated gain in speed obtained by using the proposed algorithms on the array processor.
PCI-based WILDFIRE reconfigurable computing engines
NASA Astrophysics Data System (ADS)
Fross, Bradley K.; Donaldson, Robert L.; Palmer, Douglas J.
1996-10-01
WILDFORCE is the first PCI-based custom reconfigurable computer that is based on the Splash 2 technology transferred from the National Security Agency and the Institute for Defense Analyses, Supercomputing Research Center (SRC). The WILDFORCE architecture has many of the features of the WILDFIRE computer, such as field- programmable gate array (FPGA) based processing elements, linear array and crossbar interconnection, and high- performance memory and I/O subsystems. New features introduced in the PCI-based WILDFIRE systems include memory/processor options that can be added to any processing element. These options include static and dynamic memory, digital signal processors (DSPs), FPGAs, and microprocessors. In addition to memory/processor options, many different application specific connectors can be used to extend the I/O capabilities of the system, including systolic I/O, camera input and video display output. This paper also discusses how this new PCI-based reconfigurable computing engine is used for rapid-prototyping, real-time video processing and other DSP applications.
Systolic Processor Array For Recognition Of Spectra
NASA Technical Reports Server (NTRS)
Chow, Edward T.; Peterson, John C.
1995-01-01
Spectral signatures of materials detected and identified quickly. Spectral Analysis Systolic Processor Array (SPA2) relatively inexpensive and satisfies need to analyze large, complex volume of multispectral data generated by imaging spectrometers to extract desired information: computational performance needed to do this in real time exceeds that of current supercomputers. Locates highly similar segments or contiguous subsegments in two different spectra at time. Compares sampled spectra from instruments with data base of spectral signatures of known materials. Computes and reports scores that express degrees of similarity between sampled and data-base spectra.
A class of parallel algorithms for computation of the manipulator inertia matrix
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1989-01-01
Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos
2011-01-01
General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Preliminary study on the potential usefulness of array processor techniques for structural synthesis
NASA Technical Reports Server (NTRS)
Feeser, L. J.
1980-01-01
The effects of the use of array processor techniques within the structural analyzer program, SPAR, are simulated in order to evaluate the potential analysis speedups which may result. In particular the connection of a Floating Point System AP120 processor to the PRIME computer is discussed. Measurements of execution, input/output, and data transfer times are given. Using these data estimates are made as to the relative speedups that can be executed in a more complete implementation on an array processor maxi-mini computer system.
Rectangular Array Of Digital Processors For Planning Paths
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.; Fossum, Eric R.; Nixon, Robert H.
1993-01-01
Prototype 24 x 25 rectangular array of asynchronous parallel digital processors rapidly finds best path across two-dimensional field, which could be patch of terrain traversed by robotic or military vehicle. Implemented as single-chip very-large-scale integrated circuit. Excepting processors on edges, each processor communicates with four nearest neighbors along paths representing travel to north, south, east, and west. Each processor contains delay generator in form of 8-bit ripple counter, preset to 1 of 256 possible values. Operation begins with choice of processor representing starting point. Transmits signals to nearest neighbor processors, which retransmits to other neighboring processors, and process repeats until signals propagated across entire field.
Low-Latency Embedded Vision Processor (LLEVS)
2016-03-01
26 3.2.3 Task 3 Projected Performance Analysis of FPGA- based Vision Processor ........... 31 3.2.3.1 Algorithms Latency Analysis ...Programmable Gate Array Custom Hardware for Real- Time Multiresolution Analysis . ............................................... 35...conduct data analysis for performance projections. The data acquired through measurements , simulation and estimation provide the requisite platform for
Reduction of solar vector magnetograph data using a microMSP array processor
NASA Technical Reports Server (NTRS)
Kineke, Jack
1990-01-01
The processing of raw data obtained by the solar vector magnetograph at NASA-Marshall requires extensive arithmetic operations on large arrays of real numbers. The objectives of this summer faculty fellowship study are to: (1) learn the programming language of the MicroMSP Array Processor and adapt some existing data reduction routines to exploit its capabilities; and (2) identify other applications and/or existing programs which lend themselves to array processor utilization which can be developed by undergraduate student programmers under the provisions of project JOVE.
Parallel processing in a host plus multiple array processor system for radar
NASA Technical Reports Server (NTRS)
Barkan, B. Z.
1983-01-01
Host plus multiple array processor architecture is demonstrated to yield a modular, fast, and cost-effective system for radar processing. Software methodology for programming such a system is developed. Parallel processing with pipelined data flow among the host, array processors, and discs is implemented. Theoretical analysis of performance is made and experimentally verified. The broad class of problems to which the architecture and methodology can be applied is indicated.
Contextual classification on a CDC Flexible Processor system. [for photomapped remote sensing data
NASA Technical Reports Server (NTRS)
Smith, B. W.; Siegel, H. J.; Swain, P. H.
1981-01-01
A potential hardware organization for the Flexible Processor Array is presented. An algorithm that implements a contextual classifier for remote sensing data analysis is given, along with uniprocessor classification algorithms. The Flexible Processor algorithm is provided, as are simulated timings for contextual classifiers run on the Flexible Processor Array and another system. The timings are analyzed for context neighborhoods of sizes three and nine.
Electronic neural network for solving traveling salesman and similar global optimization problems
NASA Technical Reports Server (NTRS)
Thakoor, Anilkumar P. (Inventor); Moopenn, Alexander W. (Inventor); Duong, Tuan A. (Inventor); Eberhardt, Silvio P. (Inventor)
1993-01-01
This invention is a novel high-speed neural network based processor for solving the 'traveling salesman' and other global optimization problems. It comprises a novel hybrid architecture employing a binary synaptic array whose embodiment incorporates the fixed rules of the problem, such as the number of cities to be visited. The array is prompted by analog voltages representing variables such as distances. The processor incorporates two interconnected feedback networks, each of which solves part of the problem independently and simultaneously, yet which exchange information dynamically.
Design and implementation of a high performance network security processor
NASA Astrophysics Data System (ADS)
Wang, Haixin; Bai, Guoqiang; Chen, Hongyi
2010-03-01
The last few years have seen many significant progresses in the field of application-specific processors. One example is network security processors (NSPs) that perform various cryptographic operations specified by network security protocols and help to offload the computation intensive burdens from network processors (NPs). This article presents a high performance NSP system architecture implementation intended for both internet protocol security (IPSec) and secure socket layer (SSL) protocol acceleration, which are widely employed in virtual private network (VPN) and e-commerce applications. The efficient dual one-way pipelined data transfer skeleton and optimised integration scheme of the heterogenous parallel crypto engine arrays lead to a Gbps rate NSP, which is programmable with domain specific descriptor-based instructions. The descriptor-based control flow fragments large data packets and distributes them to the crypto engine arrays, which fully utilises the parallel computation resources and improves the overall system data throughput. A prototyping platform for this NSP design is implemented with a Xilinx XC3S5000 based FPGA chip set. Results show that the design gives a peak throughput for the IPSec ESP tunnel mode of 2.85 Gbps with over 2100 full SSL handshakes per second at a clock rate of 95 MHz.
FPGA-based multiprocessor system for injection molding control.
Muñoz-Barron, Benigno; Morales-Velazquez, Luis; Romero-Troncoso, Rene J; Rodriguez-Donate, Carlos; Trejo-Hernandez, Miguel; Benitez-Rangel, Juan P; Osornio-Rios, Roque A
2012-10-18
The plastic industry is a very important manufacturing sector and injection molding is a widely used forming method in that industry. The contribution of this work is the development of a strategy to retrofit control of an injection molding machine based on an embedded system microprocessors sensor network on a field programmable gate array (FPGA) device. Six types of embedded processors are included in the system: a smart-sensor processor, a micro fuzzy logic controller, a programmable logic controller, a system manager, an IO processor and a communication processor. Temperature, pressure and position are controlled by the proposed system and experimentation results show its feasibility and robustness. As validation of the present work, a particular sample was successfully injected.
Glaser, I
1982-04-01
By combining a lenslet array with masks it is possible to obtain a noncoherent optical processor capable of computing in parallel generalized 2-D discrete linear transformations. We present here an analysis of such lenslet array processors (LAP). The effect of several errors, including optical aberrations, diffraction, vignetting, and geometrical and mask errors, are calculated, and guidelines to optical design of LAP are derived. Using these results, both ultimate and practical performances of LAP are compared with those of competing techniques.
A cost-effective methodology for the design of massively-parallel VLSI functional units
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Sriram, G.; Desouza, J.
1993-01-01
In this paper we propose a generalized methodology for the design of cost-effective massively-parallel VLSI Functional Units. This methodology is based on a technique of generating and reducing a massive bit-array on the mask-programmable PAcube VLSI array. This methodology unifies (maintains identical data flow and control) the execution of complex arithmetic functions on PAcube arrays. It is highly regular, expandable and uniform with respect to problem-size and wordlength, thereby reducing the communication complexity. The memory-functional unit interface is regular and expandable. Using this technique functional units of dedicated processors can be mask-programmed on the naked PAcube arrays, reducing the turn-around time. The production cost of such dedicated processors can be drastically reduced since the naked PAcube arrays can be mass-produced. Analysis of the the performance of functional units designed by our method yields promising results.
Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array
NASA Astrophysics Data System (ADS)
Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul
2008-04-01
This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow
NASA Technical Reports Server (NTRS)
Lottati, I.
1983-01-01
A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
A wideband software reconfigurable modem
NASA Astrophysics Data System (ADS)
Turner, J. H., Jr.; Vickers, H.
A wideband modem is described which provides signal processing capability for four Lx-band signals employing QPSK, MSK and PPM waveforms and employs a software reconfigurable architecture for maximum system flexibility and graceful degradation. The current processor uses a 2901 and two 8086 microprocessors per channel and performs acquisition, tracking, and data demodulation for JITDS, GPS, IFF and TACAN systems. The next generation processor will be implemented using a VHSIC chip set employing a programmable complex array vector processor module, a GP computer module, customized gate array modules, and a digital array correlator. This integrated processor has application to a wide number of diverse system waveforms, and will bring the benefits of VHSIC technology insertion into avionic antijam communications systems.
NASA Technical Reports Server (NTRS)
Rickard, D. A.; Bodenheimer, R. E.
1976-01-01
Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
FPGA-Based Multiprocessor System for Injection Molding Control
Muñoz-Barron, Benigno; Morales-Velazquez, Luis; Romero-Troncoso, Rene J.; Rodriguez-Donate, Carlos; Trejo-Hernandez, Miguel; Benitez-Rangel, Juan P.; Osornio-Rios, Roque A.
2012-01-01
The plastic industry is a very important manufacturing sector and injection molding is a widely used forming method in that industry. The contribution of this work is the development of a strategy to retrofit control of an injection molding machine based on an embedded system microprocessors sensor network on a field programmable gate array (FPGA) device. Six types of embedded processors are included in the system: a smart-sensor processor, a micro fuzzy logic controller, a programmable logic controller, a system manager, an IO processor and a communication processor. Temperature, pressure and position are controlled by the proposed system and experimentation results show its feasibility and robustness. As validation of the present work, a particular sample was successfully injected. PMID:23202036
NASA Astrophysics Data System (ADS)
Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
Developing infrared array controller with software real time operating system
NASA Astrophysics Data System (ADS)
Sako, Shigeyuki; Miyata, Takashi; Nakamura, Tomohiko; Motohara, Kentaro; Uchimoto, Yuka Katsuno; Onaka, Takashi; Kataza, Hirokazu
2008-07-01
Real-time capabilities are required for a controller of a large format array to reduce a dead-time attributed by readout and data transfer. The real-time processing has been achieved by dedicated processors including DSP, CPLD, and FPGA devices. However, the dedicated processors have problems with memory resources, inflexibility, and high cost. Meanwhile, a recent PC has sufficient resources of CPUs and memories to control the infrared array and to process a large amount of frame data in real-time. In this study, we have developed an infrared array controller with a software real-time operating system (RTOS) instead of the dedicated processors. A Linux PC equipped with a RTAI extension and a dual-core CPU is used as a main computer, and one of the CPU cores is allocated to the real-time processing. A digital I/O board with DMA functions is used for an I/O interface. The signal-processing cores are integrated in the OS kernel as a real-time driver module, which is composed of two virtual devices of the clock processor and the frame processor tasks. The array controller with the RTOS realizes complicated operations easily, flexibly, and at a low cost.
Ring-array processor distribution topology for optical interconnects
NASA Technical Reports Server (NTRS)
Li, Yao; Ha, Berlin; Wang, Ting; Wang, Sunyu; Katz, A.; Lu, X. J.; Kanterakis, E.
1992-01-01
The existing linear and rectangular processor distribution topologies for optical interconnects, although promising in many respects, cannot solve problems such as clock skews, the lack of supporting elements for efficient optical implementation, etc. The use of a ring-array processor distribution topology, however, can overcome these problems. Here, a study of the ring-array topology is conducted with an aim of implementing various fast clock rate, high-performance, compact optical networks for digital electronic multiprocessor computers. Practical design issues are addressed. Some proof-of-principle experimental results are included.
A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S.; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities. PMID:22164116
A real-time capable software-defined receiver using GPU for adaptive anti-jam GPS sensors.
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities.
Optical microwave filter based on spectral slicing by use of arrayed waveguide gratings.
Pastor, Daniel; Ortega, Beatriz; Capmany, José; Sales, Salvador; Martinez, Alfonso; Muñoz, Pascual
2003-10-01
We have experimentally demonstrated a new optical signal processor based on the use of arrayed waveguide gratings. The structure exploits the concept of spectral slicing combined with the use of an optical dispersive medium. The approach presents increased flexibility from previous slicing-based structures in terms of tunability, reconfiguration, and apodization of the samples or coefficients of the transversal optical filter.
Tomkins, James L [Albuquerque, NM; Camp, William J [Albuquerque, NM
2007-07-17
A multiple processor computing apparatus includes a physical interconnect structure that is flexibly configurable to support selective segregation of classified and unclassified users. The physical interconnect structure includes routers in service or compute processor boards distributed in an array of cabinets connected in series on each board and to respective routers in neighboring row cabinet boards with the routers in series connection coupled to routers in series connection in respective neighboring column cabinet boards. The array can include disconnect cabinets or respective routers in all boards in each cabinet connected in a toroid. The computing apparatus can include an emulator which permits applications from the same job to be launched on processors that use different operating systems.
Soft-core processor study for node-based architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Houten, Jonathan Roger; Jarosz, Jason P.; Welch, Benjamin James
2008-09-01
Node-based architecture (NBA) designs for future satellite projects hold the promise of decreasing system development time and costs, size, weight, and power and positioning the laboratory to address other emerging mission opportunities quickly. Reconfigurable Field Programmable Gate Array (FPGA) based modules will comprise the core of several of the NBA nodes. Microprocessing capabilities will be necessary with varying degrees of mission-specific performance requirements on these nodes. To enable the flexibility of these reconfigurable nodes, it is advantageous to incorporate the microprocessor into the FPGA itself, either as a hardcore processor built into the FPGA or as a soft-core processor builtmore » out of FPGA elements. This document describes the evaluation of three reconfigurable FPGA based processors for use in future NBA systems--two soft cores (MicroBlaze and non-fault-tolerant LEON) and one hard core (PowerPC 405). Two standard performance benchmark applications were developed for each processor. The first, Dhrystone, is a fixed-point operation metric. The second, Whetstone, is a floating-point operation metric. Several trials were run at varying code locations, loop counts, processor speeds, and cache configurations. FPGA resource utilization was recorded for each configuration. Cache configurations impacted the results greatly; for optimal processor efficiency it is necessary to enable caches on the processors. Processor caches carry a penalty; cache error mitigation is necessary when operating in a radiation environment.« less
Parallel computing on Unix workstation arrays
NASA Astrophysics Data System (ADS)
Reale, F.; Bocchino, F.; Sciortino, S.
1994-12-01
We have tested arrays of general-purpose Unix workstations used as MIMD systems for massive parallel computations. In particular we have solved numerically a demanding test problem with a 2D hydrodynamic code, generally developed to study astrophysical flows, by exucuting it on arrays either of DECstations 5000/200 on Ethernet LAN, or of DECstations 3000/400, equipped with powerful Alpha processors, on FDDI LAN. The code is appropriate for data-domain decomposition, and we have used a library for parallelization previously developed in our Institute, and easily extended to work on Unix workstation arrays by using the PVM software toolset. We have compared the parallel efficiencies obtained on arrays of several processors to those obtained on a dedicated MIMD parallel system, namely a Meiko Computing Surface (CS-1), equipped with Intel i860 processors. We discuss the feasibility of using non-dedicated parallel systems and conclude that the convenience depends essentially on the size of the computational domain as compared to the relative processor power and network bandwidth. We point out that for future perspectives a parallel development of processor and network technology is important, and that the software still offers great opportunities of improvement, especially in terms of latency times in the message-passing protocols. In conditions of significant gain in terms of speedup, such workstation arrays represent a cost-effective approach to massive parallel computations.
Processing techniques for software based SAR processors
NASA Technical Reports Server (NTRS)
Leung, K.; Wu, C.
1983-01-01
Software SAR processing techniques defined to treat Shuttle Imaging Radar-B (SIR-B) data are reviewed. The algorithms are devised for the data processing procedure selection, SAR correlation function implementation, multiple array processors utilization, cornerturning, variable reference length azimuth processing, and range migration handling. The Interim Digital Processor (IDP) originally implemented for handling Seasat SAR data has been adapted for the SIR-B, and offers a resolution of 100 km using a processing procedure based on the Fast Fourier Transformation fast correlation approach. Peculiarities of the Seasat SAR data processing requirements are reviewed, along with modifications introduced for the SIR-B. An Advanced Digital SAR Processor (ADSP) is under development for use with the SIR-B in the 1986 time frame as an upgrade for the IDP, which will be in service in 1984-5.
NASA Astrophysics Data System (ADS)
Selker, Ted
1983-05-01
Lens focusing using a hardware model of a retina (Reticon RL256 light sensitive array) with a low cost processor (8085 with 512 bytes of ROM and 512 bytes of RAM) was built. This system was developed and tested on a variety of visual stimuli to demonstrate that: a)an algorithm which moves a lens to maximize the sum of the difference of light level on adjacent light sensors will converge to best focus in all but contrived situations. This is a simpler algorithm than any previously suggested; b) it is feasible to use unmodified video sensor arrays with in-expensive processors to aid video camera use. In the future, software could be developed to extend the processor's usefulness, possibly to track an actor by panning and zooming to give a earners operator increased ease of framing; c) lateral inhibition is an adequate basis for determining best focus. This supports a simple anatomically motivated model of how our brain focuses our eyes.
An acceleration framework for synthetic aperture radar algorithms
NASA Astrophysics Data System (ADS)
Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.
2017-04-01
Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
Sequence information signal processor
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1999-01-01
An electronic circuit is used to compare two sequences, such as genetic sequences, to determine which alignment of the sequences produces the greatest similarity. The circuit includes a linear array of series-connected processors, each of which stores a single element from one of the sequences and compares that element with each successive element in the other sequence. For each comparison, the processor generates a scoring parameter that indicates which segment ending at those two elements produces the greatest degree of similarity between the sequences. The processor uses the scoring parameter to generate a similar scoring parameter for a comparison between the stored element and the next successive element from the other sequence. The processor also delivers the scoring parameter to the next processor in the array for use in generating a similar scoring parameter for another pair of elements. The electronic circuit determines which processor and alignment of the sequences produce the scoring parameter with the highest value.
Implementation of context independent code on a new array processor: The Super-65
NASA Technical Reports Server (NTRS)
Colbert, R. O.; Bowhill, S. A.
1981-01-01
The feasibility of rewriting standard uniprocessor programs into code which contains no context-dependent branches is explored. Context independent code (CIC) would contain no branches that might require different processing elements to branch different ways. In order to investigate the possibilities and restrictions of CIC, several programs were recoded into CIC and a four-element array processor was built. This processor (the Super-65) consisted of three 6502 microprocessors and the Apple II microcomputer. The results obtained were somewhat dependent upon the specific architecture of the Super-65 but within bounds, the throughput of the array processor was found to increase linearly with the number of processing elements (PEs). The slope of throughput versus PEs is highly dependent on the program and varied from 0.33 to 1.00 for the sample programs.
Optical systolic array processor using residue arithmetic
NASA Technical Reports Server (NTRS)
Jackson, J.; Casasent, D.
1983-01-01
The use of residue arithmetic to increase the accuracy and reduce the dynamic range requirements of optical matrix-vector processors is evaluated. It is determined that matrix-vector operations and iterative algorithms can be performed totally in residue notation. A new parallel residue quantizer circuit is developed which significantly improves the performance of the systolic array feedback processor. Results are presented of a computer simulation of this system used to solve a set of three simultaneous equations.
Design and implementation of highly parallel pipelined VLSI systems
NASA Astrophysics Data System (ADS)
Delange, Alphonsus Anthonius Jozef
A methodology and its realization as a prototype CAD (Computer Aided Design) system for the design and analysis of complex multiprocessor systems is presented. The design is an iterative process in which the behavioral specifications of the system components are refined into structural descriptions consisting of interconnections and lower level components etc. A model for the representation and analysis of multiprocessor systems at several levels of abstraction and an implementation of a CAD system based on this model are described. A high level design language, an object oriented development kit for tool design, a design data management system, and design and analysis tools such as a high level simulator and graphics design interface which are integrated into the prototype system and graphics interface are described. Procedures for the synthesis of semiregular processor arrays, and to compute the switching of input/output signals, memory management and control of processor array, and sequencing and segmentation of input/output data streams due to partitioning and clustering of the processor array during the subsequent synthesis steps, are described. The architecture and control of a parallel system is designed and each component mapped to a module or module generator in a symbolic layout library, compacted for design rules of VLSI (Very Large Scale Integration) technology. An example of the design of a processor that is a useful building block for highly parallel pipelined systems in the signal/image processing domains is given.
Sequence information signal processor for local and global string comparisons
Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.
1997-01-01
A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.
Digital Parallel Processor Array for Optimum Path Planning
NASA Technical Reports Server (NTRS)
Kremeny, Sabrina E. (Inventor); Fossum, Eric R. (Inventor); Nixon, Robert H. (Inventor)
1996-01-01
The invention computes the optimum path across a terrain or topology represented by an array of parallel processor cells interconnected between neighboring cells by links extending along different directions to the neighboring cells. Such an array is preferably implemented as a high-speed integrated circuit. The computation of the optimum path is accomplished by, in each cell, receiving stimulus signals from neighboring cells along corresponding directions, determining and storing the identity of a direction along which the first stimulus signal is received, broadcasting a subsequent stimulus signal to the neighboring cells after a predetermined delay time, whereby stimulus signals propagate throughout the array from a starting one of the cells. After propagation of the stimulus signal throughout the array, a master processor traces back from a selected destination cell to the starting cell along an optimum path of the cells in accordance with the identity of the directions stored in each of the cells.
Real time processor for array speckle interferometry
NASA Astrophysics Data System (ADS)
Chin, Gordon; Florez, Jose; Borelli, Renan; Fong, Wai; Miko, Joseph; Trujillo, Carlos
1989-02-01
The authors are constructing a real-time processor to acquire image frames, perform array flat-fielding, execute a 64 x 64 element two-dimensional complex FFT (fast Fourier transform) and average the power spectrum, all within the 25 ms coherence time for speckles at near-IR (infrared) wavelength. The processor will be a compact unit controlled by a PC with real-time display and data storage capability. This will provide the ability to optimize observations and obtain results on the telescope rather than waiting several weeks before the data can be analyzed and viewed with offline methods. The image acquisition and processing, design criteria, and processor architecture are described.
Real time processor for array speckle interferometry
NASA Technical Reports Server (NTRS)
Chin, Gordon; Florez, Jose; Borelli, Renan; Fong, Wai; Miko, Joseph; Trujillo, Carlos
1989-01-01
The authors are constructing a real-time processor to acquire image frames, perform array flat-fielding, execute a 64 x 64 element two-dimensional complex FFT (fast Fourier transform) and average the power spectrum, all within the 25 ms coherence time for speckles at near-IR (infrared) wavelength. The processor will be a compact unit controlled by a PC with real-time display and data storage capability. This will provide the ability to optimize observations and obtain results on the telescope rather than waiting several weeks before the data can be analyzed and viewed with offline methods. The image acquisition and processing, design criteria, and processor architecture are described.
Wake Vortex Avoidance System and Method
NASA Technical Reports Server (NTRS)
Shams, Qamar A. (Inventor); Zuckerwar, Allan J. (Inventor); Knight, Howard K. (Inventor)
2017-01-01
A wake vortex avoidance system includes a microphone array configured to detect low frequency sounds. A signal processor determines a geometric mean coherence based on the detected low frequency sounds. A display displays wake vortices based on the determined geometric mean coherence.
Analog hardware for learning neural networks
NASA Technical Reports Server (NTRS)
Eberhardt, Silvio P. (Inventor)
1991-01-01
This is a recurrent or feedforward analog neural network processor having a multi-level neuron array and a synaptic matrix for storing weighted analog values of synaptic connection strengths which is characterized by temporarily changing one connection strength at a time to determine its effect on system output relative to the desired target. That connection strength is then adjusted based on the effect, whereby the processor is taught the correct response to training examples connection by connection.
Dynamic load balance scheme for the DSMC algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Jin; Geng, Xiangren; Jiang, Dingwu
The direct simulation Monte Carlo (DSMC) algorithm, devised by Bird, has been used over a wide range of various rarified flow problems in the past 40 years. While the DSMC is suitable for the parallel implementation on powerful multi-processor architecture, it also introduces a large load imbalance across the processor array, even for small examples. The load imposed on a processor by a DSMC calculation is determined to a large extent by the total of simulator particles upon it. Since most flows are impulsively started with initial distribution of particles which is surely quite different from the steady state, themore » total of simulator particles will change dramatically. The load balance based upon an initial distribution of particles will break down as the steady state of flow is reached. The load imbalance and huge computational cost of DSMC has limited its application to rarefied or simple transitional flows. In this paper, by taking advantage of METIS, a software for partitioning unstructured graphs, and taking the total of simulator particles in each cell as a weight information, the repartitioning based upon the principle that each processor handles approximately the equal total of simulator particles has been achieved. The computation must pause several times to renew the total of simulator particles in each processor and repartition the whole domain again. Thus the load balance across the processors array holds in the duration of computation. The parallel efficiency can be improved effectively. The benchmark solution of a cylinder submerged in hypersonic flow has been simulated numerically. Besides, hypersonic flow past around a complex wing-body configuration has also been simulated. The results have displayed that, for both of cases, the computational time can be reduced by about 50%.« less
Smart-Pixel Array Processors Based on Optimal Cellular Neural Networks for Space Sensor Applications
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi; Sheu, Bing J.; Venus, Holger; Sandau, Rainer
1997-01-01
A smart-pixel cellular neural network (CNN) with hardware annealing capability, digitally programmable synaptic weights, and multisensor parallel interface has been under development for advanced space sensor applications. The smart-pixel CNN architecture is a programmable multi-dimensional array of optoelectronic neurons which are locally connected with their local neurons and associated active-pixel sensors. Integration of the neuroprocessor in each processor node of a scalable multiprocessor system offers orders-of-magnitude computing performance enhancements for on-board real-time intelligent multisensor processing and control tasks of advanced small satellites. The smart-pixel CNN operation theory, architecture, design and implementation, and system applications are investigated in detail. The VLSI (Very Large Scale Integration) implementation feasibility was illustrated by a prototype smart-pixel 5x5 neuroprocessor array chip of active dimensions 1380 micron x 746 micron in a 2-micron CMOS technology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nash, T.; Atac, R.; Cook, A.
1989-03-06
The ACPMAPS multipocessor is a highly cost effective, local memory parallel computer with a hypercube or compound hypercube architecture. Communication requires the attention of only the two communicating nodes. The design is aimed at floating point intensive, grid like problems, particularly those with extreme computing requirements. The processing nodes of the system are single board array processors, each with a peak power of 20 Mflops, supported by 8 Mbytes of data and 2 Mbytes of instruction memory. The system currently being assembled has a peak power of 5 Gflops. The nodes are based on the Weitek XL Chip set. Themore » system delivers performance at approximately $300/Mflop. 8 refs., 4 figs.« less
Video rate morphological processor based on a redundant number representation
NASA Astrophysics Data System (ADS)
Kuczborski, Wojciech; Attikiouzel, Yianni; Crebbin, Gregory A.
1992-03-01
This paper presents a video rate morphological processor for automated visual inspection of printed circuit boards, integrated circuit masks, and other complex objects. Inspection algorithms are based on gray-scale mathematical morphology. Hardware complexity of the known methods of real-time implementation of gray-scale morphology--the umbra transform and the threshold decomposition--has prompted us to propose a novel technique which applied an arithmetic system without carrying propagation. After considering several arithmetic systems, a redundant number representation has been selected for implementation. Two options are analyzed here. The first is a pure signed digit number representation (SDNR) with the base of 4. The second option is a combination of the base-2 SDNR (to represent gray levels of images) and the conventional twos complement code (to represent gray levels of structuring elements). Operation principle of the morphological processor is based on the concept of the digit level systolic array. Individual processing units and small memory elements create a pipeline. The memory elements store current image windows (kernels). All operation primitives of processing units apply a unified direction of digit processing: most significant digit first (MSDF). The implementation technology is based on the field programmable gate arrays by Xilinx. This paper justified the rationality of a new approach to logic design, which is the decomposition of Boolean functions instead of Boolean minimization.
System and method for cognitive processing for data fusion
NASA Technical Reports Server (NTRS)
Duong, Tuan A. (Inventor); Duong, Vu A. (Inventor)
2012-01-01
A system and method for cognitive processing of sensor data. A processor array receiving analog sensor data and having programmable interconnects, multiplication weights, and filters provides for adaptive learning in real-time. A static random access memory contains the programmable data for the processor array and the stored data is modified to provide for adaptive learning.
Iterative color-multiplexed, electro-optical processor.
Psaltis, D; Casasent, D; Carlotto, M
1979-11-01
A noncoherent optical vector-matrix multiplier using a linear LED source array and a linear P-I-N photodiode detector array has been combined with a 1-D adder in a feedback loop. The resultant iterative optical processor and its use in solving simultaneous linear equations are described. Operation on complex data is provided by a novel color-multiplexing system.
Digital system for structural dynamics simulation
NASA Technical Reports Server (NTRS)
Krauter, A. I.; Lagace, L. J.; Wojnar, M. K.; Glor, C.
1982-01-01
State-of-the-art digital hardware and software for the simulation of complex structural dynamic interactions, such as those which occur in rotating structures (engine systems). System were incorporated in a designed to use an array of processors in which the computation for each physical subelement or functional subsystem would be assigned to a single specific processor in the simulator. These node processors are microprogrammed bit-slice microcomputers which function autonomously and can communicate with each other and a central control minicomputer over parallel digital lines. Inter-processor nearest neighbor communications busses pass the constants which represent physical constraints and boundary conditions. The node processors are connected to the six nearest neighbor node processors to simulate the actual physical interface of real substructures. Computer generated finite element mesh and force models can be developed with the aid of the central control minicomputer. The control computer also oversees the animation of a graphics display system, disk-based mass storage along with the individual processing elements.
Cheung, Kit; Schultz, Simon R; Luk, Wayne
2015-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation.
Cheung, Kit; Schultz, Simon R.; Luk, Wayne
2016-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation. PMID:26834542
Design and Evaluation of Fault-Tolerant VLSI/WSI Processor Arrays.
1987-12-31
studies reported in this paper. In Section .3, the reliabuility characteristics of single-level FTPA’s are discusseri. Four different type of FTPA’s are...for processor arrays are proposed and studied . Stu- dies on algorithmic and software aspects relevant to systems are reported in items 4, 5, 8, 12 and...O’Keefe M., and Fortes, J. A. B., "A Comparative Study of Two Systematic Design Methodologies for Systolic Arrays," (Long Version) International Workshop on
DOE Office of Scientific and Technical Information (OSTI.GOV)
McConaghy, C. F.; Gascoyne, P. R.
The purpose ofthis project was to develop a general-purpose analysis system based on a programmable fluid processor (PFP). The PFP is an array of electrodes surrounded by fluid reservoirs and injectors. Injected droplets of various reagents are manjpulated and combined on the array by Dielectrophoretic (DEP) forces. The goal was to create a small handheld device that could accomplish the tasks currently undertaken by much larger, time consuming, manual manipulation in the lab. The entire effo1t was funded by DARPA under the Bio-Flips program. MD Anderson Cancer Center was the PI for the DARPA effort. The Bio-Flips program was amore » 3- year program that ran from September 2000 to September 2003. The CRADA was somewhat behind the Bi-Flips program running from June 2001 to June 2004 with a no cost extension to September 2004.« less
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.
Zierke, Stephanie; Bakos, Jason D
2010-04-12
Likelihood (ML)-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF) is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA)-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10x speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs).
FFT Computation with Systolic Arrays, A New Architecture
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The use of the Cooley-Tukey algorithm for computing the l-d FFT lends itself to a particular matrix factorization which suggests direct implementation by linearly-connected systolic arrays. Here we present a new systolic architecture that embodies this algorithm. This implementation requires a smaller number of processors and a smaller number of memory cells than other recent implementations, as well as having all the advantages of systolic arrays. For the implementation of the decimation-in-frequency case, word-serial data input allows continuous real-time operation without the need of a serial-to-parallel conversion device. No control or data stream switching is necessary. Computer simulation of this architecture was done in the context of a 1024 point DFT with a fixed point processor, and CMOS processor implementation has started.
Radiation Hardened Electronics for Extreme Environments
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Watson, Michael D.
2007-01-01
The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches.
C-MOS array design techniques: SUMC multiprocessor system study
NASA Technical Reports Server (NTRS)
Clapp, W. A.; Helbig, W. A.; Merriam, A. S.
1972-01-01
The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units.
Lithium niobate guided-wave beam former for steering phased-array antennas.
Armenise, M N; Passaro, V M; Noviello, G
1994-09-10
We present the theoretical investigation, design, and simulation of a novel guided-wave optical processor for L-band-transmission beam forming in a linear array of phased active antennas. The proposed configuration includes two contradirectional surface acoustic-wave transducers, and it is based on a Y-cut, X-propagating Ti:LiNbO(3) planar waveguide supporting the lowest-order modes of both polarizations (TE(0) and TM(0)) at the free-space wavelength λ = 0.85 µm. A detailed comparison between the processor we propose and other optical and electronic architectures reported in the literature is carried out, exhibiting a number of significant advantages in terms of weight, total chip size, and power consumption, when the number of antenna elements is greater than 50.
VLSI 'smart' I/O module development
NASA Astrophysics Data System (ADS)
Kirk, Dan
The developmental history, design, and operation of the MIL-STD-1553A/B discrete and serial module (DSM) for the U.S. Navy AN/AYK-14(V) avionics computer are described and illustrated with diagrams. The ongoing preplanned product improvement for the AN/AYK-14(V) includes five dual-redundant MIL-STD-1553 channels based on DSMs. The DSM is a front-end processor for transferring data to and from a common memory, sharing memory with a host processor to provide improved 'smart' input/output performance. Each DSM comprises three hardware sections: three VLSI-6000 semicustomized CMOS arrays, memory units to support the arrays, and buffers and resynchronization circuits. The DSM hardware module design, VLSI-6000 design tools, controlware and test software, and checkout procedures (using a hardware simulator) are characterized in detail.
An architecture for real-time vision processing
NASA Technical Reports Server (NTRS)
Chien, Chiun-Hong
1994-01-01
To study the feasibility of developing an architecture for real time vision processing, a task queue server and parallel algorithms for two vision operations were designed and implemented on an i860-based Mercury Computing System 860VS array processor. The proposed architecture treats each vision function as a task or set of tasks which may be recursively divided into subtasks and processed by multiple processors coordinated by a task queue server accessible by all processors. Each idle processor subsequently fetches a task and associated data from the task queue server for processing and posts the result to shared memory for later use. Load balancing can be carried out within the processing system without the requirement for a centralized controller. The author concludes that real time vision processing cannot be achieved without both sequential and parallel vision algorithms and a good parallel vision architecture.
A microcomputer based frequency-domain processor for laser Doppler anemometry
NASA Technical Reports Server (NTRS)
Horne, W. Clifton; Adair, Desmond
1988-01-01
A prototype multi-channel laser Doppler anemometry (LDA) processor was assembled using a wideband transient recorder and a microcomputer with an array processor for fast Fourier transform (FFT) computations. The prototype instrument was used to acquire, process, and record signals from a three-component wind tunnel LDA system subject to various conditions of noise and flow turbulence. The recorded data was used to evaluate the effectiveness of burst acceptance criteria, processing algorithms, and selection of processing parameters such as record length. The recorded signals were also used to obtain comparative estimates of signal-to-noise ratio between time-domain and frequency-domain signal detection schemes. These comparisons show that the FFT processing scheme allows accurate processing of signals for which the signal-to-noise ratio is 10 to 15 dB less than is practical using counter processors.
A unified approach to VLSI layout automation and algorithm mapping on processor arrays
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Pattabiraman, S.; Srinivasan, Vinoo N.
1993-01-01
Development of software tools for designing supercomputing systems is highly complex and cost ineffective. To tackle this a special purpose PAcube silicon compiler which integrates different design levels from cell to processor arrays has been proposed. As a part of this, we present in this paper a novel methodology which unifies the problems of Layout Automation and Algorithm Mapping.
Numerical aerodynamic simulation facility preliminary study, volume 2 and appendices
NASA Technical Reports Server (NTRS)
1977-01-01
Data to support results obtained in technology assessment studies are presented. Objectives, starting points, and future study tasks are outlined. Key design issues discussed in appendices include: data allocation, transposition network design, fault tolerance and trustworthiness, logic design, processing element of existing components, number of processors, the host system, alternate data base memory designs, number representation, fast div 521 instruction, architectures, and lockstep array versus synchronizable array machine comparison.
Toshiba TDF-500 High Resolution Viewing And Analysis System
NASA Astrophysics Data System (ADS)
Roberts, Barry; Kakegawa, M.; Nishikawa, M.; Oikawa, D.
1988-06-01
A high resolution, operator interactive, medical viewing and analysis system has been developed by Toshiba and Bio-Imaging Research. This system provides many advanced features including high resolution displays, a very large image memory and advanced image processing capability. In particular, the system provides CRT frame buffers capable of update in one frame period, an array processor capable of image processing at operator interactive speeds, and a memory system capable of updating multiple frame buffers at frame rates whilst supporting multiple array processors. The display system provides 1024 x 1536 display resolution at 40Hz frame and 80Hz field rates. In particular, the ability to provide whole or partial update of the screen at the scanning rate is a key feature. This allows multiple viewports or windows in the display buffer with both fixed and cine capability. To support image processing features such as windowing, pan, zoom, minification, filtering, ROI analysis, multiplanar and 3D reconstruction, a high performance CPU is integrated into the system. This CPU is an array processor capable of up to 400 million instructions per second. To support the multiple viewer and array processors' instantaneous high memory bandwidth requirement, an ultra fast memory system is used. This memory system has a bandwidth capability of 400MB/sec and a total capacity of 256MB. This bandwidth is more than adequate to support several high resolution CRT's and also the fast processing unit. This fully integrated approach allows effective real time image processing. The integrated design of viewing system, memory system and array processor are key to the imaging system. It is the intention to describe the architecture of the image system in this paper.
Frequency-multiplexed and pipelined iterative optical systolic array processors
NASA Technical Reports Server (NTRS)
Casasent, D.; Jackson, J.; Neuman, C.
1983-01-01
Optical matrix processors using acoustooptic transducers are described, with emphasis on new systolic array architectures using frequency multiplexing in addition to space and time multiplexing. A Kalman filtering application is considered in a case study from which the operations required on such a system can be defined. This also serves as a new and powerful application for iterative optical processors. The importance of pipelining the data flow and the ordering of the operations performed in a specific application of such a system are also noted. Several examples of how to effectively achieve this are included. A new technique for handling bipolar data on such architectures is also described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Supinski, B.; Caliga, D.
2017-09-28
The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.
Computations on the massively parallel processor at the Goddard Space Flight Center
NASA Technical Reports Server (NTRS)
Strong, James P.
1991-01-01
Described are four significant algorithms implemented on the massively parallel processor (MPP) at the Goddard Space Flight Center. Two are in the area of image analysis. Of the other two, one is a mathematical simulation experiment and the other deals with the efficient transfer of data between distantly separated processors in the MPP array. The first algorithm presented is the automatic determination of elevations from stereo pairs. The second algorithm solves mathematical logistic equations capable of producing both ordered and chaotic (or random) solutions. This work can potentially lead to the simulation of artificial life processes. The third algorithm is the automatic segmentation of images into reasonable regions based on some similarity criterion, while the fourth is an implementation of a bitonic sort of data which significantly overcomes the nearest neighbor interconnection constraints on the MPP for transferring data between distant processors.
Multichannel signal enhancement
Lewis, Paul S.
1990-01-01
A mixed adaptive filter is formulated for the signal processing problem where desired a priori signal information is not available. The formulation generates a least squares problem which enables the filter output to be calculated directly from an input data matrix. In one embodiment, a folded processor array enables bidirectional data flow to solve the recursive problem by back substitution without global communications. In another embodiment, a balanced processor array solves the recursive problem by forward elimination through the array. In a particular application to magnetoencephalography, the mixed adaptive filter enables an evoked response to an auditory stimulus to be identified from only a single trial.
NASA Astrophysics Data System (ADS)
Haron, Adib; Mahdzair, Fazren; Luqman, Anas; Osman, Nazmie; Junid, Syed Abdul Mutalib Al
2018-03-01
One of the most significant constraints of Von Neumann architecture is the limited bandwidth between memory and processor. The cost to move data back and forth between memory and processor is considerably higher than the computation in the processor itself. This architecture significantly impacts the Big Data and data-intensive application such as DNA analysis comparison which spend most of the processing time to move data. Recently, the in-memory processing concept was proposed, which is based on the capability to perform the logic operation on the physical memory structure using a crossbar topology and non-volatile resistive-switching memristor technology. This paper proposes a scheme to map digital equality comparator circuit on memristive memory crossbar array. The 2-bit, 4-bit, 8-bit, 16-bit, 32-bit, and 64-bit of equality comparator circuit are mapped on memristive memory crossbar array by using material implication logic in a sequential and parallel method. The simulation results show that, for the 64-bit word size, the parallel mapping exhibits 2.8× better performance in total execution time than sequential mapping but has a trade-off in terms of energy consumption and area utilization. Meanwhile, the total crossbar area can be reduced by 1.2× for sequential mapping and 1.5× for parallel mapping both by using the overlapping technique.
Prototype Focal-Plane-Array Optoelectronic Image Processor
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi; Shaw, Timothy; Yu, Jeffrey
1995-01-01
Prototype very-large-scale integrated (VLSI) planar array of optoelectronic processing elements combines speed of optical input and output with flexibility of reconfiguration (programmability) of electronic processing medium. Basic concept of processor described in "Optical-Input, Optical-Output Morphological Processor" (NPO-18174). Performs binary operations on binary (black and white) images. Each processing element corresponds to one picture element of image and located at that picture element. Includes input-plane photodetector in form of parasitic phototransistor part of processing circuit. Output of each processing circuit used to modulate one picture element in output-plane liquid-crystal display device. Intended to implement morphological processing algorithms that transform image into set of features suitable for high-level processing; e.g., recognition.
Technology Developments in Radiation-Hardened Electronics for Space Environments
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Howell, Joe T.
2008-01-01
The Radiation Hardened Electronics for Space Environments (RHESE) project consists of a series of tasks designed to develop and mature a broad spectrum of radiation hardened and low temperature electronics technologies. Three approaches are being taken to address radiation hardening: improved material hardness, design techniques to improve radiation tolerance, and software methods to improve radiation tolerance. Within these approaches various technology products are being addressed including Field Programmable Gate Arrays (FPGA), Field Programmable Analog Arrays (FPAA), MEMS, Serial Processors, Reconfigurable Processors, and Parallel Processors. In addition to radiation hardening, low temperature extremes are addressed with a focus on material and design approaches. System level applications for the RHESE technology products are discussed.
Analysis and simulation tools for solar array power systems
NASA Astrophysics Data System (ADS)
Pongratananukul, Nattorn
This dissertation presents simulation tools developed specifically for the design of solar array power systems. Contributions are made in several aspects of the system design phases, including solar source modeling, system simulation, and controller verification. A tool to automate the study of solar array configurations using general purpose circuit simulators has been developed based on the modeling of individual solar cells. Hierarchical structure of solar cell elements, including semiconductor properties, allows simulation of electrical properties as well as the evaluation of the impact of environmental conditions. A second developed tool provides a co-simulation platform with the capability to verify the performance of an actual digital controller implemented in programmable hardware such as a DSP processor, while the entire solar array including the DC-DC power converter is modeled in software algorithms running on a computer. This "virtual plant" allows developing and debugging code for the digital controller, and also to improve the control algorithm. One important task in solar arrays is to track the maximum power point on the array in order to maximize the power that can be delivered. Digital controllers implemented with programmable processors are particularly attractive for this task because sophisticated tracking algorithms can be implemented and revised when needed to optimize their performance. The proposed co-simulation tools are thus very valuable in developing and optimizing the control algorithm, before the system is built. Examples that demonstrate the effectiveness of the proposed methodologies are presented. The proposed simulation tools are also valuable in the design of multi-channel arrays. In the specific system that we have designed and tested, the control algorithm is implemented on a single digital signal processor. In each of the channels the maximum power point is tracked individually. In the prototype we built, off-the-shelf commercial DC-DC converters were utilized. At the end, the overall performance of the entire system was evaluated using solar array simulators capable of simulating various I-V characteristics, and also by using an electronic load. Experimental results are presented.
The MasPar MP-1 As a Computer Arithmetic Laboratory
Anuta, Michael A.; Lozier, Daniel W.; Turner, Peter R.
1996-01-01
This paper is a blueprint for the use of a massively parallel SIMD computer architecture for the simulation of various forms of computer arithmetic. The particular system used is a DEC/MasPar MP-1 with 4096 processors in a square array. This architecture has many advantages for such simulations due largely to the simplicity of the individual processors. Arithmetic operations can be spread across the processor array to simulate a hardware chip. Alternatively they may be performed on individual processors to allow simulation of a massively parallel implementation of the arithmetic. Compromises between these extremes permit speed-area tradeoffs to be examined. The paper includes a description of the architecture and its features. It then summarizes some of the arithmetic systems which have been, or are to be, implemented. The implementation of the level-index and symmetric level-index, LI and SLI, systems is described in some detail. An extensive bibliography is included. PMID:27805123
Microlens array processor with programmable weight mask and direct optical input
NASA Astrophysics Data System (ADS)
Schmid, Volker R.; Lueder, Ernst H.; Bader, Gerhard; Maier, Gert; Siegordner, Jochen
1999-03-01
We present an optical feature extraction system with a microlens array processor. The system is suitable for online implementation of a variety of transforms such as the Walsh transform and DCT. Operating with incoherent light, our processor accepts direct optical input. Employing a sandwich- like architecture, we obtain a very compact design of the optical system. The key elements of the microlens array processor are a square array of 15 X 15 spherical microlenses on acrylic substrate and a spatial light modulator as transmissive mask. The light distribution behind the mask is imaged onto the pixels of a customized a-Si image sensor with adjustable gain. We obtain one output sample for each microlens image and its corresponding weight mask area as summation of the transmitted intensity within one sensor pixel. The resulting architecture is very compact and robust like a conventional camera lens while incorporating a high degree of parallelism. We successfully demonstrate a Walsh transform into the spatial frequency domain as well as the implementation of a discrete cosine transform with digitized gray values. We provide results showing the transformation performance for both synthetic image patterns and images of natural texture samples. The extracted frequency features are suitable for neural classification of the input image. Other transforms and correlations can be implemented in real-time allowing adaptive optical signal processing.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-01-01
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-12-15
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
TOGA - A GNSS Reflections Instrument for Remote Sensing Using Beamforming
NASA Technical Reports Server (NTRS)
Esterhuizen, S.; Meehan, T. K.; Robison, D.
2009-01-01
Remotely sensing the Earth's surface using GNSS signals as bi-static radar sources is one of the most challenging applications for radiometric instrument design. As part of NASA's Instrument Incubator Program, our group at JPL has built a prototype instrument, TOGA (Time-shifted, Orthometric, GNSS Array), to address a variety of GNSS science needs. Observing GNSS reflections is major focus of the design/development effort. The TOGA design features a steerable beam antenna array which can form a high-gain antenna pattern in multiple directions simultaneously. Multiple FPGAs provide flexible digital signal processing logic to process both GPS and Galileo reflections. A Linux OS based science processor serves as experiment scheduler and data post-processor. This paper outlines the TOGA design approach as well as preliminary results of reflection data collected from test flights over the Pacific ocean. This reflections data demonstrates observation of the GPS L1/L2C/L5 signals.
Methodology for fast detection of false sharing in threaded scientific codes
Chung, I-Hsin; Cong, Guojing; Murata, Hiroki; Negishi, Yasushi; Wen, Hui-Fang
2014-11-25
A profiling tool identifies a code region with a false sharing potential. A static analysis tool classifies variables and arrays in the identified code region. A mapping detection library correlates memory access instructions in the identified code region with variables and arrays in the identified code region while a processor is running the identified code region. The mapping detection library identifies one or more instructions at risk, in the identified code region, which are subject to an analysis by a false sharing detection library. A false sharing detection library performs a run-time analysis of the one or more instructions at risk while the processor is re-running the identified code region. The false sharing detection library determines, based on the performed run-time analysis, whether two different portions of the cache memory line are accessed by the generated binary code.
Microcalorimeters with Germanium Thermistors for High Resolution Soft and Hard X-ray Astronomy
NASA Technical Reports Server (NTRS)
Silver, E.
2003-01-01
This is a progress report for the first year of a three year Space Research and Technology (SR&T) grant to continue the advancement of neutron transmutation doped (NTD-based) microcalorimeters. We have re-prioritized certain aspects of the statement of work and chose to emphasize issues of array development in the first year rather than wait until year two. Consequently, some of the projects scheduled for the first year were delayed to the second year. Here we report on our progress to: a) Build and test a 1 x 4 element array and to investigate electrical and thermal cross-talk; b) Build a multiplexed 4 channel analog pulse processor; c) Build a digital pulse processor that can accommodate 4 channels with independent triggers; d) Develop a proportional thermal baseline restoration system compatible with the constant voltage mode of microcalorimeter operation.
A High-Throughput Processor for Flight Control Research Using Small UAVs
NASA Technical Reports Server (NTRS)
Klenke, Robert H.; Sleeman, W. C., IV; Motter, Mark A.
2006-01-01
There are numerous autopilot systems that are commercially available for small (<100 lbs) UAVs. However, they all share several key disadvantages for conducting aerodynamic research, chief amongst which is the fact that most utilize older, slower, 8- or 16-bit microcontroller technologies. This paper describes the development and testing of a flight control system (FCS) for small UAV s based on a modern, high throughput, embedded processor. In addition, this FCS platform contains user-configurable hardware resources in the form of a Field Programmable Gate Array (FPGA) that can be used to implement custom, application-specific hardware. This hardware can be used to off-load routine tasks such as sensor data collection, from the FCS processor thereby further increasing the computational throughput of the system.
Experience in highly parallel processing using DAP
NASA Technical Reports Server (NTRS)
Parkinson, D.
1987-01-01
Distributed Array Processors (DAP) have been in day to day use for ten years and a large amount of user experience has been gained. The profile of user applications is similar to that of the Massively Parallel Processor (MPP) working group. Experience has shown that contrary to expectations, highly parallel systems provide excellent performance on so-called dirty problems such as the physics part of meteorological codes. The reasons for this observation are discussed. The arguments against replacing bit processors with floating point processors are also discussed.
Fault-Tolerant, Radiation-Hard DSP
NASA Technical Reports Server (NTRS)
Czajkowski, David
2011-01-01
Commercial digital signal processors (DSPs) for use in high-speed satellite computers are challenged by the damaging effects of space radiation, mainly single event upsets (SEUs) and single event functional interrupts (SEFIs). Innovations have been developed for mitigating the effects of SEUs and SEFIs, enabling the use of very-highspeed commercial DSPs with improved SEU tolerances. Time-triple modular redundancy (TTMR) is a method of applying traditional triple modular redundancy on a single processor, exploiting the VLIW (very long instruction word) class of parallel processors. TTMR improves SEU rates substantially. SEFIs are solved by a SEFI-hardened core circuit, external to the microprocessor. It monitors the health of the processor, and if a SEFI occurs, forces the processor to return to performance through a series of escalating events. TTMR and hardened-core solutions were developed for both DSPs and reconfigurable field-programmable gate arrays (FPGAs). This includes advancement of TTMR algorithms for DSPs and reconfigurable FPGAs, plus a rad-hard, hardened-core integrated circuit that services both the DSP and FPGA. Additionally, a combined DSP and FPGA board architecture was fully developed into a rad-hard engineering product. This technology enables use of commercial off-the-shelf (COTS) DSPs in computers for satellite and other space applications, allowing rapid deployment at a much lower cost. Traditional rad-hard space computers are very expensive and typically have long lead times. These computers are either based on traditional rad-hard processors, which have extremely low computational performance, or triple modular redundant (TMR) FPGA arrays, which suffer from power and complexity issues. Even more frustrating is that the TMR arrays of FPGAs require a fixed, external rad-hard voting element, thereby causing them to lose much of their reconfiguration capability and in some cases significant speed reduction. The benefits of COTS high-performance signal processing include significant increase in onboard science data processing, enabling orders of magnitude reduction in required communication bandwidth for science data return, orders of magnitude improvement in onboard mission planning and critical decision making, and the ability to rapidly respond to changing mission environments, thus enabling opportunistic science and orders of magnitude reduction in the cost of mission operations through reduction of required staff. Additional benefits of COTS-based, high-performance signal processing include the ability to leverage considerable commercial and academic investments in advanced computing tools, techniques, and infra structure, and the familiarity of the science and IT community with these computing environments.
Microlaser-based compact optical neuro-processors (Invited Paper)
NASA Astrophysics Data System (ADS)
Paek, Eung Gi; Chan, Winston K.; Zah, Chung-En; Cheung, Kwok-wai; Curtis, L.; Chang-Hasnain, Constance J.
1992-10-01
This paper reviews the recent progress in the development of holographic neural networks using surface-emitting laser diode arrays (SELDAs). Since the previous work on ultrafast holographic memory readout system and a robust incoherent correlator, progress has been made in several areas: the use of an array of monolithic `neurons' to reconstruct holographic memories; two-dimensional (2-D) wavelength-division multiplexing (WDM) for image transmission through a single-mode fiber; and finally, an associative memory using time- division multiplexing (TDM). Experimental demonstrations on these are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Learn, Mark Walter
Sandia National Laboratories is currently developing new processing and data communication architectures for use in future satellite payloads. These architectures will leverage the flexibility and performance of state-of-the-art static-random-access-memory-based Field Programmable Gate Arrays (FPGAs). One such FPGA is the radiation-hardened version of the Virtex-5 being developed by Xilinx. However, not all features of this FPGA are being radiation-hardened by design and could still be susceptible to on-orbit upsets. One such feature is the embedded hard-core PPC440 processor. Since this processor is implemented in the FPGA as a hard-core, traditional mitigation approaches such as Triple Modular Redundancy (TMR) are not availablemore » to improve the processor's on-orbit reliability. The goal of this work is to investigate techniques that can help mitigate the embedded hard-core PPC440 processor within the Virtex-5 FPGA other than TMR. Implementing various mitigation schemes reliably within the PPC440 offers a powerful reconfigurable computing resource to these node-based processing architectures. This document summarizes the work done on the cache mitigation scheme for the embedded hard-core PPC440 processor within the Virtex-5 FPGAs, and describes in detail the design of the cache mitigation scheme and the testing conducted at the radiation effects facility on the Texas A&M campus.« less
The Microcode for the Control Processor of the ARO (Array Oriented Processor) Array Processor.
1983-08-01
oiNi .TADDR=DBASE+MODE" 4CONT ŕWAfT’ FOR MEM, MORE", MOV) ,DRO BSX "S IGN EXT, MORE" SADD D FLDSEI,(6,3),IMN TADT)R=5+ 1 JMP I NDE-’XEI) "JU> IP ’ T1...JDTV1: YIP DIVI; TDIV2: Y,’ IP DIV2; JASHII: JMP ASHI; 4 JASH2: JMP AS112; JXOR1: YIP XDRI; JXOR2: YIP XOR2; JSOB: JMP SOB; JBPL: JMP BPL; JBMI: YIP BMI;0...JBHI: JMP BHill JBLOS: J! IP BLOS; JBVC: YIP BVC; JBWS: JMP BVS; JBCC: JMP BCC; JBCS: YIP BCS; JEMT: YIP EMT; JTRAP: YIP TRAPQ; JCLR6: YIP CLR6; JCOII
Digital Beamforming Scatterometer
NASA Technical Reports Server (NTRS)
Rincon, Rafael F.; Vega, Manuel; Kman, Luko; Buenfil, Manuel; Geist, Alessandro; Hillard, Larry; Racette, Paul
2009-01-01
This paper discusses scatterometer measurements collected with multi-mode Digital Beamforming Synthetic Aperture Radar (DBSAR) during the SMAP-VEX 2008 campaign. The 2008 SMAP Validation Experiment was conducted to address a number of specific questions related to the soil moisture retrieval algorithms. SMAP-VEX 2008 consisted on a series of aircraft-based.flights conducted on the Eastern Shore of Maryland and Delaware in the fall of 2008. Several other instruments participated in the campaign including the Passive Active L-Band System (PALS), the Marshall Airborne Polarimetric Imaging Radiometer (MAPIR), and the Global Positioning System Reflectometer (GPSR). This campaign was the first SMAP Validation Experiment. DBSAR is a multimode radar system developed at NASA/Goddard Space Flight Center that combines state-of-the-art radar technologies, on-board processing, and advances in signal processing techniques in order to enable new remote sensing capabilities applicable to Earth science and planetary applications [l]. The instrument can be configured to operate in scatterometer, Synthetic Aperture Radar (SAR), or altimeter mode. The system builds upon the L-band Imaging Scatterometer (LIS) developed as part of the RadSTAR program. The radar is a phased array system designed to fly on the NASA P3 aircraft. The instrument consists of a programmable waveform generator, eight transmit/receive (T/R) channels, a microstrip antenna, and a reconfigurable data acquisition and processor system. Each transmit channel incorporates a digital attenuator, and digital phase shifter that enables amplitude and phase modulation on transmit. The attenuators, phase shifters, and calibration switches are digitally controlled by the radar control card (RCC) on a pulse by pulse basis. The antenna is a corporate fed microstrip patch-array centered at 1.26 GHz with a 20 MHz bandwidth. Although only one feed is used with the present configuration, a provision was made for separate corporate feeds for vertical and horizontal polarization. System upgrades to dual polarization are currently under way. The DBSAR processor is a reconfigurable data acquisition and processor system capable of real-time, high-speed data processing. DBSAR uses an FPGA-based architecture to implement digitally down-conversion, in-phase and quadrature (I/Q) demodulation, and subsequent radar specific algorithms. The core of the processor board consists of an analog-to-digital (AID) section, three Altera Stratix field programmable gate arrays (FPGAs), an ARM microcontroller, several memory devices, and an Ethernet interface. The processor also interfaces with a navigation board consisting of a GPS and a MEMS gyro. The processor has been configured to operate in scatterometer, Synthetic Aperture Radar (SAR), and altimeter modes. All the modes are based on digital beamforming which is a digital process that generates the far-field beam patterns at various scan angles from voltages sampled in the antenna array. This technique allows steering the received beam and controlling its beam-width and side-lobe. Several beamforming techniques can be implemented each characterized by unique strengths and weaknesses, and each applicable to different measurement scenarios. In Scatterometer mode, the radar is capable to.generate a wide beam or scan a narrow beam on transmit, and to steer the received beam on processing while controlling its beamwidth and side-lobe level. Table I lists some important radar characteristics
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1994-01-01
In a computer having a large number of single-instruction multiple data (SIMD) processors, each of the SIMD processors has two sets of three individual processor elements controlled by a master control unit and interconnected among a plurality of register file units where data is stored. The register files input and output data in synchronism with a minor cycle clock under control of two slave control units controlling the register file units connected to respective ones of the two sets of processor elements. Depending upon which ones of the register file units are enabled to store or transmit data during a particular minor clock cycle, the processor elements within an SIMD processor are connected in rings or in pipeline arrays, and may exchange data with the internal bus or with neighboring SIMD processors through interface units controlled by respective ones of the two slave control units.
2010-09-01
53 Figure 26. Image of the phased array antenna...................................................................54...69 Figure 38. Computation of correction angle from array factor and sum/difference beams...71 Figure 39. Front panel of the tracking algorithm
Solution for the nonuniformity correction of infrared focal plane arrays.
Zhou, Huixin; Liu, Shangqian; Lai, Rui; Wang, Dabao; Cheng, Yubao
2005-05-20
Based on the S-curve model of the detector response of infrared focal plan arrays (IRFPAs), an improved two-point correction algorithm is presented. The algorithm first transforms the nonlinear image data into linear data and then uses the normal two-point algorithm to correct the linear data. The algorithm can effectively overcome the influence of nonlinearity of the detector's response, and it enlarges the correction precision and the dynamic range of the response. A real-time imaging-signal-processing system for IRFPAs that is based on a digital signal processor and field-programmable gate arrays is also presented. The nonuniformity correction capability of the presented solution is validated by experimental imaging procedures of a 128 x 128 pixel IRFPA camera prototype.
Optoelectronic switch matrix as a look-up table for residue arithmetic.
Macdonald, R I
1987-10-01
The use of optoelectronic matrix switches to perform look-up table functions in residue arithmetic processors is proposed. In this application, switchable detector arrays give the advantage of a greatly reduced requirement for optical sources by comparison with previous optoelectronic residue processors.
Microcomputer array processor system. [design for electronic warfare
NASA Technical Reports Server (NTRS)
Slezak, K. D.
1980-01-01
The microcomputer array system is discussed with specific attention given to its electronic warware applications. Several aspects of the system architecture are described as well as some of its distinctive characteristics.
Zhang, Zhen; Ma, Cheng; Zhu, Rong
2017-08-23
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.
Zhang, Zhen; Zhu, Rong
2017-01-01
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
2007-12-11
Implemented both carrier and code phase tracking loop for performance evaluation of a minimum power beam forming algorithm and null steering algorithm...4 Antennal Antenna2 Antenna K RF RF RF ct, Ct~2 ChKx1 X2 ....... Xk A W ~ ~ =Z, x W ,=1 Fig. 5. Schematics of a K-element antenna array spatial...adaptive processor Antennal Antenna K A N-i V/ ( Vil= .i= VK Fig. 6. Schematics of a K-element antenna array space-time adaptive processor Two additional
A Versatile Multichannel Digital Signal Processing Module for Microcalorimeter Arrays
NASA Astrophysics Data System (ADS)
Tan, H.; Collins, J. W.; Walby, M.; Hennig, W.; Warburton, W. K.; Grudberg, P.
2012-06-01
Different techniques have been developed for reading out microcalorimeter sensor arrays: individual outputs for small arrays, and time-division or frequency-division or code-division multiplexing for large arrays. Typically, raw waveform data are first read out from the arrays using one of these techniques and then stored on computer hard drives for offline optimum filtering, leading not only to requirements for large storage space but also limitations on achievable count rate. Thus, a read-out module that is capable of processing microcalorimeter signals in real time will be highly desirable. We have developed multichannel digital signal processing electronics that are capable of on-board, real time processing of microcalorimeter sensor signals from multiplexed or individual pixel arrays. It is a 3U PXI module consisting of a standardized core processor board and a set of daughter boards. Each daughter board is designed to interface a specific type of microcalorimeter array to the core processor. The combination of the standardized core plus this set of easily designed and modified daughter boards results in a versatile data acquisition module that not only can easily expand to future detector systems, but is also low cost. In this paper, we first present the core processor/daughter board architecture, and then report the performance of an 8-channel daughter board, which digitizes individual pixel outputs at 1 MSPS with 16-bit precision. We will also introduce a time-division multiplexing type daughter board, which takes in time-division multiplexing signals through fiber-optic cables and then processes the digital signals to generate energy spectra in real time.
Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming
2011-11-01
Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
Detection and imaging of moving objects with SAR by a joint space-time-frequency processing
NASA Astrophysics Data System (ADS)
Barbarossa, Sergio; Farina, Alfonso
This paper proposes a joint spacetime-frequency processing scheme for the detection and imaging of moving targets by Synthetic Aperture Radars (SAR). The method is based on the availability of an array antenna. The signals received by the array elements are combined, in a spacetime processor, to cancel the clutter. Then, they are analyzed in the time-frequency domain, by computing their Wigner-Ville Distribution (WVD), in order to estimate the instantaneous frequency, to be used for the successive phase compensation, necessary to produce a high resolution image.
System and method for high power diode based additive manufacturing
El-Dasher, Bassem S.; Bayramian, Andrew; Demuth, James A.; Farmer, Joseph C.; Torres, Sharon G.
2018-01-02
A system is disclosed for performing an Additive Manufacturing (AM) fabrication process on a powdered material forming a substrate. The system may make use of a diode array for generating an optical signal sufficient to melt a powdered material of the substrate. A mask may be used for preventing a first predetermined portion of the optical signal from reaching the substrate, while allowing a second predetermined portion to reach the substrate. At least one processor may be used for controlling an output of the diode array.
System and method for high power diode based additive manufacturing
El-Dasher, Bassem S.; Bayramian, Andrew; Demuth, James A.; Farmer, Joseph C.; Torres, Sharon G.
2016-04-12
A system is disclosed for performing an Additive Manufacturing (AM) fabrication process on a powdered material forming a substrate. The system may make use of a diode array for generating an optical signal sufficient to melt a powdered material of the substrate. A mask may be used for preventing a first predetermined portion of the optical signal from reaching the substrate, while allowing a second predetermined portion to reach the substrate. At least one processor may be used for controlling an output of the diode array.
FPGA wavelet processor design using language for instruction-set architectures (LISA)
NASA Astrophysics Data System (ADS)
Meyer-Bäse, Uwe; Vera, Alonzo; Rao, Suhasini; Lenk, Karl; Pattichis, Marios
2007-04-01
The design of an microprocessor is a long, tedious, and error-prone task consisting of typically three design phases: architecture exploration, software design (assembler, linker, loader, profiler), architecture implementation (RTL generation for FPGA or cell-based ASIC) and verification. The Language for instruction-set architectures (LISA) allows to model a microprocessor not only from instruction-set but also from architecture description including pipelining behavior that allows a design and development tool consistency over all levels of the design. To explore the capability of the LISA processor design platform a.k.a. CoWare Processor Designer we present in this paper three microprocessor designs that implement a 8/8 wavelet transform processor that is typically used in today's FBI fingerprint compression scheme. We have designed a 3 stage pipelined 16 bit RISC processor (NanoBlaze). Although RISC μPs are usually considered "fast" processors due to design concept like constant instruction word size, deep pipelines and many general purpose registers, it turns out that DSP operations consume essential processing time in a RISC processor. In a second step we have used design principles from programmable digital signal processor (PDSP) to improve the throughput of the DWT processor. A multiply-accumulate operation along with indirect addressing operation were the key to achieve higher throughput. A further improvement is possible with today's FPGA technology. Today's FPGAs offer a large number of embedded array multipliers and it is now feasible to design a "true" vector processor (TVP). A multiplication of two vectors can be done in just one clock cycle with our TVP, a complete scalar product in two clock cycles. Code profiling and Xilinx FPGA ISE synthesis results are provided that demonstrate the essential improvement that a TVP has compared with traditional RISC or PDSP designs.
Juswardy, Budi; Xiao, Feng; Alameh, Kamal
2009-03-16
This paper proposes a novel Opto-VLSI-based tunable true-time delay generation unit for adaptively steering the nulls of microwave phased array antennas. Arbitrary single or multiple true-time delays can simultaneously be synthesized for each antenna element by slicing an RF-modulated broadband optical source and routing specific sliced wavebands through an Opto-VLSI processor to a high-dispersion fiber. Experimental results are presented, which demonstrate the principle of the true-time delay unit through the generation of 5 arbitrary true-time delays of up to 2.5 ns each. (c) 2009 Optical Society of America
Multi-element germanium detectors for synchrotron applications
NASA Astrophysics Data System (ADS)
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; Vernon, E.; Pinelli, D.; Dooryhee, E.; Ghose, S.; Caswell, T.; Siddons, D. P.; Miceli, A.; Baldwin, J.; Almer, J.; Okasinski, J.; Quaranta, O.; Woods, R.; Krings, T.; Stock, S.
2018-04-01
We have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. We will discuss the technical details of the systems, and present some of the results from them.
Optical backplane interconnect switch for data processors and computers
NASA Technical Reports Server (NTRS)
Hendricks, Herbert D.; Benz, Harry F.; Hammer, Jacob M.
1989-01-01
An optoelectronic integrated device design is reported which can be used to implement an all-optical backplane interconnect switch. The switch is sized to accommodate an array of processors and memories suitable for direct replacement into the basic avionic multiprocessor backplane. The optical backplane interconnect switch is also suitable for direct replacement of the PI bus traffic switch and at the same time, suitable for supporting pipelining of the processor and memory. The 32 bidirectional switchable interconnects are configured with broadcast capability for controls, reconfiguration, and messages. The approach described here can handle a serial interconnection of data processors or a line-to-link interconnection of data processors. An optical fiber demonstration of this approach is presented.
Design of a MIMD neural network processor
NASA Astrophysics Data System (ADS)
Saeks, Richard E.; Priddy, Kevin L.; Pap, Robert M.; Stowell, S.
1994-03-01
The Accurate Automation Corporation (AAC) neural network processor (NNP) module is a fully programmable multiple instruction multiple data (MIMD) parallel processor optimized for the implementation of neural networks. The AAC NNP design fully exploits the intrinsic sparseness of neural network topologies. Moreover, by using a MIMD parallel processing architecture one can update multiple neurons in parallel with efficiency approaching 100 percent as the size of the network increases. Each AAC NNP module has 8 K neurons and 32 K interconnections and is capable of 140,000,000 connections per second with an eight processor array capable of over one billion connections per second.
Fuzzy logic particle tracking velocimetry
NASA Technical Reports Server (NTRS)
Wernet, Mark P.
1993-01-01
Fuzzy logic has proven to be a simple and robust method for process control. Instead of requiring a complex model of the system, a user defined rule base is used to control the process. In this paper the principles of fuzzy logic control are applied to Particle Tracking Velocimetry (PTV). Two frames of digitally recorded, single exposure particle imagery are used as input. The fuzzy processor uses the local particle displacement information to determine the correct particle tracks. Fuzzy PTV is an improvement over traditional PTV techniques which typically require a sequence (greater than 2) of image frames for accurately tracking particles. The fuzzy processor executes in software on a PC without the use of specialized array or fuzzy logic processors. A pair of sample input images with roughly 300 particle images each, results in more than 200 velocity vectors in under 8 seconds of processing time.
Comparing an FPGA to a Cell for an Image Processing Application
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.
2010-12-01
Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.
NASA Technical Reports Server (NTRS)
Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas
2008-01-01
A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.
A generic FPGA-based detector readout and real-time image processing board
NASA Astrophysics Data System (ADS)
Sarpotdar, Mayuresh; Mathew, Joice; Safonova, Margarita; Murthy, Jayant
2016-07-01
For space-based astronomical observations, it is important to have a mechanism to capture the digital output from the standard detector for further on-board analysis and storage. We have developed a generic (application- wise) field-programmable gate array (FPGA) board to interface with an image sensor, a method to generate the clocks required to read the image data from the sensor, and a real-time image processor system (on-chip) which can be used for various image processing tasks. The FPGA board is applied as the image processor board in the Lunar Ultraviolet Cosmic Imager (LUCI) and a star sensor (StarSense) - instruments developed by our group. In this paper, we discuss the various design considerations for this board and its applications in the future balloon and possible space flights.
Parallel asynchronous systems and image processing algorithms
NASA Technical Reports Server (NTRS)
Coon, D. D.; Perera, A. G. U.
1989-01-01
A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Fast, Massively Parallel Data Processors
NASA Technical Reports Server (NTRS)
Heaton, Robert A.; Blevins, Donald W.; Davis, ED
1994-01-01
Proposed fast, massively parallel data processor contains 8x16 array of processing elements with efficient interconnection scheme and options for flexible local control. Processing elements communicate with each other on "X" interconnection grid with external memory via high-capacity input/output bus. This approach to conditional operation nearly doubles speed of various arithmetic operations.
Fast particles identification in programmable form at level-0 trigger by means of the 3D-Flow system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crosetto, Dario B.
1998-10-30
The 3D-Flow Processor system is a new, technology-independent concept in very fast, real-time system architectures. Based on either an FPGA or an ASIC implementation, it can address, in a fully programmable manner, applications where commercially available processors would fail because of throughput requirements. Possible applications include filtering-algorithms (pattern recognition) from the input of multiple sensors, as well as moving any input validated by these filtering-algorithms to a single output channel. Both operations can easily be implemented on a 3D-Flow system to achieve a real-time processing system with a very short lag time. This system can be built either with off-the-shelfmore » FPGAs or, for higher data rates, with CMOS chips containing 4 to 16 processors each. The basic building block of the system, a 3D-Flow processor, has been successfully designed in VHDL code written in ''Generic HDL'' (mostly made of reusable blocks that are synthesizable in different technologies, or FPGAs), to produce a netlist for a four-processor ASIC featuring 0.35 micron CBA (Ceil Base Array) technology at 3.3 Volts, 884 mW power dissipation at 60 MHz and 63.75 mm sq. die size. The same VHDL code has been targeted to three FPGA manufacturers (Altera EPF10K250A, ORCA-Lucent Technologies 0R3T165 and Xilinx XCV1000). A complete set of software tools, the 3D-Flow System Manager, equally applicable to ASIC or FPGA implementations, has been produced to provide full system simulation, application development, real-time monitoring, and run-time fault recovery. Today's technology can accommodate 16 processors per chip in a medium size die, at a cost per processor of less than $5 based on the current silicon die/size technology cost.« less
An optical/digital processor - Hardware and applications
NASA Technical Reports Server (NTRS)
Casasent, D.; Sterling, W. M.
1975-01-01
A real-time two-dimensional hybrid processor consisting of a coherent optical system, an optical/digital interface, and a PDP-11/15 control minicomputer is described. The input electrical-to-optical transducer is an electron-beam addressed potassium dideuterium phosphate (KD2PO4) light valve. The requirements and hardware for the output optical-to-digital interface, which is constructed from modular computer building blocks, are presented. Initial experimental results demonstrating the operation of this hybrid processor in phased-array radar data processing, synthetic-aperture image correlation, and text correlation are included. The applications chosen emphasize the role of the interface in the analysis of data from an optical processor and possible extensions to the digital feedback control of an optical processor.
Periodic Application of Concurrent Error Detection in Processor Array Architectures. PhD. Thesis -
NASA Technical Reports Server (NTRS)
Chen, Paul Peichuan
1993-01-01
Processor arrays can provide an attractive architecture for some applications. Featuring modularity, regular interconnection and high parallelism, such arrays are well-suited for VLSI/WSI implementations, and applications with high computational requirements, such as real-time signal processing. Preserving the integrity of results can be of paramount importance for certain applications. In these cases, fault tolerance should be used to ensure reliable delivery of a system's service. One aspect of fault tolerance is the detection of errors caused by faults. Concurrent error detection (CED) techniques offer the advantage that transient and intermittent faults may be detected with greater probability than with off-line diagnostic tests. Applying time-redundant CED techniques can reduce hardware redundancy costs. However, most time-redundant CED techniques degrade a system's performance.
Optical signal processing of spatially distributed sensor data in smart structures
NASA Technical Reports Server (NTRS)
Bennett, K. D.; Claus, R. O.; Murphy, K. A.; Goette, A. M.
1989-01-01
Smart structures which contain dense two- or three-dimensional arrays of attached or embedded sensor elements inherently require signal multiplexing and processing capabilities to permit good spatial data resolution as well as the adequately short calculation times demanded by real time active feedback actuator drive circuitry. This paper reports the implementation of an in-line optical signal processor and its application in a structural sensing system which incorporates multiple discrete optical fiber sensor elements. The signal processor consists of an array of optical fiber couplers having tailored s-parameters and arranged to allow gray code amplitude scaling of sensor inputs. The use of this signal processor in systems designed to indicate the location of distributed strain and damage in composite materials, as well as to quantitatively characterize that damage, is described. Extension of similar signal processing methods to more complicated smart materials and structures applications are discussed.
Design of infrasound-detection system via adaptive LMSTDE algorithm
NASA Technical Reports Server (NTRS)
Khalaf, C. S.; Stoughton, J. W.
1984-01-01
A proposed solution to an aviation safety problem is based on passive detection of turbulent weather phenomena through their infrasonic emission. This thesis describes a system design that is adequate for detection and bearing evaluation of infrasounds. An array of four sensors, with the appropriate hardware, is used for the detection part. Bearing evaluation is based on estimates of time delays between sensor outputs. The generalized cross correlation (GCC), as the conventional time-delay estimation (TDE) method, is first reviewed. An adaptive TDE approach, using the least mean square (LMS) algorithm, is then discussed. A comparison between the two techniques is made and the advantages of the adaptive approach are listed. The behavior of the GCC, as a Roth processor, is examined for the anticipated signals. It is shown that the Roth processor has the desired effect of sharpening the peak of the correlation function. It is also shown that the LMSTDE technique is an equivalent implementation of the Roth processor in the time domain. A LMSTDE lead-lag model, with a variable stability coefficient and a convergence criterion, is designed.
NbN A/D Conversion of IR Focal Plane Sensor Signal at 10 K
NASA Technical Reports Server (NTRS)
Eaton, L.; Durand, D.; Sandell, R.; Spargo, J.; Krabach, T.
1994-01-01
We are implementing a 12 bit SFQ counting ADC with parallel-to-serial readout using our established 10 K NbN capability. This circuit provides a key element of the analog signal processor (ASP) used in large infrared focal plane arrays. The circuit processes the signal data stream from a Si:As BIB detector array. A 10 mega samples per second (MSPS) pixel data stream flows from the chip at a 120 megabit bit rate in a format that is compatible with other superconductive time dependent processor (TDP) circuits being developed. We will discuss our planned ASP demonstration, the circuit design, and test results.
20-GFLOPS QR processor on a Xilinx Virtex-E FPGA
NASA Astrophysics Data System (ADS)
Walke, Richard L.; Smith, Robert W. M.; Lightbody, Gaye
2000-11-01
Adaptive beamforming can play an important role in sensor array systems in countering directional interference. In high-sample rate systems, such as radar and comms, the calculation of adaptive weights is a very computational task that requires highly parallel solutions. For systems where low power consumption and volume are important the only viable implementation is as an Application Specific Integrated Circuit (ASIC). However, the rapid advancement of Field Programmable Gate Array (FPGA) technology is enabling highly credible re-programmable solutions. In this paper we present the implementation of a scalable linear array processor for weight calculation using QR decomposition. We employ floating-point arithmetic with mantissa size optimized to the target application to minimize component size, and implement them as relationally placed macros (RPMs) on Xilinx Virtex FPGAs to achieve predictable dense layout and high-speed operation. We present results that show that 20GFLOPS of sustained computation on a single XCV3200E-8 Virtex-E FPGA is possible. We also describe the parameterized implementation of the floating-point operators and QR-processor, and the design methodology that enables us to rapidly generate complex FPGA implementations using the industry standard hardware description language VHDL.
Multiple Embedded Processors for Fault-Tolerant Computing
NASA Technical Reports Server (NTRS)
Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy
2005-01-01
A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
Acoustooptic linear algebra processors - Architectures, algorithms, and applications
NASA Technical Reports Server (NTRS)
Casasent, D.
1984-01-01
Architectures, algorithms, and applications for systolic processors are described with attention to the realization of parallel algorithms on various optical systolic array processors. Systolic processors for matrices with special structure and matrices of general structure, and the realization of matrix-vector, matrix-matrix, and triple-matrix products and such architectures are described. Parallel algorithms for direct and indirect solutions to systems of linear algebraic equations and their implementation on optical systolic processors are detailed with attention to the pipelining and flow of data and operations. Parallel algorithms and their optical realization for LU and QR matrix decomposition are specifically detailed. These represent the fundamental operations necessary in the implementation of least squares, eigenvalue, and SVD solutions. Specific applications (e.g., the solution of partial differential equations, adaptive noise cancellation, and optimal control) are described to typify the use of matrix processors in modern advanced signal processing.
Fault detection and bypass in a sequence information signal processor
NASA Technical Reports Server (NTRS)
Peterson, John C. (Inventor); Chow, Edward T. (Inventor)
1992-01-01
The invention comprises a plurality of scan registers, each such register respectively associated with a processor element; an on-chip comparator, encoder and fault bypass register. Each scan register generates a unitary signal the logic state of which depends on the correctness of the input from the previous processor in the systolic array. These unitary signals are input to a common comparator which generates an output indicating whether or not an error has occurred. These unitary signals are also input to an encoder which identifies the location of any fault detected so that an appropriate multiplexer can be switched to bypass the faulty processor element. Input scan data can be readily programmed to fully exercise all of the processor elements so that no fault can remain undetected.
Benchmarking GNU Radio Kernels and Multi-Processor Scheduling
2013-01-14
AMD E350 APU , comparable to Atom • ARM Cortex A8 running on a Gumstix Overo on an Ettus USRP E110 The general testing procedure consists of • Build...Intel Atom, and the AMD E350 APU . 3.2 Multi-Processor Scheduling Figure 1: GFLOPs per second through an FFT array on an Intel i7. Example output from
Techniques for the rapid display and manipulation of 3-D biomedical data.
Goldwasser, S M; Reynolds, R A; Talton, D A; Walsh, E S
1988-01-01
The use of fully interactive 3-D workstations with true real-time performance will become increasingly common as technology matures and economical commercial systems become available. This paper provides a comprehensive introduction to high speed approaches to the display and manipulation of 3-D medical objects obtained from tomographic data acquisition systems such as CT, MR, and PET. A variety of techniques are outlined including the use of software on conventional minicomputers, hardware assist devices such as array processors and programmable frame buffers, and special purpose computer architecture for dedicated high performance systems. While both algorithms and architectures are addressed, the major theme centers around the utilization of hardware-based approaches including parallel processors for the implementation of true real-time systems.
Integrated 3-D vision system for autonomous vehicles
NASA Astrophysics Data System (ADS)
Hou, Kun M.; Shawky, Mohamed; Tu, Xiaowei
1992-03-01
Nowadays, autonomous vehicles have become a multidiscipline field. Its evolution is taking advantage of the recent technological progress in computer architectures. As the development tools became more sophisticated, the trend is being more specialized, or even dedicated architectures. In this paper, we will focus our interest on a parallel vision subsystem integrated in the overall system architecture. The system modules work in parallel, communicating through a hierarchical blackboard, an extension of the 'tuple space' from LINDA concepts, where they may exchange data or synchronization messages. The general purpose processing elements are of different skills, built around 40 MHz i860 Intel RISC processors for high level processing and pipelined systolic array processors based on PLAs or FPGAs for low-level processing.
Processor and method for developing a set of admissible fixture designs for a workpiece
Brost, R.C.; Goldberg, K.Y.; Wallack, A.S.; Canny, J.
1996-08-13
A fixture process and method is provided for developing a complete set of all admissible fixture designs for a workpiece which prevents the workpiece from translating or rotating. The fixture processor generates the set of all admissible designs based on geometric access constraints and expected applied forces on the workpiece. For instance, the fixture processor may generate a set of admissible fixture designs for first, second and third locators placed in an array of holes on a fixture plate and a translating clamp attached to the fixture plate for contacting the workpiece. In another instance, a fixture vice is used in which first, second, third and fourth locators are used and first and second fixture jaws are tightened to secure the workpiece. The fixture process also ranks the set of admissible fixture designs according to a predetermined quality metric so that the optimal fixture design for the desired purpose may be identified from the set of all admissible fixture designs. 27 figs.
Processor and method for developing a set of admissible fixture designs for a workpiece
Brost, Randolph C.; Goldberg, Kenneth Y.; Canny, John; Wallack, Aaron S.
1999-01-01
Methods and apparatus are provided for developing a complete set of all admissible Type I and Type II fixture designs for a workpiece. The fixture processor generates the set of all admissible designs based on geometric access constraints and expected applied forces on the workpiece. For instance, the fixture processor may generate a set of admissible fixture designs for first, second and third locators placed in an array of holes on a fixture plate and a translating clamp attached to the fixture plate for contacting the workpiece. In another instance, a fixture vise is used in which first, second, third and fourth locators are used and first and second fixture jaws are tightened to secure the workpiece. The fixture process also ranks the set of admissible fixture designs according to a predetermined quality metric so that the optimal fixture design for the desired purpose may be identified from the set of all admissible fixture designs.
Processor and method for developing a set of admissible fixture designs for a workpiece
Brost, Randolph C.; Goldberg, Kenneth Y.; Wallack, Aaron S.; Canny, John
1996-01-01
A fixture process and method is provided for developing a complete set of all admissible fixture designs for a workpiece which prevents the workpiece from translating or rotating. The fixture processor generates the set of all admissible designs based on geometric access constraints and expected applied forces on the workpiece. For instance, the fixture processor may generate a set of admissible fixture designs for first, second and third locators placed in an array of holes on a fixture plate and a translating clamp attached to the fixture plate for contacting the workpiece. In another instance, a fixture vice is used in which first, second, third and fourth locators are used and first and second fixture jaws are tightened to secure the workpiece. The fixture process also ranks the set of admissible fixture designs according to a predetermined quality metric so that the optimal fixture design for the desired purpose may be identified from the set of all admissible fixture designs.
Processor and method for developing a set of admissible fixture designs for a workpiece
Brost, R.C.; Goldberg, K.Y.; Canny, J.; Wallack, A.S.
1999-01-05
Methods and apparatus are provided for developing a complete set of all admissible Type 1 and Type 2 fixture designs for a workpiece. The fixture processor generates the set of all admissible designs based on geometric access constraints and expected applied forces on the workpiece. For instance, the fixture processor may generate a set of admissible fixture designs for first, second and third locators placed in an array of holes on a fixture plate and a translating clamp attached to the fixture plate for contacting the workpiece. In another instance, a fixture vise is used in which first, second, third and fourth locators are used and first and second fixture jaws are tightened to secure the workpiece. The fixture process also ranks the set of admissible fixture designs according to a predetermined quality metric so that the optimal fixture design for the desired purpose may be identified from the set of all admissible fixture designs. 44 figs.
Simulation of an array-based neural net model
NASA Technical Reports Server (NTRS)
Barnden, John A.
1987-01-01
Research in cognitive science suggests that much of cognition involves the rapid manipulation of complex data structures. However, it is very unclear how this could be realized in neural networks or connectionist systems. A core question is: how could the interconnectivity of items in an abstract-level data structure be neurally encoded? The answer appeals mainly to positional relationships between activity patterns within neural arrays, rather than directly to neural connections in the traditional way. The new method was initially devised to account for abstract symbolic data structures, but it also supports cognitively useful spatial analogue, image-like representations. As the neural model is based on massive, uniform, parallel computations over 2D arrays, the massively parallel processor is a convenient tool for simulation work, although there are complications in using the machine to the fullest advantage. An MPP Pascal simulation program for a small pilot version of the model is running.
NASA Astrophysics Data System (ADS)
Weber, Walter H.; Mair, H. Douglas; Jansen, Dion
2003-03-01
A suite of basic signal processors has been developed. These basic building blocks can be cascaded together to form more complex processors without the need for programming. The data structures between each of the processors are handled automatically. This allows a processor built for one purpose to be applied to any type of data such as images, waveform arrays and single values. The processors are part of Winspect Data Acquisition software. The new processors are fast enough to work on A-scan signals live while scanning. Their primary use is to extract features, reduce noise or to calculate material properties. The cascaded processors work equally well on live A-scan displays, live gated data or as a post-processing engine on saved data. Researchers are able to call their own MATLAB or C-code from anywhere within the processor structure. A built-in formula node processor that uses a simple algebraic editor may make external user programs unnecessary. This paper also discusses the problems associated with ad hoc software development and how graphical programming languages can tie up researchers writing software rather than designing experiments.
Multi-element germanium detectors for synchrotron applications
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.; ...
2018-04-27
In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
Multi-element germanium detectors for synchrotron applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rumaiz, A. K.; Kuczewski, A. J.; Mead, J.
In this paper, we have developed a series of monolithic multi-element germanium detectors, based on sensor arrays produced by the Forschungzentrum Julich, and on Application-specific integrated circuits (ASICs) developed at Brookhaven. Devices have been made with element counts ranging from 64 to 384. These detectors are being used at NSLS-II and APS for a range of diffraction experiments, both monochromatic and energy-dispersive. Compact and powerful readout systems have been developed, based on the new generation of FPGA system-on-chip devices, which provide closely coupled multi-core processors embedded in large gate arrays. Finally, we will discuss the technical details of the systems,more » and present some of the results from them.« less
NASA Astrophysics Data System (ADS)
Cerwin, Steve; Barnes, Julie; Kell, Scott; Walters, Mark
2003-09-01
This paper describes development and application of a novel method to accomplish real-time solid angle acoustic direction finding using two 8-element orthogonal microphone arrays. The developed prototype system was intended for localization and signature recognition of ground-based sounds from a small UAV. Recent advances in computer speeds have enabled the implementation of microphone arrays in many audio applications. Still, the real-time presentation of a two-dimensional sound field for the purpose of audio target localization is computationally challenging. In order to overcome this challenge, a crosspower spectrum phase1 (CSP) technique was applied to each 8-element arm of a 16-element cross array to provide audio target localization. In this paper, we describe the technique and compare it with two other commonly used techniques; Cross-Spectral Matrix2 and MUSIC3. The results show that the CSP technique applied to two 8-element orthogonal arrays provides a computationally efficient solution with reasonable accuracy and tolerable artifacts, sufficient for real-time applications. Additional topics include development of a synchronized 16-channel transmitter and receiver to relay the airborne data to the ground-based processor and presentation of test data demonstrating both ground-mounted operation and airborne localization of ground-based gunshots and loud engine sounds.
FPGA-based distributed computing microarchitecture for complex physical dynamics investigation.
Borgese, Gianluca; Pace, Calogero; Pantano, Pietro; Bilotta, Eleonora
2013-09-01
In this paper, we present a distributed computing system, called DCMARK, aimed at solving partial differential equations at the basis of many investigation fields, such as solid state physics, nuclear physics, and plasma physics. This distributed architecture is based on the cellular neural network paradigm, which allows us to divide the differential equation system solving into many parallel integration operations to be executed by a custom multiprocessor system. We push the number of processors to the limit of one processor for each equation. In order to test the present idea, we choose to implement DCMARK on a single FPGA, designing the single processor in order to minimize its hardware requirements and to obtain a large number of easily interconnected processors. This approach is particularly suited to study the properties of 1-, 2- and 3-D locally interconnected dynamical systems. In order to test the computing platform, we implement a 200 cells, Korteweg-de Vries (KdV) equation solver and perform a comparison between simulations conducted on a high performance PC and on our system. Since our distributed architecture takes a constant computing time to solve the equation system, independently of the number of dynamical elements (cells) of the CNN array, it allows us to reduce the elaboration time more than other similar systems in the literature. To ensure a high level of reconfigurability, we design a compact system on programmable chip managed by a softcore processor, which controls the fast data/control communication between our system and a PC Host. An intuitively graphical user interface allows us to change the calculation parameters and plot the results.
NASA Technical Reports Server (NTRS)
Gooder, S. T.
1977-01-01
System tests were performed in which Integrally Regulated Solar Arrays (IRSA's) were used to directly power the beam and accelerator loads of a 30-cm-diameter, electron bombardment, mercury ion thruster. The remaining thruster loads were supplied from conventional power-processing circuits. This combination of IRSA's and conventional circuits formed a hybrid power processor. Thruster performance was evaluated at 3/4- and 1-A beam currents with both the IRSA-hybrid and conventional power processors and was found to be identical for both systems. Power processing is significantly more efficient with the hybrid system. System dynamics and IRSA response to thruster arcs are also examined.
Runtime support and compilation methods for user-specified data distributions
NASA Technical Reports Server (NTRS)
Ponnusamy, Ravi; Saltz, Joel; Choudhury, Alok; Hwang, Yuan-Shin; Fox, Geoffrey
1993-01-01
This paper describes two new ideas by which an HPF compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a simple conservative method that in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g. communication schedules, loop iteration partitions, information that associates off-processor data copies with on-processor buffer locations). We present performance results for these mechanisms from a Fortran 90D compiler implementation.
Arranging computer architectures to create higher-performance controllers
NASA Technical Reports Server (NTRS)
Jacklin, Stephen A.
1988-01-01
Techniques for integrating microprocessors, array processors, and other intelligent devices in control systems are reviewed, with an emphasis on the (re)arrangement of components to form distributed or parallel processing systems. Consideration is given to the selection of the host microprocessor, increasing the power and/or memory capacity of the host, multitasking software for the host, array processors to reduce computation time, the allocation of real-time and non-real-time events to different computer subsystems, intelligent devices to share the computational burden for real-time events, and intelligent interfaces to increase communication speeds. The case of a helicopter vibration-suppression and stabilization controller is analyzed as an example, and significant improvements in computation and throughput rates are demonstrated.
General linear codes for fault-tolerant matrix operations on processor arrays
NASA Technical Reports Server (NTRS)
Nair, V. S. S.; Abraham, J. A.
1988-01-01
Various checksum codes have been suggested for fault-tolerant matrix computations on processor arrays. Use of these codes is limited due to potential roundoff and overflow errors. Numerical errors may also be misconstrued as errors due to physical faults in the system. In this a set of linear codes is identified which can be used for fault-tolerant matrix operations such as matrix addition, multiplication, transposition, and LU-decomposition, with minimum numerical error. Encoding schemes are given for some of the example codes which fall under the general set of codes. With the help of experiments, a rule of thumb for the selection of a particular code for a given application is derived.
NASA Astrophysics Data System (ADS)
Männer, R.
1989-12-01
This paper describes a systolic array processor for a ring image Cherenkov counter which is capable of identifying pairs of electron circles with a known radius and a certain minimum distance within 15 μs. The processor is a very flexible and fast device. It consists of 128 x 128 processing elements (PEs), where one PE is assigned to each pixel of the image. All PEs run synchronously at 40 MHz. The identification of electron circles is done by correlating the detector image with the proper circle circumference. Circle centers are found by peak detection in the correlation result. A second correlation with a circle disc allows circles of closed electron pairs to be rejected. The trigger decision is generated if a pseudo adder detects at least two remaining circles. The device is controlled by a freely programmable sequencer. A VLSI chip containing 8 x 8 PEs is being developed using a VENUS design system and will be produced in 2μ CMOS technology.
Unstructured Adaptive Grid Computations on an Array of SMPs
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Pramanick, Ira; Sohn, Andrew; Simon, Horst D.
1996-01-01
Dynamic load balancing is necessary for parallel adaptive methods to solve unsteady CFD problems on unstructured grids. We have presented such a dynamic load balancing framework called JOVE, in this paper. Results on a four-POWERnode POWER CHALLENGEarray demonstrated that load balancing gives significant performance improvements over no load balancing for such adaptive computations. The parallel speedup of JOVE, implemented using MPI on the POWER CHALLENCEarray, was significant, being as high as 31 for 32 processors. An implementation of JOVE that exploits 'an array of SMPS' architecture was also studied; this hybrid JOVE outperformed flat JOVE by up to 28% on the meshes and adaption models tested. With large, realistic meshes and actual flow-solver and adaption phases incorporated into JOVE, hybrid JOVE can be expected to yield significant advantage over flat JOVE, especially as the number of processors is increased, thus demonstrating the scalability of an array of SMPs architecture.
Radiation Tolerant, FPGA-Based SmallSat Computer System
NASA Technical Reports Server (NTRS)
LaMeres, Brock J.; Crum, Gary A.; Martinez, Andres; Petro, Andrew
2015-01-01
The Radiation Tolerant, FPGA-based SmallSat Computer System (RadSat) computing platform exploits a commercial off-the-shelf (COTS) Field Programmable Gate Array (FPGA) with real-time partial reconfiguration to provide increased performance, power efficiency and radiation tolerance at a fraction of the cost of existing radiation hardened computing solutions. This technology is ideal for small spacecraft that require state-of-the-art on-board processing in harsh radiation environments but where using radiation hardened processors is cost prohibitive.
Design of video processing and testing system based on DSP and FPGA
NASA Astrophysics Data System (ADS)
Xu, Hong; Lv, Jun; Chen, Xi'ai; Gong, Xuexia; Yang, Chen'na
2007-12-01
Based on high speed Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA), a video capture, processing and display system is presented, which is of miniaturization and low power. In this system, a triple buffering scheme was used for the capture and display, so that the application can always get a new buffer without waiting; The Digital Signal Processor has an image process ability and it can be used to test the boundary of workpiece's image. A video graduation technology is used to aim at the position which is about to be tested, also, it can enhance the system's flexibility. The character superposition technology realized by DSP is used to display the test result on the screen in character format. This system can process image information in real time, ensure test precision, and help to enhance product quality and quality management.
Hardware Architecture Study for NASA's Space Software Defined Radios
NASA Technical Reports Server (NTRS)
Reinhart, Richard C.; Scardelletti, Maximilian C.; Mortensen, Dale J.; Kacpura, Thomas J.; Andro, Monty; Smith, Carl; Liebetreu, John
2008-01-01
This study defines a hardware architecture approach for software defined radios to enable commonality among NASA space missions. The architecture accommodates a range of reconfigurable processing technologies including general purpose processors, digital signal processors, field programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs) in addition to flexible and tunable radio frequency (RF) front-ends to satisfy varying mission requirements. The hardware architecture consists of modules, radio functions, and and interfaces. The modules are a logical division of common radio functions that comprise a typical communication radio. This paper describes the architecture details, module definitions, and the typical functions on each module as well as the module interfaces. Trade-offs between component-based, custom architecture and a functional-based, open architecture are described. The architecture does not specify the internal physical implementation within each module, nor does the architecture mandate the standards or ratings of the hardware used to construct the radios.
NASA Technical Reports Server (NTRS)
Reinhart, Richard C.; Kacpura, Thomas J.; Smith, Carl R.; Liebetreu, John; Hill, Gary; Mortensen, Dale J.; Andro, Monty; Scardelletti, Maximilian C.; Farrington, Allen
2008-01-01
This report defines a hardware architecture approach for software-defined radios to enable commonality among NASA space missions. The architecture accommodates a range of reconfigurable processing technologies including general-purpose processors, digital signal processors, field programmable gate arrays, and application-specific integrated circuits (ASICs) in addition to flexible and tunable radiofrequency front ends to satisfy varying mission requirements. The hardware architecture consists of modules, radio functions, and interfaces. The modules are a logical division of common radio functions that compose a typical communication radio. This report describes the architecture details, the module definitions, the typical functions on each module, and the module interfaces. Tradeoffs between component-based, custom architecture and a functional-based, open architecture are described. The architecture does not specify a physical implementation internally on each module, nor does the architecture mandate the standards or ratings of the hardware used to construct the radios.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Muller, U.A.; Baumle, B.; Kohler, P.
1992-10-01
Music, a DSP-based system with a parallel distributed-memory architecture, provides enormous computing power yet retains the flexibility of a general-purpose computer. Reaching a peak performance of 2.7 Gflops at a significantly lower cost, power consumption, and space requirement than conventional supercomputers, Music is well suited to computationally intensive applications such as neural network simulation. 12 refs., 9 figs., 2 tabs.
Baranwal, Mayank; Gorugantu, Ram S; Salapaka, Srinivasa M
2015-08-01
This paper aims at control design and its implementation for robust high-bandwidth precision (nanoscale) positioning systems. Even though modern model-based control theoretic designs for robust broadband high-resolution positioning have enabled orders of magnitude improvement in performance over existing model independent designs, their scope is severely limited by the inefficacies of digital implementation of the control designs. High-order control laws that result from model-based designs typically have to be approximated with reduced-order systems to facilitate digital implementation. Digital systems, even those that have very high sampling frequencies, provide low effective control bandwidth when implementing high-order systems. In this context, field programmable analog arrays (FPAAs) provide a good alternative to the use of digital-logic based processors since they enable very high implementation speeds, moreover with cheaper resources. The superior flexibility of digital systems in terms of the implementable mathematical and logical functions does not give significant edge over FPAAs when implementing linear dynamic control laws. In this paper, we pose the control design objectives for positioning systems in different configurations as optimal control problems and demonstrate significant improvements in performance when the resulting control laws are applied using FPAAs as opposed to their digital counterparts. An improvement of over 200% in positioning bandwidth is achieved over an earlier digital signal processor (DSP) based implementation for the same system and same control design, even when for the DSP-based system, the sampling frequency is about 100 times the desired positioning bandwidth.
Microdot - A Four-Bit Microcontroller Designed for Distributed Low-End Computing in Satellites
NASA Astrophysics Data System (ADS)
2002-03-01
Many satellites are an integrated collection of sensors and actuators that require dedicated real-time control. For single processor systems, additional sensors require an increase in computing power and speed to provide the multi-tasking capability needed to service each sensor. Faster processors cost more and consume more power, which taxes a satellite's power resources and may lead to shorter satellite lifetimes. An alternative design approach is a distributed network of small and low power microcontrollers designed for space that handle the computing requirements of each individual sensor and actuator. The design of microdot, a four-bit microcontroller for distributed low-end computing, is presented. The design is based on previous research completed at the Space Electronics Branch, Air Force Research Laboratory (AFRL/VSSE) at Kirtland AFB, NM, and the Air Force Institute of Technology at Wright-Patterson AFB, OH. The Microdot has 29 instructions and a 1K x 4 instruction memory. The distributed computing architecture is based on the Philips Semiconductor I2C Serial Bus Protocol. A prototype was implemented and tested using an Altera Field Programmable Gate Array (FPGA). The prototype was operable to 9.1 MHz. The design was targeted for fabrication in a radiation-hardened-by-design gate-array cell library for the TSMC 0.35 micrometer CMOS process.
Chrestenson transform FPGA embedded factorizations.
Corinthios, Michael J
2016-01-01
Chrestenson generalized Walsh transform factorizations for parallel processing imbedded implementations on field programmable gate arrays are presented. This general base transform, sometimes referred to as the Discrete Chrestenson transform, has received special attention in recent years. In fact, the Discrete Fourier transform and Walsh-Hadamard transform are but special cases of the Chrestenson generalized Walsh transform. Rotations of a base-p hypercube, where p is an arbitrary integer, are shown to produce dynamic contention-free memory allocation, in processor architecture. The approach is illustrated by factorizations involving the processing of matrices of the transform which are function of four variables. Parallel operations are implemented matrix multiplications. Each matrix, of dimension N × N, where N = p (n) , n integer, has a structure that depends on a variable parameter k that denotes the iteration number in the factorization process. The level of parallelism, in the form of M = p (m) processors can be chosen arbitrarily by varying m between zero to its maximum value of n - 1. The result is an equation describing the generalised parallelism factorization as a function of the four variables n, p, k and m. Applications of the approach are shown in relation to configuring field programmable gate arrays for digital signal processing applications.
Study of a programmable high speed processor for use on-board satellites
NASA Astrophysics Data System (ADS)
Degavre, J. Cl.; Okkes, R.; Gaillat, G.
The availability of VLSI programmable devices will significantly enhance satellite on-board data processing capabilities. A case study is presented which indicates that computation-intensive processing applications requiring the execution of 100 megainstructions/sec are within the CD power constraints of satellites. It is noted that the current progress in semicustom design technique development and in achievable gate array densities, together with the recent announcement of improved monochip processors, are encouraging the development of an on-board programmable processor architecture able to associate the devices that will appear in communication and military markets.
Implementation and Assessment of Advanced Analog Vector-Matrix Processor
NASA Technical Reports Server (NTRS)
Gary, Charles K.; Bualat, Maria G.; Lum, Henry, Jr. (Technical Monitor)
1994-01-01
This paper discusses the design and implementation of an analog optical vecto-rmatrix coprocessor with a throughput of 128 Mops for a personal computer. Vector matrix calculations are inherently parallel, providing a promising domain for the use of optical calculators. However, to date, digital optical systems have proven too cumbersome to replace electronics, and analog processors have not demonstrated sufficient accuracy in large scale systems. The goal of the work described in this paper is to demonstrate a viable optical coprocessor for linear operations. The analog optical processor presented has been integrated with a personal computer to provide full functionality and is the first demonstration of an optical linear algebra processor with a throughput greater than 100 Mops. The optical vector matrix processor consists of a laser diode source, an acoustooptical modulator array to input the vector information, a liquid crystal spatial light modulator to input the matrix information, an avalanche photodiode array to read out the result vector of the vector matrix multiplication, as well as transport optics and the electronics necessary to drive the optical modulators and interface to the computer. The intent of this research is to provide a low cost, highly energy efficient coprocessor for linear operations. Measurements of the analog accuracy of the processor performing 128 Mops are presented along with an assessment of the implications for future systems. A range of noise sources, including cross-talk, source amplitude fluctuations, shot noise at the detector, and non-linearities of the optoelectronic components are measured and compared to determine the most significant source of error. The possibilities for reducing these sources of error are discussed. Also, the total error is compared with that expected from a statistical analysis of the individual components and their relation to the vector-matrix operation. The sufficiency of the measured accuracy of the processor is compared with that required for a range of typical problems. Calculations resolving alloy concentrations from spectral plume data of rocket engines are implemented on the optical processor, demonstrating its sufficiency for this problem. We also show how this technology can be easily extended to a 100 x 100 10 MHz (200 Cops) processor.
NASA Astrophysics Data System (ADS)
Ishii, Akira; Tai, Haruka; Mitsudo, Jun
2007-10-01
This paper describes a real-time system for measuring the three-dimensional shape of solder bumps arrayed on an LSI chip-size-package (CSP) board presented for inspection based on the shape-from-focus technique. It uses a copper-alloy mirror deformed by a piezoelectric actuator as a varifocal mirror enabling a simple, fast, precise focusing mechanism without moving parts to be built. A practical measuring speed of 1.69 s/package for a small CSP board (4 x 4 mm2) was achieved by incorporating an exclusive field programmable gate array processor to calculate focus measure and by constructing a domed array of LEDs as a high-intensity, uniform illumination system so that a fast (150 fps) and high-resolution (1024 x 1024 pixels/frame) CMOS image sensor could be used. Accurate measurements of bump height were also achieved with errors of 10 μm (2σ) meeting the requirements for testing the coplanarity of a bump array.
Limit characteristics of digital optoelectronic processor
NASA Astrophysics Data System (ADS)
Kolobrodov, V. G.; Tymchik, G. S.; Kolobrodov, M. S.
2018-01-01
In this article, the limiting characteristics of a digital optoelectronic processor are explored. The limits are defined by diffraction effects and a matrix structure of the devices for input and output of optical signals. The purpose of a present research is to optimize the parameters of the processor's components. The developed physical and mathematical model of DOEP allowed to establish the limit characteristics of the processor, restricted by diffraction effects and an array structure of the equipment for input and output of optical signals, as well as to optimize the parameters of the processor's components. The diameter of the entrance pupil of the Fourier lens is determined by the size of SLM and the pixel size of the modulator. To determine the spectral resolution, it is offered to use a concept of an optimum phase when the resolved diffraction maxima coincide with the pixel centers of the radiation detector.
2005-12-01
Upsets in SRAM FPGAs,” Military and Aerospace Applications of Programmable Logic Devices, September 2002. 8. Wakerly , John F,. “Microcomputer...change. The goal of the Configurable Fault Tolerant Processor (CFTP) Project is to explore, develop and demonstrate the applicability of using off-the...develop and demonstrate the applicability of using commercial-of-the-shelf (COTS) Field Programmable Gate Arrays (FPGA) in the design of
Spacecube: A Family of Reconfigurable Hybrid On-Board Science Data Processors
NASA Technical Reports Server (NTRS)
Flatley, Thomas P.
2015-01-01
SpaceCube is a family of Field Programmable Gate Array (FPGA) based on-board science data processing systems developed at the NASA Goddard Space Flight Center (GSFC). The goal of the SpaceCube program is to provide 10x to 100x improvements in on-board computing power while lowering relative power consumption and cost. SpaceCube is based on the Xilinx Virtex family of FPGAs, which include processor, FPGA logic and digital signal processing (DSP) resources. These processing elements are leveraged to produce a hybrid science data processing platform that accelerates the execution of algorithms by distributing computational functions to the most suitable elements. This approach enables the implementation of complex on-board functions that were previously limited to ground based systems, such as on-board product generation, data reduction, calibration, classification, eventfeature detection, data mining and real-time autonomous operations. The system is fully reconfigurable in flight, including data parameters, software and FPGA logic, through either ground commanding or autonomously in response to detected eventsfeatures in the instrument data stream.
Onboard Experiment Data Support Facility
NASA Technical Reports Server (NTRS)
1976-01-01
An onboard array structure has been devised for end to end processing of data from multiple spaceborne sensors. The array constitutes sets of programmable pipeline processors whose elements perform each assigned function in 0.25 microseconds. This space shuttle computer system can handle data rates from a few bits to over 100 megabits per second.
2008-09-01
of magnetic UXO. The prototype STAR Sensor comprises: a) A cubic array of eight fluxgate magnetometers . b) A 24-channel data acquisition/signal...array (shaded boxes) of eight low noise Triaxial Fluxgate Magnetometers (TFM) develops 24 channels of vector B- field data. Processor hardware
Optimal expression evaluation for data parallel architectures
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Schreiber, Robert
1990-01-01
A data parallel machine represents an array or other composite data structure by allocating one processor (at least conceptually) per data item. A pointwise operation can be performed between two such arrays in unit time, provided their corresponding elements are allocated in the same processors. If the arrays are not aligned in this fashion, the cost of moving one or both of them is part of the cost of the operation. The choice of where to perform the operation then affects this cost. If an expression with several operands is to be evaluated, there may be many choices of where to perform the intermediate operations. An efficient algorithm is given to find the minimum-cost way to evaluate an expression, for several different data parallel architectures. This algorithm applies to any architecture in which the metric describing the cost of moving an array is robust. This encompasses most of the common data parallel communication architectures, including meshes of arbitrary dimension and hypercubes. Remarks are made on several variations of the problem, some of which are solved and some of which remain open.
Controllable 3D Display System Based on Frontal Projection Lenticular Screen
NASA Astrophysics Data System (ADS)
Feng, Q.; Sang, X.; Yu, X.; Gao, X.; Wang, P.; Li, C.; Zhao, T.
2014-08-01
A novel auto-stereoscopic three-dimensional (3D) projection display system based on the frontal projection lenticular screen is demonstrated. It can provide high real 3D experiences and the freedom of interaction. In the demonstrated system, the content can be changed and the dense of viewing points can be freely adjusted according to the viewers' demand. The high dense viewing points can provide smooth motion parallax and larger image depth without blurry. The basic principle of stereoscopic display is described firstly. Then, design architectures including hardware and software are demonstrated. The system consists of a frontal projection lenticular screen, an optimally designed projector-array and a set of multi-channel image processors. The parameters of the frontal projection lenticular screen are based on the demand of viewing such as the viewing distance and the width of view zones. Each projector is arranged on an adjustable platform. The set of multi-channel image processors are made up of six PCs. One of them is used as the main controller, the other five client PCs can process 30 channel signals and transmit them to the projector-array. Then a natural 3D scene will be perceived based on the frontal projection lenticular screen with more than 1.5 m image depth in real time. The control section is presented in detail, including parallax adjustment, system synchronization, distortion correction, etc. Experimental results demonstrate the effectiveness of this novel controllable 3D display system.
Two-dimensional acousto-optic processor using circular antenna array with a Butler matrix
NASA Astrophysics Data System (ADS)
Lee, Jim P.
1992-09-01
A two-dimensional acousto-optic signal processor is shown to be useful for providing simultaneous spectrum analysis and direction finding of radar signals over an instantaneous field of view of 360 deg. A system analysis with emphasis on the direction-finding aspect of this new architecture is presented. The peak location of the optical pattern provides a direct measure of bearing, independent of signal frequency. In addition, the sidelobe levels of the pattern can be effectively reduced using amplitude weighting. Performance parameters, such as mainlobe beamwidth, peak-sidelobe level, and pointing error, are analyzed as a function of the Gaussian laser illumination profile and the number of channels. Finally, a comparison with a linear antenna array architecture is also discussed.
NASA Technical Reports Server (NTRS)
Olariu, S.; Schwing, J.; Zhang, J.
1991-01-01
A bus system that can change dynamically to suit computational needs is referred to as reconfigurable. We present a fast adaptive convex hull algorithm on a two-dimensional processor array with a reconfigurable bus system (2-D PARBS, for short). Specifically, we show that computing the convex hull of a planar set of n points taken O(log n/log m) time on a 2-D PARBS of size mn x n with 3 less than or equal to m less than or equal to n. Our result implies that the convex hull of n points in the plane can be computed in O(1) time in a 2-D PARBS of size n(exp 1.5) x n.
Knowledge-Based Transformational Synthesis of Efficient Structures for Concurrent Computation.
1985-09-30
this wire network to a smaller wire network , creation of subnetworks to replace an overly-broad fanout network , virtualization which is the creation of...dependencies among the values they contain, reduction of this wire network to a smaller wire network , " creation of subnetworks to replace an overly-broad...fanout network , "rtualization which is the creation of additional array elements and processors to reflect the internal enumera- -4 tions that
A performance analysis of advanced I/O architectures for PC-based network file servers
NASA Astrophysics Data System (ADS)
Huynh, K. D.; Khoshgoftaar, T. M.
1994-12-01
In the personal computing and workstation environments, more and more I/O adapters are becoming complete functional subsystems that are intelligent enough to handle I/O operations on their own without much intervention from the host processor. The IBM Subsystem Control Block (SCB) architecture has been defined to enhance the potential of these intelligent adapters by defining services and conventions that deliver command information and data to and from the adapters. In recent years, a new storage architecture, the Redundant Array of Independent Disks (RAID), has been quickly gaining acceptance in the world of computing. In this paper, we would like to discuss critical system design issues that are important to the performance of a network file server. We then present a performance analysis of the SCB architecture and disk array technology in typical network file server environments based on personal computers (PCs). One of the key issues investigated in this paper is whether a disk array can outperform a group of disks (of same type, same data capacity, and same cost) operating independently, not in parallel as in a disk array.
MULTI-CORE AND OPTICAL PROCESSOR RELATED APPLICATIONS RESEARCH AT OAK RIDGE NATIONAL LABORATORY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barhen, Jacob; Kerekes, Ryan A; ST Charles, Jesse Lee
2008-01-01
High-speed parallelization of common tasks holds great promise as a low-risk approach to achieving the significant increases in signal processing and computational performance required for next generation innovations in reconfigurable radio systems. Researchers at the Oak Ridge National Laboratory have been working on exploiting the parallelization offered by this emerging technology and applying it to a variety of problems. This paper will highlight recent experience with four different parallel processors applied to signal processing tasks that are directly relevant to signal processing required for SDR/CR waveforms. The first is the EnLight Optical Core Processor applied to matched filter (MF) correlationmore » processing via fast Fourier transform (FFT) of broadband Dopplersensitive waveforms (DSW) using active sonar arrays for target tracking. The second is the IBM CELL Broadband Engine applied to 2-D discrete Fourier transform (DFT) kernel for image processing and frequency domain processing. And the third is the NVIDIA graphical processor applied to document feature clustering. EnLight Optical Core Processor. Optical processing is inherently capable of high-parallelism that can be translated to very high performance, low power dissipation computing. The EnLight 256 is a small form factor signal processing chip (5x5 cm2) with a digital optical core that is being developed by an Israeli startup company. As part of its evaluation of foreign technology, ORNL's Center for Engineering Science Advanced Research (CESAR) had access to a precursor EnLight 64 Alpha hardware for a preliminary assessment of capabilities in terms of large Fourier transforms for matched filter banks and on applications related to Doppler-sensitive waveforms. This processor is optimized for array operations, which it performs in fixed-point arithmetic at the rate of 16 TeraOPS at 8-bit precision. This is approximately 1000 times faster than the fastest DSP available today. The optical core performs the matrix-vector multiplications, where the nominal matrix size is 256x256. The system clock is 125MHz. At each clock cycle, 128K multiply-and-add operations per second (OPS) are carried out, which yields a peak performance of 16 TeraOPS. IBM Cell Broadband Engine. The Cell processor is the extraordinary resulting product of 5 years of sustained, intensive R&D collaboration (involving over $400M investment) between IBM, Sony, and Toshiba. Its architecture comprises one multithreaded 64-bit PowerPC processor element (PPE) with VMX capabilities and two levels of globally coherent cache, and 8 synergistic processor elements (SPEs). Each SPE consists of a processor (SPU) designed for streaming workloads, local memory, and a globally coherent direct memory access (DMA) engine. Computations are performed in 128-bit wide single instruction multiple data streams (SIMD). An integrated high-bandwidth element interconnect bus (EIB) connects the nine processors and their ports to external memory and to system I/O. The Applied Software Engineering Research (ASER) Group at the ORNL is applying the Cell to a variety of text and image analysis applications. Research on Cell-equipped PlayStation3 (PS3) consoles has led to the development of a correlation-based image recognition engine that enables a single PS3 to process images at more than 10X the speed of state-of-the-art single-core processors. NVIDIA Graphics Processing Units. The ASER group is also employing the latest NVIDIA graphical processing units (GPUs) to accelerate clustering of thousands of text documents using recently developed clustering algorithms such as document flocking and affinity propagation.« less
Block Copolymers as Templates for Arrays of Carbon Nanotubes
NASA Technical Reports Server (NTRS)
Bronikowski, Michael; Hunt, Brian
2003-01-01
A method of manufacturing regular arrays of precisely sized, shaped, positioned, and oriented carbon nanotubes has been proposed. Arrays of carbon nanotubes could prove useful in such diverse applications as communications (especially for filtering of signals), biotechnology (for sequencing of DNA and separation of chemicals), and micro- and nanoelectronics (as field emitters and as signal transducers and processors). The method is expected to be suitable for implementation in standard semiconductor-device fabrication facilities.
Advanced Avionics and Processor Systems for a Flexible Space Exploration Architecture
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Adams, James H.; Smith, Leigh M.; Johnson, Michael A.; Cressler, John D.
2010-01-01
The Advanced Avionics and Processor Systems (AAPS) project, formerly known as the Radiation Hardened Electronics for Space Environments (RHESE) project, endeavors to develop advanced avionic and processor technologies anticipated to be used by NASA s currently evolving space exploration architectures. The AAPS project is a part of the Exploration Technology Development Program, which funds an entire suite of technologies that are aimed at enabling NASA s ability to explore beyond low earth orbit. NASA s Marshall Space Flight Center (MSFC) manages the AAPS project. AAPS uses a broad-scoped approach to developing avionic and processor systems. Investment areas include advanced electronic designs and technologies capable of providing environmental hardness, reconfigurable computing techniques, software tools for radiation effects assessment, and radiation environment modeling tools. Near-term emphasis within the multiple AAPS tasks focuses on developing prototype components using semiconductor processes and materials (such as Silicon-Germanium (SiGe)) to enhance a device s tolerance to radiation events and low temperature environments. As the SiGe technology will culminate in a delivered prototype this fiscal year, the project emphasis shifts its focus to developing low-power, high efficiency total processor hardening techniques. In addition to processor development, the project endeavors to demonstrate techniques applicable to reconfigurable computing and partially reconfigurable Field Programmable Gate Arrays (FPGAs). This capability enables avionic architectures the ability to develop FPGA-based, radiation tolerant processor boards that can serve in multiple physical locations throughout the spacecraft and perform multiple functions during the course of the mission. The individual tasks that comprise AAPS are diverse, yet united in the common endeavor to develop electronics capable of operating within the harsh environment of space. Specifically, the AAPS tasks for the Federal fiscal year of 2010 are: Silicon-Germanium (SiGe) Integrated Electronics for Extreme Environments, Modeling of Radiation Effects on Electronics, Radiation Hardened High Performance Processors (HPP), and and Reconfigurable Computing.
Optimized smith waterman processor design for breast cancer early diagnosis
NASA Astrophysics Data System (ADS)
Nurdin, D. S.; Isa, M. N.; Ismail, R. C.; Ahmad, M. I.
2017-09-01
This paper presents an optimized design of Processing Element (PE) of Systolic Array (SA) which implements affine gap penalty Smith Waterman (SW) algorithm on the Xilinx Virtex-6 XC6VLX75T Field Programmable Gate Array (FPGA) for Deoxyribonucleic Acid (DNA) sequence alignment. The PE optimization aims to reduce PE logic resources to increase number of PEs in FPGA for higher degree of parallelism during alignment matrix computations. This is useful for aligning long DNA-based disease sequence such as Breast Cancer (BC) for early diagnosis. The optimized PE architecture has the smallest PE area with 15 slices in a PE and 776 PEs implemented in the Virtex - 6 FPGA.
A Survey of Plasmas and Their Applications
NASA Technical Reports Server (NTRS)
Eastman, Timothy E.; Grabbe, C. (Editor)
2006-01-01
Plasmas are everywhere and relevant to everyone. We bath in a sea of photons, quanta of electromagnetic radiation, whose sources (natural and artificial) are dominantly plasma-based (stars, fluorescent lights, arc lamps.. .). Plasma surface modification and materials processing contribute increasingly to a wide array of modern artifacts; e.g., tiny plasma discharge elements constitute the pixel arrays of plasma televisions and plasma processing provides roughly one-third of the steps to produce semiconductors, essential elements of our networking and computing infrastructure. Finally, plasmas are central to many cutting edge technologies with high potential (compact high-energy particle accelerators; plasma-enhanced waste processors; high tolerance surface preparation and multifuel preprocessors for transportation systems; fusion for energy production).
Fault-Tolerant Software-Defined Radio on Manycore
NASA Technical Reports Server (NTRS)
Ricketts, Scott
2015-01-01
Software-defined radio (SDR) platforms generally rely on field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), but such architectures require significant software development. In addition, application demands for radiation mitigation and fault tolerance exacerbate programming challenges. MaXentric Technologies, LLC, has developed a manycore-based SDR technology that provides 100 times the throughput of conventional radiationhardened general purpose processors. Manycore systems (30-100 cores and beyond) have the potential to provide high processing performance at error rates that are equivalent to current space-deployed uniprocessor systems. MaXentric's innovation is a highly flexible radio, providing over-the-air reconfiguration; adaptability; and uninterrupted, real-time, multimode operation. The technology is also compliant with NASA's Space Telecommunications Radio System (STRS) architecture. In addition to its many uses within NASA communications, the SDR can also serve as a highly programmable research-stage prototyping device for new waveforms and other communications technologies. It can also support noncommunication codes on its multicore processor, collocated with the communications workload-reducing the size, weight, and power of the overall system by aggregating processing jobs to a single board computer.
CoNNeCT Baseband Processor Module
NASA Technical Reports Server (NTRS)
Yamamoto, Clifford K; Jedrey, Thomas C.; Gutrich, Daniel G.; Goodpasture, Richard L.
2011-01-01
A document describes the CoNNeCT Baseband Processor Module (BPM) based on an updated processor, memory technology, and field-programmable gate arrays (FPGAs). The BPM was developed from a requirement to provide sufficient computing power and memory storage to conduct experiments for a Software Defined Radio (SDR) to be implemented. The flight SDR uses the AT697 SPARC processor with on-chip data and instruction cache. The non-volatile memory has been increased from a 20-Mbit EEPROM (electrically erasable programmable read only memory) to a 4-Gbit Flash, managed by the RTAX2000 Housekeeper, allowing more programs and FPGA bit-files to be stored. The volatile memory has been increased from a 20-Mbit SRAM (static random access memory) to a 1.25-Gbit SDRAM (synchronous dynamic random access memory), providing additional memory space for more complex operating systems and programs to be executed on the SPARC. All memory is EDAC (error detection and correction) protected, while the SPARC processor implements fault protection via TMR (triple modular redundancy) architecture. Further capability over prior BPM designs includes the addition of a second FPGA to implement features beyond the resources of a single FPGA. Both FPGAs are implemented with Xilinx Virtex-II and are interconnected by a 96-bit bus to facilitate data exchange. Dedicated 1.25- Gbit SDRAMs are wired to each Xilinx FPGA to accommodate high rate data buffering for SDR applications as well as independent SpaceWire interfaces. The RTAX2000 manages scrub and configuration of each Xilinx.
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
Massively parallel processor computer
NASA Technical Reports Server (NTRS)
Fung, L. W. (Inventor)
1983-01-01
An apparatus for processing multidimensional data with strong spatial characteristics, such as raw image data, characterized by a large number of parallel data streams in an ordered array is described. It comprises a large number (e.g., 16,384 in a 128 x 128 array) of parallel processing elements operating simultaneously and independently on single bit slices of a corresponding array of incoming data streams under control of a single set of instructions. Each of the processing elements comprises a bidirectional data bus in communication with a register for storing single bit slices together with a random access memory unit and associated circuitry, including a binary counter/shift register device, for performing logical and arithmetical computations on the bit slices, and an I/O unit for interfacing the bidirectional data bus with the data stream source. The massively parallel processor architecture enables very high speed processing of large amounts of ordered parallel data, including spatial translation by shifting or sliding of bits vertically or horizontally to neighboring processing elements.
Asynchronous parallel status comparator
Arnold, Jeffrey W.; Hart, Mark M.
1992-01-01
Apparatus for matching asynchronously received signals and determining whether two or more out of a total number of possible signals match. The apparatus comprises, in one embodiment, an array of sensors positioned in discrete locations and in communication with one or more processors. The processors will receive signals if the sensors detect a change in the variable sensed from a nominal to a special condition and will transmit location information in the form of a digital data set to two or more receivers. The receivers collect, read, latch and acknowledge the data sets and forward them to decoders that produce an output signal for each data set received. The receivers also periodically reset the system following each scan of the sensor array. A comparator then determines if any two or more, as specified by the user, of the output signals corresponds to the same location. A sufficient number of matches produces a system output signal that activates a system to restore the array to its nominal condition.
Asynchronous parallel status comparator
Arnold, J.W.; Hart, M.M.
1992-12-15
Disclosed is an apparatus for matching asynchronously received signals and determining whether two or more out of a total number of possible signals match. The apparatus comprises, in one embodiment, an array of sensors positioned in discrete locations and in communication with one or more processors. The processors will receive signals if the sensors detect a change in the variable sensed from a nominal to a special condition and will transmit location information in the form of a digital data set to two or more receivers. The receivers collect, read, latch and acknowledge the data sets and forward them to decoders that produce an output signal for each data set received. The receivers also periodically reset the system following each scan of the sensor array. A comparator then determines if any two or more, as specified by the user, of the output signals corresponds to the same location. A sufficient number of matches produces a system output signal that activates a system to restore the array to its nominal condition. 4 figs.
Bio-inspired optical rotation sensor
NASA Astrophysics Data System (ADS)
O'Carroll, David C.; Shoemaker, Patrick A.; Brinkworth, Russell S. A.
2007-01-01
Traditional approaches to calculating self-motion from visual information in artificial devices have generally relied on object identification and/or correlation of image sections between successive frames. Such calculations are computationally expensive and real-time digital implementation requires powerful processors. In contrast flies arrive at essentially the same outcome, the estimation of self-motion, in a much smaller package using vastly less power. Despite the potential advantages and a few notable successes, few neuromorphic analog VLSI devices based on biological vision have been employed in practical applications to date. This paper describes a hardware implementation in aVLSI of our recently developed adaptive model for motion detection. The chip integrates motion over a linear array of local motion processors to give a single voltage output. Although the device lacks on-chip photodetectors, it includes bias circuits to use currents from external photodiodes, and we have integrated it with a ring-array of 40 photodiodes to form a visual rotation sensor. The ring configuration reduces pattern noise and combined with the pixel-wise adaptive characteristic of the underlying circuitry, permits a robust output that is proportional to image rotational velocity over a large range of speeds, and is largely independent of either mean luminance or the spatial structure of the image viewed. In principle, such devices could be used as an element of a velocity-based servo to replace or augment inertial guidance systems in applications such as mUAVs.
NASA Astrophysics Data System (ADS)
Jo, Hyunho; Sim, Donggyu
2014-06-01
We present a bitstream decoding processor for entropy decoding of variable length coding-based multiformat videos. Since most of the computational complexity of entropy decoders comes from bitstream accesses and table look-up process, the developed bitstream processing unit (BsPU) has several designated instructions to access bitstreams and to minimize branch operations in the table look-up process. In addition, the instruction for bitstream access has the capability to remove emulation prevention bytes (EPBs) of H.264/AVC without initial delay, repeated memory accesses, and additional buffer. Experimental results show that the proposed method for EPB removal achieves a speed-up of 1.23 times compared to the conventional EPB removal method. In addition, the BsPU achieves speed-ups of 5.6 and 3.5 times in entropy decoding of H.264/AVC and MPEG-4 Visual bitstreams, respectively, compared to an existing processor without designated instructions and a new table mapping algorithm. The BsPU is implemented on a Xilinx Virtex5 LX330 field-programmable gate array. The MPEG-4 Visual (ASP, Level 5) and H.264/AVC (Main Profile, Level 4) are processed using the developed BsPU with a core clock speed of under 250 MHz in real time.
Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers
NASA Technical Reports Server (NTRS)
Su, Ernesto; Lain, Antonio; Ramaswamy, Shankar; Palermo, Daniel J.; Hodges, Eugene W., IV; Banerjee, Prithviraj
1995-01-01
The PARADIGM compiler project provides an automated means to parallelize programs, written in a serial programming model, for efficient execution on distributed-memory multicomputers. .A previous implementation of the compiler based on the PTD representation allowed symbolic array sizes, affine loop bounds and array subscripts, and variable number of processors, provided that arrays were single or multi-dimensionally block distributed. The techniques presented here extend the compiler to also accept multidimensional cyclic and block-cyclic distributions within a uniform symbolic framework. These extensions demand more sophisticated symbolic manipulation capabilities. A novel aspect of our approach is to meet this demand by interfacing PARADIGM with a powerful off-the-shelf symbolic package, Mathematica. This paper describes some of the Mathematica routines that performs various transformations, shows how they are invoked and used by the compiler to overcome the new challenges, and presents experimental results for code involving cyclic and block-cyclic arrays as evidence of the feasibility of the approach.
Wide-field microscopy using microcamera arrays
NASA Astrophysics Data System (ADS)
Marks, Daniel L.; Youn, Seo Ho; Son, Hui S.; Kim, Jungsang; Brady, David J.
2013-02-01
A microcamera is a relay lens paired with image sensors. Microcameras are grouped into arrays to relay overlapping views of a single large surface to the sensors to form a continuous synthetic image. The imaged surface may be curved or irregular as each camera may independently be dynamically focused to a different depth. Microcamera arrays are akin to microprocessors in supercomputers in that both join individual processors by an optoelectronic routing fabric to increase capacity and performance. A microcamera may image ten or more megapixels and grouped into an array of several hundred, as has already been demonstrated by the DARPA AWARE Wide-Field program with multiscale gigapixel photography. We adapt gigapixel microcamera array architectures to wide-field microscopy of irregularly shaped surfaces to greatly increase area imaging over 1000 square millimeters at resolutions of 3 microns or better in a single snapshot. The system includes a novel relay design, a sensor electronics package, and a FPGA-based networking fabric. Biomedical applications of this include screening for skin lesions, wide-field and resolution-agile microsurgical imaging, and microscopic cytometry of millions of cells performed in situ.
NASA Astrophysics Data System (ADS)
Tao, R.; Ma, Y.; Si, L.; Dong, X.; Zhou, P.; Liu, Z.
2011-11-01
We present a theoretical and experimental study of a target-in-the-loop (TIL) high-power adaptive phase-locked fiber laser array. The system configuration of the TIL adaptive phase-locked fiber laser array is introduced, and the fundamental theory for TIL based on the single-dithering technique is deduced for the first time. Two 10-W-level high-power fiber amplifiers are set up and adaptive phase locking of the two fiber amplifiers is accomplished successfully by implementing a single-dithering algorithm on a signal processor. The experimental results demonstrate that the optical phase noise for each beam channel can be effectively compensated by the TIL adaptive optics system under high-power applications and the fringe contrast on a remotely located extended target is advanced from 12% to 74% for the two 10-W-level fiber amplifiers.
Implementing Access to Data Distributed on Many Processors
NASA Technical Reports Server (NTRS)
James, Mark
2006-01-01
A reference architecture is defined for an object-oriented implementation of domains, arrays, and distributions written in the programming language Chapel. This technology primarily addresses domains that contain arrays that have regular index sets with the low-level implementation details being beyond the scope of this discussion. What is defined is a complete set of object-oriented operators that allows one to perform data distributions for domain arrays involving regular arithmetic index sets. What is unique is that these operators allow for the arbitrary regions of the arrays to be fragmented and distributed across multiple processors with a single point of access giving the programmer the illusion that all the elements are collocated on a single processor. Today's massively parallel High Productivity Computing Systems (HPCS) are characterized by a modular structure, with a large number of processing and memory units connected by a high-speed network. Locality of access as well as load balancing are primary concerns in these systems that are typically used for high-performance scientific computation. Data distributions address these issues by providing a range of methods for spreading large data sets across the components of a system. Over the past two decades, many languages, systems, tools, and libraries have been developed for the support of distributions. Since the performance of data parallel applications is directly influenced by the distribution strategy, users often resort to low-level programming models that allow fine-tuning of the distribution aspects affecting performance, but, at the same time, are tedious and error-prone. This technology presents a reusable design of a data-distribution framework for data parallel high-performance applications. Distributions are a means to express locality in systems composed of large numbers of processor and memory components connected by a network. Since distributions have a great effect on the performance of applications, it is important that the distribution strategy is flexible, so its behavior can change depending on the needs of the application. At the same time, high productivity concerns require that the user be shielded from error-prone, tedious details such as communication and synchronization.
Digital micromirror devices: principles and applications in imaging.
Bansal, Vivek; Saggau, Peter
2013-05-01
A digital micromirror device (DMD) is an array of individually switchable mirrors that can be used in many advanced optical systems as a rapid spatial light modulator. With a DMD, several implementations of confocal microscopy, hyperspectral imaging, and fluorescence lifetime imaging can be realized. The DMD can also be used as a real-time optical processor for applications such as the programmable array microscope and compressive sensing. Advantages and disadvantages of the DMD for these applications as well as methods to overcome some of the limitations will be discussed in this article. Practical considerations when designing with the DMD and sample optical layouts of a completely DMD-based imaging system and one in which acousto-optic deflectors (AODs) are used in the illumination pathway are also provided.
NASA Technical Reports Server (NTRS)
Swift, Gary M.; Allen, Gregory S.; Farmanesh, Farhad; George, Jeffrey; Petrick, David J.; Chayab, Fayez
2006-01-01
Shown in this presentation are recent results for the upset susceptibility of the various types of memory elements in the embedded PowerPC405 in the Xilinx V2P40 FPGA. For critical flight designs where configuration upsets are mitigated effectively through appropriate design triplication and configuration scrubbing, these upsets of processor elements can dominate the system error rate. Data from irradiations with both protons and heavy ions are given and compared using available models.
Integrated High-Speed Torque Control System for a Robotic Joint
NASA Technical Reports Server (NTRS)
Davis, Donald R. (Inventor); Radford, Nicolaus A. (Inventor); Permenter, Frank Noble (Inventor); Valvo, Michael C. (Inventor); Askew, R. Scott (Inventor)
2013-01-01
A control system for achieving high-speed torque for a joint of a robot includes a printed circuit board assembly (PCBA) having a collocated joint processor and high-speed communication bus. The PCBA may also include a power inverter module (PIM) and local sensor conditioning electronics (SCE) for processing sensor data from one or more motor position sensors. Torque control of a motor of the joint is provided via the PCBA as a high-speed torque loop. Each joint processor may be embedded within or collocated with the robotic joint being controlled. Collocation of the joint processor, PIM, and high-speed bus may increase noise immunity of the control system, and the localized processing of sensor data from the joint motor at the joint level may minimize bus cabling to and from each control node. The joint processor may include a field programmable gate array (FPGA).
CPU architecture for a fast and energy-saving calculation of convolution neural networks
NASA Astrophysics Data System (ADS)
Knoll, Florian J.; Grelcke, Michael; Czymmek, Vitali; Holtorf, Tim; Hussmann, Stephan
2017-06-01
One of the most difficult problem in the use of artificial neural networks is the computational capacity. Although large search engine companies own specially developed hardware to provide the necessary computing power, for the conventional user only remains the state of the art method, which is the use of a graphic processing unit (GPU) as a computational basis. Although these processors are well suited for large matrix computations, they need massive energy. Therefore a new processor on the basis of a field programmable gate array (FPGA) has been developed and is optimized for the application of deep learning. This processor is presented in this paper. The processor can be adapted for a particular application (in this paper to an organic farming application). The power consumption is only a fraction of a GPU application and should therefore be well suited for energy-saving applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuhl, D.E.
1987-09-01
A brief progress report is presented describing the preparation of /sup 11/C-scopolamine, /sup 17/F-fluoromethane and /sup 18/F-tetraalkylammonium fluoride. The application of /sup 11/C-scopolamine to map cholinergic receptors in normal human brain. Additional studies entitled ''The Automated Arterial Blood Sampling Systems for PET'' and ''Investigations of Array Processor Based High-Speed Parameter Estimation for Tracer Kinetic Modeling'' are also described. (DT)
ASPOD modifications of 1993-1994
NASA Technical Reports Server (NTRS)
Jackson, Jennifer J. (Editor); Fogarty, Paul W.; Muller, Matthew; Martucci, Thomas A., III; Williams, Daniel; Rowney, David A.
1994-01-01
ASPOD, Autonomous Space Processors for Orbital Debris, provides a unique way of collecting the space debris that has built up over the past 37 years. For the past several years, ASPOD has gone through several different modifications. This year's concentrations were on the solar cutting array, the solar tracker, the earth based main frame/tilt table, the controls for the two robotic arms, and accurate autocad drawings of ASPOD. This final report contains the reports written by the students who worked on the ASPOD project this year.
NASA Astrophysics Data System (ADS)
Krasilenko, Vladimir G.; Lazarev, Alexander A.; Nikitovich, Diana V.
2017-10-01
The paper considers results of design and modeling of continuously logical base cells (CL BC) based on current mirrors (CM) with functions of preliminary analogue and subsequent analogue-digital processing for creating sensor multichannel analog-to-digital converters (SMC ADCs) and image processors (IP). For such with vector or matrix parallel inputs-outputs IP and SMC ADCs it is needed active basic photosensitive cells with an extended electronic circuit, which are considered in paper. Such basic cells and ADCs based on them have a number of advantages: high speed and reliability, simplicity, small power consumption, high integration level for linear and matrix structures. We show design of the CL BC and ADC of photocurrents and their various possible implementations and its simulations. We consider CL BC for methods of selection and rank preprocessing and linear array of ADCs with conversion to binary codes and Gray codes. In contrast to our previous works here we will dwell more on analogue preprocessing schemes for signals of neighboring cells. Let us show how the introduction of simple nodes based on current mirrors extends the range of functions performed by the image processor. Each channel of the structure consists of several digital-analog cells (DC) on 15-35 CMOS. The amount of DC does not exceed the number of digits of the formed code, and for an iteration type, only one cell of DC, complemented by the device of selection and holding (SHD), is required. One channel of ADC with iteration is based on one DC-(G) and SHD, and it has only 35 CMOS transistors. In such ADCs easily parallel code can be realized and also serial-parallel output code. The circuits and simulation results of their design with OrCAD are shown. The supply voltage of the DC is 1.8÷3.3V, the range of an input photocurrent is 0.1÷24μA, the transformation time is 20÷30nS at 6-8 bit binary or Gray codes. The general power consumption of the ADC with iteration is only 50÷100μW, if the maximum input current is 4μA. Such simple structure of linear array of ADCs with low power consumption and supply voltage 3.3V, and at the same time with good dynamic characteristics (frequency of digitization even for 1.5μm CMOS-technologies is 40÷50 MHz, and can be increased up to 10 times) and accuracy characteristics are show. The SMC ADCs based on CL BC and CM opens new prospects for realization of linear and matrix IP and photo-electronic structures with matrix operands, which are necessary for neural networks, digital optoelectronic processors, neural-fuzzy controllers.
Radiation-Hardened Wafer Scale Integration
1989-10-25
unlimited. LEXINGTON MASSACHUSETTS EXECUTIVE SUMMARY A focal plane processor (FPP) for a large array of LWIR photodetectors on a space platform must...It seems certain that large. scanning LWIR arrays will once again be of interest in the future, though their specifications will differ from those... nonuniformity and defects in the ZMR material, but films of good quality produced by this technique are now available commercially from Kopin Corporation. Such
Two-dimensional optoelectronic interconnect-processor and its operational bit error rate
NASA Astrophysics Data System (ADS)
Liu, J. Jiang; Gollsneider, Brian; Chang, Wayne H.; Carhart, Gary W.; Vorontsov, Mikhail A.; Simonis, George J.; Shoop, Barry L.
2004-10-01
Two-dimensional (2-D) multi-channel 8x8 optical interconnect and processor system were designed and developed using complementary metal-oxide-semiconductor (CMOS) driven 850-nm vertical-cavity surface-emitting laser (VCSEL) arrays and the photodetector (PD) arrays with corresponding wavelengths. We performed operation and bit-error-rate (BER) analysis on this free-space integrated 8x8 VCSEL optical interconnects driven by silicon-on-sapphire (SOS) circuits. Pseudo-random bit stream (PRBS) data sequence was used in operation of the interconnects. Eye diagrams were measured from individual channels and analyzed using a digital oscilloscope at data rates from 155 Mb/s to 1.5 Gb/s. Using a statistical model of Gaussian distribution for the random noise in the transmission, we developed a method to compute the BER instantaneously with the digital eye-diagrams. Direct measurements on this interconnects were also taken on a standard BER tester for verification. We found that the results of two methods were in the same order and within 50% accuracy. The integrated interconnects were investigated in an optoelectronic processing architecture of digital halftoning image processor. Error diffusion networks implemented by the inherently parallel nature of photonics promise to provide high quality digital halftoned images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, E.L.
A novel method for performing real-time acquisition and processing Landsat/EROS data covers all aspects including radiometric and geometric corrections of multispectral scanner or return-beam vidicon inputs, image enhancement, statistical analysis, feature extraction, and classification. Radiometric transformations include bias/gain adjustment, noise suppression, calibration, scan angle compensation, and illumination compensation, including topography and atmospheric effects. Correction or compensation for geometric distortion includes sensor-related distortions, such as centering, skew, size, scan nonlinearity, radial symmetry, and tangential symmetry. Also included are object image-related distortions such as aspect angle (altitude), scale distortion (altitude), terrain relief, and earth curvature. Ephemeral corrections are also applied to compensatemore » for satellite forward movement, earth rotation, altitude variations, satellite vibration, and mirror scan velocity. Image enhancement includes high-pass, low-pass, and Laplacian mask filtering and data restoration for intermittent losses. Resource classification is provided by statistical analysis including histograms, correlational analysis, matrix manipulations, and determination of spectral responses. Feature extraction includes spatial frequency analysis, which is used in parallel discriminant functions in each array processor for rapid determination. The technique uses integrated parallel array processors that decimate the tasks concurrently under supervision of a control processor. The operator-machine interface is optimized for programming ease and graphics image windowing.« less
Mobile and replicated alignment of arrays in data-parallel programs
NASA Technical Reports Server (NTRS)
Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert
1993-01-01
When a data-parallel language like FORTRAN 90 is compiled for a distributed-memory machine, aggregate data objects (such as arrays) are distributed across the processor memories. The mapping determines the amount of residual communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: first, an alignment that maps all the objects to an abstract template, and then a distribution that maps the template to the processors. We solve two facets of the problem of finding alignments that reduce residual communication: we determine alignments that vary in loops, and objects that should have replicated alignments. We show that loop-dependent mobile alignment is sometimes necessary for optimum performance, and we provide algorithms with which a compiler can determine good mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself (via spread operations) or can be used to improve performance. We propose an algorithm based on network flow that determines which objects to replicate so as to minimize the total amount of broadcast communication in replication. This work on mobile and replicated alignment extends our earlier work on determining static alignment.
MEMS-based system and image processing strategy for epiretinal prosthesis.
Xia, Peng; Hu, Jie; Qi, Jin; Gu, Chaochen; Peng, Yinghong
2015-01-01
Retinal prostheses have the potential to restore some level of visual function to the patients suffering from retinal degeneration. In this paper, an epiretinal approach with active stimulation devices is presented. The MEMS-based processing system consists of an external micro-camera, an information processor, an implanted electrical stimulator and a microelectrode array. The image processing strategy combining image clustering and enhancement techniques was proposed and evaluated by psychophysical experiments. The results indicated that the image processing strategy improved the visual performance compared with direct merging pixels to low resolution. The image processing methods assist epiretinal prosthesis for vision restoration.
Acoustically based fetal heart rate monitor
NASA Technical Reports Server (NTRS)
Baker, Donald A.; Zuckerwar, Allan J.
1991-01-01
The acoustically based fetal heart rate monitor permits an expectant mother to perform the fetal Non-Stress Test in her home. The potential market would include the one million U.S. pregnancies per year requiring this type of prenatal surveillance. The monitor uses polyvinylidene fluoride (PVF2) piezoelectric polymer film for the acoustic sensors, which are mounted in a seven-element array on a cummerbund. Evaluation of the sensor ouput signals utilizes a digital signal processor, which performs a linear prediction routine in real time. Clinical tests reveal that the acoustically based monitor provides Non-Stress Test records which are comparable to those obtained with a commercial ultrasonic transducer.
Using a Cray Y-MP as an array processor for a RISC Workstation
NASA Technical Reports Server (NTRS)
Lamaster, Hugh; Rogallo, Sarah J.
1992-01-01
As microprocessors increase in power, the economics of centralized computing has changed dramatically. At the beginning of the 1980's, mainframes and super computers were often considered to be cost-effective machines for scalar computing. Today, microprocessor-based RISC (reduced-instruction-set computer) systems have displaced many uses of mainframes and supercomputers. Supercomputers are still cost competitive when processing jobs that require both large memory size and high memory bandwidth. One such application is array processing. Certain numerical operations are appropriate to use in a Remote Procedure Call (RPC)-based environment. Matrix multiplication is an example of an operation that can have a sufficient number of arithmetic operations to amortize the cost of an RPC call. An experiment which demonstrates that matrix multiplication can be executed remotely on a large system to speed the execution over that experienced on a workstation is described.
NASA Technical Reports Server (NTRS)
Ramaswamy, Shankar; Banerjee, Prithviraj
1994-01-01
Appropriate data distribution has been found to be critical for obtaining good performance on Distributed Memory Multicomputers like the CM-5, Intel Paragon and IBM SP-1. It has also been found that some programs need to change their distributions during execution for better performance (redistribution). This work focuses on automatically generating efficient routines for redistribution. We present a new mathematical representation for regular distributions called PITFALLS and then discuss algorithms for redistribution based on this representation. One of the significant contributions of this work is being able to handle arbitrary source and target processor sets while performing redistribution. Another important contribution is the ability to handle an arbitrary number of dimensions for the array involved in the redistribution in a scalable manner. Our implementation of these techniques is based on an MPI-like communication library. The results presented show the low overheads for our redistribution algorithm as compared to naive runtime methods.
2010-07-01
imagery, persistent sensor array I. Introduction New device fabrication technologies and heterogeneous embedded processors have led to the emergence of a...geometric occlusions between target and sensor , motion blur, urban scene complexity, and high data volumes. In practical terms the targets are small...distributed airborne narrow-field-of-view video sensor networks. Airborne camera arrays combined with com- putational photography techniques enable the
NASA Technical Reports Server (NTRS)
Pang, Jackson; Pingree, Paula J.; Torgerson, J. Leigh
2006-01-01
We present the Telecommunications protocol processing subsystem using Reconfigurable Interoperable Gate Arrays (TRIGA), a novel approach that unifies fault tolerance, error correction coding and interplanetary communication protocol off-loading to implement CCSDS File Delivery Protocol and Datalink layers. The new reconfigurable architecture offers more than one order of magnitude throughput increase while reducing footprint requirements in memory, command and data handling processor utilization, communication system interconnects and power consumption.
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2015-04-01
PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks < number of processors) or tasks are divided out among the available processors (number of tasks > number of processors). Nested parallel statements may further subdivide the processor set owned by a given task. Tasks or processors are distributed evenly by default, but uneven distributions are possible under programmer control. It is also possible to explicitly enable child tasks to migrate within the processor set owned by their parent task, reducing load unbalancing at the potential cost of increased inter-processor message traffic. PM incorporates some programming structures from the earlier MIST language presented at a previous EGU General Assembly, while adopting a significantly different underlying parallelisation model and type system. PM code is available at www.pm-lang.org under an unrestrictive MIT license. Reference Ruymán Reyes, Antonio J. Dorta, Francisco Almeida, Francisco de Sande, 2009. Automatic Hybrid MPI+OpenMP Code Generation with llc, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science Volume 5759, 185-195
Face classification using electronic synapses
NASA Astrophysics Data System (ADS)
Yao, Peng; Wu, Huaqiang; Gao, Bin; Eryilmaz, Sukru Burc; Huang, Xueyao; Zhang, Wenqiang; Zhang, Qingtian; Deng, Ning; Shi, Luping; Wong, H.-S. Philip; Qian, He
2017-05-01
Conventional hardware platforms consume huge amount of energy for cognitive learning due to the data movement between the processor and the off-chip memory. Brain-inspired device technologies using analogue weight storage allow to complete cognitive tasks more efficiently. Here we present an analogue non-volatile resistive memory (an electronic synapse) with foundry friendly materials. The device shows bidirectional continuous weight modulation behaviour. Grey-scale face classification is experimentally demonstrated using an integrated 1024-cell array with parallel online training. The energy consumption within the analogue synapses for each iteration is 1,000 × (20 ×) lower compared to an implementation using Intel Xeon Phi processor with off-chip memory (with hypothetical on-chip digital resistive random access memory). The accuracy on test sets is close to the result using a central processing unit. These experimental results consolidate the feasibility of analogue synaptic array and pave the way toward building an energy efficient and large-scale neuromorphic system.
Face classification using electronic synapses.
Yao, Peng; Wu, Huaqiang; Gao, Bin; Eryilmaz, Sukru Burc; Huang, Xueyao; Zhang, Wenqiang; Zhang, Qingtian; Deng, Ning; Shi, Luping; Wong, H-S Philip; Qian, He
2017-05-12
Conventional hardware platforms consume huge amount of energy for cognitive learning due to the data movement between the processor and the off-chip memory. Brain-inspired device technologies using analogue weight storage allow to complete cognitive tasks more efficiently. Here we present an analogue non-volatile resistive memory (an electronic synapse) with foundry friendly materials. The device shows bidirectional continuous weight modulation behaviour. Grey-scale face classification is experimentally demonstrated using an integrated 1024-cell array with parallel online training. The energy consumption within the analogue synapses for each iteration is 1,000 × (20 ×) lower compared to an implementation using Intel Xeon Phi processor with off-chip memory (with hypothetical on-chip digital resistive random access memory). The accuracy on test sets is close to the result using a central processing unit. These experimental results consolidate the feasibility of analogue synaptic array and pave the way toward building an energy efficient and large-scale neuromorphic system.
NASA Technical Reports Server (NTRS)
Jacklin, S. A.; Leyland, J. A.; Warmbrodt, W.
1985-01-01
Modern control systems must typically perform real-time identification and control, as well as coordinate a host of other activities related to user interaction, online graphics, and file management. This paper discusses five global design considerations which are useful to integrate array processor, multimicroprocessor, and host computer system architectures into versatile, high-speed controllers. Such controllers are capable of very high control throughput, and can maintain constant interaction with the nonreal-time or user environment. As an application example, the architecture of a high-speed, closed-loop controller used to actively control helicopter vibration is briefly discussed. Although this system has been designed for use as the controller for real-time rotorcraft dynamics and control studies in a wind tunnel environment, the controller architecture can generally be applied to a wide range of automatic control applications.
Space Debris Detection on the HPDP, a Coarse-Grained Reconfigurable Array Architecture for Space
NASA Astrophysics Data System (ADS)
Suarez, Diego Andres; Bretz, Daniel; Helfers, Tim; Weidendorfer, Josef; Utzmann, Jens
2016-08-01
Stream processing, widely used in communications and digital signal processing applications, requires high- throughput data processing that is achieved in most cases using Application-Specific Integrated Circuit (ASIC) designs. Lack of programmability is an issue especially in space applications, which use on-board components with long life-cycles requiring applications updates. To this end, the High Performance Data Processor (HPDP) architecture integrates an array of coarse-grained reconfigurable elements to provide both flexible and efficient computational power suitable for stream-based data processing applications in space. In this work the capabilities of the HPDP architecture are demonstrated with the implementation of a real-time image processing algorithm for space debris detection in a space-based space surveillance system. The implementation challenges and alternatives are described making trade-offs to improve performance at the expense of negligible degradation of detection accuracy. The proposed implementation uses over 99% of the available computational resources. Performance estimations based on simulations show that the HPDP can amply match the application requirements.
Phase space simulation of collisionless stellar systems on the massively parallel processor
NASA Technical Reports Server (NTRS)
White, Richard L.
1987-01-01
A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem.
Optical systolic solutions of linear algebraic equations
NASA Technical Reports Server (NTRS)
Neuman, C. P.; Casasent, D.
1984-01-01
The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
Redundant disk arrays: Reliable, parallel secondary storage. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Gibson, Garth Alan
1990-01-01
During the past decade, advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secondary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of current disk subsystems, by leveraging parallelism, many times their performance. Unfortunately, arrays of small disks may have much higher failure rates than the single large disks they replace. Redundant arrays of inexpensive disks (RAID) use simple redundancy schemes to provide high data reliability. The data encoding, performance, and reliability of redundant disk arrays are investigated. Organizing redundant data into a disk array is treated as a coding problem. Among alternatives examined, codes as simple as parity are shown to effectively correct single, self-identifying disk failures.
Associative architecture for image processing
NASA Astrophysics Data System (ADS)
Adar, Rutie; Akerib, Avidan
1997-09-01
This article presents a new generation in parallel processing architecture for real-time image processing. The approach is implemented in a real time image processor chip, called the XiumTM-2, based on combining a fully associative array which provides the parallel engine with a serial RISC core on the same die. The architecture is fully programmable and can be programmed to implement a wide range of color image processing, computer vision and media processing functions in real time. The associative part of the chip is based on patented pending methodology of Associative Computing Ltd. (ACL), which condenses 2048 associative processors, each of 128 'intelligent' bits. Each bit can be a processing bit or a memory bit. At only 33 MHz and 0.6 micron manufacturing technology process, the chip has a computational power of 3 billion ALU operations per second and 66 billion string search operations per second. The fully programmable nature of the XiumTM-2 chip enables developers to use ACL tools to write their own proprietary algorithms combined with existing image processing and analysis functions from ACL's extended set of libraries.
ATCA digital controller hardware for vertical stabilization of plasmas in tokamaks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Batista, A. J. N.; Sousa, J.; Varandas, C. A. F.
2006-10-15
The efficient vertical stabilization (VS) of plasmas in tokamaks requires a fast reaction of the VS controller, for example, after detection of edge localized modes (ELM). For controlling the effects of very large ELMs a new digital control hardware, based on the Advanced Telecommunications Computing Architecture trade mark sign (ATCA), is being developed aiming to reduce the VS digital control loop cycle (down to an optimal value of 10 {mu}s) and improve the algorithm performance. The system has 1 ATCA trade mark sign processor module and up to 12 ATCA trade mark sign control modules, each one with 32 analogmore » input channels (12 bit resolution), 4 analog output channels (12 bit resolution), and 8 digital input/output channels. The Aurora trade mark sign and PCI Express trade mark sign communication protocols will be used for data transport, between modules, with expected latencies below 2 {mu}s. Control algorithms are implemented on a ix86 based processor with 6 Gflops and on field programmable gate arrays with 80 GMACS, interconnected by serial gigabit links in a full mesh topology.« less
The research and application of multi-biometric acquisition embedded system
NASA Astrophysics Data System (ADS)
Deng, Shichao; Liu, Tiegen; Guo, Jingjing; Li, Xiuyan
2009-11-01
The identification technology based on multi-biometric can greatly improve the applicability, reliability and antifalsification. This paper presents a multi-biometric system bases on embedded system, which includes: three capture daughter boards are applied to obtain different biometric: one each for fingerprint, iris and vein of the back of hand; FPGA (Field Programmable Gate Array) is designed as coprocessor, which uses to configure three daughter boards on request and provides data path between DSP (digital signal processor) and daughter boards; DSP is the master processor and its functions include: control the biometric information acquisition, extracts feature as required and responsible for compare the results with the local database or data server through network communication. The advantages of this system were it can acquire three different biometric in real time, extracts complexity feature flexibly in different biometrics' raw data according to different purposes and arithmetic and network interface on the core-board will be the solution of big data scale. Because this embedded system has high stability, reliability, flexibility and fit for different data scale, it can satisfy the demand of multi-biometric recognition.
O'Sullivan, G.A.; O'Sullivan, J.A.
1999-07-27
In one embodiment, a power processor which operates in three modes: an inverter mode wherein power is delivered from a battery to an AC power grid or load; a battery charger mode wherein the battery is charged by a generator; and a parallel mode wherein the generator supplies power to the AC power grid or load in parallel with the battery. In the parallel mode, the system adapts to arbitrary non-linear loads. The power processor may operate on a per-phase basis wherein the load may be synthetically transferred from one phase to another by way of a bumpless transfer which causes no interruption of power to the load when transferring energy sources. Voltage transients and frequency transients delivered to the load when switching between the generator and battery sources are minimized, thereby providing an uninterruptible power supply. The power processor may be used as part of a hybrid electrical power source system which may contain, in one embodiment, a photovoltaic array, diesel engine, and battery power sources. 31 figs.
O'Sullivan, George A.; O'Sullivan, Joseph A.
1999-01-01
In one embodiment, a power processor which operates in three modes: an inverter mode wherein power is delivered from a battery to an AC power grid or load; a battery charger mode wherein the battery is charged by a generator; and a parallel mode wherein the generator supplies power to the AC power grid or load in parallel with the battery. In the parallel mode, the system adapts to arbitrary non-linear loads. The power processor may operate on a per-phase basis wherein the load may be synthetically transferred from one phase to another by way of a bumpless transfer which causes no interruption of power to the load when transferring energy sources. Voltage transients and frequency transients delivered to the load when switching between the generator and battery sources are minimized, thereby providing an uninterruptible power supply. The power processor may be used as part of a hybrid electrical power source system which may contain, in one embodiment, a photovoltaic array, diesel engine, and battery power sources.
Environmentally adaptive processing for shallow ocean applications: A sequential Bayesian approach.
Candy, J V
2015-09-01
The shallow ocean is a changing environment primarily due to temperature variations in its upper layers directly affecting sound propagation throughout. The need to develop processors capable of tracking these changes implies a stochastic as well as an environmentally adaptive design. Bayesian techniques have evolved to enable a class of processors capable of performing in such an uncertain, nonstationary (varying statistics), non-Gaussian, variable shallow ocean environment. A solution to this problem is addressed by developing a sequential Bayesian processor capable of providing a joint solution to the modal function tracking and environmental adaptivity problem. Here, the focus is on the development of both a particle filter and an unscented Kalman filter capable of providing reasonable performance for this problem. These processors are applied to hydrophone measurements obtained from a vertical array. The adaptivity problem is attacked by allowing the modal coefficients and/or wavenumbers to be jointly estimated from the noisy measurement data along with tracking of the modal functions while simultaneously enhancing the noisy pressure-field measurements.
Software-Reconfigurable Processors for Spacecraft
NASA Technical Reports Server (NTRS)
Farrington, Allen; Gray, Andrew; Bell, Bryan; Stanton, Valerie; Chong, Yong; Peters, Kenneth; Lee, Clement; Srinivasan, Jeffrey
2005-01-01
A report presents an overview of an architecture for a software-reconfigurable network data processor for a spacecraft engaged in scientific exploration. When executed on suitable electronic hardware, the software performs the functions of a physical layer (in effect, acts as a software radio in that it performs modulation, demodulation, pulse-shaping, error correction, coding, and decoding), a data-link layer, a network layer, a transport layer, and application-layer processing of scientific data. The software-reconfigurable network processor is undergoing development to enable rapid prototyping and rapid implementation of communication, navigation, and scientific signal-processing functions; to provide a long-lived communication infrastructure; and to provide greatly improved scientific-instrumentation and scientific-data-processing functions by enabling science-driven in-flight reconfiguration of computing resources devoted to these functions. This development is an extension of terrestrial radio and network developments (e.g., in the cellular-telephone industry) implemented in software running on such hardware as field-programmable gate arrays, digital signal processors, traditional digital circuits, and mixed-signal application-specific integrated circuits (ASICs).
Dynamic data distributions in Vienna Fortran
NASA Technical Reports Server (NTRS)
Chapman, Barbara; Mehrotra, Piyush; Moritsch, Hans; Zima, Hans
1993-01-01
Vienna Fortran is a machine-independent language extension of Fortran, which is based upon the Single-Program-Multiple-Data (SPMD) paradigm and allows the user to write programs for distributed-memory systems using global addresses. The language features focus mainly on the issue of distributing data across virtual processor structures. Those features of Vienna Fortran that allow the data distributions of arrays to change dynamically, depending on runtime conditions are discussed. The relevant language features are discussed, their implementation is outlined, and how they may be used in applications is described.
Scalable Engineering of Quantum Optical Information Processing Architectures (SEQUOIA)
2016-12-13
arrays. Figure 4: An 8-channel fiber-coupled SNSPD array. 1.4 Post -fabrication-tunable linear optic fabrication We have analyzed the...performance of the programmable nanophotonic processor (PNP) that is dynamically tunable via post -fabrication active phase tuning to predict the scaling of...various device losses. PACS numbers: 42.50. Ex , 03.67.Dd, 03.67.Lx, 42.50.Dv I. INTRODUCTION Quantum key distribution (QKD) enables two distant authenticated
A new measuring machine in Paris
NASA Technical Reports Server (NTRS)
Guibert, J.; Charvin, P.
1984-01-01
A new photographic measuring machine is under construction at the Paris Observatory. The amount of transmitted light is measured by a linear array of 1024 photodiodes. Carriage control, data acquisition and on line processing are performed by microprocessors, a S.E.L. 32/27 computer, and an AP 120-B Array Processor. It is expected that a Schmidt telescope plate of size 360 mm square will be scanned in one hour with pixel size of ten microns.
Testability Design Rating System: Testability Handbook. Volume 1
1992-02-01
4-10 4.7.5 Summary of False BIT Alarms (FBA) ............................. 4-10 4.7.6 Smart BIT Technique...Circuit Board PGA Pin Grid Array PLA Programmable Logic Array PLD Programmable Logic Device PN Pseudo-Random Number PREDICT Probabilistic Estimation of...11 4.7.6 Smart BIT ( reference: RADC-TR-85-198). " Smart " BIT is a term given to BIT circuitry in a system LRU which includes dedicated processor/memory
NASA Astrophysics Data System (ADS)
Yen, J. L.; Kremer, P.; Amin, N.; Fung, J.
1989-05-01
The Department of National Defence (Canada) has been conducting studies into multi-beam adaptive arrays for extremely high frequency (EHF) frequency hopped signals. A three-beam 43 GHz adaptive antenna and a beam control processor is under development. An interactive software package for the operation of the array, capable of applying different control algorithms is being written. A maximum signal to jammer plus noise ratio (SJNR) was found to provide superior performance in preventing degradation of user signals in the presence of nearby jammers. A new fast algorithm using a modified conjugate gradient approach was found to be a very efficient way to implement anti-jamming arrays based on maximum SJNR criterion. The present study was intended to refine and simplify this algorithm and to implement the algorithm on an experimental array for real-time evaluation of anti-jamming performance. A three-beam adaptive array was used. A simulation package was used in the evaluation of multi-beam systems using more than three beams and different user-jammer scenarios. An attempt to further reduce the computation burden through continued analysis of maximum SJNR met with limited success. A method to acquire and track an incoming laser beam is proposed.
NASA Astrophysics Data System (ADS)
Yen, J. L.; Kremer, P.; Fung, J.
1990-05-01
The Department of National Defence (Canada) has been conducting studies into multi-beam adaptive arrays for extremely high frequency (EHF) frequency hopped signals. A three-beam 43 GHz adaptive antenna and a beam control processor is under development. An interactive software package for the operation of the array, capable of applying different control algorithms is being written. A maximum signal to jammer plus noise ratio (SJNR) has been found to provide superior performance in preventing degradation of user signals in the presence of nearby jammers. A new fast algorithm using a modified conjugate gradient approach has been found to be a very efficient way to implement anti-jamming arrays based on maximum SJNR criterion. The present study was intended to refine and simplify this algorithm and to implement the algorithm on an experimental array for real-time evaluation of anti-jamming performance. A three-beam adaptive array was used. A simulation package was used in the evaluation of multi-beam systems using more than three beams and different user-jammer scenarios. An attempt to further reduce the computation burden through further analysis of maximum SJNR met with limited success. The investigation of a new angle detector for spatial tracking in heterodyne laser space communications was completed.
NASA Technical Reports Server (NTRS)
2008-01-01
Topics covered include: WRATS Integrated Data Acquisition System; Breadboard Signal Processor for Arraying DSN Antennas; Digital Receiver Phase Meter; Split-Block Waveguide Polarization Twist for 220 to 325 GHz; Nano-Multiplication-Region Avalanche Photodiodes and Arrays; Tailored Asymmetry for Enhanced Coupling to WGM Resonators; Disabling CNT Electronic Devices by Use of Electron Beams; Conical Bearingless Motor/Generators; Integrated Force Method for Indeterminate Structures; Carbon-Nanotube-Based Electrodes for Biomedical Applications; Compact Directional Microwave Antenna for Localized Heating; Using Hyperspectral Imagery to Identify Turfgrass Stresses; Shaping Diffraction-Grating Grooves to Optimize Efficiency; Low-Light-Shift Cesium Fountain without Mechanical Shutters; Magnetic Compensation for Second-Order Doppler Shift in LITS; Nanostructures Exploit Hybrid-Polariton Resonances; Microfluidics, Chromatography, and Atomic-Force Microscopy; Model of Image Artifacts from Dust Particles; Pattern-Recognition System for Approaching a Known Target; Orchestrator Telemetry Processing Pipeline; Scheme for Quantum Computing Immune to Decoherence; Spin-Stabilized Microsatellites with Solar Concentrators; Phase Calibration of Antenna Arrays Aimed at Spacecraft; Ring Bus Architecture for a Solid-State Recorder; and Image Compression Algorithm Altered to Improve Stereo Ranging.
Efficacy of Code Optimization on Cache-based Processors
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Chancellor, Marisa K. (Technical Monitor)
1997-01-01
The current common wisdom in the U.S. is that the powerful, cost-effective supercomputers of tomorrow will be based on commodity (RISC) micro-processors with cache memories. Already, most distributed systems in the world use such hardware as building blocks. This shift away from vector supercomputers and towards cache-based systems has brought about a change in programming paradigm, even when ignoring issues of parallelism. Vector machines require inner-loop independence and regular, non-pathological memory strides (usually this means: non-power-of-two strides) to allow efficient vectorization of array operations. Cache-based systems require spatial and temporal locality of data, so that data once read from main memory and stored in high-speed cache memory is used optimally before being written back to main memory. This means that the most cache-friendly array operations are those that feature zero or unit stride, so that each unit of data read from main memory (a cache line) contains information for the next iteration in the loop. Moreover, loops ought to be 'fat', meaning that as many operations as possible are performed on cache data-provided instruction caches do not overflow and enough registers are available. If unit stride is not possible, for example because of some data dependency, then care must be taken to avoid pathological strides, just ads on vector computers. For cache-based systems the issues are more complex, due to the effects of associativity and of non-unit block (cache line) size. But there is more to the story. Most modern micro-processors are superscalar, which means that they can issue several (arithmetic) instructions per clock cycle, provided that there are enough independent instructions in the loop body. This is another argument for providing fat loop bodies. With these restrictions, it appears fairly straightforward to produce code that will run efficiently on any cache-based system. It can be argued that although some of the important computational algorithms employed at NASA Ames require different programming styles on vector machines and cache-based machines, respectively, neither architecture class appeared to be favored by particular algorithms in principle. Practice tells us that the situation is more complicated. This report presents observations and some analysis of performance tuning for cache-based systems. We point out several counterintuitive results that serve as a cautionary reminder that memory accesses are not the only factors that determine performance, and that within the class of cache-based systems, significant differences exist.
High-frequency ultrasound annular array imaging. Part II: digital beamformer design and imaging.
Hu, Chang-Hong; Snook, Kevin A; Cao, Pei-Jie; Shung, K Kirk
2006-02-01
This is the second part of a two-paper series reporting a recent effort in the development of a high-frequency annular array ultrasound imaging system. In this paper an imaging system composed of a six-element, 43 MHz annular array transducer, a six-channel analog front-end, a field programmable gate array (FPGA)-based beamformer, and a digital signal processor (DSP) microprocessor-based scan converter will be described. A computer is used as the interface for image display. The beamformer that applies delays to the echoes for each channel is implemented with the strategy of combining the coarse and fine delays. The coarse delays that are integer multiples of the clock periods are achieved by using a first-in-first-out (FIFO) structure, and the fine delays are obtained with a fractional delay (FD) filter. Using this principle, dynamic receiving focusing is achieved. The image from a wire phantom obtained with the imaging system was compared to that from a prototype ultrasonic backscatter microscope with a 45 MHz single-element transducer. The improved lateral resolution and depth of field from the wire phantom image were observed. Images from an excised rabbit eye sample also were obtained, and fine anatomical structures were discerned.
A special purpose silicon compiler for designing supercomputing VLSI systems
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.
1991-01-01
Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.
Replication of Space-Shuttle Computers in FPGAs and ASICs
NASA Technical Reports Server (NTRS)
Ferguson, Roscoe C.
2008-01-01
A document discusses the replication of the functionality of the onboard space-shuttle general-purpose computers (GPCs) in field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). The purpose of the replication effort is to enable utilization of proven space-shuttle flight software and software-development facilities to the extent possible during development of software for flight computers for a new generation of launch vehicles derived from the space shuttles. The replication involves specifying the instruction set of the central processing unit and the input/output processor (IOP) of the space-shuttle GPC in a hardware description language (HDL). The HDL is synthesized to form a "core" processor in an FPGA or, less preferably, in an ASIC. The core processor can be used to create a flight-control card to be inserted into a new avionics computer. The IOP of the GPC as implemented in the core processor could be designed to support data-bus protocols other than that of a multiplexer interface adapter (MIA) used in the space shuttle. Hence, a computer containing the core processor could be tailored to communicate via the space-shuttle GPC bus and/or one or more other buses.
Dynamically programmable cache
NASA Astrophysics Data System (ADS)
Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas
1998-10-01
Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
Multiple-access phased array antenna simulator for a digital beam-forming system investigation
NASA Technical Reports Server (NTRS)
Kerczewski, Robert J.; Yu, John; Walton, Joanne C.; Perl, Thomas D.; Andro, Monty; Alexovich, Robert E.
1992-01-01
Future versions of data relay satellite systems are currently being planned by NASA. Being given consideration for implementation are on-board digital beamforming techniques which will allow multiple users to simultaneously access a single S-band phased array antenna system. To investigate the potential performance of such a system, a laboratory simulator has been developed at NASA's Lewis Research Center. This paper describes the system simulator, and in particular, the requirements, design and performance of a key subsystem, the phased array antenna simulator, which provides realistic inputs to the digital processor including multiple signals, noise, and nonlinearities.
Multiple-access phased array antenna simulator for a digital beam forming system investigation
NASA Technical Reports Server (NTRS)
Kerczewski, Robert J.; Yu, John; Walton, Joanne C.; Perl, Thomas D.; Andro, Monty; Alexovich, Robert E.
1992-01-01
Future versions of data relay satellite systems are currently being planned by NASA. Being given consideration for implementation are on-board digital beamforming techniques which will allow multiple users to simultaneously access a single S-band phased array antenna system. To investigate the potential performance of such a system, a laboratory simulator has been developed at NASA's Lewis Research Center. This paper describes the system simulator, and in particular, the requirements, design, and performance of a key subsystem, the phased array antenna simulator, which provides realistic inputs to the digital processor including multiple signals, noise, and nonlinearities.
NASA Technical Reports Server (NTRS)
2001-01-01
Traditional spacecraft power systems incorporate a solar array energy source, an energy storage element (battery), and battery charge control and bus voltage regulation electronics to provide continuous electrical power for spacecraft systems and instruments. Dedicated power conditioning components provide limited fault isolation between systems and instruments, while a centralized power-switching unit provides spacecraft load control. Battery undervoltage conditions are detected by the spacecraft processor, which removes fault conditions and non-critical loads before permanent battery damage can occur. Cost effective operation of a micro-sat constellation requires a fault tolerant spacecraft architecture that minimizes on-orbit operational costs by permitting autonomous reconfiguration in response to unexpected fault conditions. A new micro-sat power system architecture that enhances spacecraft fault tolerance and improves power system survivability by continuously managing the battery charge and discharge processes on a cell-by-cell basis has been developed. This architecture is based on the Integrated Power Source (US patent 5644207), which integrates dual junction solar cells, Lithium Ion battery cells, and processor based charge control electronics into a structural panel that can be deployed or used to form a portion of the outer shell of a micro-spacecraft. The first generation Integrated Power Source is configured as a one inch thick panel in which prismatic Lithium Ion battery cells are arranged in a 3x7 matrix (26VDC) and a 3x1 matrix (3.7VDC) to provide the required output voltages and load currents. A multi-layer structure holds the battery cells, as well as the thermal insulators that are necessary to protect the Lithium Ion battery cells from the extreme temperatures of the solar cell layer. Independent thermal radiators, located on the back of the panel, are dedicated to the solar cell array, the electronics, and the battery cell array. In deployed panel applications, these radiators maintain the battery cells in an appropriate operational temperature range.
A vector scanning processing technique for pulsed laser velocimetry
NASA Technical Reports Server (NTRS)
Wernet, Mark P.; Edwards, Robert V.
1989-01-01
Pulsed-laser-sheet velocimetry yields two-dimensional velocity vectors across an extended planar region of a flow. Current processing techniques offer high-precision (1-percent) velocity estimates, but can require hours of processing time on specialized array processors. Sometimes, however, a less accurate (about 5 percent) data-reduction technique which also gives unambiguous velocity vector information is acceptable. Here, a direct space-domain processing technique is described and shown to be far superior to previous methods in achieving these objectives. It uses a novel data coding and reduction technique and has no 180-deg directional ambiguity. A complex convection vortex flow was recorded and completely processed in under 2 min on an 80386-based PC, producing a two-dimensional velocity-vector map of the flowfield. Pulsed-laser velocimetry data can thus be reduced quickly and reasonably accurately, without specialized array processing hardware.
Embedded System Implementation on FPGA System With μCLinux OS
NASA Astrophysics Data System (ADS)
Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna
2011-02-01
Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
Evaluation and application of a fast module in a PLC based interlock and control system
NASA Astrophysics Data System (ADS)
Zaera-Sanz, M.
2009-08-01
The LHC Beam Interlock system requires a controller performing a simple matrix function to collect the different beam dump requests. To satisfy the expected safety level of the Interlock, the system should be robust and reliable. The PLC is a promising candidate to fulfil both aspects but too slow to meet the expected response time which is of the order of μseconds. Siemens has introduced a ``so called'' fast module (FM352-5 Boolean Processor). It provides independent and extremely fast control of a process within a larger control system using an onboard processor, a Field Programmable Gate Array (FPGA), to execute code in parallel which results in extremely fast scan times. It is interesting to investigate its features and to evaluate it as a possible candidate for the beam interlock system. This paper publishes the results of this study. As well, this paper could be useful for other applications requiring fast processing using a PLC.
Concurrent and Accurate Short Read Mapping on Multicore Processors.
Martínez, Héctor; Tárraga, Joaquín; Medina, Ignacio; Barrachina, Sergio; Castillo, Maribel; Dopazo, Joaquín; Quintana-Ortí, Enrique S
2015-01-01
We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA (HPG Aligner SA is an open-source application. The software is available at http://www.opencb.org, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of HPG Aligner SA, on RNA reads of 100-400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.
The Fermilab lattice supercomputer project
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fischler, M.; Atac, R.; Cook, A.
1989-02-01
The ACPMAPS system is a highly cost effective, local memory MIMD computer targeted at algorithm development and production running for gauge theory on the lattice. The machine consists of a compound hypercube of crates, each of which is a full crossbar switch containing several processors. The processing nodes are single board array processors based on the Weitek XL chip set, each with a peak power of 20 MFLOPS and supported by 8 MBytes of data memory. The system currently being assembled has a peak power of 5 GFLOPS, delivering performance at approximately $250/MFLOP. The system is programmable in C andmore » Fortran. An underpinning of software routines (CANOPY) provides an easy and natural way of coding lattice problems, such that the details of parallelism, and communication and system architecture are transparent to the user. CANOPY can easily be ported to any single CPU or MIMD system which supports C, and allows the coding of typical applications with very little effort. 3 refs., 1 fig.« less
Status report of the end-to-end ASKAP software system: towards early science operations
NASA Astrophysics Data System (ADS)
Guzman, Juan Carlos; Chapman, Jessica; Marquarding, Malte; Whiting, Matthew
2016-08-01
The Australian SKA Pathfinder (ASKAP) is a novel centimetre radio synthesis telescope currently in the commissioning phase and located in the midwest region of Western Australia. It comprises of 36 x 12 m diameter reflector antennas each equipped with state-of-the-art and award winning Phased Array Feeds (PAF) technology. The PAFs provide a wide, 30 square degree field-of-view by forming up to 36 separate dual-polarisation beams at once. This results in a high data rate: 70 TB of correlated visibilities in an 8-hour observation, requiring custom-written, high-performance software running in dedicated High Performance Computing (HPC) facilities. The first six antennas equipped with first-generation PAF technology (Mark I), named the Boolardy Engineering Test Array (BETA) have been in use since 2014 as a platform to test PAF calibration and imaging techniques, and along the way it has been producing some great science results. Commissioning of the ASKAP Array Release 1, that is the first six antennas with second-generation PAFs (Mark II) is currently under way. An integral part of the instrument is the Central Processor platform hosted at the Pawsey Supercomputing Centre in Perth, which executes custom-written software pipelines, designed specifically to meet the ASKAP imaging requirements of wide field of view and high dynamic range. There are three key hardware components of the Central Processor: The ingest nodes (16 x node cluster), the fast temporary storage (1 PB Lustre file system) and the processing supercomputer (200 TFlop system). This High-Performance Computing (HPC) platform is managed and supported by the Pawsey support team. Due to the limited amount of data generated by BETA and the first ASKAP Array Release, the Central Processor platform has been running in a more "traditional" or user-interactive mode. But this is about to change: integration and verification of the online ingest pipeline starts in early 2016, which is required to support the full 300 MHz bandwidth for Array Release 1; followed by the deployment of the real-time data processing components. In addition to the Central Processor, the first production release of the CSIRO ASKAP Science Data Archive (CASDA) has also been deployed in one of the Pawsey Supercomputing Centre facilities and it is integrated to the end-to-end ASKAP data flow system. This paper describes the current status of the "end-to-end" data flow software system from preparing observations to data acquisition, processing and archiving; and the challenges of integrating an HPC facility as a key part of the instrument. It also shares some lessons learned since the start of integration activities and the challenges ahead in preparation for the start of the Early Science program.
High-Performance, Radiation-Hardened Electronics for Space Environments
NASA Technical Reports Server (NTRS)
Keys, Andrew S.; Watson, Michael D.; Frazier, Donald O.; Adams, James H.; Johnson, Michael A.; Kolawa, Elizabeth A.
2007-01-01
The Radiation Hardened Electronics for Space Environments (RHESE) project endeavors to advance the current state-of-the-art in high-performance, radiation-hardened electronics and processors, ensuring successful performance of space systems required to operate within extreme radiation and temperature environments. Because RHESE is a project within the Exploration Technology Development Program (ETDP), RHESE's primary customers will be the human and robotic missions being developed by NASA's Exploration Systems Mission Directorate (ESMD) in partial fulfillment of the Vision for Space Exploration. Benefits are also anticipated for NASA's science missions to planetary and deep-space destinations. As a technology development effort, RHESE provides a broad-scoped, full spectrum of approaches to environmentally harden space electronics, including new materials, advanced design processes, reconfigurable hardware techniques, and software modeling of the radiation environment. The RHESE sub-project tasks are: SelfReconfigurable Electronics for Extreme Environments, Radiation Effects Predictive Modeling, Radiation Hardened Memory, Single Event Effects (SEE) Immune Reconfigurable Field Programmable Gate Array (FPGA) (SIRF), Radiation Hardening by Software, Radiation Hardened High Performance Processors (HPP), Reconfigurable Computing, Low Temperature Tolerant MEMS by Design, and Silicon-Germanium (SiGe) Integrated Electronics for Extreme Environments. These nine sub-project tasks are managed by technical leads as located across five different NASA field centers, including Ames Research Center, Goddard Space Flight Center, the Jet Propulsion Laboratory, Langley Research Center, and Marshall Space Flight Center. The overall RHESE integrated project management responsibility resides with NASA's Marshall Space Flight Center (MSFC). Initial technology development emphasis within RHESE focuses on the hardening of Field Programmable Gate Arrays (FPGA)s and Field Programmable Analog Arrays (FPAA)s for use in reconfigurable architectures. As these component/chip level technologies mature, the RHESE project emphasis shifts to focus on efforts encompassing total processor hardening techniques and board-level electronic reconfiguration techniques featuring spare and interface modularity. This phased approach to distributing emphasis between technology developments provides hardened FPGA/FPAAs for early mission infusion, then migrates to hardened, board-level, high speed processors with associated memory elements and high density storage for the longer duration missions encountered for Lunar Outpost and Mars Exploration occurring later in the Constellation schedule.
Compact optical processor for Hough and frequency domain features
NASA Astrophysics Data System (ADS)
Ott, Peter
1996-11-01
Shape recognition is necessary in a broad band of applications such as traffic sign or work piece recognition. It requires not only neighborhood processing of the input image pixels but global interconnection of them. The Hough transform (HT) performs such a global operation and it is well suited in the preprocessing stage of a shape recognition system. Translation invariant features can be easily calculated form the Hough domain. We have implemented on the computer a neural network shape recognition system which contains a HT, a feature extraction, and a classification layer. The advantage of this approach is that the total system can be optimized with well-known learning techniques and that it can explore the parallelism of the algorithms. However, the HT is a time consuming operation. Parallel, optical processing is therefore advantageous. Several systems have been proposed, based on space multiplexing with arrays of holograms and CGH's or time multiplexing with acousto-optic processors or by image rotation with incoherent and coherent astigmatic optical processors. We took up the last mentioned approach because 2D array detectors are read out line by line, so a 2D detector can achieve the same speed and is easier to implement. Coherent processing can allow the implementation of tilers in the frequency domain. Features based on wedge/ring, Gabor, or wavelet filters have been proven to show good discrimination capabilities for texture and shape recognition. The astigmatic lens system which is derived form the mathematical formulation of the HT is long and contains a non-standard, astigmatic element. By methods of lens transformation s for coherent applications we map the original design to a shorter lens with a smaller number of well separated standard elements and with the same coherent system response. The final lens design still contains the frequency plane for filtering and ray-tracing shows diffraction limited performance. Image rotation can be done optically by a rotating prism. We realize it on a fast FLC- SLM of our lab as input device. The filters can be implemented on the same type of SLM with 128 by 128 square pixels of size, resulting in a total length of the lens of less than 50cm.
A Parallel Vector Machine for the PM Programming Language
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2016-04-01
PM is a new programming language which aims to make the writing of computational geoscience models on parallel hardware accessible to scientists who are not themselves expert parallel programmers. It is based around the concept of communicating operators: language constructs that enable variables local to a single invocation of a parallelised loop to be viewed as if they were arrays spanning the entire loop domain. This mechanism enables different loop invocations (which may or may not be executing on different processors) to exchange information in a manner that extends the successful Communicating Sequential Processes idiom from single messages to collective communication. Communicating operators avoid the additional synchronisation mechanisms, such as atomic variables, required when programming using the Partitioned Global Address Space (PGAS) paradigm. Using a single loop invocation as the fundamental unit of concurrency enables PM to uniformly represent different levels of parallelism from vector operations through shared memory systems to distributed grids. This paper describes an implementation of PM based on a vectorised virtual machine. On a single processor node, concurrent operations are implemented using masked vector operations. Virtual machine instructions operate on vectors of values and may be unmasked, masked using a Boolean field, or masked using an array of active vector cell locations. Conditional structures (such as if-then-else or while statement implementations) calculate and apply masks to the operations they control. A shift in mask representation from Boolean to location-list occurs when active locations become sufficiently sparse. Parallel loops unfold data structures (or vectors of data structures for nested loops) into vectors of values that may additionally be distributed over multiple computational nodes and then split into micro-threads compatible with the size of the local cache. Inter-node communication is accomplished using standard OpenMP and MPI. Performance analyses of the PM vector machine, demonstrating its scaling properties with respect to domain size and the number of processor nodes will be presented for a range of hardware configurations. The PM software and language definition are being made available under unrestrictive MIT and Creative Commons Attribution licenses respectively: www.pm-lang.org.
NASA Astrophysics Data System (ADS)
Urfianto, Mohammad Zalfany; Isshiki, Tsuyoshi; Khan, Arif Ullah; Li, Dongju; Kunieda, Hiroaki
This paper presentss a Multiprocessor System-on-Chips (MPSoC) architecture used as an execution platform for the new C-language based MPSoC design framework we are currently developing. The MPSoC architecture is based on an existing SoC platform with a commercial RISC core acting as the host CPU. We extend the existing SoC with a multiprocessor-array block that is used as the main engine to run parallel applications modeled in our design framework. Utilizing several optimizations provided by our compiler, an efficient inter-communication between processing elements with minimum overhead is implemented. A host-interface is designed to integrate the existing RISC core to the multiprocessor-array. The experimental results show that an efficacious integration is achieved, proving that the designed communication module can be used to efficiently incorporate off-the-shelf processors as a processing element for MPSoC architectures designed using our framework.
Goavec-Mérou, G; Chrétien, N; Friedt, J-M; Sandoz, P; Martin, G; Lenczner, M; Ballandras, S
2014-01-01
Vibrating mechanical structure characterization is demonstrated using contactless techniques best suited for mobile and rotating equipments. Fast measurement rates are achieved using Field Programmable Gate Array (FPGA) devices as real-time digital signal processors. Two kinds of algorithms are implemented on FPGA and experimentally validated in the case of the vibrating tuning fork. A first application concerns in-plane displacement detection by vision with sampling rates above 10 kHz, thus reaching frequency ranges above the audio range. A second demonstration concerns pulsed-RADAR cooperative target phase detection and is applied to radiofrequency acoustic transducers used as passive wireless strain gauges. In this case, the 250 ksamples/s refresh rate achieved is only limited by the acoustic sensor design but not by the detection bandwidth. These realizations illustrate the efficiency, interest, and potentialities of FPGA-based real-time digital signal processing for the contactless interrogation of passive embedded probes with high refresh rates.
FPGA-Based, Self-Checking, Fault-Tolerant Computers
NASA Technical Reports Server (NTRS)
Some, Raphael; Rennels, David
2004-01-01
A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing identical programs in lock step, with comparison of their outputs to detect errors. It would also contain various cache local memory circuits, communication circuits, and configurable special-purpose processors that would use self-checking checkers. (The basic principle of the self-checking checker method is to utilize logic circuitry that generates error signals whenever there is an error in either the checker or the circuit being checked.) The memory system would comprise a main memory and a hardware-controlled check-pointing system (CPS) based on a buffer memory denoted the recovery cache. The main memory would contain random-access memory (RAM) chips and FPGAs that would, in addition to everything else, implement double-error-detecting and single-error-correcting memory functions to enable recovery from single-bit errors.
Atmospheric Modeling And Sensor Simulation (AMASS) study
NASA Technical Reports Server (NTRS)
Parker, K. G.
1985-01-01
A 4800 band synchronous communications link was established between the Perkin-Elmer (P-E) 3250 Atmospheric Modeling and Sensor Simulation (AMASS) system and the Cyber 205 located at the Goddard Space Flight Center. An extension study of off-the-shelf array processors offering standard interface to the Perkin-Elmer was conducted to determine which would meet computational requirements of the division. A Floating Point Systems AP-120B was borrowed from another Marshall Space Flight Center laboratory for evaluation. It was determined that available array processors did not offer significantly more capabilities than the borrowed unit, although at least three other vendors indicated that standard Perkin-Elmer interfaces would be marketed in the future. Therefore, the recommendation was made to continue to utilize the 120B ad to keep monitoring the AP market. Hardware necessary to support requirements of the ASD as well as to enhance system performance was specified and procured. Filters were implemented on the Harris/McIDAS system including two-dimensional lowpass, gradient, Laplacian, and bicubic interpolation routines.
A new multifunction acousto-optic signal processor
NASA Technical Reports Server (NTRS)
Berg, N. J.; Casseday, M. W.; Filipov, A. N.; Pellegrino, J. M.
1984-01-01
An acousto-optic architecture for simultaneously obtaining time integration correlation and high-speed power spectrum analysis was constructed using commercially available TeO2 modulators and photodiode detector-arrays. The correlator section of the processor uses coherent interferometry to attain maximum bandwidth and dynamic range while achieving a time-bandwidth product of 1 million. Two correllator outputs are achieved in this system configuration. One is optically filtered and magnified 2 : 1 to decrease the spatial frequency to a level where a 25-MHz bandwidth may be sampled by a 62-mm array with elements on 25-micro centers. The other output is magnified by a factor of 10 such that the center 4 microseconds of information is available for estimation of time-difference-of-arrival to within 10 ns. The Bragg cell spectrum-analyzer section, which also has two outputs, resolves a 25-MHz instantaneous bandwidth to 25 kHz and can determine discrete-frequency reception time to within 15 microseconds. A microprocessor combines spectrum analysis information with that obtained from the correlator.
NASA Astrophysics Data System (ADS)
Zou, Liang; Fu, Zhuang; Zhao, YanZheng; Yang, JunYan
2010-07-01
This paper proposes a kind of pipelined electric circuit architecture implemented in FPGA, a very large scale integrated circuit (VLSI), which efficiently deals with the real time non-uniformity correction (NUC) algorithm for infrared focal plane arrays (IRFPA). Dual Nios II soft-core processors and a DSP with a 64+ core together constitute this image system. Each processor undertakes own systematic task, coordinating its work with each other's. The system on programmable chip (SOPC) in FPGA works steadily under the global clock frequency of 96Mhz. Adequate time allowance makes FPGA perform NUC image pre-processing algorithm with ease, which has offered favorable guarantee for the work of post image processing in DSP. And at the meantime, this paper presents a hardware (HW) and software (SW) co-design in FPGA. Thus, this systematic architecture yields an image processing system with multiprocessor, and a smart solution to the satisfaction with the performance of the system.
Development Of A Three-Dimensional Circuit Integration Technology And Computer Architecture
NASA Astrophysics Data System (ADS)
Etchells, R. D.; Grinberg, J.; Nudd, G. R.
1981-12-01
This paper is the first of a series 1,2,3 describing a range of efforts at Hughes Research Laboratories, which are collectively referred to as "Three-Dimensional Microelectronics." The technology being developed is a combination of a unique circuit fabrication/packaging technology and a novel processing architecture. The packaging technology greatly reduces the parasitic impedances associated with signal-routing in complex VLSI structures, while simultaneously allowing circuit densities orders of magnitude higher than the current state-of-the-art. When combined with the 3-D processor architecture, the resulting machine exhibits a one- to two-order of magnitude simultaneous improvement over current state-of-the-art machines in the three areas of processing speed, power consumption, and physical volume. The 3-D architecture is essentially that commonly referred to as a "cellular array", with the ultimate implementation having as many as 512 x 512 processors working in parallel. The three-dimensional nature of the assembled machine arises from the fact that the chips containing the active circuitry of the processor are stacked on top of each other. In this structure, electrical signals are passed vertically through the chips via thermomigrated aluminum feedthroughs. Signals are passed between adjacent chips by micro-interconnects. This discussion presents a broad view of the total effort, as well as a more detailed treatment of the fabrication and packaging technologies themselves. The results of performance simulations of the completed 3-D processor executing a variety of algorithms are also presented. Of particular pertinence to the interests of the focal-plane array community is the simulation of the UNICORNS nonuniformity correction algorithms as executed by the 3-D architecture.
An FPGA computing demo core for space charge simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Jinyuan; Huang, Yifei; /Fermilab
2009-01-01
In accelerator physics, space charge simulation requires large amount of computing power. In a particle system, each calculation requires time/resource consuming operations such as multiplications, divisions, and square roots. Because of the flexibility of field programmable gate arrays (FPGAs), we implemented this task with efficient use of the available computing resources and completely eliminated non-calculating operations that are indispensable in regular micro-processors (e.g. instruction fetch, instruction decoding, etc.). We designed and tested a 16-bit demo core for computing Coulomb's force in an Altera Cyclone II FPGA device. To save resources, the inverse square-root cube operation in our design is computedmore » using a memory look-up table addressed with nine to ten most significant non-zero bits. At 200 MHz internal clock, our demo core reaches a throughput of 200 M pairs/s/core, faster than a typical 2 GHz micro-processor by about a factor of 10. Temperature and power consumption of FPGAs were also lower than those of micro-processors. Fast and convenient, FPGAs can serve as alternatives to time-consuming micro-processors for space charge simulation.« less
Sound-field reproduction systems using fixed-directivity loudspeakers.
Poletti, M; Fazi, F M; Nelson, P A
2010-06-01
Sound reproduction systems using open arrays of loudspeakers in rooms suffer from degradations due to room reflections. These reflections can be reduced using pre-compensation of the loudspeaker signals, but this requires calibration of the array in the room, and is processor-intensive. This paper examines 3D sound reproduction systems using spherical arrays of fixed-directivity loudspeakers which reduce the sound field radiated outside the array. A generalized form of the simple source formulation and a mode-matching solution are derived for the required loudspeaker weights. The exterior field is derived and expressions for the exterior power and direct to reverberant ratio are derived. The theoretical results and simulations confirm that minimum interference occurs for loudspeakers which have hyper-cardioid polar responses.
Nanosatellite Power System Considerations
NASA Technical Reports Server (NTRS)
Robyn, M.; Thaller, L.; Scott, D.
1995-01-01
The capability to build complex electronic functions into compact packages is opening the path to miniature satellites on the order of 1 kg mass, 10 cm across, packed with the computing processors, motion controllers, measurement sensors, and communications hardware necessary for operation. Power generation will be from short strings of silicon or gallium arsenide-based solar photovoltaic cells with the array power maximized by a peak power tracker (PPT). Energy storage will utilize a low voltage battery with nickel cadmium or lithium ion cells as the most likely selections for rechargeables and lithium (MnO2-Li) primary batteries for one shot short missions.
ALMA Correlator Real-Time Data Processor
NASA Astrophysics Data System (ADS)
Pisano, J.; Amestica, R.; Perez, J.
2005-10-01
The design of a real-time Linux application utilizing Real-Time Application Interface (RTAI) to process real-time data from the radio astronomy correlator for the Atacama Large Millimeter Array (ALMA) is described. The correlator is a custom-built digital signal processor which computes the cross-correlation function of two digitized signal streams. ALMA will have 64 antennas with 2080 signal streams each with a sample rate of 4 giga-samples per second. The correlator's aggregate data output will be 1 gigabyte per second. The software is defined by hard deadlines with high input and processing data rates, while requiring interfaces to non real-time external computers. The designed computer system - the Correlator Data Processor or CDP, consists of a cluster of 17 SMP computers, 16 of which are compute nodes plus a master controller node all running real-time Linux kernels. Each compute node uses an RTAI kernel module to interface to a 32-bit parallel interface which accepts raw data at 64 megabytes per second in 1 megabyte chunks every 16 milliseconds. These data are transferred to tasks running on multiple CPUs in hard real-time using RTAI's LXRT facility to perform quantization corrections, data windowing, FFTs, and phase corrections for a processing rate of approximately 1 GFLOPS. Highly accurate timing signals are distributed to all seventeen computer nodes in order to synchronize them to other time-dependent devices in the observatory array. RTAI kernel tasks interface to the timing signals providing sub-millisecond timing resolution. The CDP interfaces, via the master node, to other computer systems on an external intra-net for command and control, data storage, and further data (image) processing. The master node accesses these external systems utilizing ALMA Common Software (ACS), a CORBA-based client-server software infrastructure providing logging, monitoring, data delivery, and intra-computer function invocation. The software is being developed in tandem with the correlator hardware which presents software engineering challenges as the hardware evolves. The current status of this project and future goals are also presented.
A programmable computational image sensor for high-speed vision
NASA Astrophysics Data System (ADS)
Yang, Jie; Shi, Cong; Long, Xitian; Wu, Nanjian
2013-08-01
In this paper we present a programmable computational image sensor for high-speed vision. This computational image sensor contains four main blocks: an image pixel array, a massively parallel processing element (PE) array, a row processor (RP) array and a RISC core. The pixel-parallel PE is responsible for transferring, storing and processing image raw data in a SIMD fashion with its own programming language. The RPs are one dimensional array of simplified RISC cores, it can carry out complex arithmetic and logic operations. The PE array and RP array can finish great amount of computation with few instruction cycles and therefore satisfy the low- and middle-level high-speed image processing requirement. The RISC core controls the whole system operation and finishes some high-level image processing algorithms. We utilize a simplified AHB bus as the system bus to connect our major components. Programming language and corresponding tool chain for this computational image sensor are also developed.
Processing of thermionic power on an electrically propelled spacecraft
NASA Technical Reports Server (NTRS)
Macie, T. W.
1973-01-01
A study to define the power processing equipment required between a thermionic reactor and an array of mercury-ion thrusters for a nuclear electric propulsion system is reported. Observations and recommendations that resulted from this study were: (1) the preferred thermionic-fuel-element source voltages are 23 V or higher; (2) transistor characteristics exert a strong effect on power processor mass; (3) the power processor mass could be considerably reduced should the magnetic materials that exhibit low losses at high frequencies, that have a high Curie point, and that can operate at 15 to 20 kG become avaliable; (4) electrical component packaging on the radiator could reduce the area that is sensitive to meteoroid penetration, thereby reducing the meteoroid shielding mass requirement; (5) an experimental model of the power processor design should be built and tested to verify the efficiencies, masses, and all the automatic operational aspects of the design.
NASA Technical Reports Server (NTRS)
Rincon, Rafael F.
2008-01-01
The reconfigurable L-Band radar is an ongoing development at NASA/GSFC that exploits the capability inherently in phased array radar systems with a state-of-the-art data acquisition and real-time processor in order to enable multi-mode measurement techniques in a single radar architecture. The development leverages on the L-Band Imaging Scatterometer, a radar system designed for the development and testing of new radar techniques; and the custom-built DBSAR processor, a highly reconfigurable, high speed data acquisition and processing system. The radar modes currently implemented include scatterometer, synthetic aperture radar, and altimetry; and plans to add new modes such as radiometry and bi-static GNSS signals are being formulated. This development is aimed at enhancing the radar remote sensing capabilities for airborne and spaceborne applications in support of Earth Science and planetary exploration This paper describes the design of the radar and processor systems, explains the operational modes, and discusses preliminary measurements and future plans.
An optical processor for object recognition and tracking
NASA Technical Reports Server (NTRS)
Sloan, J.; Udomkesmalee, S.
1987-01-01
The design and development of a miniaturized optical processor that performs real time image correlation are described. The optical correlator utilizes the Vander Lugt matched spatial filter technique. The correlation output, a focused beam of light, is imaged onto a CMOS photodetector array. In addition to performing target recognition, the device also tracks the target. The hardware, composed of optical and electro-optical components, occupies only 590 cu cm of volume. A complete correlator system would also include an input imaging lens. This optical processing system is compact, rugged, requires only 3.5 watts of operating power, and weighs less than 3 kg. It represents a major achievement in miniaturizing optical processors. When considered as a special-purpose processing unit, it is an attractive alternative to conventional digital image recognition processing. It is conceivable that the combined technology of both optical and ditital processing could result in a very advanced robot vision system.
NASA Astrophysics Data System (ADS)
Szplet, R.; Kalisz, J.; Jachna, Z.
2009-02-01
We present a time digitizer having 45 ps resolution, integrated in a field programmable gate array (FPGA) device. The time interval measurement is based on the two-stage interpolation method. A dual-edge two-phase interpolator is driven by the on-chip synthesized 250 MHz clock with precise phase adjustment. An improved dual-edge double synchronizer was developed to control the main counter. The nonlinearity of the digitizer's transfer characteristic is identified and utilized by the dedicated hardware code processor for the on-the-fly correction of the output data. Application of presented ideas has resulted in the measurement uncertainty of the digitizer below 70 ps RMS over the time interval ranging from 0 to 1 s. The use of the two-stage interpolation and a fast FIFO memory has allowed us to obtain the maximum measurement rate of five million measurements per second.
Iterative current mode per pixel ADC for 3D SoftChip implementation in CMOS
NASA Astrophysics Data System (ADS)
Lachowicz, Stefan W.; Rassau, Alexander; Lee, Seung-Minh; Eshraghian, Kamran; Lee, Mike M.
2003-04-01
Mobile multimedia communication has rapidly become a significant area of research and development constantly challenging boundaries on a variety of technological fronts. The processing requirements for the capture, conversion, compression, decompression, enhancement, display, etc. of increasingly higher quality multimedia content places heavy demands even on current ULSI (ultra large scale integration) systems, particularly for mobile applications where area and power are primary considerations. The ADC presented in this paper is designed for a vertically integrated (3D) system comprising two distinct layers bonded together using Indium bump technology. The top layer is a CMOS imaging array containing analogue-to-digital converters, and a buffer memory. The bottom layer takes the form of a configurable array processor (CAP), a highly parallel array of soft programmable processors capable of carrying out complex processing tasks directly on data stored in the top plane. This paper presents a ADC scheme for the image capture plane. The analogue photocurrent or sampled voltage is transferred to the ADC via a column or a column/row bus. In the proposed system, an array of analogue-to-digital converters is distributed, so that a one-bit cell is associated with one sensor. The analogue-to-digital converters are algorithmic current-mode converters. Eight such cells are cascaded to form an 8-bit converter. Additionally, each photo-sensor is equipped with a current memory cell, and multiple conversions are performed with scaled values of the photocurrent for colour processing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Szadkowski, Zbigniew
We present the new approach to a filtering of radio frequency interferences (RFI) in the Auger Engineering Radio Array (AERA) which study the electromagnetic part of the Extensive Air Showers. The radio stations can observe radio signals caused by coherent emissions due to geomagnetic radiation and charge excess processes. AERA observes frequency band from 30 to 80 MHz. This range is highly contaminated by human-made RFI. In order to improve the signal to noise ratio RFI filters are used in AERA to suppress this contamination. The first kind of filter used by AERA was the Median one, based on themore » Fast Fourier Transform (FFT) technique. The second one, which is currently in use, is the infinite impulse response (IIR) notch filter. The proposed new filter is a finite impulse response (FIR) filter based on a linear prediction (LP). A periodic contamination hidden in a registered signal (digitized in the ADC) can be extracted and next subtracted to make signal cleaner. The FIR filter requires a calculation of n=32, 64 or even 128 coefficients (dependent on a required speed or accuracy) by solving of n linear equations with coefficients built from the covariance Toeplitz matrix. This matrix can be solved by the Levinson recursion, which is much faster than the Gauss procedure. The filter has been already tested in the real AERA radio stations on Argentinean pampas with a very successful results. The linear equations were solved either in the virtual soft-core NIOSR processor (implemented in the FPGA chip as a net of logic elements) or in the external Voipac PXA270M ARM processor. The NIOS processor is relatively slow (50 MHz internal clock), calculations performed in an external processor consume a significant amount of time for data exchange between the FPGA and the processor. Test showed a very good efficiency of the RFI suppression for stationary (long-term) contaminations. However, we observed a short-time contaminations, which could not be suppressed either by the IIR-notch filter or by the FIR filter based on the linear predictions. For the LP FIR filter the refreshment time of the filter coefficients was to long and filter did not keep up with the changes of a contamination structure, mainly due to a long calculation time in a slow processors. We propose to use the Cyclone V SE chip with embedded micro-controller operating with 925 MHz internal clock to significantly reduce a refreshment time of the FIR coefficients. The lab results are promising. (authors)« less
Geostationary payload concepts for personal satellite communications
NASA Technical Reports Server (NTRS)
Benedicto, J.; Rinous, P.; Roberts, I.; Roederer, A.; Stojkovic, I.
1993-01-01
This paper reviews candidate satellite payload architectures for systems providing world-wide communication services to mobile users equipped with hand-held terminals based on large geostationary satellites. There are a number of problems related to the payload architecture, on-board routing and beamforming, and the design of the S-band Tx and L-band Rx antenna and front ends. A number of solutions are outlined, based on trade-offs with respect to the most significant performance parameters such as capacity, G/T, flexibility of routing traffic to beams and re-configuration of the spot-beam coverage, and payload mass and power. Candidate antenna and front-end configurations were studied, in particular direct radiating arrays, arrays magnified by a reflector and active focused reflectors with overlapping feed clusters for both transmit (multimax) and receive (beam synthesis). Regarding the on-board routing and beamforming sub-systems, analog techniques based on banks of SAW filters, FET or CMOS switches and cross-bar fixed and variable beamforming are compared with a hybrid analog/digital approach based on Chirp Fourier Transform (CFT) demultiplexer combined with digital beamforming or a fully digital processor implementation, also based on CFT demultiplexing.
Minati, Ludovico; Cercignani, Mara; Chan, Dennis
2013-10-01
Graph theory-based analyses of brain network topology can be used to model the spatiotemporal correlations in neural activity detected through fMRI, and such approaches have wide-ranging potential, from detection of alterations in preclinical Alzheimer's disease through to command identification in brain-machine interfaces. However, due to prohibitive computational costs, graph-based analyses to date have principally focused on measuring connection density rather than mapping the topological architecture in full by exhaustive shortest-path determination. This paper outlines a solution to this problem through parallel implementation of Dijkstra's algorithm in programmable logic. The processor design is optimized for large, sparse graphs and provided in full as synthesizable VHDL code. An acceleration factor between 15 and 18 is obtained on a representative resting-state fMRI dataset, and maps of Euclidean path length reveal the anticipated heterogeneous cortical involvement in long-range integrative processing. These results enable high-resolution geodesic connectivity mapping for resting-state fMRI in patient populations and real-time geodesic mapping to support identification of imagined actions for fMRI-based brain-machine interfaces. Copyright © 2013 IPEM. Published by Elsevier Ltd. All rights reserved.
Temperature-Adaptive Circuits on Reconfigurable Analog Arrays
NASA Technical Reports Server (NTRS)
Stoica, Adrian; Zebulum, Ricardo S.; Keymeulen, Didier; Ramesham, Rajeshuni; Neff, Joseph; Katkoori, Srinivas
2006-01-01
Demonstration of a self-reconfigurable Integrated Circuit (IC) that would operate under extreme temperature (-180 C and 120 C) and radiation (300krad), without the protection of thermal controls and radiation shields. Self-Reconfigurable Electronics platform: a) Evolutionary Processor (EP) to run reconfiguration mechanism; b) Reconfigurable chip (FPGA, FPAA, etc).
Smart medical systems with application to nutrition and fitness in space
NASA Technical Reports Server (NTRS)
Soller, Babs R.; Cabrera, Marco; Smith, Scott M.; Sutton, Jeffrey P.
2002-01-01
Smart medical systems are being developed to allow medical treatments to address alterations in chemical and physiologic status in real time. In a smart medical system, sensor arrays assess subject status, which is interpreted by computer processors that analyze multiple inputs and recommend treatment interventions. The response of the subject to the treatment is again assessed by the sensor arrays, thus closing the loop. An early form of "smart medicine" has been practiced in space to assess nutrition. Nutrient levels are assessed with food frequency questionnaires, which are interpreted by flight surgeons to recommend inflight alterations in diet. In the future, sensor arrays will directly probe body chemistry. Near-infrared spectroscopy can be used to non-invasively measure several blood and tissue parameters that are important in the assessment of nutrition and fitness. In particular, this technology can be used to measure blood hematocrit and interstitial fluid pH. The non-invasive measurement of interstitial pH is discussed as a surrogate for blood lactate measurement for the development and real-time assessment of exercise protocols in space. Earth-based application of these sensors is also described.
Smart Medical Systems with Application to Nutrition and Fitness in Space
NASA Technical Reports Server (NTRS)
Soller, Babs R.; Cabrera, Marco; Smith, Scott M.; Sutton, Jeffrey P.
2002-01-01
Smart medical systems are being developed to allow medical treatments to address alterations in chemical and physiological status in real time. In a smart medical system sensor arrays assess subject status, which are interpreted by computer processors which analyze multiple inputs and recommend treatment interventions. The response of the subject to the treatment is again assessed by the sensor arrays, closing the loop. An early form of "smart medicine" has been practiced in space to assess nutrition. Nutrient levels are assessed with food frequency questionnaires, which are interpreted by flight surgeons to recommend in-flight alterations in diet. In the future, sensor arrays will directly probe body chemistry. Near infrared spectroscopy can be used to noninvasively measure several blood and tissue parameters which are important in the assessment of nutrition and fitness. In particular, this technology can be used to measure blood hematocrit and interstitial fluid pH. The noninvasive measurement of interstitial pH is discussed as a surrogate for blood lactate measurement for the development and real-time assessment of exercise protocols in space. Earth-based application of these sensors are also described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dodge, D. A.; Harris, D. B.
Correlation detectors are of considerable interest to the seismic monitoring communities because they offer reduced detection thresholds and combine detection, location and identification functions into a single operation. They appear to be ideal for applications requiring screening of frequent repeating events. However, questions remain about how broadly empirical correlation methods are applicable. We describe the effectiveness of banks of correlation detectors in a system that combines traditional power detectors with correlation detectors in terms of efficiency, which we define to be the fraction of events detected by the correlators. This paper elaborates and extends the concept of a dynamic correlationmore » detection framework – a system which autonomously creates correlation detectors from event waveforms detected by power detectors; and reports observed performance on a network of arrays in terms of efficiency. We performed a large scale test of dynamic correlation processors on an 11 terabyte global dataset using 25 arrays in the single frequency band 1-3 Hz. The system found over 3.2 million unique signals and produced 459,747 screened detections. A very satisfying result is that, on average, efficiency grows with time and, after nearly 16 years of operation, exceeds 47% for events observed over all distance ranges and approaches 70% for near regional and 90% for local events. This observation suggests that future pipeline architectures should make extensive use of correlation detectors, principally for decluttering observations of local and near-regional events. Our results also suggest that future operations based on correlation detection will require commodity large-scale computing infrastructure, since the numbers of correlators in an autonomous system can grow into the hundreds of thousands.« less
Dodge, D. A.; Harris, D. B.
2016-03-15
Correlation detectors are of considerable interest to the seismic monitoring communities because they offer reduced detection thresholds and combine detection, location and identification functions into a single operation. They appear to be ideal for applications requiring screening of frequent repeating events. However, questions remain about how broadly empirical correlation methods are applicable. We describe the effectiveness of banks of correlation detectors in a system that combines traditional power detectors with correlation detectors in terms of efficiency, which we define to be the fraction of events detected by the correlators. This paper elaborates and extends the concept of a dynamic correlationmore » detection framework – a system which autonomously creates correlation detectors from event waveforms detected by power detectors; and reports observed performance on a network of arrays in terms of efficiency. We performed a large scale test of dynamic correlation processors on an 11 terabyte global dataset using 25 arrays in the single frequency band 1-3 Hz. The system found over 3.2 million unique signals and produced 459,747 screened detections. A very satisfying result is that, on average, efficiency grows with time and, after nearly 16 years of operation, exceeds 47% for events observed over all distance ranges and approaches 70% for near regional and 90% for local events. This observation suggests that future pipeline architectures should make extensive use of correlation detectors, principally for decluttering observations of local and near-regional events. Our results also suggest that future operations based on correlation detection will require commodity large-scale computing infrastructure, since the numbers of correlators in an autonomous system can grow into the hundreds of thousands.« less
FPGA-Based Reconfigurable Processor for Ultrafast Interlaced Ultrasound and Photoacoustic Imaging
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2016-01-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models. PMID:22828830
FPGA-based reconfigurable processor for ultrafast interlaced ultrasound and photoacoustic imaging.
Alqasemi, Umar; Li, Hai; Aguirre, Andrés; Zhu, Quing
2012-07-01
In this paper, we report, to the best of our knowledge, a unique field-programmable gate array (FPGA)-based reconfigurable processor for real-time interlaced co-registered ultrasound and photoacoustic imaging and its application in imaging tumor dynamic response. The FPGA is used to control, acquire, store, delay-and-sum, and transfer the data for real-time co-registered imaging. The FPGA controls the ultrasound transmission and ultrasound and photoacoustic data acquisition process of a customized 16-channel module that contains all of the necessary analog and digital circuits. The 16-channel module is one of multiple modules plugged into a motherboard; their beamformed outputs are made available for a digital signal processor (DSP) to access using an external memory interface (EMIF). The FPGA performs a key role through ultrafast reconfiguration and adaptation of its structure to allow real-time switching between the two imaging modes, including transmission control, laser synchronization, internal memory structure, beamforming, and EMIF structure and memory size. It performs another role by parallel accessing of internal memories and multi-thread processing to reduce the transfer of data and the processing load on the DSP. Furthermore, because the laser will be pulsing even during ultrasound pulse-echo acquisition, the FPGA ensures that the laser pulses are far enough from the pulse-echo acquisitions by appropriate time-division multiplexing (TDM). A co-registered ultrasound and photoacoustic imaging system consisting of four FPGA modules (64-channels) is constructed, and its performance is demonstrated using phantom targets and in vivo mouse tumor models.
OAO-3 end of mission power subsystem evaluation
NASA Technical Reports Server (NTRS)
Tasevoli, M.
1982-01-01
End of mission tests were performed on the OAO-3 power subsystem in three component areas: solar array, nickel-cadmium batteries and the On-Board Processor (OBP) power boost operation. Solar array evaluation consisted of analyzing array performance characteristics and comparing them to earlier flight data. Measured solar array degradation of 14.1 to 17.7% after 8 1/3 years is in good agreement with theortical radiation damage losses. Battery discharge characteristics were compared to results of laboratory life cycle tests performed on similar cells. Comparison of cell voltage profils reveals close correlation and confirms the validity of real time life cycle simulation. The successful operation of the system in the OBP/power boost regulation mode demonstrates the excellent life, reliability and greater system utilization of power subsystems using maximum power trackers.
The 7.5 kW solar array simulator
NASA Technical Reports Server (NTRS)
Robson, R. R.
1975-01-01
A high power solar array simulator capable of providing the input power to simultaneously operate two 30 cm diameter ion thruster power processors was designed, fabricated, and tested. The maximum power point is set to between 150 and 7500 watts representing an open circuit voltage from 50 to 300 volts and a short circuit current from 4 to 36 amps. Illuminated solar cells are used as the control element to provide a true solar cell characteristic and permit the option of simulating changes in this characteristic due to variations in solar intensity and/or temperature of the solar array. This is accomplished by changing the illumination and/or temperature of the control cells. The response of the output to a step change in load closely approximates that of an actual solar array.
SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes
NASA Astrophysics Data System (ADS)
Homann, Holger; Laenen, Francois
2018-03-01
The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles. The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel's MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs. Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
Facile fabrication of nanofluidic diode membranes using anodic aluminium oxide
NASA Astrophysics Data System (ADS)
Wu, Songmei; Wildhaber, Fabien; Vazquez-Mena, Oscar; Bertsch, Arnaud; Brugger, Juergen; Renaud, Philippe
2012-08-01
Active control of ion transport plays important roles in chemical and biological analytical processes. Nanofluidic systems hold the promise for such control through electrostatic interaction between ions and channel surfaces. Most existing experiments rely on planar geometry where the nanochannels are generally very long and shallow with large aspect ratios. Based on this configuration the concepts of nanofluidic gating and rectification have been successfully demonstrated. However, device minimization and throughput scaling remain significant challenges. We report here an innovative and facile realization of hetero-structured Al2O3/SiO2 (Si) nanopore array membranes by using pattern transfer of self-organized nanopore structures of anodic aluminum oxide (AAO). Thanks to the opposite surface charge states of Al2O3 (positive) and SiO2 (negative), the membrane exhibits clear rectification of ion current in electrolyte solutions with very low aspect ratios compared to previous approaches. Our hetero-structured nanopore arrays provide a valuable platform for high throughput applications such as molecular separation, chemical processors and energy conversion.Active control of ion transport plays important roles in chemical and biological analytical processes. Nanofluidic systems hold the promise for such control through electrostatic interaction between ions and channel surfaces. Most existing experiments rely on planar geometry where the nanochannels are generally very long and shallow with large aspect ratios. Based on this configuration the concepts of nanofluidic gating and rectification have been successfully demonstrated. However, device minimization and throughput scaling remain significant challenges. We report here an innovative and facile realization of hetero-structured Al2O3/SiO2 (Si) nanopore array membranes by using pattern transfer of self-organized nanopore structures of anodic aluminum oxide (AAO). Thanks to the opposite surface charge states of Al2O3 (positive) and SiO2 (negative), the membrane exhibits clear rectification of ion current in electrolyte solutions with very low aspect ratios compared to previous approaches. Our hetero-structured nanopore arrays provide a valuable platform for high throughput applications such as molecular separation, chemical processors and energy conversion. Electronic supplementary information (ESI) available: Pattern transfer of local AAO mask into Si layers of different thickness; characterization of the Ag/AgCl electrodes and the cell constant; control experiments of mono-charged nanopore membranes; and simulation of ionic transport in nanofluidic diodes. See DOI: 10.1039/c2nr31243c
NASA Astrophysics Data System (ADS)
Tohara, Takashi; Liang, Haichao; Tanaka, Hirofumi; Igarashi, Makoto; Samukawa, Seiji; Endo, Kazuhiko; Takahashi, Yasuo; Morie, Takashi
2016-03-01
A nanodisk array connected with a fin field-effect transistor is fabricated and analyzed for spiking neural network applications. This nanodevice performs weighted sums in the time domain using rising slopes of responses triggered by input spike pulses. The nanodisk arrays, which act as a resistance of several giga-ohms, are fabricated using a self-assembly bio-nano-template technique. Weighted sums are achieved with an energy dissipation on the order of 1 fJ, where the number of inputs can be more than one hundred. This amount of energy is several orders of magnitude lower than that of conventional digital processors.
Advanced satellite communication system
NASA Technical Reports Server (NTRS)
Staples, Edward J.; Lie, Sen
1992-01-01
The objective of this research program was to develop an innovative advanced satellite receiver/demodulator utilizing surface acoustic wave (SAW) chirp transform processor and coherent BPSK demodulation. The algorithm of this SAW chirp Fourier transformer is of the Convolve - Multiply - Convolve (CMC) type, utilizing off-the-shelf reflective array compressor (RAC) chirp filters. This satellite receiver, if fully developed, was intended to be used as an on-board multichannel communications repeater. The Advanced Communications Receiver consists of four units: (1) CMC processor, (2) single sideband modulator, (3) demodulator, and (4) chirp waveform generator and individual channel processors. The input signal is composed of multiple user transmission frequencies operating independently from remotely located ground terminals. This signal is Fourier transformed by the CMC Processor into a unique time slot for each user frequency. The CMC processor is driven by a waveform generator through a single sideband (SSB) modulator. The output of the coherent demodulator is composed of positive and negative pulses, which are the envelopes of the chirp transform processor output. These pulses correspond to the data symbols. Following the demodulator, a logic circuit reconstructs the pulses into data, which are subsequently differentially decoded to form the transmitted data. The coherent demodulation and detection of BPSK signals derived from a CMC chirp transform processor were experimentally demonstrated and bit error rate (BER) testing was performed. To assess the feasibility of such advanced receiver, the results were compared with the theoretical analysis and plotted for an average BER as a function of signal-to-noise ratio. Another goal of this SBIR program was the development of a commercial product. The commercial product developed was an arbitrary waveform generator. The successful sales have begun with the delivery of the first arbitrary waveform generator.
NASA Astrophysics Data System (ADS)
Dombrowski, M. P.; LaBelle, J.; McGaw, D. G.; Broughton, M. C.
2016-07-01
The programmable combined receiver/digital signal processor platform presented in this article is designed for digital downsampling and processing of general waveform inputs with a 66 MHz initial sampling rate and multi-input synchronized sampling. Systems based on this platform are capable of fully autonomous low-power operation, can be programmed to preprocess and filter the data for preselection and reduction, and may output to a diverse array of transmission or telemetry media. We describe three versions of this system, one for deployment on sounding rockets and two for ground-based applications. The rocket system was flown on the Correlation of High-Frequency and Auroral Roar Measurements (CHARM)-II mission launched from Poker Flat Research Range, Alaska, in 2010. It measured auroral "roar" signals at 2.60 MHz. The ground-based systems have been deployed at Sondrestrom, Greenland, and South Pole Station, Antarctica. The Greenland system synchronously samples signals from three spaced antennas providing direction finding of 0-5 MHz waves. It has successfully measured auroral signals and man-made broadcast signals. The South Pole system synchronously samples signals from two crossed antennas, providing polarization information. It has successfully measured the polarization of auroral kilometric radiation-like signals as well as auroral hiss. Further systems are in development for future rocket missions and for installation in Antarctic Automatic Geophysical Observatories.
Ultrabroadband phased-array radio frequency (RF) receivers based on optical techniques
NASA Astrophysics Data System (ADS)
Overmiller, Brock M.; Schuetz, Christopher A.; Schneider, Garrett; Murakowski, Janusz; Prather, Dennis W.
2014-03-01
Military operations require the ability to locate and identify electronic emissions in the battlefield environment. However, recent developments in radio detection and ranging (RADAR) and communications technology are making it harder to effectively identify such emissions. Phased array systems aid in discriminating emitters in the scene by virtue of their relatively high-gain beam steering and nulling capabilities. For the purpose of locating emitters, we present an approach realize a broadband receiver based on optical processing techniques applied to the response of detectors in conformal antenna arrays. This approach utilizes photonic techniques that enable us to capture, route, and process the incoming signals. Optical modulators convert the incoming signals up to and exceeding 110 GHz with appreciable conversion efficiency and route these signals via fiber optics to a central processing location. This central processor consists of a closed loop phase control system which compensates for phase fluctuations induced on the fibers due to thermal or acoustic vibrations as well as an optical heterodyne approach for signal conversion down to baseband. Our optical heterodyne approach uses injection-locked paired optical sources to perform heterodyne downconversion/frequency identification of the detected emission. Preliminary geolocation and frequency identification testing of electronic emissions has been performed demonstrating the capabilities of our RF receiver.
FPGA-accelerated algorithm for the regular expression matching system
NASA Astrophysics Data System (ADS)
Russek, P.; Wiatr, K.
2015-01-01
This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
Image processing using Gallium Arsenide (GaAs) technology
NASA Technical Reports Server (NTRS)
Miller, Warner H.
1989-01-01
The need to increase the information return from space-borne imaging systems has increased in the past decade. The use of multi-spectral data has resulted in the need for finer spatial resolution and greater spectral coverage. Onboard signal processing will be necessary in order to utilize the available Tracking and Data Relay Satellite System (TDRSS) communication channel at high efficiency. A generally recognized approach to the increased efficiency of channel usage is through data compression techniques. The compression technique implemented is a differential pulse code modulation (DPCM) scheme with a non-uniform quantizer. The need to advance the state-of-the-art of onboard processing was recognized and a GaAs integrated circuit technology was chosen. An Adaptive Programmable Processor (APP) chip set was developed which is based on an 8-bit slice general processor. The reason for choosing the compression technique for the Multi-spectral Linear Array (MLA) instrument is described. Also a description is given of the GaAs integrated circuit chip set which will demonstrate that data compression can be performed onboard in real time at data rate in the order of 500 Mb/s.
Shift-, rotation-, and scale-invariant shape recognition system using an optical Hough transform
NASA Astrophysics Data System (ADS)
Schmid, Volker R.; Bader, Gerhard; Lueder, Ernst H.
1998-02-01
We present a hybrid shape recognition system with an optical Hough transform processor. The features of the Hough space offer a separate cancellation of distortions caused by translations and rotations. Scale invariance is also provided by suitable normalization. The proposed system extends the capabilities of Hough transform based detection from only straight lines to areas bounded by edges. A very compact optical design is achieved by a microlens array processor accepting incoherent light as direct optical input and realizing the computationally expensive connections massively parallel. Our newly developed algorithm extracts rotation and translation invariant normalized patterns of bright spots on a 2D grid. A neural network classifier maps the 2D features via a nonlinear hidden layer onto the classification output vector. We propose initialization of the connection weights according to regions of activity specifically assigned to each neuron in the hidden layer using a competitive network. The presented system is designed for industry inspection applications. Presently we have demonstrated detection of six different machined parts in real-time. Our method yields very promising detection results of more than 96% correctly classified parts.
Optical Interconnections for VLSI Computational Systems Using Computer-Generated Holography.
NASA Astrophysics Data System (ADS)
Feldman, Michael Robert
Optical interconnects for VLSI computational systems using computer generated holograms are evaluated in theory and experiment. It is shown that by replacing particular electronic connections with free-space optical communication paths, connection of devices on a single chip or wafer and between chips or modules can be improved. Optical and electrical interconnects are compared in terms of power dissipation, communication bandwidth, and connection density. Conditions are determined for which optical interconnects are advantageous. Based on this analysis, it is shown that by applying computer generated holographic optical interconnects to wafer scale fine grain parallel processing systems, dramatic increases in system performance can be expected. Some new interconnection networks, designed to take full advantage of optical interconnect technology, have been developed. Experimental Computer Generated Holograms (CGH's) have been designed, fabricated and subsequently tested in prototype optical interconnected computational systems. Several new CGH encoding methods have been developed to provide efficient high performance CGH's. One CGH was used to decrease the access time of a 1 kilobit CMOS RAM chip. Another was produced to implement the inter-processor communication paths in a shared memory SIMD parallel processor array.
Particle Based Simulations of Complex Systems with MP2C : Hydrodynamics and Electrostatics
NASA Astrophysics Data System (ADS)
Sutmann, Godehard; Westphal, Lidia; Bolten, Matthias
2010-09-01
Particle based simulation methods are well established paths to explore system behavior on microscopic to mesoscopic time and length scales. With the development of new computer architectures it becomes more and more important to concentrate on local algorithms which do not need global data transfer or reorganisation of large arrays of data across processors. This requirement strongly addresses long-range interactions in particle systems, i.e. mainly hydrodynamic and electrostatic contributions. In this article, emphasis is given to the implementation and parallelization of the Multi-Particle Collision Dynamics method for hydrodynamic contributions and a splitting scheme based on Multigrid for electrostatic contributions. Implementations are done for massively parallel architectures and are demonstrated for the IBM Blue Gene/P architecture Jugene in Jülich.
Autonomous Telemetry Collection for Single-Processor Small Satellites
NASA Technical Reports Server (NTRS)
Speer, Dave
2003-01-01
For the Space Technology 5 mission, which is being developed under NASA's New Millennium Program, a single spacecraft processor will be required to do on-board real-time computations and operations associated with attitude control, up-link and down-link communications, science data processing, solid-state recorder management, power switching and battery charge management, experiment data collection, health and status data collection, etc. Much of the health and status information is in analog form, and each of the analog signals must be routed to the input of an analog-to-digital converter, converted to digital form, and then stored in memory. If the micro-operations of the analog data collection process are implemented in software, the processor may use up a lot of time either waiting for the analog signal to settle, waiting for the analog-to-digital conversion to complete, or servicing a large number of high frequency interrupts. In order to off-load a very busy processor, the collection and digitization of all analog spacecraft health and status data will be done autonomously by a field-programmable gate array that can configure the analog signal chain, control the analog-to-digital converter, and store the converted data in memory.
Molecular Mechanics with an Array Processor.
1982-06-01
34 to be submritted. 40 B. W. Kernrihan and D M Ritchie, The CPm guniw.g Language, Prentice- Hall. Eaglewood Cliffs, New Jersey, 1978. 60 D. J. Adams , in...1400 Washington Avenue Ban e o Albany, New York 12203 La J la, California 92093 Dr. Rank Loos Professor C. A. Ansell Latuna Research Laboratory
Star sensing for an earth imaging sensor
NASA Technical Reports Server (NTRS)
Ellis, Kenneth K. (Inventor); Griffith, Paul C. (Inventor)
2012-01-01
A star sensor includes (a) a scan mirror for scanning at least one star; (b) a detector array, coupled to the scan mirror, for detecting the one star; and (c) a processor, coupled to the detector array. The processor includes a first filter configured to reduce noise spikes in the detected one star, and provide a detection mask of filtered data. Also included is a second filter configured to reduce non-contiguous samples in the detection mask. A centroid calculator is included to determine a location of the one star, after the first and second filtering. The first filter includes a median filter, followed by an averaging filter, both configured to filter the one star in an along-scan direction of the scan mirror. The first filter includes another median filter, which is configured to filter the detected one star in the cross-scan direction of the scan mirror. An adder is included to subtract (a) output data from the other median filter from (b) output data from the averaging filter and provide filtered star data to the second filter.
High-performance computing for airborne applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather M; Manuzzato, Andrea; Fairbanks, Tom
2010-06-28
Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even thoughmore » the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.« less
NASA Technical Reports Server (NTRS)
Liu, Hua-Kuang (Inventor); Awwal, Abdul A. S. (Inventor); Karim, Mohammad A. (Inventor)
1993-01-01
An inner-product array processor is provided with thresholding of the inner product during each iteration to make more significant the inner product employed in estimating a vector to be used as the input vector for the next iteration. While stored vectors and estimated vectors are represented in bipolar binary (1,-1), only those elements of an initial partial input vector that are believed to be common with those of a stored vector are represented in bipolar binary; the remaining elements of a partial input vector are set to 0. This mode of representation, in which the known elements of a partial input vector are in bipolar binary form and the remaining elements are set equal to 0, is referred to as trinary representation. The initial inner products corresponding to the partial input vector will then be equal to the number of known elements. Inner-product thresholding is applied to accelerate convergence and to avoid convergence to a negative input product.
Towards implementation of cellular automata in Microbial Fuel Cells.
Tsompanas, Michail-Antisthenis I; Adamatzky, Andrew; Sirakoulis, Georgios Ch; Greenman, John; Ieropoulos, Ioannis
2017-01-01
The Microbial Fuel Cell (MFC) is a bio-electrochemical transducer converting waste products into electricity using microbial communities. Cellular Automaton (CA) is a uniform array of finite-state machines that update their states in discrete time depending on states of their closest neighbors by the same rule. Arrays of MFCs could, in principle, act as massive-parallel computing devices with local connectivity between elementary processors. We provide a theoretical design of such a parallel processor by implementing CA in MFCs. We have chosen Conway's Game of Life as the 'benchmark' CA because this is the most popular CA which also exhibits an enormously rich spectrum of patterns. Each cell of the Game of Life CA is realized using two MFCs. The MFCs are linked electrically and hydraulically. The model is verified via simulation of an electrical circuit demonstrating equivalent behaviours. The design is a first step towards future implementations of fully autonomous biological computing devices with massive parallelism. The energy independence of such devices counteracts their somewhat slow transitions-compared to silicon circuitry-between the different states during computation.
Towards implementation of cellular automata in Microbial Fuel Cells
Adamatzky, Andrew; Sirakoulis, Georgios Ch.; Greenman, John; Ieropoulos, Ioannis
2017-01-01
The Microbial Fuel Cell (MFC) is a bio-electrochemical transducer converting waste products into electricity using microbial communities. Cellular Automaton (CA) is a uniform array of finite-state machines that update their states in discrete time depending on states of their closest neighbors by the same rule. Arrays of MFCs could, in principle, act as massive-parallel computing devices with local connectivity between elementary processors. We provide a theoretical design of such a parallel processor by implementing CA in MFCs. We have chosen Conway’s Game of Life as the ‘benchmark’ CA because this is the most popular CA which also exhibits an enormously rich spectrum of patterns. Each cell of the Game of Life CA is realized using two MFCs. The MFCs are linked electrically and hydraulically. The model is verified via simulation of an electrical circuit demonstrating equivalent behaviours. The design is a first step towards future implementations of fully autonomous biological computing devices with massive parallelism. The energy independence of such devices counteracts their somewhat slow transitions—compared to silicon circuitry—between the different states during computation. PMID:28498871
FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling.
Kim, Chang-Min; Park, Hyung-Min; Kim, Taesu; Choi, Yoon-Kyung; Lee, Soo-Young
2003-01-01
An field programmable gate array (FPGA) implementation of independent component analysis (ICA) algorithm is reported for blind signal separation (BSS) and adaptive noise canceling (ANC) in real time. In order to provide enormous computing power for ICA-based algorithms with multipath reverberation, a special digital processor is designed and implemented in FPGA. The chip design fully utilizes modular concept and several chips may be put together for complex applications with a large number of noise sources. Experimental results with a fabricated test board are reported for ANC only, BSS only, and simultaneous ANC/BSS, which demonstrates successful speech enhancement in real environments in real time.
Development of a multikilowatt ion thruster power processor
NASA Technical Reports Server (NTRS)
Schoenfeld, A. D.; Goldin, D. S.; Biess, J. J.
1972-01-01
A feasibility study was made of the application of silicon-controlled, rectifier series, resonant inverter, power conditioning technology to electric propulsion power processing operating from a 200 to 400 Vdc solar array bus. A power system block diagram was generated to meet the electrical requirements of a 20 CM hollow cathode, mercury bombardment, ion engine. The SCR series resonant inverter was developed as a primary means of power switching and conversion, and the analog signal-to-discrete-time-interval converter control system was applied to achieve good regulation. A complete breadboard was designed, fabricated, and tested with a resistive load bank, and critical power processor areas relating to efficiency, weight, and part count were identified.
Photographic film image enhancement
NASA Technical Reports Server (NTRS)
Horner, J. L.
1975-01-01
A series of experiments were undertaken to assess the feasibility of defogging color film by the techniques of optical spatial filtering. A coherent optical processor was built using red, blue, and green laser light input and specially designed Fourier transformation lenses. An array of spatial filters was fabricated on black and white emulsion slides using the coherent optical processor. The technique was first applied to laboratory white light fogged film, and the results were successful. However, when the same technique was applied to some original Apollo X radiation fogged color negatives, the results showed no similar restoration. Examples of each experiment are presented and possible reasons for the lack of restoration in the Apollo films are discussed.
Compression of CCD raw images for digital still cameras
NASA Astrophysics Data System (ADS)
Sriram, Parthasarathy; Sudharsanan, Subramania
2005-03-01
Lossless compression of raw CCD images captured using color filter arrays has several benefits. The benefits include improved storage capacity, reduced memory bandwidth, and lower power consumption for digital still camera processors. The paper discusses the benefits in detail and proposes the use of a computationally efficient block adaptive scheme for lossless compression. Experimental results are provided that indicate that the scheme performs well for CCD raw images attaining compression factors of more than two. The block adaptive method also compares favorably with JPEG-LS. A discussion is provided indicating how the proposed lossless coding scheme can be incorporated into digital still camera processors enabling lower memory bandwidth and storage requirements.
Analog Processor To Solve Optimization Problems
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Eberhardt, Silvio P.; Thakoor, Anil P.
1993-01-01
Proposed analog processor solves "traveling-salesman" problem, considered paradigm of global-optimization problems involving routing or allocation of resources. Includes electronic neural network and auxiliary circuitry based partly on concepts described in "Neural-Network Processor Would Allocate Resources" (NPO-17781) and "Neural Network Solves 'Traveling-Salesman' Problem" (NPO-17807). Processor based on highly parallel computing solves problem in significantly less time.
Lee, W R; Kim, H S; Park, M K; Lee, J H; Kim, K H
2012-09-01
The Thomson scattering diagnostic system is successfully installed in the Korea Superconducting Tokamak Advanced Research (KSTAR) facility. We got the electron temperature and electron density data for the first time in 2011, 4th campaign using a field programmable gate array (FPGA) based signal control board. It operates as a signal generator, a detector, a controller, and a time measuring device. This board produces two configurable trigger pulses to operate Nd:YAG laser system and receives a laser beam detection signal from a photodiode detector. It allows a trigger pulse to be delivered to a time delay module to make a scattered signal measurement, measuring an asynchronous time value between the KSTAR timing board and the laser system injection signal. All functions are controlled by the embedded processor running on operating system within a single FPGA. It provides Ethernet communication interface and is configured with standard middleware to integrate with KSTAR. This controller has operated for two experimental campaigns including commissioning and performed the reconfiguration of logic designs to accommodate varying experimental situation without hardware rebuilding.
Design and Evaluation of a Scalable and Reconfigurable Multi-Platform System for Acoustic Imaging
Izquierdo, Alberto; Villacorta, Juan José; del Val Puente, Lara; Suárez, Luis
2016-01-01
This paper proposes a scalable and multi-platform framework for signal acquisition and processing, which allows for the generation of acoustic images using planar arrays of MEMS (Micro-Electro-Mechanical Systems) microphones with low development and deployment costs. Acoustic characterization of MEMS sensors was performed, and the beam pattern of a module, based on an 8 × 8 planar array and of several clusters of modules, was obtained. A flexible framework, formed by an FPGA, an embedded processor, a computer desktop, and a graphic processing unit, was defined. The processing times of the algorithms used to obtain the acoustic images, including signal processing and wideband beamforming via FFT, were evaluated in each subsystem of the framework. Based on this analysis, three frameworks are proposed, defined by the specific subsystems used and the algorithms shared. Finally, a set of acoustic images obtained from sound reflected from a person are presented as a case study in the field of biometric identification. These results reveal the feasibility of the proposed system. PMID:27727174
NASA Technical Reports Server (NTRS)
Seale, R. H.
1979-01-01
The prediction of the SRB and ET impact areas requires six separate processors. The SRB impact prediction processor computes the impact areas and related trajectory data for each SRB element. Output from this processor is stored on a secure file accessible by the SRB impact plot processor which generates the required plots. Similarly the ET RTLS impact prediction processor and the ET RTLS impact plot processor generates the ET impact footprints for return-to-launch-site (RTLS) profiles. The ET nominal/AOA/ATO impact prediction processor and the ET nominal/AOA/ATO impact plot processor generate the ET impact footprints for non-RTLS profiles. The SRB and ET impact processors compute the size and shape of the impact footprints by tabular lookup in a stored footprint dispersion data base. The location of each footprint is determined by simulating a reference trajectory and computing the reference impact point location. To insure consistency among all flight design system (FDS) users, much input required by these processors will be obtained from the FDS master data base.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barhen, Jacob; Imam, Neena
2007-01-01
Revolutionary computing technologies are defined in terms of technological breakthroughs, which leapfrog over near-term projected advances in conventional hardware and software to produce paradigm shifts in computational science. For underwater threat source localization using information provided by a dynamical sensor network, one of the most promising computational advances builds upon the emergence of digital optical-core devices. In this article, we present initial results of sensor network calculations that focus on the concept of signal wavefront time-difference-of-arrival (TDOA). The corresponding algorithms are implemented on the EnLight processing platform recently introduced by Lenslet Laboratories. This tera-scale digital optical core processor is optimizedmore » for array operations, which it performs in a fixed-point-arithmetic architecture. Our results (i) illustrate the ability to reach the required accuracy in the TDOA computation, and (ii) demonstrate that a considerable speed-up can be achieved when using the EnLight 64a prototype processor as compared to a dual Intel XeonTM processor.« less
Universal sensor interface module (USIM)
NASA Astrophysics Data System (ADS)
King, Don; Torres, A.; Wynn, John
1999-01-01
A universal sensor interface model (USIM) is being developed by the Raytheon-TI Systems Company for use with fields of unattended distributed sensors. In its production configuration, the USIM will be a multichip module consisting of a set of common modules. The common module USIM set consists of (1) a sensor adapter interface (SAI) module, (2) digital signal processor (DSP) and associated memory module, and (3) a RF transceiver model. The multispectral sensor interface is designed around a low-power A/D converted, whose input/output interface consists of: -8 buffered, sampled inputs from various devices including environmental, acoustic seismic and magnetic sensors. The eight sensor inputs are each high-impedance, low- capacitance, differential amplifiers. The inputs are ideally suited for interface with discrete or MEMS sensors, since the differential input will allow direct connection with high-impedance bridge sensors and capacitance voltage sources. Each amplifier is connected to a 22-bit (Delta) (Sigma) A/D converter to enable simultaneous samples. The low power (Delta) (Sigma) converter provides 22-bit resolution at sample frequencies up to 142 hertz (used for magnetic sensors) and 16-bit resolution at frequencies up to 1168 hertz (used for acoustic and seismic sensors). The video interface module is based around the TMS320C5410 DSP. It can provide sensor array addressing, video data input, data calibration and correction. The processor module is based upon a MPC555. It will be used for mode control, synchronization of complex sensors, sensor signal processing, array processing, target classification and tracking. Many functions of the A/D, DSP and transceiver can be powered down by using variable clock speeds under software command or chip power switches. They can be returned to intermediate or full operation by DSP command. Power management may be based on the USIM's internal timer, command from the USIM transceiver, or by sleep mode processing management. The low power detection mode is implemented by monitoring any of the sensor analog outputs at lower sample rates for detection over a software controllable threshold.
ICE: A Scalable, Low-Cost FPGA-Based Telescope Signal Processing and Networking System
NASA Astrophysics Data System (ADS)
Bandura, K.; Bender, A. N.; Cliche, J. F.; de Haan, T.; Dobbs, M. A.; Gilbert, A. J.; Griffin, S.; Hsyu, G.; Ittah, D.; Parra, J. Mena; Montgomery, J.; Pinsonneault-Marotte, T.; Siegel, S.; Smecher, G.; Tang, Q. Y.; Vanderlinde, K.; Whitehorn, N.
2016-03-01
We present an overview of the ‘ICE’ hardware and software framework that implements large arrays of interconnected field-programmable gate array (FPGA)-based data acquisition, signal processing and networking nodes economically. The system was conceived for application to radio, millimeter and sub-millimeter telescope readout systems that have requirements beyond typical off-the-shelf processing systems, such as careful control of interference signals produced by the digital electronics, and clocking of all elements in the system from a single precise observatory-derived oscillator. A new generation of telescopes operating at these frequency bands and designed with a vastly increased emphasis on digital signal processing to support their detector multiplexing technology or high-bandwidth correlators — data rates exceeding a terabyte per second — are becoming common. The ICE system is built around a custom FPGA motherboard that makes use of an Xilinx Kintex-7 FPGA and ARM-based co-processor. The system is specialized for specific applications through software, firmware and custom mezzanine daughter boards that interface to the FPGA through the industry-standard FPGA mezzanine card (FMC) specifications. For high density applications, the motherboards are packaged in 16-slot crates with ICE backplanes that implement a low-cost passive full-mesh network between the motherboards in a crate, allow high bandwidth interconnection between crates and enable data offload to a computer cluster. A Python-based control software library automatically detects and operates the hardware in the array. Examples of specific telescope applications of the ICE framework are presented, namely the frequency-multiplexed bolometer readout systems used for the South Pole Telescope (SPT) and Simons Array and the digitizer, F-engine, and networking engine for the Canadian Hydrogen Intensity Mapping Experiment (CHIME) and Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) radio interferometers.
The Data Acquisition System of the Stockholm Educational Air Shower Array
NASA Astrophysics Data System (ADS)
Hofverberg, P.; Johansson, H.; Pearce, M.; Rydstrom, S.; Wikstrom, C.
2005-12-01
The Stockholm Educational Air Shower Array (SEASA) project is deploying an array of plastic scintillator detector stations on school roofs in the Stockholm area. Signals from GPS satellites are used to time synchronise signals from the widely separated detector stations, allowing cosmic ray air showers to be identified and studied. A low-cost and highly scalable data acquisition system has been produced using embedded Linux processors which communicate station data to a central server running a MySQL database. Air shower data can be visualised in real-time using a Java-applet client. It is also possible to query the database and manage detector stations from the client. In this paper, the design and performance of the system are described
Bioinspired architecture approach for a one-billion transistor smart CMOS camera chip
NASA Astrophysics Data System (ADS)
Fey, Dietmar; Komann, Marcus
2007-05-01
In the paper we present a massively parallel VLSI architecture for future smart CMOS camera chips with up to one billion transistors. To exploit efficiently the potential offered by future micro- or nanoelectronic devices traditional on central structures oriented parallel architectures based on MIMD or SIMD approaches will fail. They require too long and too many global interconnects for the distribution of code or the access to common memory. On the other hand nature developed self-organising and emergent principles to manage successfully complex structures based on lots of interacting simple elements. Therefore we developed a new as Marching Pixels denoted emergent computing paradigm based on a mixture of bio-inspired computing models like cellular automaton and artificial ants. In the paper we present different Marching Pixels algorithms and the corresponding VLSI array architecture. A detailed synthesis result for a 0.18 μm CMOS process shows that a 256×256 pixel image is processed in less than 10 ms assuming a moderate 100 MHz clock rate for the processor array. Future higher integration densities and a 3D chip stacking technology will allow the integration and processing of Mega pixels within the same time since our architecture is fully scalable.
Powerful conveyer belt real-time online detection system based on x-ray
NASA Astrophysics Data System (ADS)
Rong, Feng; Miao, Chang-yun; Meng, Wei
2009-07-01
The powerful conveyer belt is widely used in the mine, dock, and so on. After used for a long time, internal steel rope of the conveyor belt may fracture, rust, joints moving, and so on .This would bring potential safety problems. A kind of detection system based on x-ray is designed in this paper. Linear array detector (LDA) is used. LDA cost is low, response fast; technology mature .Output charge of LDA is transformed into differential voltage signal by amplifier. This kind of signal have great ability of anti-noise, is suitable for long-distance transmission. The processor is FPGA. A IP core control 4-channel A/D convertor, achieve parallel output data collection. Soft-core processor MicroBlaze which process tcp/ip protocol is embedded in FPGA. Sampling data are transferred to a computer via Ethernet. In order to improve the image quality, algorithm of getting rid of noise from the measurement result and taking gain normalization for pixel value is studied and designed. Experiments show that this system work well, can real-time online detect conveyor belt of width of 2.0m and speed of 5 m/s, does not affect the production. Image is clear, visual and can easily judge the situation of conveyor belt.
Real-time road detection in infrared imagery
NASA Astrophysics Data System (ADS)
Andre, Haritini E.; McCoy, Keith
1990-09-01
Automatic road detection is an important part in many scene recognition applications. The extraction of roads provides a means of navigation and position update for remotely piloted vehicles or autonomous vehicles. Roads supply strong contextual information which can be used to improve the performance of automatic target recognition (ATh) systems by directing the search for targets and adjusting target classification confidences. This paper will describe algorithmic techniques for labeling roads in high-resolution infrared imagery. In addition, realtime implementation of this structural approach using a processor array based on the Martin Marietta Geometric Arithmetic Parallel Processor (GAPPTh) chip will be addressed. The algorithm described is based on the hypothesis that a road consists of pairs of line segments separated by a distance "d" with opposite gradient directions (antiparallel). The general nature of the algorithm, in addition to its parallel implementation in a single instruction, multiple data (SIMD) machine, are improvements to existing work. The algorithm seeks to identify line segments meeting the road hypothesis in a manner that performs well, even when the side of the road is fragmented due to occlusion or intersections. The use of geometrical relationships between line segments is a powerful yet flexible method of road classification which is independent of orientation. In addition, this approach can be used to nominate other types of objects with minor parametric changes.
Algorithms for Automatic Alignment of Arrays
NASA Technical Reports Server (NTRS)
Chatterjee, Siddhartha; Gilbert, John R.; Oliker, Leonid; Schreiber, Robert; Sheffler, Thomas J.
1996-01-01
Aggregate data objects (such as arrays) are distributed across the processor memories when compiling a data-parallel language for a distributed-memory machine. The mapping determines the amount of communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: an alignment that maps all the objects to an abstract template, followed by a distribution that maps the template to the processors. This paper describes algorithms for solving the various facets of the alignment problem: axis and stride alignment, static and mobile offset alignment, and replication labeling. We show that optimal axis and stride alignment is NP-complete for general program graphs, and give a heuristic method that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. We also show how local graph contractions can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. We show how to model the static offset alignment problem using linear programming, and we show that loop-dependent mobile offset alignment is sometimes necessary for optimum performance. We describe an algorithm with for determining mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself or can be used to improve performance. We describe an algorithm based on network flow that replicates objects so as to minimize the total amount of broadcast communication in replication.
Implementation of a cone-beam backprojection algorithm on the cell broadband engine processor
NASA Astrophysics Data System (ADS)
Bockenbach, Olivier; Knaup, Michael; Kachelrieß, Marc
2007-03-01
Tomographic image reconstruction is computationally very demanding. In all cases the backprojection represents the performance bottleneck due to the high operational count and due to the high demand put on the memory subsystem. In the past, solving this problem has lead to the implementation of specific architectures, connecting Application Specific Integrated Circuits (ASICs) or Field Programmable Gate Arrays (FPGAs) to memory through dedicated high speed busses. More recently, there have also been attempt to use Graphic Processing Units (GPUs) to perform the backprojection step. Originally aimed at the gaming market, IBM, Toshiba and Sony have introduced the Cell Broadband Engine (CBE) processor, often considered as a multicomputer on a chip. Clocked at 3 GHz, the Cell allows for a theoretical performance of 192 GFlops and a peak data transfer rate over the internal bus of 200 GB/s. This performance indeed makes the Cell a very attractive architecture for implementing tomographic image reconstruction algorithms. In this study, we investigate the relative performance of a perspective backprojection algorithm when implemented on a standard PC and on the Cell processor. We compare these results to the performance achievable with FPGAs based boards and high end GPUs. The cone-beam backprojection performance was assessed by backprojecting a full circle scan of 512 projections of 1024x1024 pixels into a volume of size 512x512x512 voxels. It took 3.2 minutes on the PC (single CPU) and is as fast as 13.6 seconds on the Cell.
Examination of roundwood utilization rates in West Virginia
Shawn T. Grushecky; Jan Wiedenbeck; Curt C. Hassler
2013-01-01
Forest harvesting is an integral part of the West Virginia forest economy. This component of the supply chain supports a diverse array of primary and secondary processors. A key metric used to describe the efficiency of the roundwood extraction process is the logging utilization factor (LUF). The LUF is one way managers can discern the overall use of harvested...
Progress in video immersion using Panospheric imaging
NASA Astrophysics Data System (ADS)
Bogner, Stephen L.; Southwell, David T.; Penzes, Steven G.; Brosinsky, Chris A.; Anderson, Ron; Hanna, Doug M.
1998-09-01
Having demonstrated significant technical and marketplace advantages over other modalities for video immersion, PanosphericTM Imaging (PI) continues to evolve rapidly. This paper reports on progress achieved since AeroSense 97. The first practical field deployment of the technology occurred in June-August 1997 during the NASA-CMU 'Atacama Desert Trek' activity, where the Nomad mobile robot was teleoperated via immersive PanosphericTM imagery from a distance of several thousand kilometers. Research using teleoperated vehicles at DRES has also verified the exceptional utility of the PI technology for achieving high levels of situational awareness, operator confidence, and mission effectiveness. Important performance enhancements have been achieved with the completion of the 4th Generation PI DSP-based array processor system. The system is now able to provide dynamic full video-rate generation of spatial and computational transformations, resulting in a programmable and fully interactive immersive video telepresence. A new multi- CCD camera architecture has been created to exploit the bandwidth of this processor, yielding a well-matched PI system with greatly improved resolution. While the initial commercial application for this technology is expected to be video tele- conferencing, it also appears to have excellent potential for application in the 'Immersive Cockpit' concept. Additional progress is reported in the areas of Long Wave Infrared PI Imaging, Stereo PI concepts, PI based Video-Servoing concepts, PI based Video Navigation concepts, and Foveation concepts (to merge localized high-resolution views with immersive views).
Precision orbit raising trajectories. [solar electric propulsion orbital transfer program
NASA Technical Reports Server (NTRS)
Flanagan, P. F.; Horsewood, J. L.; Pines, S.
1975-01-01
A precision trajectory program has been developed to serve as a test bed for geocentric orbit raising steering laws. The steering laws to be evaluated have been developed using optimization methods employing averaging techniques. This program provides the capability of testing the steering laws in a precision simulation. The principal system models incorporated in the program are described, including the radiation environment, the solar array model, the thrusters and power processors, the geopotential, and the solar system. Steering and array orientation constraints are discussed, and the impact of these constraints on program design is considered.
The SKA1 LOW telescope: system architecture and design performance
NASA Astrophysics Data System (ADS)
Waterson, Mark F.; Labate, Maria Grazia; Schnetler, Hermine; Wagg, Jeff; Turner, Wallace; Dewdney, Peter
2016-07-01
The SKA1-LOW radio telescope will be a low-frequency (50-350 MHz) aperture array located in Western Australia. Its scientific objectives will prioritize studies of the Epoch of Reionization and pulsar physics. Development of the telescope has been allocated to consortia responsible for the aperture array front end, timing distribution, signal and data transport, correlation and beamforming signal processors, infrastructure, monitor and control systems, and science data processing. This paper will describe the system architectural design and key performance parameters of the telescope and summarize the high-level sub-system designs of the consortia.
Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo
2011-01-01
This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH4/H2) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process. PMID:22346587
Advanced On-Board Processor (AOP). [for future spacecraft applications
NASA Technical Reports Server (NTRS)
1973-01-01
Advanced On-board Processor the (AOP) uses large scale integration throughout and is the most advanced space qualified computer of its class in existence today. It was designed to satisfy most spacecraft requirements which are anticipated over the next several years. The AOP design utilizes custom metallized multigate arrays (CMMA) which have been designed specifically for this computer. This approach provides the most efficient use of circuits, reduces volume, weight, assembly costs and provides for a significant increase in reliability by the significant reduction in conventional circuit interconnections. The required 69 CMMA packages are assembled on a single multilayer printed circuit board which together with associated connectors constitutes the complete AOP. This approach also reduces conventional interconnections thus further reducing weight, volume and assembly costs.
Spacecraft on-board SAR image generation for EOS-type missions
NASA Technical Reports Server (NTRS)
Liu, K. Y.; Arens, W. E.; Assal, H. M.; Vesecky, J. F.
1987-01-01
Spacecraft on-board synthetic aperture radar (SAR) image generation is an extremely difficult problem because of the requirements for high computational rates (usually on the order of Giga-operations per second), high reliability (some missions last up to 10 years), and low power dissipation and mass (typically less than 500 watts and 100 Kilograms). Recently, a JPL study was performed to assess the feasibility of on-board SAR image generation for EOS-type missions. This paper summarizes the results of that study. Specifically, it proposes a processor architecture using a VLSI time-domain parallel array for azimuth correlation. Using available space qualifiable technology to implement the proposed architecture, an on-board SAR processor having acceptable power and mass characteristics appears feasible for EOS-type applications.
Facile fabrication of nanofluidic diode membranes using anodic aluminium oxide.
Wu, Songmei; Wildhaber, Fabien; Vazquez-Mena, Oscar; Bertsch, Arnaud; Brugger, Juergen; Renaud, Philippe
2012-09-21
Active control of ion transport plays important roles in chemical and biological analytical processes. Nanofluidic systems hold the promise for such control through electrostatic interaction between ions and channel surfaces. Most existing experiments rely on planar geometry where the nanochannels are generally very long and shallow with large aspect ratios. Based on this configuration the concepts of nanofluidic gating and rectification have been successfully demonstrated. However, device minimization and throughput scaling remain significant challenges. We report here an innovative and facile realization of hetero-structured Al(2)O(3)/SiO(2) (Si) nanopore array membranes by using pattern transfer of self-organized nanopore structures of anodic aluminum oxide (AAO). Thanks to the opposite surface charge states of Al(2)O(3) (positive) and SiO(2) (negative), the membrane exhibits clear rectification of ion current in electrolyte solutions with very low aspect ratios compared to previous approaches. Our hetero-structured nanopore arrays provide a valuable platform for high throughput applications such as molecular separation, chemical processors and energy conversion.
NASA Astrophysics Data System (ADS)
Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.
2017-11-01
Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Use of Field Programmable Gate Array Technology in Future Space Avionics
NASA Technical Reports Server (NTRS)
Ferguson, Roscoe C.; Tate, Robert
2005-01-01
Fulfilling NASA's new vision for space exploration requires the development of sustainable, flexible and fault tolerant spacecraft control systems. The traditional development paradigm consists of the purchase or fabrication of hardware boards with fixed processor and/or Digital Signal Processing (DSP) components interconnected via a standardized bus system. This is followed by the purchase and/or development of software. This paradigm has several disadvantages for the development of systems to support NASA's new vision. Building a system to be fault tolerant increases the complexity and decreases the performance of included software. Standard bus design and conventional implementation produces natural bottlenecks. Configuring hardware components in systems containing common processors and DSPs is difficult initially and expensive or impossible to change later. The existence of Hardware Description Languages (HDLs), the recent increase in performance, density and radiation tolerance of Field Programmable Gate Arrays (FPGAs), and Intellectual Property (IP) Cores provides the technology for reprogrammable Systems on a Chip (SOC). This technology supports a paradigm better suited for NASA's vision. Hardware and software production are melded for more effective development; they can both evolve together over time. Designers incorporating this technology into future avionics can benefit from its flexibility. Systems can be designed with improved fault isolation and tolerance using hardware instead of software. Also, these designs can be protected from obsolescence problems where maintenance is compromised via component and vendor availability.To investigate the flexibility of this technology, the core of the Central Processing Unit and Input/Output Processor of the Space Shuttle AP101S Computer were prototyped in Verilog HDL and synthesized into an Altera Stratix FPGA.
NASA Technical Reports Server (NTRS)
Batcher, K. E.; Eddey, E. E.; Faiss, R. O.; Gilmore, P. A.
1981-01-01
The processing of synthetic aperture radar (SAR) signals using the massively parallel processor (MPP) is discussed. The fast Fourier transform convolution procedures employed in the algorithms are described. The MPP architecture comprises an array unit (ARU) which processes arrays of data; an array control unit which controls the operation of the ARU and performs scalar arithmetic; a program and data management unit which controls the flow of data; and a unique staging memory (SM) which buffers and permutes data. The ARU contains a 128 by 128 array of bit-serial processing elements (PE). Two-by-four surarrays of PE's are packaged in a custom VLSI HCMOS chip. The staging memory is a large multidimensional-access memory which buffers and permutes data flowing with the system. Efficient SAR processing is achieved via ARU communication paths and SM data manipulation. Real time processing capability can be realized via a multiple ARU, multiple SM configuration.
Microsystem enabled photovoltaic modules and systems
Nielson, Gregory N; Sweatt, William C; Okandan, Murat
2015-05-12
A microsystem enabled photovoltaic (MEPV) module including: an absorber layer; a fixed optic layer coupled to the absorber layer; a translatable optic layer; a translation stage coupled between the fixed and translatable optic layers; and a motion processor electrically coupled to the translation stage to controls motion of the translatable optic layer relative to the fixed optic layer. The absorber layer includes an array of photovoltaic (PV) elements. The fixed optic layer includes an array of quasi-collimating (QC) micro-optical elements designed and arranged to couple incident radiation from an intermediate image formed by the translatable optic layer into one of the PV elements such that it is quasi-collimated. The translatable optic layer includes an array of focusing micro-optical elements corresponding to the QC micro-optical element array. Each focusing micro-optical element is designed to produce a quasi-telecentric intermediate image from substantially collimated radiation incident within a predetermined field of view.
The computational structural mechanics testbed generic structural-element processor manual
NASA Technical Reports Server (NTRS)
Stanley, Gary M.; Nour-Omid, Shahram
1990-01-01
The usage and development of structural finite element processors based on the CSM Testbed's Generic Element Processor (GEP) template is documented. By convention, such processors have names of the form ESi, where i is an integer. This manual is therefore intended for both Testbed users who wish to invoke ES processors during the course of a structural analysis, and Testbed developers who wish to construct new element processors (or modify existing ones).
Optical information processing at NASA Ames Research Center
NASA Technical Reports Server (NTRS)
Reid, Max B.; Bualat, Maria G.; Cho, Young C.; Downie, John D.; Gary, Charles K.; Ma, Paul W.; Ozcan, Meric; Pryor, Anna H.; Spirkovska, Lilly
1993-01-01
The combination of analog optical processors with digital electronic systems offers the potential of tera-OPS computational performance, while often requiring less power and weight relative to all-digital systems. NASA is working to develop and demonstrate optical processing techniques for on-board, real time science and mission applications. Current research areas and applications under investigation include optical matrix processing for space structure vibration control and the analysis of Space Shuttle Main Engine plume spectra, optical correlation-based autonomous vision for robotic vehicles, analog computation for robotic path planning, free-space optical interconnections for information transfer within digital electronic computers, and multiplexed arrays of fiber optic interferometric sensors for acoustic and vibration measurements.
Algorithmic commonalities in the parallel environment
NASA Technical Reports Server (NTRS)
Mcanulty, Michael A.; Wainer, Michael S.
1987-01-01
The ultimate aim of this project was to analyze procedures from substantially different application areas to discover what is either common or peculiar in the process of conversion to the Massively Parallel Processor (MPP). Three areas were identified: molecular dynamic simulation, production systems (rule systems), and various graphics and vision algorithms. To date, only selected graphics procedures have been investigated. They are the most readily available, and produce the most visible results. These include simple polygon patch rendering, raycasting against a constructive solid geometric model, and stochastic or fractal based textured surface algorithms. Only the simplest of conversion strategies, mapping a major loop to the array, has been investigated so far. It is not entirely satisfactory.
Ultra-Compact Transputer-Based Controller for High-Level, Multi-Axis Coordination
NASA Technical Reports Server (NTRS)
Zenowich, Brian; Crowell, Adam; Townsend, William T.
2013-01-01
The design of machines that rely on arrays of servomotors such as robotic arms, orbital platforms, and combinations of both, imposes a heavy computational burden to coordinate their actions to perform coherent tasks. For example, the robotic equivalent of a person tracing a straight line in space requires enormously complex kinematics calculations, and complexity increases with the number of servo nodes. A new high-level architecture for coordinated servo-machine control enables a practical, distributed transputer alternative to conventional central processor electronics. The solution is inherently scalable, dramatically reduces bulkiness and number of conductor runs throughout the machine, requires only a fraction of the power, and is designed for cooling in a vacuum.
Biomimetic machine vision system.
Harman, William M; Barrett, Steven F; Wright, Cameron H G; Wilcox, Michael
2005-01-01
Real-time application of digital imaging for use in machine vision systems has proven to be prohibitive when used within control systems that employ low-power single processors without compromising the scope of vision or resolution of captured images. Development of a real-time machine analog vision system is the focus of research taking place at the University of Wyoming. This new vision system is based upon the biological vision system of the common house fly. Development of a single sensor is accomplished, representing a single facet of the fly's eye. This new sensor is then incorporated into an array of sensors capable of detecting objects and tracking motion in 2-D space. This system "preprocesses" incoming image data resulting in minimal data processing to determine the location of a target object. Due to the nature of the sensors in the array, hyperacuity is achieved thereby eliminating resolutions issues found in digital vision systems. In this paper, we will discuss the biological traits of the fly eye and the specific traits that led to the development of this machine vision system. We will also discuss the process of developing an analog based sensor that mimics the characteristics of interest in the biological vision system. This paper will conclude with a discussion of how an array of these sensors can be applied toward solving real-world machine vision issues.
Sub-nanosecond clock synchronization and trigger management in the nuclear physics experiment AGATA
NASA Astrophysics Data System (ADS)
Bellato, M.; Bortolato, D.; Chavas, J.; Isocrate, R.; Rampazzo, G.; Triossi, A.; Bazzacco, D.; Mengoni, D.; Recchia, F.
2013-07-01
The new-generation spectrometer AGATA, the Advanced GAmma Tracking Array, requires sub-nanosecond clock synchronization among readout and front-end electronics modules that may lie hundred meters apart. We call GTS (Global Trigger and Synchronization System) the infrastructure responsible for precise clock synchronization and for the trigger management of AGATA. It is made of a central trigger processor and nodes, connected in a tree structure by means of optical fibers operated at 2Gb/s. The GTS tree handles the synchronization and the trigger data flow, whereas the trigger processor analyses and eventually validates the trigger primitives centrally. Sub-nanosecond synchronization is achieved by measuring two different types of round-trip times and by automatically correcting for phase-shift differences. For a tree of depth two, the peak-to-peak clock jitter at each leaf is 70 ps; the mean phase difference is 180 ps, while the standard deviation over such phase difference, namely the phase equalization repeatability, is 20 ps. The GTS system has run flawlessly for the two-year long AGATA campaign, held at the INFN Legnaro National Laboratories, Italy, where five triple clusters of the AGATA sub-array were coupled with a variety of ancillary detectors.
Research in the design of high-performance reconfigurable systems
NASA Technical Reports Server (NTRS)
Slotnick, D. L.; Mcewan, S. D.; Spry, A. J.
1984-01-01
An initial design for the Bit Processor (BP) referred to in prior reports as the Processing Element or PE has been completed. Eight BP's, together with their supporting random-access memory, a 64 k x 9 ROM to perform addition, routing logic, and some additional logic, constitute the components of a single stage. An initial stage design is given. Stages may be combined to perform high-speed fixed or floating point arithmetic. Stages can be configured into a range of arithmetic modules that includes bit-serial one or two-dimensional arrays; one or two dimensional arrays fixed or floating point processors; and specialized uniprocessors, such as long-word arithmetic units. One to eight BP's represent a likely initial chip level. The Stage would then correspond to a first-level pluggable module. As both this project and VLSI CAD/CAM progress, however, it is expected that the chip level would migrate upward to the stage and, perhaps, ultimately the box level. The BP RAM, consisting of two banks, holds only operands and indices. Programs are at the box (high-level function) and system level. At the system level initial effort has been concentrated on specifying the tools needed to evaluate design alternatives.
Stereo and IMU-Assisted Visual Odometry for Small Robots
NASA Technical Reports Server (NTRS)
2012-01-01
This software performs two functions: (1) taking stereo image pairs as input, it computes stereo disparity maps from them by cross-correlation to achieve 3D (three-dimensional) perception; (2) taking a sequence of stereo image pairs as input, it tracks features in the image sequence to estimate the motion of the cameras between successive image pairs. A real-time stereo vision system with IMU (inertial measurement unit)-assisted visual odometry was implemented on a single 750 MHz/520 MHz OMAP3530 SoC (system on chip) from TI (Texas Instruments). Frame rates of 46 fps (frames per second) were achieved at QVGA (Quarter Video Graphics Array i.e. 320 240), or 8 fps at VGA (Video Graphics Array 640 480) resolutions, while simultaneously tracking up to 200 features, taking full advantage of the OMAP3530's integer DSP (digital signal processor) and floating point ARM processors. This is a substantial advancement over previous work as the stereo implementation produces 146 Mde/s (millions of disparities evaluated per second) in 2.5W, yielding a stereo energy efficiency of 58.8 Mde/J, which is 3.75 better than prior DSP stereo while providing more functionality.
NASA Astrophysics Data System (ADS)
Wang, Zhe; Wang, Wen-Qin; Shao, Huaizong
2016-12-01
Different from the phased-array using the same carrier frequency for each transmit element, the frequency diverse array (FDA) uses a small frequency offset across the array elements to produce range-angle-dependent transmit beampattern. FDA radar provides new application capabilities and potentials due to its range-dependent transmit array beampattern, but the FDA using linearly increasing frequency offsets will produce a range and angle coupled transmit beampattern. In order to decouple the range-azimuth beampattern for FDA radar, this paper proposes a uniform linear array (ULA) FDA using Costas-sequence modulated frequency offsets to produce random-like energy distribution in the transmit beampattern and thumbtack transmit-receive beampattern. In doing so, the range and angle of targets can be unambiguously estimated through matched filtering and subspace decomposition algorithms in the receiver signal processor. Moreover, random-like energy distributed beampattern can also be utilized for low probability of intercept (LPI) radar applications. Numerical results show that the proposed scheme outperforms the standard FDA in focusing the transmit energy, especially in the range dimension.
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.
1994-01-01
Electronic and optoelectronic hardware implementations of highly parallel computing architectures address several ill-defined and/or computation-intensive problems not easily solved by conventional computing techniques. The concurrent processing architectures developed are derived from a variety of advanced computing paradigms including neural network models, fuzzy logic, and cellular automata. Hardware implementation technologies range from state-of-the-art digital/analog custom-VLSI to advanced optoelectronic devices such as computer-generated holograms and e-beam fabricated Dammann gratings. JPL's concurrent processing devices group has developed a broad technology base in hardware implementable parallel algorithms, low-power and high-speed VLSI designs and building block VLSI chips, leading to application-specific high-performance embeddable processors. Application areas include high throughput map-data classification using feedforward neural networks, terrain based tactical movement planner using cellular automata, resource optimization (weapon-target assignment) using a multidimensional feedback network with lateral inhibition, and classification of rocks using an inner-product scheme on thematic mapper data. In addition to addressing specific functional needs of DOD and NASA, the JPL-developed concurrent processing device technology is also being customized for a variety of commercial applications (in collaboration with industrial partners), and is being transferred to U.S. industries. This viewgraph p resentation focuses on two application-specific processors which solve the computation intensive tasks of resource allocation (weapon-target assignment) and terrain based tactical movement planning using two extremely different topologies. Resource allocation is implemented as an asynchronous analog competitive assignment architecture inspired by the Hopfield network. Hardware realization leads to a two to four order of magnitude speed-up over conventional techniques and enables multiple assignments, (many to many), not achievable with standard statistical approaches. Tactical movement planning (finding the best path from A to B) is accomplished with a digital two-dimensional concurrent processor array. By exploiting the natural parallel decomposition of the problem in silicon, a four order of magnitude speed-up over optimized software approaches has been demonstrated.
Extending Automatic Parallelization to Optimize High-Level Abstractions for Multicore
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liao, C; Quinlan, D J; Willcock, J J
2008-12-12
Automatic introduction of OpenMP for sequential applications has attracted significant attention recently because of the proliferation of multicore processors and the simplicity of using OpenMP to express parallelism for shared-memory systems. However, most previous research has only focused on C and Fortran applications operating on primitive data types. C++ applications using high-level abstractions, such as STL containers and complex user-defined types, are largely ignored due to the lack of research compilers that are readily able to recognize high-level object-oriented abstractions and leverage their associated semantics. In this paper, we automatically parallelize C++ applications using ROSE, a multiple-language source-to-source compiler infrastructuremore » which preserves the high-level abstractions and gives us access to their semantics. Several representative parallelization candidate kernels are used to explore semantic-aware parallelization strategies for high-level abstractions, combined with extended compiler analyses. Those kernels include an array-base computation loop, a loop with task-level parallelism, and a domain-specific tree traversal. Our work extends the applicability of automatic parallelization to modern applications using high-level abstractions and exposes more opportunities to take advantage of multicore processors.« less
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery
Qi, Baogui; Zhuang, Yin; Chen, He; Chen, Liang
2018-01-01
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited. PMID:29693585
Smart Sensors: Why and when the origin was and why and where the future will be
NASA Astrophysics Data System (ADS)
Corsi, C.
2013-12-01
Smart Sensors is a technique developed in the 70's when the processing capabilities, based on readout integrated with signal processing, was still far from the complexity needed in advanced IR surveillance and warning systems, because of the enormous amount of noise/unwanted signals emitted by operating scenario especially in military applications. The Smart Sensors technology was kept restricted within a close military environment exploding in applications and performances in the 90's years thanks to the impressive improvements in the integrated signal read-out and processing achieved by CCD-CMOS technologies in FPA. In fact the rapid advances of "very large scale integration" (VLSI) processor technology and mosaic EO detector array technology allowed to develop new generations of Smart Sensors with much improved signal processing by integrating microcomputers and other VLSI signal processors. inside the sensor structure achieving some basic functions of living eyes (dynamic stare, non-uniformity compensation, spatial and temporal filtering). New and future technologies (Nanotechnology, Bio-Organic Electronics, Bio-Computing) are lightning a new generation of Smart Sensors extending the Smartness from the Space-Time Domain to Spectroscopic Functional Multi-Domain Signal Processing. History and future forecasting of Smart Sensors will be reported.
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery.
Qi, Baogui; Shi, Hao; Zhuang, Yin; Chen, He; Chen, Liang
2018-04-25
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited.
75 FR 52507 - Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-26
... standards designed to ensure that all catch delivered to the processor is accurately weighed and accounted... NMFS for catcher/processors and motherships is based on the vessel meeting a series of design criteria. Because of the wide variations in factory layout for inshore processors, NMFS requires a performance-based...
Design for a Manufacturing Method for Memristor-Based Neuromorphic Computing Processors
2013-03-01
DESIGN FOR A MANUFACTURING METHOD FOR MEMRISTOR- BASED NEUROMORPHIC COMPUTING PROCESSORS UNIVERSITY OF PITTSBURGH MARCH 2013...BASED NEUROMORPHIC COMPUTING PROCESSORS 5a. CONTRACT NUMBER FA8750-11-1-0271 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6. AUTHOR(S...synapses and implemented a neuromorphic computing system based on our proposed synapse designs. The robustness of our system is also evaluated by
Performances of multiprocessor multidisk architectures for continuous media storage
NASA Astrophysics Data System (ADS)
Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.
1996-03-01
Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Ion propulsion cost effectivity
NASA Technical Reports Server (NTRS)
Zafran, S.; Biess, J. J.
1978-01-01
Ion propulsion modules employing 8-cm thrusters and 30-cm thrusters were studied for Multimission Modular Spacecraft (MMS) applications. Recurring and nonrecurring cost elements were generated for these modules. As a result, ion propulsion cost drivers were identified to be Shuttle charges, solar array, power processing, and thruster costs. Cost effective design approaches included short length module configurations, array power sharing, operation at reduced thruster input power, simplified power processing units, and power processor output switching. The MMS mission model employed indicated that nonrecurring costs have to be shared with other programs unless the mission model grows. Extended performance missions exhibited the greatest benefits when compared with monopropellant hydrazine propulsion.
Cluster Computing For Real Time Seismic Array Analysis.
NASA Astrophysics Data System (ADS)
Martini, M.; Giudicepietro, F.
A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by a pro- gram which reads data from disk files and send them to a remote host by using the Internet protocols.
Space Tug Avionics Definition Study. Volume 5: Cost and Programmatics
NASA Technical Reports Server (NTRS)
1975-01-01
The baseline avionics system features a central digital computer that integrates the functions of all the space tug subsystems by means of a redundant digital data bus. The central computer consists of dual central processor units, dual input/output processors, and a fault tolerant memory, utilizing internal redundancy and error checking. Three electronically steerable phased arrays provide downlink transmission from any tug attitude directly to ground or via TDRS. Six laser gyros and six accelerometers in a dodecahedron configuration make up the inertial measurement unit. Both a scanning laser radar and a TV system, employing strobe lamps, are required as acquisition and docking sensors. Primary dc power at a nominal 28 volts is supplied from dual lightweight, thermally integrated fuel cells which operate from propellant grade reactants out of the main tanks.
A data base processor semantics specification package
NASA Technical Reports Server (NTRS)
Fishwick, P. A.
1983-01-01
A Semantics Specification Package (DBPSSP) for the Intel Data Base Processor (DBP) is defined. DBPSSP serves as a collection of cross assembly tools that allow the analyst to assemble request blocks on the host computer for passage to the DBP. The assembly tools discussed in this report may be effectively used in conjunction with a DBP compatible data communications protocol to form a query processor, precompiler, or file management system for the database processor. The source modules representing the components of DBPSSP are fully commented and included.
NASA Technical Reports Server (NTRS)
2008-01-01
As Global Positioning Satellite (GPS) applications become more prevalent for land- and air-based vehicles, GPS applications for space vehicles will also increase. The Applied Technology Directorate of Kennedy Space Center (KSC) has developed a lightweight, low-cost GPS Metric Tracking Unit (GMTU), the first of two steps in developing a lightweight, low-cost Space-Based Tracking and Command Subsystem (STACS) designed to meet Range Safety's link margin and latency requirements for vehicle command and telemetry data. The goals of STACS are to improve Range Safety operations and expand tracking capabilities for space vehicles. STACS will track the vehicle, receive commands, and send telemetry data through the space-based asset, which will dramatically reduce dependence on ground-based assets. The other step was the Low-Cost Tracking and Data Relay Satellite System (TDRSS) Transceiver (LCT2), developed by the Wallops Flight Facility (WFF), which allows the vehicle to communicate with a geosynchronous relay satellite. Although the GMTU and LCT2 were independently implemented and tested, the design collaboration of KSC and WFF engineers allowed GMTU and LCT2 to be integrated into one enclosure, leading to the final STACS. In operation, GMTU needs only a radio frequency (RF) input from a GPS antenna and outputs position and velocity data to the vehicle through a serial or pulse code modulation (PCM) interface. GMTU includes one commercial GPS receiver board and a custom board, the Command and Telemetry Processor (CTP) developed by KSC. The CTP design is based on a field-programmable gate array (FPGA) with embedded processors to support GPS functions.
Yang, Chen; Li, Bingyi; Chen, Liang; Wei, Chunpeng; Xie, Yizhuang; Chen, He; Yu, Wenyue
2017-06-24
With the development of satellite load technology and very large scale integrated (VLSI) circuit technology, onboard real-time synthetic aperture radar (SAR) imaging systems have become a solution for allowing rapid response to disasters. A key goal of the onboard SAR imaging system design is to achieve high real-time processing performance with severe size, weight, and power consumption constraints. In this paper, we analyse the computational burden of the commonly used chirp scaling (CS) SAR imaging algorithm. To reduce the system hardware cost, we propose a partial fixed-point processing scheme. The fast Fourier transform (FFT), which is the most computation-sensitive operation in the CS algorithm, is processed with fixed-point, while other operations are processed with single precision floating-point. With the proposed fixed-point processing error propagation model, the fixed-point processing word length is determined. The fidelity and accuracy relative to conventional ground-based software processors is verified by evaluating both the point target imaging quality and the actual scene imaging quality. As a proof of concept, a field- programmable gate array-application-specific integrated circuit (FPGA-ASIC) hybrid heterogeneous parallel accelerating architecture is designed and realized. The customized fixed-point FFT is implemented using the 130 nm complementary metal oxide semiconductor (CMOS) technology as a co-processor of the Xilinx xc6vlx760t FPGA. A single processing board requires 12 s and consumes 21 W to focus a 50-km swath width, 5-m resolution stripmap SAR raw data with a granularity of 16,384 × 16,384.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feo, J.T.
1993-10-01
This report contain papers on: Programmability and performance issues; The case of an iterative partial differential equation solver; Implementing the kernal of the Australian Region Weather Prediction Model in Sisal; Even and quarter-even prime length symmetric FFTs and their Sisal Implementations; Top-down thread generation for Sisal; Overlapping communications and computations on NUMA architechtures; Compiling technique based on dataflow analysis for funtional programming language Valid; Copy elimination for true multidimensional arrays in Sisal 2.0; Increasing parallelism for an optimization that reduces copying in IF2 graphs; Caching in on Sisal; Cache performance of Sisal Vs. FORTRAN; FFT algorithms on a shared-memory multiprocessor;more » A parallel implementation of nonnumeric search problems in Sisal; Computer vision algorithms in Sisal; Compilation of Sisal for a high-performance data driven vector processor; Sisal on distributed memory machines; A virtual shared addressing system for distributed memory Sisal; Developing a high-performance FFT algorithm in Sisal for a vector supercomputer; Implementation issues for IF2 on a static data-flow architechture; and Systematic control of parallelism in array-based data-flow computation. Selected papers have been indexed separately for inclusion in the Energy Science and Technology Database.« less
MAP3D: a media processor approach for high-end 3D graphics
NASA Astrophysics Data System (ADS)
Darsa, Lucia; Stadnicki, Steven; Basoglu, Chris
1999-12-01
Equator Technologies, Inc. has used a software-first approach to produce several programmable and advanced VLIW processor architectures that have the flexibility to run both traditional systems tasks and an array of media-rich applications. For example, Equator's MAP1000A is the world's fastest single-chip programmable signal and image processor targeted for digital consumer and office automation markets. The Equator MAP3D is a proposal for the architecture of the next generation of the Equator MAP family. The MAP3D is designed to achieve high-end 3D performance and a variety of customizable special effects by combining special graphics features with high performance floating-point and media processor architecture. As a programmable media processor, it offers the advantages of a completely configurable 3D pipeline--allowing developers to experiment with different algorithms and to tailor their pipeline to achieve the highest performance for a particular application. With the support of Equator's advanced C compiler and toolkit, MAP3D programs can be written in a high-level language. This allows the compiler to successfully find and exploit any parallelism in a programmer's code, thus decreasing the time to market of a given applications. The ability to run an operating system makes it possible to run concurrent applications in the MAP3D chip, such as video decoding while executing the 3D pipelines, so that integration of applications is easily achieved--using real-time decoded imagery for texturing 3D objects, for instance. This novel architecture enables an affordable, integrated solution for high performance 3D graphics.
High-Speed Current dq PI Controller for Vector Controlled PMSM Drive
Reaz, Mamun Bin Ibne; Rahman, Labonnah Farzana; Chang, Tae Gyu
2014-01-01
High-speed current controller for vector controlled permanent magnet synchronous motor (PMSM) is presented. The controller is developed based on modular design for faster calculation and uses fixed-point proportional-integral (PI) method for improved accuracy. Current dq controller is usually implemented in digital signal processor (DSP) based computer. However, DSP based solutions are reaching their physical limits, which are few microseconds. Besides, digital solutions suffer from high implementation cost. In this research, the overall controller is realizing in field programmable gate array (FPGA). FPGA implementation of the overall controlling algorithm will certainly trim down the execution time significantly to guarantee the steadiness of the motor. Agilent 16821A Logic Analyzer is employed to validate the result of the implemented design in FPGA. Experimental results indicate that the proposed current dq PI controller needs only 50 ns of execution time in 40 MHz clock, which is the lowest computational cycle for the era. PMID:24574913
Adaptive DFT-based Interferometer Fringe Tracking
NASA Technical Reports Server (NTRS)
Wilson, Edward; Pedretti, Ettore; Bregman, Jesse; Mah, Robert W.; Traub, Wesley A.
2004-01-01
An automatic interferometer fringe tracking system has been developed, implemented, and tested at the Infrared Optical Telescope Array (IOTA) observatory at Mt. Hopkins, Arizona. The system can minimize the optical path differences (OPDs) for all three baselines of the Michelson stellar interferometer at IOTA. Based on sliding window discrete Fourier transform (DFT) calculations that were optimized for computational efficiency and robustness to atmospheric disturbances, the algorithm has also been tested extensively on off-line data. Implemented in ANSI C on the 266 MHz PowerPC processor running the VxWorks real-time operating system, the algorithm runs in approximately 2.0 milliseconds per scan (including all three interferograms), using the science camera and piezo scanners to measure and correct the OPDs. The adaptive DFT-based tracking algorithm should be applicable to other systems where there is a need to detect or track a signal with an approximately constant-frequency carrier pulse.
Software Defined GPS Receiver for International Space Station
NASA Technical Reports Server (NTRS)
Duncan, Courtney B.; Robison, David E.; Koelewyn, Cynthia Lee
2011-01-01
JPL is providing a software defined radio (SDR) that will fly on the International Space Station (ISS) as part of the CoNNeCT project under NASA's SCaN program. The SDR consists of several modules including a Baseband Processor Module (BPM) and a GPS Module (GPSM). The BPM executes applications (waveforms) consisting of software components for the embedded SPARC processor and logic for two Virtex II Field Programmable Gate Arrays (FPGAs) that operate on data received from the GPSM. GPS waveforms on the SDR are enabled by an L-Band antenna, low noise amplifier (LNA), and the GPSM that performs quadrature downconversion at L1, L2, and L5. The GPS waveform for the JPL SDR will acquire and track L1 C/A, L2C, and L5 GPS signals from a CoNNeCT platform on ISS, providing the best GPS-based positioning of ISS achieved to date, the first use of multiple frequency GPS on ISS, and potentially the first L5 signal tracking from space. The system will also enable various radiometric investigations on ISS such as local multipath or ISS dynamic behavior characterization. In following the software-defined model, this work will create a highly portable GPS software and firmware package that can be adapted to another platform with the necessary processor and FPGA capability. This paper also describes ISS applications for the JPL CoNNeCT SDR GPS waveform, possibilities for future global navigation satellite system (GNSS) tracking development, and the applicability of the waveform components to other space navigation applications.
1991-05-01
contact between averaging of the strong nuclear dipolar interaction the components will result at the interfacial region in this sample. In contrast, tho...and a sea marker to help save survivors $1.5 million for the institution in 1916, but of disasters at sea. A thermal diffusion process wartime delays...memory for large simulations on parallel intervening medium. Accomplishing this research array processors and immediate displays of results requires
NASA Astrophysics Data System (ADS)
1995-04-01
Bell Laboratories has developed the world's first optical information processor. Its core device is a self-excited electrooptical effect apparatus array of symmetric operation. After being developed in the United States, this high-technology device was successfully developed by China's scientists,thus making the fact that China's optoelectronic technology is among the most advanced in the world.
Effects Of Local Oscillator Errors On Digital Beamforming
2016-03-01
processor EF element factor EW electronic warfare FFM flicker frequency modulation FOV field-of-view FPGA field-programmable gate array FPM flicker...frequencies and also more difficult to measure [15]. 2. Flicker frequency modulation The source for flicker frequency modulation ( FFM ) is attributed to...a physical resonance mechanism of an oscillator or issues controlling electronic components. Some oscillators might not show FFM noise, which might
Adaptive Optoelectronic Eyes: Hybrid Sensor/Processor Architectures
2006-11-13
corresponding calculated data. The width of the mirror stopband is proportional to the refractive index difference between the high and low index materials ...Silicon VLSI Neuron Unit Arrays 56 Development of a Single-Sided Flip-Chip Bonding Process 65 Development of High Refractive Index Diffractive Optical ...Elements (DOEs) 68 Development of High-Performance Antireflection Coatings for High Refractive Index DOEs 69 Design and Fabrication of Low Threshold
A MIMO-Inspired Rapidly Switchable Photonic Interconnect Architecture (Postprint)
2009-07-01
capabilities of future systems. Highspeed optical processing has been looked to as a means for eliminating this interconnect bottleneck. Presented...here are the results of a study for a novel optical (integrated photonic) processor which would allow for a high-speed, secure means for arbitrarily...regarded as a Multiple Input Multiple Output (MIMO) architecture. 15. SUBJECT TERMS Free-space optical interconnects, Optical Phased Arrays, High-Speed
Compact VLSI neural computer integrated with active pixel sensor for real-time ATR applications
NASA Astrophysics Data System (ADS)
Fang, Wai-Chi; Udomkesmalee, Gabriel; Alkalai, Leon
1997-04-01
A compact VLSI neural computer integrated with an active pixel sensor has been under development to mimic what is inherent in biological vision systems. This electronic eye- brain computer is targeted for real-time machine vision applications which require both high-bandwidth communication and high-performance computing for data sensing, synergy of multiple types of sensory information, feature extraction, target detection, target recognition, and control functions. The neural computer is based on a composite structure which combines Annealing Cellular Neural Network (ACNN) and Hierarchical Self-Organization Neural Network (HSONN). The ACNN architecture is a programmable and scalable multi- dimensional array of annealing neurons which are locally connected with their local neurons. Meanwhile, the HSONN adopts a hierarchical structure with nonlinear basis functions. The ACNN+HSONN neural computer is effectively designed to perform programmable functions for machine vision processing in all levels with its embedded host processor. It provides a two order-of-magnitude increase in computation power over the state-of-the-art microcomputer and DSP microelectronics. A compact current-mode VLSI design feasibility of the ACNN+HSONN neural computer is demonstrated by a 3D 16X8X9-cube neural processor chip design in a 2-micrometers CMOS technology. Integration of this neural computer as one slice of a 4'X4' multichip module into the 3D MCM based avionics architecture for NASA's New Millennium Program is also described.
Pursley, Randall H.; Salem, Ghadi; Devasahayam, Nallathamby; Subramanian, Sankaran; Koscielniak, Janusz; Krishna, Murali C.; Pohida, Thomas J.
2006-01-01
The integration of modern data acquisition and digital signal processing (DSP) technologies with Fourier transform electron paramagnetic resonance (FT-EPR) imaging at radiofrequencies (RF) is described. The FT-EPR system operates at a Larmor frequency (Lf) of 300 MHz to facilitate in vivo studies. This relatively low frequency Lf, in conjunction with our ~10 MHz signal bandwidth, enables the use of direct free induction decay time-locked subsampling (TLSS). This particular technique provides advantages by eliminating the traditional analog intermediate frequency downconversion stage along with the corresponding noise sources. TLSS also results in manageable sample rates that facilitate the design of DSP-based data acquisition and image processing platforms. More specifically, we utilize a high-speed field programmable gate array (FPGA) and a DSP processor to perform advanced real-time signal and image processing. The migration to a DSP-based configuration offers the benefits of improved EPR system performance, as well as increased adaptability to various EPR system configurations (i.e., software configurable systems instead of hardware reconfigurations). The required modifications to the FT-EPR system design are described, with focus on the addition of DSP technologies including the application-specific hardware, software, and firmware developed for the FPGA and DSP processor. The first results of using real-time DSP technologies in conjunction with direct detection bandpass sampling to implement EPR imaging at RF frequencies are presented. PMID:16243552
The Use of Field Programmable Gate Arrays (FPGA) in Small Satellite Communication Systems
NASA Technical Reports Server (NTRS)
Varnavas, Kosta; Sims, William Herbert; Casas, Joseph
2015-01-01
This paper will describe the use of digital Field Programmable Gate Arrays (FPGA) to contribute to advancing the state-of-the-art in software defined radio (SDR) transponder design for the emerging SmallSat and CubeSat industry and to provide advances for NASA as described in the TAO5 Communication and Navigation Roadmap (Ref 4). The use of software defined radios (SDR) has been around for a long time. A typical implementation of the SDR is to use a processor and write software to implement all the functions of filtering, carrier recovery, error correction, framing etc. Even with modern high speed and low power digital signal processors, high speed memories, and efficient coding, the compute intensive nature of digital filters, error correcting and other algorithms is too much for modern processors to get efficient use of the available bandwidth to the ground. By using FPGAs, these compute intensive tasks can be done in parallel, pipelined fashion and more efficiently use every clock cycle to significantly increase throughput while maintaining low power. These methods will implement digital radios with significant data rates in the X and Ka bands. Using these state-of-the-art technologies, unprecedented uplink and downlink capabilities can be achieved in a 1/2 U sized telemetry system. Additionally, modern FPGAs have embedded processing systems, such as ARM cores, integrated inside the FPGA allowing mundane tasks such as parameter commanding to occur easily and flexibly. Potential partners include other NASA centers, industry and the DOD. These assets are associated with small satellite demonstration flights, LEO and deep space applications. MSFC currently has an SDR transponder test-bed using Hardware-in-the-Loop techniques to evaluate and improve SDR technologies.
A Real-Time System for Lane Detection Based on FPGA and DSP
NASA Astrophysics Data System (ADS)
Xiao, Jing; Li, Shutao; Sun, Bin
2016-12-01
This paper presents a real-time lane detection system including edge detection and improved Hough Transform based lane detection algorithm and its hardware implementation with field programmable gate array (FPGA) and digital signal processor (DSP). Firstly, gradient amplitude and direction information are combined to extract lane edge information. Then, the information is used to determine the region of interest. Finally, the lanes are extracted by using improved Hough Transform. The image processing module of the system consists of FPGA and DSP. Particularly, the algorithms implemented in FPGA are working in pipeline and processing in parallel so that the system can run in real-time. In addition, DSP realizes lane line extraction and display function with an improved Hough Transform. The experimental results show that the proposed system is able to detect lanes under different road situations efficiently and effectively.
Current, K. Wayne; Yuk, Kelvin; McConaghy, Charles; Gascoyne, Peter R. C.; Schwartz, Jon A.; Vykoukal, Jody V.; Andrews, Craig
2010-01-01
A high-voltage (HV) integrated circuit has been demonstrated to transport droplets on programmable paths across its coated surface. This chip is the engine for a dielectrophoresis (DEP)-based micro-fluidic lab-on-a-chip system. This chip creates DEP forces that move and help inject droplets. Electrode excitation voltage and frequency are variable. With the electrodes driven with a 100V peak-to-peak periodic waveform, the maximum high-voltage electrode waveform frequency is about 200Hz. Data communication rate is variable up to 250kHz. This demonstration chip has a 32×32 array of nominally 100V electrode drivers. It is fabricated in a 130V SOI CMOS fabrication technology, dissipates a maximum of 1.87W, and is about 10.4 mm × 8.2 mm. PMID:23989241
Singular value decomposition utilizing parallel algorithms on graphical processors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kotas, Charlotte W; Barhen, Jacob
2011-01-01
One of the current challenges in underwater acoustic array signal processing is the detection of quiet targets in the presence of noise. In order to enable robust detection, one of the key processing steps requires data and replica whitening. This, in turn, involves the eigen-decomposition of the sample spectral matrix, Cx = 1/K xKX(k)XH(k) where X(k) denotes a single frequency snapshot with an element for each element of the array. By employing the singular value decomposition (SVD) method, the eigenvectors and eigenvalues can be determined directly from the data without computing the sample covariance matrix, reducing the computational requirements formore » a given level of accuracy (van Trees, Optimum Array Processing). (Recall that the SVD of a complex matrix A involves determining V, , and U such that A = U VH where U and V are orthonormal and is a positive, real, diagonal matrix containing the singular values of A. U and V are the eigenvectors of AAH and AHA, respectively, while the singular values are the square roots of the eigenvalues of AAH.) Because it is desirable to be able to compute these quantities in real time, an efficient technique for computing the SVD is vital. In addition, emerging multicore processors like graphical processing units (GPUs) are bringing parallel processing capabilities to an ever increasing number of users. Since the computational tasks involved in array signal processing are well suited for parallelization, it is expected that these computations will be implemented using GPUs as soon as users have the necessary computational tools available to them. Thus, it is important to have an SVD algorithm that is suitable for these processors. This work explores the effectiveness of two different parallel SVD implementations on an NVIDIA Tesla C2050 GPU (14 multiprocessors, 32 cores per multiprocessor, 1.15 GHz clock - peed). The first algorithm is based on a two-step algorithm which bidiagonalizes the matrix using Householder transformations, and then diagonalizes the intermediate bidiagonal matrix through implicit QR shifts. This is similar to that implemented for real matrices by Lahabar and Narayanan ("Singular Value Decomposition on GPU using CUDA", IEEE International Parallel Distributed Processing Symposium 2009). The implementation is done in a hybrid manner, with the bidiagonalization stage done using the GPU while the diagonalization stage is done using the CPU, with the GPU used to update the U and V matrices. The second algorithm is based on a one-sided Jacobi scheme utilizing a sequence of pair-wise column orthogonalizations such that A is replaced by AV until the resulting matrix is sufficiently orthogonal (that is, equal to U ). V is obtained from the sequence of orthogonalizations, while can be found from the square root of the diagonal elements of AH A and, once is known, U can be found from column scaling the resulting matrix. These implementations utilize CUDA Fortran and NVIDIA's CUB LAS library. The primary goal of this study is to quantify the comparative performance of these two techniques against themselves and other standard implementations (for example, MATLAB). Considering that there is significant overhead associated with transferring data to the GPU and with synchronization between the GPU and the host CPU, it is also important to understand when it is worthwhile to use the GPU in terms of the matrix size and number of concurrent SVDs to be calculated.« less
Single-Scale Retinex Using Digital Signal Processors
NASA Technical Reports Server (NTRS)
Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn
2005-01-01
The Retinex is an image enhancement algorithm that improves the brightness, contrast and sharpness of an image. It performs a non-linear spatial/spectral transform that provides simultaneous dynamic range compression and color constancy. It has been used for a wide variety of applications ranging from aviation safety to general purpose photography. Many potential applications require the use of Retinex processing at video frame rates. This is difficult to achieve with general purpose processors because the algorithm contains a large number of complex computations and data transfers. In addition, many of these applications also constrain the potential architectures to embedded processors to save power, weight and cost. Thus we have focused on digital signal processors (DSPs) and field programmable gate arrays (FPGAs) as potential solutions for real-time Retinex processing. In previous efforts we attained a 21 (full) frame per second (fps) processing rate for the single-scale monochromatic Retinex with a TMS320C6711 DSP operating at 150 MHz. This was achieved after several significant code improvements and optimizations. Since then we have migrated our design to the slightly more powerful TMS320C6713 DSP and the fixed point TMS320DM642 DSP. In this paper we briefly discuss the Retinex algorithm, the performance of the algorithm executing on the TMS320C6713 and the TMS320DM642, and compare the results with the TMS320C6711.
Automatic film processors' quality control test in Greek military hospitals.
Lymberis, C; Efstathopoulos, E P; Manetou, A; Poudridis, G
1993-04-01
The two major military radiology installations (Athens, Greece) using a total of 15 automatic film processors were assessed using the 21-step-wedge method. The results of quality control in all these processors are presented. The parameters measured under actual working conditions were base and fog, contrast and speed. Base and fog as well as speed displayed large variations with average values generally higher than acceptable, whilst contrast displayed greater stability. Developer temperature was measured daily during the test and was found to be outside the film manufacturers' recommended limits in nine of the 15 processors. In only one processor did film passing time vary on an every day basis and this was due to maloperation. Developer pH test was not part of the daily monitoring service being performed every 5 days for each film processor and found to be in the range 9-12; 10 of the 15 processors presented pH values outside the limits specified by the film manufacturers.
Supercomputing on massively parallel bit-serial architectures
NASA Technical Reports Server (NTRS)
Iobst, Ken
1985-01-01
Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.
NASA Astrophysics Data System (ADS)
Radulescu, A.; Arend, N.; Drochner, M.; Ioffe, A.; Kemmerling, G.; Ossovyi, V.; Staringer, S.; Vehres, G.; McKinny, K.; Olechnowicz, B.; Yen, D.
2016-09-01
A new detection system based on an array of 3He tubes and innovative fast detection electronics was designed and produced by GE Reuter Stokes for the high-intensity small-angle neutron scattering diffractometer KWS-2, operated by the Jülich Centre for Neutron Science (JCNS) at the Heinz Meier-Leibnitz Zentrum (MLZ). The new detector consists of a panel array of 144 3He tubes and a new fast read-out electronics. The electronics is mounted in a closed case in the backside of the 3He tubes panel array and will operate at ambient atmosphere under cooling air stream. The new detection system is composed of eighteen 8-pack modules of 3He-tubes that work independently of one another (each unit has its own processor and electronics). Knowing beforehand the performance of one detector unit and of one single tube detector is prerequisite for tuning and maximizing the performance of the complete detection system. In this paper we present the results of the tests of the prototyped 8-pack of 3He-tubes and corresponding electronics, which have been carried out at the JCNS instruments KWS-2 (in high flux conditions) and TREFF.
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kirkham, R.; Siddons, D.; Dunn, P.A.
2010-06-23
The Maia detector system is engineered for energy dispersive x-ray fluorescence spectroscopy and elemental imaging at photon rates exceeding 10{sup 7}/s, integrated scanning of samples for pixel transit times as small as 50 {micro}s and high definition images of 10{sup 8} pixels and real-time processing of detected events for spectral deconvolution and online display of pure elemental images. The system developed by CSIRO and BNL combines a planar silicon 384 detector array, application-specific integrated circuits for pulse shaping and peak detection and sampling and optical data transmission to an FPGA-based pipelined, parallel processor. This paper describes the system and themore » underpinning engineering solutions.« less
Hot Chips and Hot Interconnects for High End Computing Systems
NASA Technical Reports Server (NTRS)
Saini, Subhash
2005-01-01
I will discuss several processors: 1. The Cray proprietary processor used in the Cray X1; 2. The IBM Power 3 and Power 4 used in an IBM SP 3 and IBM SP 4 systems; 3. The Intel Itanium and Xeon, used in the SGI Altix systems and clusters respectively; 4. IBM System-on-a-Chip used in IBM BlueGene/L; 5. HP Alpha EV68 processor used in DOE ASCI Q cluster; 6. SPARC64 V processor, which is used in the Fujitsu PRIMEPOWER HPC2500; 7. An NEC proprietary processor, which is used in NEC SX-6/7; 8. Power 4+ processor, which is used in Hitachi SR11000; 9. NEC proprietary processor, which is used in Earth Simulator. The IBM POWER5 and Red Storm Computing Systems will also be discussed. The architectures of these processors will first be presented, followed by interconnection networks and a description of high-end computer systems based on these processors and networks. The performance of various hardware/programming model combinations will then be compared, based on latest NAS Parallel Benchmark results (MPI, OpenMP/HPF and hybrid (MPI + OpenMP). The tutorial will conclude with a discussion of general trends in the field of high performance computing, (quantum computing, DNA computing, cellular engineering, and neural networks).
Adaptive and mobile ground sensor array.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzrichter, Michael Warren; O'Rourke, William T.; Zenner, Jennifer
The goal of this LDRD was to demonstrate the use of robotic vehicles for deploying and autonomously reconfiguring seismic and acoustic sensor arrays with high (centimeter) accuracy to obtain enhancement of our capability to locate and characterize remote targets. The capability to accurately place sensors and then retrieve and reconfigure them allows sensors to be placed in phased arrays in an initial monitoring configuration and then to be reconfigured in an array tuned to the specific frequencies and directions of the selected target. This report reviews the findings and accomplishments achieved during this three-year project. This project successfully demonstrated autonomousmore » deployment and retrieval of a payload package with an accuracy of a few centimeters using differential global positioning system (GPS) signals. It developed an autonomous, multisensor, temporally aligned, radio-frequency communication and signal processing capability, and an array optimization algorithm, which was implemented on a digital signal processor (DSP). Additionally, the project converted the existing single-threaded, monolithic robotic vehicle control code into a multi-threaded, modular control architecture that enhances the reuse of control code in future projects.« less
Multi-terabyte EIDE disk arrays running Linux RAID5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanders, D.A.; Cremaldi, L.M.; Eschenburg, V.
2004-11-01
High-energy physics experiments are currently recording large amounts of data and in a few years will be recording prodigious quantities of data. New methods must be developed to handle this data and make analysis at universities possible. Grid Computing is one method; however, the data must be cached at the various Grid nodes. We examine some storage techniques that exploit recent developments in commodity hardware. Disk arrays using RAID level 5 (RAID-5) include both parity and striping. The striping improves access speed. The parity protects data in the event of a single disk failure, but not in the case ofmore » multiple disk failures. We report on tests of dual-processor Linux Software RAID-5 arrays and Hardware RAID-5 arrays using a 12-disk 3ware controller, in conjunction with 250 and 300 GB disks, for use in offline high-energy physics data analysis. The price of IDE disks is now less than $1/GB. These RAID-5 disk arrays can be scaled to sizes affordable to small institutions and used when fast random access at low cost is important.« less
Set processing in a network environment. [data bases and magnetic disks and tapes
NASA Technical Reports Server (NTRS)
Hardgrave, W. T.
1975-01-01
A combination of a local network, a mass storage system, and an autonomous set processor serving as a data/storage management machine is described. Its characteristics include: content-accessible data bases usable from all connected devices; efficient storage/access of large data bases; simple and direct programming with data manipulation and storage management handled by the set processor; simple data base design and entry from source representation to set processor representation with no predefinition necessary; capability available for user sort/order specification; significant reduction in tape/disk pack storage and mounts; flexible environment that allows upgrading hardware/software configuration without causing major interruptions in service; minimal traffic on data communications network; and improved central memory usage on large processors.
Systolic array IC for genetic computation
NASA Technical Reports Server (NTRS)
Anderson, D.
1991-01-01
Measuring similarities between large sequences of genetic information is a formidable task requiring enormous amounts of computer time. Geneticists claim that nearly two months of CRAY-2 time are required to run a single comparison of the known database against the new bases that will be found this year, and more than a CRAY-2 year for next year's genetic discoveries, and so on. The DNA IC, designed at HP-ICBD in cooperation with the California Institute of Technology and the Jet Propulsion Laboratory, is being implemented in order to move the task of genetic comparison onto workstations and personal computers, while vastly improving performance. The chip is a systolic (pumped) array comprised of 16 processors, control logic, and global RAM, totaling 400,000 FETS. At 12 MHz, each chip performs 2.7 billion 16 bit operations per second. Using 35 of these chips in series on one PC board (performing nearly 100 billion operations per second), a sequence of 560 bases can be compared against the eventual total genome of 3 billion bases, in minutes--on a personal computer. While the designed purpose of the DNA chip is for genetic research, other disciplines requiring similarity measurements between strings of 7 bit encoded data could make use of this chip as well. Cryptography and speech recognition are two examples. A mix of full custom design and standard cells, in CMOS34, were used to achieve these goals. Innovative test methods were developed to enhance controllability and observability in the array. This paper describes these techniques as well as the chip's functionality. This chip was designed in the 1989-90 timeframe.
Architectures for reasoning in parallel
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.
1989-01-01
The research conducted has dealt with rule-based expert systems. The algorithms that may lead to effective parallelization of them were investigated. Both the forward and backward chained control paradigms were investigated in the course of this work. The best computer architecture for the developed and investigated algorithms has been researched. Two experimental vehicles were developed to facilitate this research. They are Backpac, a parallel backward chained rule-based reasoning system and Datapac, a parallel forward chained rule-based reasoning system. Both systems have been written in Multilisp, a version of Lisp which contains the parallel construct, future. Applying the future function to a function causes the function to become a task parallel to the spawning task. Additionally, Backpac and Datapac have been run on several disparate parallel processors. The machines are an Encore Multimax with 10 processors, the Concert Multiprocessor with 64 processors, and a 32 processor BBN GP1000. Both the Concert and the GP1000 are switch-based machines. The Multimax has all its processors hung off a common bus. All are shared memory machines, but have different schemes for sharing the memory and different locales for the shared memory. The main results of the investigations come from experiments on the 10 processor Encore and the Concert with partitions of 32 or less processors. Additionally, experiments have been run with a stripped down version of EMYCIN.
FPGA-based Upgrade to RITS-6 Control System, Designed with EMP Considerations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harold D. Anderson, John T. Williams
2009-07-01
The existing control system for the RITS-6, a 20-MA 3-MV pulsed-power accelerator located at Sandia National Laboratories, was built as a system of analog switches because the operators needed to be close enough to the machine to hear pulsed-power breakdowns, yet the electromagnetic pulse (EMP) emitted would disable any processor-based solutions. The resulting system requires operators to activate and deactivate a series of 110-V relays manually in a complex order. The machine is sensitive to both the order of operation and the time taken between steps. A mistake in either case would cause a misfire and possible machine damage. Basedmore » on these constraints, a field-programmable gate array (FPGA) was chosen as the core of a proposed upgrade to the control system. An FPGA is a series of logic elements connected during programming. Based on their connections, the elements can mimic primitive logic elements, a process called synthesis. The circuit is static; all paths exist simultaneously and do not depend on a processor. This should make it less sensitive to EMP. By shielding it and using good electromagnetic interference-reduction practices, it should continue to operate well in the electrically noisy environment. The FPGA has two advantages over the existing system. In manual operation mode, the synthesized logic gates keep the operators in sequence. In addition, a clock signal and synthesized countdown circuit provides an automated sequence, with adjustable delays, for quickly executing the time-critical portions of charging and firing. The FPGA is modeled as a set of states, each state being a unique set of values for the output signals. The state is determined by the input signals, and in the automated segment by the value of the synthesized countdown timer, with the default mode placing the system in a safe configuration. Unlike a processor-based system, any system stimulus that results in an abort situation immediately executes a shutdown, with only a tens-of-nanoseconds delay to propagate across the FPGA. This paper discusses the design, installation, and testing of the proposed system upgrade, including failure statistics and modifications to the original design.« less
CORDIC-based digital signal processing (DSP) element for adaptive signal processing
NASA Astrophysics Data System (ADS)
Bolstad, Gregory D.; Neeld, Kenneth B.
1995-04-01
The High Performance Adaptive Weight Computation (HAWC) processing element is a CORDIC based application specific DSP element that, when connected in a linear array, can perform extremely high throughput (100s of GFLOPS) matrix arithmetic operations on linear systems of equations in real time. In particular, it very efficiently performs the numerically intense computation of optimal least squares solutions for large, over-determined linear systems. Most techniques for computing solutions to these types of problems have used either a hard-wired, non-programmable systolic array approach, or more commonly, programmable DSP or microprocessor approaches. The custom logic methods can be efficient, but are generally inflexible. Approaches using multiple programmable generic DSP devices are very flexible, but suffer from poor efficiency and high computation latencies, primarily due to the large number of DSP devices that must be utilized to achieve the necessary arithmetic throughput. The HAWC processor is implemented as a highly optimized systolic array, yet retains some of the flexibility of a programmable data-flow system, allowing efficient implementation of algorithm variations. This provides flexible matrix processing capabilities that are one to three orders of magnitude less expensive and more dense than the current state of the art, and more importantly, allows a realizable solution to matrix processing problems that were previously considered impractical to physically implement. HAWC has direct applications in RADAR, SONAR, communications, and image processing, as well as in many other types of systems.
SWARM: A 32 GHz Correlator and VLBI Beamformer for the Submillimeter Array
NASA Astrophysics Data System (ADS)
Primiani, Rurik A.; Young, Kenneth H.; Young, André; Patel, Nimesh; Wilson, Robert W.; Vertatschitsch, Laura; Chitwood, Billie B.; Srinivasan, Ranjani; MacMahon, David; Weintroub, Jonathan
2016-03-01
A 32GHz bandwidth VLBI capable correlator and phased array has been designed and deployeda at the Smithsonian Astrophysical Observatory’s Submillimeter Array (SMA). The SMA Wideband Astronomical ROACH2 Machine (SWARM) integrates two instruments: a correlator with 140kHz spectral resolution across its full 32GHz band, used for connected interferometric observations, and a phased array summer used when the SMA participates as a station in the Event Horizon Telescope (EHT) very long baseline interferometry (VLBI) array. For each SWARM quadrant, Reconfigurable Open Architecture Computing Hardware (ROACH2) units shared under open-source from the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER) are equipped with a pair of ultra-fast analog-to-digital converters (ADCs), a field programmable gate array (FPGA) processor, and eight 10 Gigabit Ethernet (GbE) ports. A VLBI data recorder interface designated the SWARM digital back end, or SDBE, is implemented with a ninth ROACH2 per quadrant, feeding four Mark6 VLBI recorders with an aggregate recording rate of 64 Gbps. This paper describes the design and implementation of SWARM, as well as its deployment at SMA with reference to verification and science data.
Owen, Kevin; Fuller, Michael I.; Hossack, John A.
2015-01-01
Two-dimensional arrays present significant beamforming computational challenges because of their high channel count and data rate. These challenges are even more stringent when incorporating a 2-D transducer array into a battery-powered hand-held device, placing significant demands on power efficiency. Previous work in sonar and ultrasound indicates that 2-D array beamforming can be decomposed into two separable line-array beamforming operations. This has been used in conjunction with frequency-domain phase-based focusing to achieve fast volume imaging. In this paper, we analyze the imaging and computational performance of approximate near-field separable beamforming for high-quality delay-and-sum (DAS) beamforming and for a low-cost, phaserotation-only beamforming method known as direct-sampled in-phase quadrature (DSIQ) beamforming. We show that when high-quality time-delay interpolation is used, separable DAS focusing introduces no noticeable imaging degradation under practical conditions. Similar results for DSIQ focusing are observed. In addition, a slight modification to the DSIQ focusing method greatly increases imaging contrast, making it comparable to that of DAS, despite having a wider main lobe and higher side lobes resulting from the limitations of phase-only time-delay interpolation. Compared with non-separable 2-D imaging, up to a 20-fold increase in frame rate is possible with the separable method. When implemented on a smart-phone-oriented processor to focus data from a 60 × 60 channel array using a 40 × 40 aperture, the frame rate per C-mode volume slice increases from 16 to 255 Hz for DAS, and from 11 to 193 Hz for DSIQ. Energy usage per frame is similarly reduced from 75 to 4.8 mJ/ frame for DAS, and from 107 to 6.3 mJ/frame for DSIQ. We also show that the separable method outperforms 2-D FFT-based focusing by a factor of 1.64 at these data sizes. This data indicates that with the optimal design choices, separable 2-D beamforming can significantly improve frame rate and battery life for hand-held devices with 2-D arrays. PMID:22828829
Hiding the Disk and Network Latency of Out-of-Core Visualization
NASA Technical Reports Server (NTRS)
Ellsworth, David
2001-01-01
This paper describes an algorithm that improves the performance of application-controlled demand paging for out-of-core visualization by hiding the latency of reading data from both local disks or disks on remote servers. The performance improvements come from better overlapping the computation with the page reading process, and by performing multiple page reads in parallel. The paper includes measurements that show that the new multithreaded paging algorithm decreases the time needed to compute visualizations by one third when using one processor and reading data from local disk. The time needed when using one processor and reading data from remote disk decreased by two thirds. Visualization runs using data from remote disk actually ran faster than ones using data from local disk because the remote runs were able to make use of the remote server's high performance disk array.
Compute Element and Interface Box for the Hazard Detection System
NASA Technical Reports Server (NTRS)
Villalpando, Carlos Y.; Khanoyan, Garen; Stern, Ryan A.; Some, Raphael R.; Bailey, Erik S.; Carson, John M.; Vaughan, Geoffrey M.; Werner, Robert A.; Salomon, Phil M.; Martin, Keith E.;
2013-01-01
The Autonomous Landing and Hazard Avoidance Technology (ALHAT) program is building a sensor that enables a spacecraft to evaluate autonomously a potential landing area to generate a list of hazardous and safe landing sites. It will also provide navigation inputs relative to those safe sites. The Hazard Detection System Compute Element (HDS-CE) box combines a field-programmable gate array (FPGA) board for sensor integration and timing, with a multicore computer board for processing. The FPGA does system-level timing and data aggregation, and acts as a go-between, removing the real-time requirements from the processor and labeling events with a high resolution time. The processor manages the behavior of the system, controls the instruments connected to the HDS-CE, and services the "heavy lifting" computational requirements for analyzing the potential landing spots.
MOSAIC - A space-multiplexing technique for optical processing of large images
NASA Technical Reports Server (NTRS)
Athale, Ravindra A.; Astor, Michael E.; Yu, Jeffrey
1993-01-01
A technique for Fourier processing of images larger than the space-bandwidth products of conventional or smart spatial light modulators and two-dimensional detector arrays is described. The technique involves a spatial combination of subimages displayed on individual spatial light modulators to form a phase-coherent image, which is subsequently processed with Fourier optical techniques. Because of the technique's similarity with the mosaic technique used in art, the processor used is termed an optical MOSAIC processor. The phase accuracy requirements of this system were studied by computer simulation. It was found that phase errors of less than lambda/8 did not degrade the performance of the system and that the system was relatively insensitive to amplitude nonuniformities. Several schemes for implementing the subimage combination are described. Initial experimental results demonstrating the validity of the mosaic concept are also presented.
General-purpose interface bus for multiuser, multitasking computer system
NASA Technical Reports Server (NTRS)
Generazio, Edward R.; Roth, Don J.; Stang, David B.
1990-01-01
The architecture of a multiuser, multitasking, virtual-memory computer system intended for the use by a medium-size research group is described. There are three central processing units (CPU) in the configuration, each with 16 MB memory, and two 474 MB hard disks attached. CPU 1 is designed for data analysis and contains an array processor for fast-Fourier transformations. In addition, CPU 1 shares display images viewed with the image processor. CPU 2 is designed for image analysis and display. CPU 3 is designed for data acquisition and contains 8 GPIB channels and an analog-to-digital conversion input/output interface with 16 channels. Up to 9 users can access the third CPU simultaneously for data acquisition. Focus is placed on the optimization of hardware interfaces and software, facilitating instrument control, data acquisition, and processing.
A novel parallel architecture for local histogram equalization
NASA Astrophysics Data System (ADS)
Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan
2005-07-01
Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
Fast neural net simulation with a DSP processor array.
Muller, U A; Gunzinger, A; Guggenbuhl, W
1995-01-01
This paper describes the implementation of a fast neural net simulator on a novel parallel distributed-memory computer. A 60-processor system, named MUSIC (multiprocessor system with intelligent communication), is operational and runs the backpropagation algorithm at a speed of 330 million connection updates per second (continuous weight update) using 32-b floating-point precision. This is equal to 1.4 Gflops sustained performance. The complete system with 3.8 Gflops peak performance consumes less than 800 W of electrical power and fits into a 19-in rack. While reaching the speed of modern supercomputers, MUSIC still can be used as a personal desktop computer at a researcher's own disposal. In neural net simulation, this gives a computing performance to a single user which was unthinkable before. The system's real-time interfaces make it especially useful for embedded applications.
The Solution of Linear Complementarity Problems on an Array Processor.
1981-01-01
INITIALIZE T04E 4IASK COMON /1SCA/M1AA ITERAIIJ)NSp NIUvld ITEWAILUNSPNUJ4d RUMboaJNI Co6 C3MAON /ISCA/I GRIL )POINTSo Y LiRIUPOINTS CDMAION /SUaLMjAT...GRI1D# WIDTH GRIfl LOGICAL MASWI MASK MASK INTEGE" X GRIL )POINTSo Y GRIOPUINTS 14JTEGEM MAX ITERATIONS# NUMB ITERArIONS9 NIJMO ROPIS, NUMB COLS C LOCAL
RANS Simulations using OpenFOAM Software
2016-01-01
Averaged Navier- Stokes (RANS) simulations is described and illustrated by applying the simpleFoam solver to two case studies; two dimensional flow...to run in parallel over large processor arrays. The purpose of this report is to illustrate and test the use of the steady-state Reynolds Averaged ...Group in the Maritime Platforms Division he has been simulating fluid flow around ships and submarines using finite element codes, Lagrangian vortex
Jensen, Erik C.; Stockton, Amanda M.; Chiesl, Thomas N.; Kim, Jungkyu; Bera, Abhisek; Mathies, Richard A.
2013-01-01
A digitally programmable microfluidic Automaton consisting of a 2-dimensional array of pneumatically actuated microvalves is programmed to perform new multiscale mixing and sample processing operations. Large (µL-scale) volume processing operations are enabled by precise metering of multiple reagents within individual nL-scale valves followed by serial repetitive transfer to programmed locations in the array. A novel process exploiting new combining valve concepts is developed for continuous rapid and complete mixing of reagents in less than 800 ms. Mixing, transfer, storage, and rinsing operations are implemented combinatorially to achieve complex assay automation protocols. The practical utility of this technology is demonstrated by performing automated serial dilution for quantitative analysis as well as the first demonstration of on-chip fluorescent derivatization of biomarker targets (carboxylic acids) for microchip capillary electrophoresis on the Mars Organic Analyzer. A language is developed to describe how unit operations are combined to form a microfluidic program. Finally, this technology is used to develop a novel microfluidic 6-sample processor for combinatorial mixing of large sets (>26 unique combinations) of reagents. The digitally programmable microfluidic Automaton is a versatile programmable sample processor for a wide range of process volumes, for multiple samples, and for different types of analyses. PMID:23172232
A vector scanning processing technique for pulsed laser velocimetry
NASA Technical Reports Server (NTRS)
Wernet, Mark P.; Edwards, Robert V.
1989-01-01
Pulsed laser sheet velocimetry yields nonintrusive measurements of two-dimensional velocity vectors across an extended planar region of a flow. Current processing techniques offer high precision (1 pct) velocity estimates, but can require several hours of processing time on specialized array processors. Under some circumstances, a simple, fast, less accurate (approx. 5 pct), data reduction technique which also gives unambiguous velocity vector information is acceptable. A direct space domain processing technique was examined. The direct space domain processing technique was found to be far superior to any other techniques known, in achieving the objectives listed above. It employs a new data coding and reduction technique, where the particle time history information is used directly. Further, it has no 180 deg directional ambiguity. A complex convection vortex flow was recorded and completely processed in under 2 minutes on an 80386 based PC, producing a 2-D velocity vector map of the flow field. Hence, using this new space domain vector scanning (VS) technique, pulsed laser velocimetry data can be reduced quickly and reasonably accurately, without specialized array processing hardware.
Optoelectronic-cache memory system architecture.
Chiarulli, D M; Levitan, S P
1996-05-10
We present an investigation of the architecture of an optoelectronic cache that can integrate terabit optical memories with the electronic caches associated with high-performance uniprocessors and multiprocessors. The use of optoelectronic-cache memories enables these terabit technologies to provide transparently low-latency secondary memory with frame sizes comparable with disk pages but with latencies that approach those of electronic secondary-cache memories. This enables the implementation of terabit memories with effective access times comparable with the cycle times of current microprocessors. The cache design is based on the use of a smart-pixel array and combines parallel free-space optical input-output to-and-from optical memory with conventional electronic communication to the processor caches. This cache and the optical memory system to which it will interface provide a large random-access memory space that has a lower overall latency than that of magnetic disks and disk arrays. In addition, as a consequence of the high-bandwidth parallel input-output capabilities of optical memories, fault service times for the optoelectronic cache are substantially less than those currently achievable with any rotational media.
DBSAR's First Multimode Flight Campaign
NASA Technical Reports Server (NTRS)
Rincon, Rafael F.; Vega, Manuel; Buenfil, Manuel; Geist, Alessandro; Hilliard, Lawrence; Racette, Paul
2010-01-01
The Digital Beamforming SAR (DBSAR) is an airborne imaging radar system that combines phased array technology, reconfigurable on-board processing and waveform generation, and advances in signal processing to enable techniques not possible with conventional SARs. The system exploits the versatility inherently in phased-array technology with a state-of-the-art data acquisition and real-time processor in order to implement multi-mode measurement techniques in a single radar system. Operational modes include scatterometry over multiple antenna beams, Synthetic Aperture Radar (SAR) over several antenna beams, or Altimetry. The radar was flight tested in October 2008 on board of the NASA P3 aircraft over the Delmarva Peninsula, MD. The results from the DBSAR system performance is presented.
Implementation of kernels on the Maestro processor
NASA Astrophysics Data System (ADS)
Suh, Jinwoo; Kang, D. I. D.; Crago, S. P.
Currently, most microprocessors use multiple cores to increase performance while limiting power usage. Some processors use not just a few cores, but tens of cores or even 100 cores. One such many-core microprocessor is the Maestro processor, which is based on Tilera's TILE64 processor. The Maestro chip is a 49-core, general-purpose, radiation-hardened processor designed for space applications. The Maestro processor, unlike the TILE64, has a floating point unit (FPU) in each core for improved floating point performance. The Maestro processor runs at 342 MHz clock frequency. On the Maestro processor, we implemented several widely used kernels: matrix multiplication, vector add, FIR filter, and FFT. We measured and analyzed the performance of these kernels. The achieved performance was up to 5.7 GFLOPS, and the speedup compared to single tile was up to 49 using 49 tiles.
A Low-Power Wearable Stand-Alone Tongue Drive System for People With Severe Disabilities.
Jafari, Ali; Buswell, Nathanael; Ghovanloo, Maysam; Mohsenin, Tinoosh
2018-02-01
This paper presents a low-power stand-alone tongue drive system (sTDS) used for individuals with severe disabilities to potentially control their environment such as computer, smartphone, and wheelchair using their voluntary tongue movements. A low-power local processor is proposed, which can perform signal processing to convert raw magnetic sensor signals to user-defined commands, on the sTDS wearable headset, rather than sending all raw data out to a PC or smartphone. The proposed sTDS significantly reduces the transmitter power consumption and subsequently increases the battery life. Assuming the sTDS user issues one command every 20 ms, the proposed local processor reduces the data volume that needs to be wirelessly transmitted by a factor of 64, from 9.6 to 0.15 kb/s. The proposed processor consists of three main blocks: serial peripheral interface bus for receiving raw data from magnetic sensors, external magnetic interference attenuation to attenuate external magnetic field from the raw magnetic signal, and a machine learning classifier for command detection. A proof-of-concept prototype sTDS has been implemented with a low-power IGLOO-nano field programmable gate array (FPGA), bluetooth low energy, battery and magnetic sensors on a headset, and tested. At clock frequency of 20 MHz, the processor takes 6.6 s and consumes 27 nJ for detecting a command with a detection accuracy of 96.9%. To further reduce power consumption, an application-specified integrated circuit processor for the sTDS is implemented at the postlayout level in 65-nm CMOS technology with 1-V power supply, and it consumes 0.43 mW, which is 10 lower than FPGA power consumption and occupies an area of only 0.016 mm.
Leung, Vitus J [Albuquerque, NM; Phillips, Cynthia A [Albuquerque, NM; Bender, Michael A [East Northport, NY; Bunde, David P [Urbana, IL
2009-07-21
In a multiple processor computing apparatus, directional routing restrictions and a logical channel construct permit fault tolerant, deadlock-free routing. Processor allocation can be performed by creating a linear ordering of the processors based on routing rules used for routing communications between the processors. The linear ordering can assume a loop configuration, and bin-packing is applied to this loop configuration. The interconnection of the processors can be conceptualized as a generally rectangular 3-dimensional grid, and the MC allocation algorithm is applied with respect to the 3-dimensional grid.
An Efficient Functional Test Generation Method For Processors Using Genetic Algorithms
NASA Astrophysics Data System (ADS)
Hudec, Ján; Gramatová, Elena
2015-07-01
The paper presents a new functional test generation method for processors testing based on genetic algorithms and evolutionary strategies. The tests are generated over an instruction set architecture and a processor description. Such functional tests belong to the software-oriented testing. Quality of the tests is evaluated by code coverage of the processor description using simulation. The presented test generation method uses VHDL models of processors and the professional simulator ModelSim. The rules, parameters and fitness functions were defined for various genetic algorithms used in automatic test generation. Functionality and effectiveness were evaluated using the RISC type processor DP32.
Implementation of High Speed Distributed Data Acquisition System
NASA Astrophysics Data System (ADS)
Raju, Anju P.; Sekhar, Ambika
2012-09-01
This paper introduces a high speed distributed data acquisition system based on a field programmable gate array (FPGA). The aim is to develop a "distributed" data acquisition interface. The development of instruments such as personal computers and engineering workstations based on "standard" platforms is the motivation behind this effort. Using standard platforms as the controlling unit allows independence in hardware from a particular vendor and hardware platform. The distributed approach also has advantages from a functional point of view: acquisition resources become available to multiple instruments; the acquisition front-end can be physically remote from the rest of the instrument. High speed data acquisition system transmits data faster to a remote computer system through Ethernet interface. The data is acquired through 16 analog input channels. The input data commands are multiplexed and digitized and then the data is stored in 1K buffer for each input channel. The main control unit in this design is the 16 bit processor implemented in the FPGA. This 16 bit processor is used to set up and initialize the data source and the Ethernet controller, as well as control the flow of data from the memory element to the NIC. Using this processor we can initialize and control the different configuration registers in the Ethernet controller in a easy manner. Then these data packets are sending to the remote PC through the Ethernet interface. The main advantages of the using FPGA as standard platform are its flexibility, low power consumption, short design duration, fast time to market, programmability and high density. The main advantages of using Ethernet controller AX88796 over others are its non PCI interface, the presence of embedded SRAM where transmit and reception buffers are located and high-performance SRAM-like interface. The paper introduces the implementation of the distributed data acquisition using FPGA by VHDL. The main advantages of this system are high accuracy, high speed, real time monitoring.
Utilizing the ISS Mission as a Testbed to Develop Cognitive Communications Systems
NASA Technical Reports Server (NTRS)
Jackson, Dan
2016-01-01
The ISS provides an excellent opportunity for pioneering artificial intelligence software to meet the challenges of real-time communications (comm) link management. This opportunity empowers the ISS Program to forge a testbed for developing cognitive communications systems for the benefit of the ISS mission, manned Low Earth Orbit (LEO) science programs and future planetary exploration programs. In November, 1998, the Flight Operations Directorate (FOD) started the ISS Antenna Manager (IAM) project to develop a single processor supporting multiple comm satellite tracking for two different antenna systems. Further, the processor was developed to be highly adaptable as it supported the ISS mission through all assembly stages. The ISS mission mandated communications specialists with complete knowledge of when the ISS was about to lose or gain comm link service. The current specialty mandated cognizance of large sun-tracking solar arrays and thermal management panels in addition to the highly-dynamic satellite service schedules and rise/set tables. This mission requirement makes the ISS the ideal communications management analogue for future LEO space station and long-duration planetary exploration missions. Future missions, with their precision-pointed, dynamic, laser-based comm links, require complete autonomy for managing high-data rate communications systems. Development of cognitive communications management systems that permit any crew member or payload science specialist, regardless of experience level, to control communications is one of the greater benefits the ISS can offer new space exploration programs. The IAM project met a new mission requirement never previously levied against US space-born communications systems management: process and display the orientation of large solar arrays and thermal control panels based on real-time joint angle telemetry. However, IAM leaves the actual communications availability assessment to human judgment, which introduces unwanted variability because each specialist has a different core of experience with comm link performance. Because the ISS utilizes two different frequency bands, dynamic structure can be occasionally translucent at one frequency while it can completely interdict service at the other frequency. The impact of articulating structure on the comm link can depend on its orientation at the time it impinges on the link. It can become easy for a human specialist to cross-associate experience at one frequency with experience at the other frequency. Additionally, the specialist's experience is incremental, occurring one nine-hour shift at a time. Only the IAM processor experiences the complete 24x7x365 communications link performance for both communications links but, it has no "learning capability." If the IAM processor could be endowed with a cognitive ability to remember past structure-induced comm link outages, based on its knowledge of the ISS position, attitude, communications gear, array joint angles and tracking accuracy, it could convey such experience to the human operator. It could also use its learned communications link behaviors to accurately convey the availability of future communications sessions. Further, the tool could remember how accurately or inaccurately it predicted availability and correct future predictions based on past performance. The IAM tool could learn frequency-specific impacts due to spacecraft structures and pass that information along as "experience." Such development would provide a single artificial intelligence processor that could provide two different experience bases. If it also "knew" the satellite service schedule, it could distinguish structure blockage from schedule or planet blockage and then quickly switch to another satellite. Alternatively, just as a human operator could judge, a cognizant comm system based on the IAM model could "know" that the blockage is not going to last very long and continue tracking a comm satellite, waiting for it to track away from structure. Ultimately, once this capability was fully developed and tested in the Mission Control Center, it could be transferred on-orbit to support development of operations concepts that include more advanced cognitive communications systems. Future applications of this capability are easily foreseen because even more dynamic satellite constellations with more nodes and greater capability are coming. Currently, the ISS fully employs a 300 million bit-per-second (Mbps) return link for harvesting payload science. In the coming eighteen months, it will step up to 600 Mbps. Already there is talk of a 1.2 billion bit-per-second (Gbps) upgrade for the ISS and laser comm links have already been tested from the ISS. Every data rate upgrade mandates more complicated and sensitive communications equipment which implies greater expertise invested in the human operator. Future on-orbit cognizant comm systems will be needed to meet greater performance demands aboard larger, far more complicated spacecraft. In the LEO environment, the old-style one-satellite-per-spacecraft operations concept will give way to a new concept of a single customer spacecraft simultaneously using multiple comm satellites. Much more highly-dynamic manned LEO missions with decades of crew members potentially increase the demand for communications link performance. A cognizant on-board communications system will meet advanced communications demands from future LEO missions and future planetary missions. The ISS has fledgling components of future exploration programs, both LEO and planetary. Further, the Flight Operations Directorate, through the IAM project, has already begun to develop a communications management system that attempts to solve advanced problems ideally represented by dynamic structure impacting scheduled satellite service. With an earnest project to integrate artificial intelligence into the IAM processor, the ISS Program could develop a cognizant communications system that could be adapted and transferred to future on-orbit avionics designs.
Utilizing the ISS Mission as a Testbed to Develop Cognitive Communications Systems
NASA Technical Reports Server (NTRS)
Jackson, Dan
2016-01-01
The ISS provides an excellent opportunity for pioneering artificial intelligence software to meet the challenges of real-time communications (comm) link management. This opportunity empowers the ISS Program to forge a testbed for developing cognitive communications systems for the benefit of the ISS mission, manned Low Earth Orbit (LEO) science programs and future planetary exploration programs. In November, 1998, the Flight Operations Directorate (FOD) started the ISS Antenna Manager (IAM) project to develop a single processor supporting multiple comm satellite tracking for two different antenna systems. Further, the processor was developed to be highly adaptable as it supported the ISS mission through all assembly stages. The ISS mission mandated communications specialists with complete knowledge of when the ISS was about to lose or gain comm link service. The current specialty mandated cognizance of large sun-tracking solar arrays and thermal management panels in addition to the highly-dynamic satellite service schedules and rise/set tables. This mission requirement makes the ISS the ideal communications management analogue for future LEO space station and long-duration planetary exploration missions. Future missions, with their precision-pointed, dynamic, laser-based comm links, require complete autonomy for managing high-data rate communications systems. Development of cognitive communications management systems that permit any crew member or payload science specialist, regardless of experience level, to control communications is one of the greater benefits the ISS can offer new space exploration programs. The IAM project met a new mission requirement never previously levied against US space-born communications systems management: process and display the orientation of large solar arrays and thermal control panels based on real-time joint angle telemetry. However, IAM leaves the actual communications availability assessment to human judgement, which introduces unwanted variability because each specialist has a different core of experience with comm link performance. Because the ISS utilizes two different frequency bands, dynamic structure can be occasionally translucent at one frequency while it can completely interdict service at the other frequency. The impact of articulating structure on the comm link can depend on its orientation at the time it impinges on the link. It can become easy for a human specialist to cross-associate experience at one frequency with experience at the other frequency. Additionally, the specialist's experience is incremental, occurring one nine-hour shift at a time. Only the IAM processor experiences the complete 24x7x365 communications link performance for both communications links but, it has no "learning capability." If the IAM processor could be endowed with a cognitive ability to remember past structure-induced comm link outages, based on its knowledge of the ISS position, attitude, communications gear, array joint angles and tracking accuracy, it could convey such experience to the human operator. It could also use its learned communications link behaviors to accurately convey the availability of future communications sessions. Further, the tool could remember how accurately or inaccurately it predicted availability and correct future predictions based on past performance. The IAM tool could learn frequency-specific impacts due to spacecraft structures and pass that information along as "experience." Such development would provide a single artificial intelligence processor that could provide two different experience bases. If it also "knew" the satellite service schedule, it could distinguish structure blockage from schedule or planet blockage and then quickly switch to another satellite. Alternatively, just as a human operator could judge, a cognizant comm system based on the IAM model could "know" that the blockage is not going to last very long and continue tracking a comm satellite, waiting for it to track away from structure. Ultimately, once this capability was fully developed and tested in the Mission Control Center, it could be transferred on-orbit to support development of operations concepts that include more advanced cognitive communications systems. Future applications of this capability are easily foreseen because even more dynamic satellite constellations with more nodes and greater capability are coming. Currently, the ISS fully employs its high-data-rate return link for harvesting payload science. In the coming months, it will double that data rate and is forecast to fully utilize that capability. Already there is talk of an upgrade that quadruples the current data rate allocated to ISS payload science before the end of its mission and laser comm links have already been tested from the ISS. Every data rate upgrade mandates more complicated and sensitive communications equipment which implies greater expertise invested in the human operator. Future on-orbit cognizant comm systems will be needed to meet greater performance demands aboard larger, far more complicated spacecraft. In the LEO environment, the old-style one-satellite-per-spacecraft operations concept will give way to a new concept of a single customer spacecraft simultaneously using multiple comm satellites. Much more highly-dynamic manned LEO missions with decades of crew members potentially increase the demand for communications link performance. A cognizant on-board communications system will meet advanced communications demands from future LEO missions and future planetary missions. The ISS has fledgling components of future exploration programs, both LEO and planetary. Further, the Flight Operations Directorate, through the IAM project, has already begun to develop a communications management system that attempts to solve advanced problems ideally represented by dynamic structure impacting scheduled satellite service. With an earnest project to integrate artificial intelligence into the IAM processor, the ISS Program could develop a cognizant communications system that could be adapted and transferred to future on-orbit avionics designs.
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors.
Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres
2016-05-28
Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms.
Systems and methods for process and user driven dynamic voltage and frequency scaling
Mallik, Arindam [Evanston, IL; Lin, Bin [Hillsboro, OR; Memik, Gokhan [Evanston, IL; Dinda, Peter [Evanston, IL; Dick, Robert [Evanston, IL
2011-03-22
Certain embodiments of the present invention provide a method for power management including determining at least one of an operating frequency and an operating voltage for a processor and configuring the processor based on the determined at least one of the operating frequency and the operating voltage. The operating frequency is determined based at least in part on direct user input. The operating voltage is determined based at least in part on an individual profile for processor.
Radiation effects and mitigation strategies for modern FPGAs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stettler, M. W.; Caffrey, M. P.; Graham, P. S.
2004-01-01
Field Programmable Gate Array devices have become the technology of choice in small volume modern instrumentation and control systems. These devices have always offered significant advantages in flexibility, and recent advances in fabrication have greatly increased logic capacity, substantially increasing the number of applications for this technology. Unfortunately, the increased density (and corresponding shrinkage of process geometry), has made these devices more susceptible to failure due to external radiation. This has been an issue for space based systems for some time, but is now becoming an issue for terrestrial systems in elevated radiation environments and commercial avionics as well. Characterizingmore » the failure modes of Xilinx FPGAs, and developing mitigation strategies is the subject of ongoing research by a consortium of academic, industrial, and governmental laboratories. This paper presents background information of radiation effects and failure modes, as well as current and future mitigation techniques. In particular, the availability of very large FPGA devices, complete with generous amounts of RAM and embedded processor(s), has led to the implementation of complete digital systems on a single device, bringing issues of system reliability and redundancy management to the chip level. Radiation effects on a single FPGA are increasingly likely to have system level consequences, and will need to be addressed in current and future designs.« less
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
Finite elements and the method of conjugate gradients on a concurrent processor
NASA Technical Reports Server (NTRS)
Lyzenga, G. A.; Raefsky, A.; Hager, G. H.
1985-01-01
An algorithm for the iterative solution of finite element problems on a concurrent processor is presented. The method of conjugate gradients is used to solve the system of matrix equations, which is distributed among the processors of a MIMD computer according to an element-based spatial decomposition. This algorithm is implemented in a two-dimensional elastostatics program on the Caltech Hypercube concurrent processor. The results of tests on up to 32 processors show nearly linear concurrent speedup, with efficiencies over 90 percent for sufficiently large problems.
Finite elements and the method of conjugate gradients on a concurrent processor
NASA Technical Reports Server (NTRS)
Lyzenga, G. A.; Raefsky, A.; Hager, B. H.
1984-01-01
An algorithm for the iterative solution of finite element problems on a concurrent processor is presented. The method of conjugate gradients is used to solve the system of matrix equations, which is distributed among the processors of a MIMD computer according to an element-based spatial decomposition. This algorithm is implemented in a two-dimensional elastostatics program on the Caltech Hypercube concurrent processor. The results of tests on up to 32 processors show nearly linear concurrent speedup, with efficiencies over 90% for sufficiently large problems.
Automatic Carbon Dioxide-Methane Gas Sensor Based on the Solubility of Gases in Water
Cadena-Pereda, Raúl O.; Rivera-Muñoz, Eric M.; Herrera-Ruiz, Gilberto; Gomez-Melendez, Domingo J.; Anaya-Rivera, Ely K.
2012-01-01
Biogas methane content is a relevant variable in anaerobic digestion processing where knowledge of process kinetics or an early indicator of digester failure is needed. The contribution of this work is the development of a novel, simple and low cost automatic carbon dioxide-methane gas sensor based on the solubility of gases in water as the precursor of a sensor for biogas quality monitoring. The device described in this work was used for determining the composition of binary mixtures, such as carbon dioxide-methane, in the range of 0–100%. The design and implementation of a digital signal processor and control system into a low-cost Field Programmable Gate Array (FPGA) platform has permitted the successful application of data acquisition, data distribution and digital data processing, making the construction of a standalone carbon dioxide-methane gas sensor possible. PMID:23112626
VLSI implementation of a new LMS-based algorithm for noise removal in ECG signal
NASA Astrophysics Data System (ADS)
Satheeskumaran, S.; Sabrigiriraj, M.
2016-06-01
Least mean square (LMS)-based adaptive filters are widely deployed for removing artefacts in electrocardiogram (ECG) due to less number of computations. But they posses high mean square error (MSE) under noisy environment. The transform domain variable step-size LMS algorithm reduces the MSE at the cost of computational complexity. In this paper, a variable step-size delayed LMS adaptive filter is used to remove the artefacts from the ECG signal for improved feature extraction. The dedicated digital Signal processors provide fast processing, but they are not flexible. By using field programmable gate arrays, the pipelined architectures can be used to enhance the system performance. The pipelined architecture can enhance the operation efficiency of the adaptive filter and save the power consumption. This technique provides high signal-to-noise ratio and low MSE with reduced computational complexity; hence, it is a useful method for monitoring patients with heart-related problem.
Automatic carbon dioxide-methane gas sensor based on the solubility of gases in water.
Cadena-Pereda, Raúl O; Rivera-Muñoz, Eric M; Herrera-Ruiz, Gilberto; Gomez-Melendez, Domingo J; Anaya-Rivera, Ely K
2012-01-01
Biogas methane content is a relevant variable in anaerobic digestion processing where knowledge of process kinetics or an early indicator of digester failure is needed. The contribution of this work is the development of a novel, simple and low cost automatic carbon dioxide-methane gas sensor based on the solubility of gases in water as the precursor of a sensor for biogas quality monitoring. The device described in this work was used for determining the composition of binary mixtures, such as carbon dioxide-methane, in the range of 0-100%. The design and implementation of a digital signal processor and control system into a low-cost Field Programmable Gate Array (FPGA) platform has permitted the successful application of data acquisition, data distribution and digital data processing, making the construction of a standalone carbon dioxide-methane gas sensor possible.
Design Tools for Reconfigurable Hardware in Orbit (RHinO)
NASA Technical Reports Server (NTRS)
French, Mathew; Graham, Paul; Wirthlin, Michael; Larchev, Gregory; Bellows, Peter; Schott, Brian
2004-01-01
The Reconfigurable Hardware in Orbit (RHinO) project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. These tools leverage an established FPGA design environment and focus primarily on space effects mitigation and power optimization. The project is creating software to automatically test and evaluate the single-event-upsets (SEUs) sensitivities of an FPGA design and insert mitigation techniques. Extensions into the tool suite will also allow evolvable algorithm techniques to reconfigure around single-event-latchup (SEL) events. In the power domain, tools are being created for dynamic power visualiization and optimization. Thus, this technology seeks to enable the use of Reconfigurable Hardware in Orbit, via an integrated design tool-suite aiming to reduce risk, cost, and design time of multimission reconfigurable space processors using SRAM-based FPGAs.
Adaptive DIT-Based Fringe Tracking and Prediction at IOTA
NASA Technical Reports Server (NTRS)
Wilson, Edward; Pedretti, Ettore; Bregman, Jesse; Mah, Robert W.; Traub, Wesley A.
2004-01-01
An automatic fringe tracking system has been developed and implemented at the Infrared Optical Telescope Array (IOTA). In testing during May 2002, the system successfully minimized the optical path differences (OPDs) for all three baselines at IOTA. Based on sliding window discrete Fourier transform (DFT) calculations that were optimized for computational efficiency and robustness to atmospheric disturbances, the algorithm has also been tested extensively on off-line data. Implemented in ANSI C on the 266 MHZ PowerPC processor running the VxWorks real-time operating system, the algorithm runs in approximately 2.0 milliseconds per scan (including all three interferograms), using the science camera and piezo scanners to measure and correct the OPDs. Preliminary analysis on an extension of this algorithm indicates a potential for predictive tracking, although at present, real-time implementation of this extension would require significantly more computational capacity.
Database for LDV Signal Processor Performance Analysis
NASA Technical Reports Server (NTRS)
Baker, Glenn D.; Murphy, R. Jay; Meyers, James F.
1989-01-01
A comparative and quantitative analysis of various laser velocimeter signal processors is difficult because standards for characterizing signal bursts have not been established. This leaves the researcher to select a signal processor based only on manufacturers' claims without the benefit of direct comparison. The present paper proposes the use of a database of digitized signal bursts obtained from a laser velocimeter under various configurations as a method for directly comparing signal processors.
Integrated circuit for SAW and MEMS sensors
NASA Astrophysics Data System (ADS)
Fischer, Wolf-Joachim; Koenig, Peter; Ploetner, Matthias; Hermann, Rudiger; Stab, Helmut
2001-11-01
The sensor processor circuit has been developed for hand-held devices used in industrial and environmental applications, such as on-line process monitoring. Thereby devices with SAW sensors or MEMS resonators will benefit from this processor especially. Up to 8 sensors can be connected to the circuit as multisensors or sensor arrays. Two sensor processors SP1 and SP2 for different applications are presented in this paper. The SP-1 chip has a PCMCIA interface which can be used for the program and data transfer. SAW sensors which are working in the frequency range from 80 MHz to 160 MHz can be connected to the processor directly. It is possible to use the new SP-2 chip fabricated in a 0.5(mu) CMOS process for SAW devices with a maximum frequency of 600 MHz. An on-chip analog-digital-converter (ADC) and 6 PWM modules support the development of high-miniaturized intelligent sensor systems We have developed a multi-SAW sensor system with this ASIC that manages the requirements on control as well as signal generation and storage and provides an interface to the PC and electronic devices on the board. Its low power consumption and its PCMCIA plug fulfil the requirements of small size and mobility. For this application sensors have been developed to detect hazardous gases in ambient air. Sensors with differently modified copper-phthalocyanine films are capable of detecting NO2 and O3, whereas those with a hyperbranched polyester film respond to NH3.
Novel processor architecture for onboard infrared sensors
NASA Astrophysics Data System (ADS)
Hihara, Hiroki; Iwasaki, Akira; Tamagawa, Nobuo; Kuribayashi, Mitsunobu; Hashimoto, Masanori; Mitsuyama, Yukio; Ochi, Hiroyuki; Onodera, Hidetoshi; Kanbara, Hiroyuki; Wakabayashi, Kazutoshi; Tada, Munehiro
2016-09-01
Infrared sensor system is a major concern for inter-planetary missions that investigate the nature and the formation processes of planets and asteroids. The infrared sensor system requires signal preprocessing functions that compensate for the intensity of infrared image sensors to get high quality data and high compression ratio through the limited capacity of transmission channels towards ground stations. For those implementations, combinations of Field Programmable Gate Arrays (FPGAs) and microprocessors are employed by AKATSUKI, the Venus Climate Orbiter, and HAYABUSA2, the asteroid probe. On the other hand, much smaller size and lower power consumption are demanded for future missions to accommodate more sensors. To fulfill this future demand, we developed a novel processor architecture which consists of reconfigurable cluster cores and programmable-logic cells with complementary atom switches. The complementary atom switches enable hardware programming without configuration memories, and thus soft-error on logic circuit connection is completely eliminated. This is a noteworthy advantage for space applications which cannot be found in conventional re-writable FPGAs. Almost one-tenth of lower power consumption is expected compared to conventional re-writable FPGAs because of the elimination of configuration memories. The proposed processor architecture can be reconfigured by behavioral synthesis with higher level language specification. Consequently, compensation functions are implemented in a single chip without accommodating program memories, which is accompanied with conventional microprocessors, while maintaining the comparable performance. This enables us to embed a processor element on each infrared signal detector output channel.
A hybrid optic-fiber sensor network with the function of self-diagnosis and self-healing
NASA Astrophysics Data System (ADS)
Xu, Shibo; Liu, Tiegen; Ge, Chunfeng; Chen, Cheng; Zhang, Hongxia
2014-11-01
We develop a hybrid wavelength division multiplexing optical fiber network with distributed fiber-optic sensors and quasi-distributed FBG sensor arrays which detect vibrations, temperatures and strains at the same time. The network has the ability to locate the failure sites automatically designated as self-diagnosis and make protective switching to reestablish sensing service designated as self-healing by cooperative work of software and hardware. The processes above are accomplished by master-slave processors with the help of optical and wireless telemetry signals. All the sensing and optical telemetry signals transmit in the same fiber either working fiber or backup fiber. We take wavelength 1450nm as downstream signal and wavelength 1350nm as upstream signal to control the network in normal circumstances, both signals are sent by a light emitting node of the corresponding processor. There is also a continuous laser wavelength 1310nm sent by each node and received by next node on both working and backup fibers to monitor their healthy states, but it does not carry any message like telemetry signals do. When fibers of two sensor units are completely damaged, the master processor will lose the communication with the node between the damaged ones.However we install RF module in each node to solve the possible problem. Finally, the whole network state is transmitted to host computer by master processor. Operator could know and control the network by human-machine interface if needed.
Implementation of 4-way Superscalar Hash MIPS Processor Using FPGA
NASA Astrophysics Data System (ADS)
Sahib Omran, Safaa; Fouad Jumma, Laith
2018-05-01
Due to the quick advancements in the personal communications systems and wireless communications, giving data security has turned into a more essential subject. This security idea turns into a more confounded subject when next-generation system requirements and constant calculation speed are considered in real-time. Hash functions are among the most essential cryptographic primitives and utilized as a part of the many fields of signature authentication and communication integrity. These functions are utilized to acquire a settled size unique fingerprint or hash value of an arbitrary length of message. In this paper, Secure Hash Algorithms (SHA) of types SHA-1, SHA-2 (SHA-224, SHA-256) and SHA-3 (BLAKE) are implemented on Field-Programmable Gate Array (FPGA) in a processor structure. The design is described and implemented using a hardware description language, namely VHSIC “Very High Speed Integrated Circuit” Hardware Description Language (VHDL). Since the logical operation of the hash types of (SHA-1, SHA-224, SHA-256 and SHA-3) are 32-bits, so a Superscalar Hash Microprocessor without Interlocked Pipelines (MIPS) processor are designed with only few instructions that were required in invoking the desired Hash algorithms, when the four types of hash algorithms executed sequentially using the designed processor, the total time required equal to approximately 342 us, with a throughput of 4.8 Mbps while the required to execute the same four hash algorithms using the designed four-way superscalar is reduced to 237 us with improved the throughput to 5.1 Mbps.
Parallel processor-based raster graphics system architecture
Littlefield, Richard J.
1990-01-01
An apparatus for generating raster graphics images from the graphics command stream includes a plurality of graphics processors connected in parallel, each adapted to receive any part of the graphics command stream for processing the command stream part into pixel data. The apparatus also includes a frame buffer for mapping the pixel data to pixel locations and an interconnection network for interconnecting the graphics processors to the frame buffer. Through the interconnection network, each graphics processor may access any part of the frame buffer concurrently with another graphics processor accessing any other part of the frame buffer. The plurality of graphics processors can thereby transmit concurrently pixel data to pixel locations in the frame buffer.
2014-09-01
band signal samples by taking the ratio of (166) and (165) as 2 2 /2 /2 sin sin coscos g g g g gg cQ cI eE n E n e...processors,” EEE Trans. Acoust. Speech Signal Process., vol. 31, no. 6, pp. 1378–1393, Dec. 1983. [10] J. Li, P. Stoica and Z. Wang, “On robust
NASA Technical Reports Server (NTRS)
Thakoor, Anil
1990-01-01
Viewgraphs on electronic neural networks for space station are presented. Topics covered include: electronic neural networks; electronic implementations; VLSI/thin film hybrid hardware for neurocomputing; computations with analog parallel processing; features of neuroprocessors; applications of neuroprocessors; neural network hardware for terrain trafficability determination; a dedicated processor for path planning; neural network system interface; neural network for robotic control; error backpropagation algorithm for learning; resource allocation matrix; global optimization neuroprocessor; and electrically programmable read only thin-film synaptic array.
Reconfiguration Schemes for Fault-Tolerant Processor Arrays
1992-10-15
partially notion of linear schedule are easily related to similar ordered subset of a multidimensional integer lattice models and concepts used in [11-[131...and several other (called indec set). The points of this lattice correspond works. to (i.e.. are the indices of) computations, and the partial There are...These data dependencies are represented as vectors that of all computations of the algorithm is to be minimized. connect points of the lattice . If a
Low-Cost Space Hardware and Software
NASA Technical Reports Server (NTRS)
Shea, Bradley Franklin
2013-01-01
The goal of this project is to demonstrate and support the overall vision of NASA's Rocket University (RocketU) through the design of an electrical power system (EPS) monitor for implementation on RUBICS (Rocket University Broad Initiatives CubeSat), through the support for the CHREC (Center for High-Performance Reconfigurable Computing) Space Processor, and through FPGA (Field Programmable Gate Array) design. RocketU will continue to provide low-cost innovations even with continuous cuts to the budget.
Feasibility of a special-purpose computer to solve the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Gritton, E. C.; King, W. S.; Sutherland, I.; Gaines, R. S.; Gazley, C., Jr.; Grosch, C.; Juncosa, M.; Petersen, H.
1978-01-01
Orders-of-magnitude improvements in computer performance can be realized with a parallel array of thousands of fast microprocessors. In this architecture, wiring congestion is minimized by limiting processor communication to nearest neighbors. When certain standard algorithms are applied to a viscous flow problem and existing LSI technology is used, performance estimates of this conceptual design show a dramatic decrease in computational time when compared to the CDC 7600.
Spacecube V2.0 Micro Single Board Computer
NASA Technical Reports Server (NTRS)
Petrick, David J. (Inventor); Geist, Alessandro (Inventor); Lin, Michael R. (Inventor); Crum, Gary R. (Inventor)
2017-01-01
A single board computer system radiation hardened for space flight includes a printed circuit board having a top side and bottom side; a reconfigurable field programmable gate array (FPGA) processor device disposed on the top side; a connector disposed on the top side; a plurality of peripheral components mounted on the bottom side; and wherein a size of the single board computer system is not greater than approximately 7 cm.times.7 cm.
Finite element computation on nearest neighbor connected machines
NASA Technical Reports Server (NTRS)
Mcaulay, A. D.
1984-01-01
Research aimed at faster, more cost effective parallel machines and algorithms for improving designer productivity with finite element computations is discussed. A set of 8 boards, containing 4 nearest neighbor connected arrays of commercially available floating point chips and substantial memory, are inserted into a commercially available machine. One-tenth Mflop (64 bit operation) processors provide an 89% efficiency when solving the equations arising in a finite element problem for a single variable regular grid of size 40 by 40 by 40. This is approximately 15 to 20 times faster than a much more expensive machine such as a VAX 11/780 used in double precision. The efficiency falls off as faster or more processors are envisaged because communication times become dominant. A novel successive overrelaxation algorithm which uses cyclic reduction in order to permit data transfer and computation to overlap in time is proposed.
The economics of data acquisition computers for ST and MST radars
NASA Technical Reports Server (NTRS)
Watkins, B. J.
1983-01-01
Some low cost options for data acquisition computers for ST (stratosphere, troposphere) and MST (mesosphere, stratosphere, troposphere) are presented. The particular equipment discussed reflects choices made by the University of Alaska group but of course many other options exist. The low cost microprocessor and array processor approach presented here has several advantages because of its modularity. An inexpensive system may be configured for a minimum performance ST radar, whereas a multiprocessor and/or a multiarray processor system may be used for a higher performance MST radar. This modularity is important for a network of radars because the initial cost is minimized while future upgrades will still be possible at minimal expense. This modularity also aids in lowering the cost of software development because system expansions should rquire little software changes. The functions of the radar computer will be to obtain Doppler spectra in near real time with some minor analysis such as vector wind determination.
Prototyping the HPDP Chip on STM 65 NM Process
NASA Astrophysics Data System (ADS)
Papadas, C.; Dramitinos, G.; Syed, M.; Helfers, T.; Dedes, G.; Schoellkopf, J.-P.; Dugoujon, L.
2011-08-01
Currently Astrium GmbH is involved in the of the High Performance Data Processor (HPDP) development programme for telecommunication applications under a DLR contract. The HPDP project targets the implementation of the commercially available reconfigurable array processor IP (XPP from the company PACT XPP Technologies) in a radiation hardened technology.In the current complementary development phase funded under the Greek Industry Incentive scheme, it is planned to prototype the HPDP chip in commercial STM 65 nm technology. In addition it is also planned to utilise the preliminary radiation hardened components of this library wherever possible.This abstract gives an overview of the HPDP chip architecture, the basic details of the STM 65 nm process and the design flow foreseen for the prototyping. The paper will discuss the development and integration issues involved in using the STM 65 nm process (also including the available preliminary radiation hardened components) for designs targeted to be used in space applications.
Automation of Data Traffic Control on DSM Architecture
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry
2001-01-01
The design of distributed shared memory (DSM) computers liberates users from the duty to distribute data across processors and allows for the incremental development of parallel programs using, for example, OpenMP or Java threads. DSM architecture greatly simplifies the development of parallel programs having good performance on a few processors. However, to achieve a good program scalability on DSM computers requires that the user understand data flow in the application and use various techniques to avoid data traffic congestions. In this paper we discuss a number of such techniques, including data blocking, data placement, data transposition and page size control and evaluate their efficiency on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks. We also present a tool which automates the detection of constructs causing data congestions in Fortran array oriented codes and advises the user on code transformations for improving data traffic in the application.
Design of Small MEMS Microphone Array Systems for Direction Finding of Outdoors Moving Vehicles
Zhang, Xin; Huang, Jingchang; Song, Enliang; Liu, Huawei; Li, Baoqing; Yuan, Xiaobing
2014-01-01
In this paper, a MEMS microphone array system scheme is proposed which implements real-time direction of arrival (DOA) estimation for moving vehicles. Wind noise is the primary source of unwanted noise on microphones outdoors. A multiple signal classification (MUSIC) algorithm is used in this paper for direction finding associated with spatial coherence to discriminate between the wind noise and the acoustic signals of a vehicle. The method is implemented in a SHARC DSP processor and the real-time estimated DOA is uploaded through Bluetooth or a UART module. Experimental results in different places show the validity of the system and the deviation is no bigger than 6° in the presence of wind noise. PMID:24603636
Design of small MEMS microphone array systems for direction finding of outdoors moving vehicles.
Zhang, Xin; Huang, Jingchang; Song, Enliang; Liu, Huawei; Li, Baoqing; Yuan, Xiaobing
2014-03-05
In this paper, a MEMS microphone array system scheme is proposed which implements real-time direction of arrival (DOA) estimation for moving vehicles. Wind noise is the primary source of unwanted noise on microphones outdoors. A multiple signal classification (MUSIC) algorithm is used in this paper for direction finding associated with spatial coherence to discriminate between the wind noise and the acoustic signals of a vehicle. The method is implemented in a SHARC DSP processor and the real-time estimated DOA is uploaded through Bluetooth or a UART module. Experimental results in different places show the validity of the system and the deviation is no bigger than 6° in the presence of wind noise.
Time-delayed directional beam phased array antenna
Fund, Douglas Eugene; Cable, John William; Cecil, Tony Myron
2004-10-19
An antenna comprising a phased array of quadrifilar helix or other multifilar antenna elements and a time-delaying feed network adapted to feed the elements. The feed network can employ a plurality of coaxial cables that physically bridge a microstrip feed circuitry to feed power signals to the elements. The cables provide an incremental time delay which is related to their physical lengths, such that replacing cables having a first set of lengths with cables having a second set of lengths functions to change the time delay and shift or steer the antenna's main beam. Alternatively, the coaxial cables may be replaced with a programmable signal processor unit adapted to introduce the time delay using signal processing techniques applied to the power signals.
Development of a ground signal processor for digital synthetic array radar data
NASA Technical Reports Server (NTRS)
Griffin, C. R.; Estes, J. M.
1981-01-01
A modified APQ-102 sidelooking array radar (SLAR) in a B-57 aircraft test bed is used, with other optical and infrared sensors, in remote sensing of Earth surface features for various users at NASA Johnson Space Center. The video from the radar is normally recorded on photographic film and subsequently processed photographically into high resolution radar images. Using a high speed sampling (digitizing) system, the two receiver channels of cross-and co-polarized video are recorded on wideband magnetic tape along with radar and platform parameters. These data are subsequently reformatted and processed into digital synthetic aperture radar images with the image data available on magnetic tape for subsequent analysis by investigators. The system design and results obtained are described.
Dual Active Bridge based DC Transformer LabVIEW FPGA Control Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
In the area of power electronics control, Field Programmable Gate Arrays (FPGAs) have the capability to outperform their Digital Signal Processor (DSP) counterparts due to the FPGA’s ability to implement true parallel processing and therefore facilitate higher switching frequencies, higher control bandwidth, and/or enhanced functionality. National Instruments (NI) has developed two platforms, Compact RIO (cRIO) and Single Board RIO (sbRIO), which combine a real-time processor with an FPGA. The FPGA can be programmed with a subset of the well-known LabVIEW graphical programming language. The candidate software implements complete control algorithms in LabVIEW FPGA for a DC Transformer (DCX) based onmore » a dual active bridge (DAB). A DCX is an isolated bi-directional DC-DC converter designed to operate at unity conversion ratio, M, defined by where Vin is the primary-side DC bus voltage, Vout is the secondary-side DC bus voltage, and n is the turns ratio of the embedded high frequency transformer (HFX). The DCX based on a DAB incorporates two H-bridges, a resonant inductor, and an HFX to provide this functionality. The candidate software employs phase-shift modulation of the two H-bridges and a feedback loop to regulate the conversion ratio at unity. The software also includes alarm-handling capabilities as well as debugging and tuning tools. The software fits on the Xilinx Virtex V LX110 FPGA embedded in the NI cRIO-9118 FPGA chassis, and with a 40 MHz base clock, supports a modulation update rate of 40 MHz, and user-settable switching frequencies and synchronized control loop update rates of tens of kHz.« less
A fully reconfigurable photonic integrated signal processor
NASA Astrophysics Data System (ADS)
Liu, Weilin; Li, Ming; Guzzon, Robert S.; Norberg, Erik J.; Parker, John S.; Lu, Mingzhi; Coldren, Larry A.; Yao, Jianping
2016-03-01
Photonic signal processing has been considered a solution to overcome the inherent electronic speed limitations. Over the past few years, an impressive range of photonic integrated signal processors have been proposed, but they usually offer limited reconfigurability, a feature highly needed for the implementation of large-scale general-purpose photonic signal processors. Here, we report and experimentally demonstrate a fully reconfigurable photonic integrated signal processor based on an InP-InGaAsP material system. The proposed photonic signal processor is capable of performing reconfigurable signal processing functions including temporal integration, temporal differentiation and Hilbert transformation. The reconfigurability is achieved by controlling the injection currents to the active components of the signal processor. Our demonstration suggests great potential for chip-scale fully programmable all-optical signal processing.
Neurovision processor for designing intelligent sensors
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1992-03-01
A programmable multi-task neuro-vision processor, called the Positive-Negative (PN) neural processor, is proposed as a plausible hardware mechanism for constructing robust multi-task vision sensors. The computational operations performed by the PN neural processor are loosely based on the neural activity fields exhibited by certain nervous tissue layers situated in the brain. The neuro-vision processor can be programmed to generate diverse dynamic behavior that may be used for spatio-temporal stabilization (STS), short-term visual memory (STVM), spatio-temporal filtering (STF) and pulse frequency modulation (PFM). A multi- functional vision sensor that performs a variety of information processing operations on time- varying two-dimensional sensory images can be constructed from a parallel and hierarchical structure of numerous individually programmed PN neural processors.
A cochlear implant phantom for evaluating CT acquisition parameters
NASA Astrophysics Data System (ADS)
Chakravorti, Srijata; Bussey, Brian J.; Zhao, Yiyuan; Dawant, Benoit M.; Labadie, Robert F.; Noble, Jack H.
2017-03-01
Cochlear Implants (CIs) are surgically implantable neural prosthetic devices used to treat profound hearing loss. Recent literature indicates that there is a correlation between the positioning of the electrode array within the cochlea and the ultimate hearing outcome of the patient, indicating that further studies aimed at better understanding the relationship between electrode position and outcomes could have significant implications for future surgical techniques, array design, and processor programming methods. Post-implantation high resolution CT imaging is the best modality for localizing electrodes and provides the resolution necessary to visually identify electrode position, albeit with an unknown degree of accuracy depending on image acquisition parameters, like the HU range of reconstruction, radiation dose, and resolution of the image. In this paper, we report on the development of a phantom that will both permit studying which CT acquisition parameters are best for accurately identifying electrode position and serve as a ground truth for evaluating how different electrode localization methods perform when using different CT scanners and acquisition parameters. We conclude based on our tests that image resolution and HU range of reconstruction strongly affect how accurately the true position of the electrode array can be found by both experts and automatic analysis techniques. The results presented in this paper demonstrate that our phantom is a versatile tool for assessing how CT acquisition parameters affect the localization of CIs.
Jang, Jongmoon; Lee, JangWoo; Woo, Seongyong; Sly, David J; Campbell, Luke J; Cho, Jin-Ho; O'Leary, Stephen J; Park, Min-Hyun; Han, Sungmin; Choi, Ji-Wong; Jang, Jeong Hun; Choi, Hongsoo
2015-07-31
We proposed a piezoelectric artificial basilar membrane (ABM) composed of a microelectromechanical system cantilever array. The ABM mimics the tonotopy of the cochlea: frequency selectivity and mechanoelectric transduction. The fabricated ABM exhibits a clear tonotopy in an audible frequency range (2.92-12.6 kHz). Also, an animal model was used to verify the characteristics of the ABM as a front end for potential cochlear implant applications. For this, a signal processor was used to convert the piezoelectric output from the ABM to an electrical stimulus for auditory neurons. The electrical stimulus for auditory neurons was delivered through an implanted intra-cochlear electrode array. The amplitude of the electrical stimulus was modulated in the range of 0.15 to 3.5 V with incoming sound pressure levels (SPL) of 70.1 to 94.8 dB SPL. The electrical stimulus was used to elicit an electrically evoked auditory brainstem response (EABR) from deafened guinea pigs. EABRs were successfully measured and their magnitude increased upon application of acoustic stimuli from 75 to 95 dB SPL. The frequency selectivity of the ABM was estimated by measuring the magnitude of EABRs while applying sound pressure at the resonance and off-resonance frequencies of the corresponding cantilever of the selected channel. In this study, we demonstrated a novel piezoelectric ABM and verified its characteristics by measuring EABRs.
NASA Astrophysics Data System (ADS)
Blok, A. S.; Bukhenskii, A. F.; Krupitskii, É. I.; Morozov, S. V.; Pelevin, V. Yu; Sergeenko, T. N.; Yakovlev, V. I.
1995-10-01
An investigation is reported of acousto-optical and fibre-optic Fourier processors of electric signals, based on semiconductor lasers. A description is given of practical acousto-optical processors with an analysis band 120 MHz wide, a resolution of 200 kHz, and 7 cm × 8 cm × 18 cm dimensions. Fibre-optic Fourier processors are considered: they represent a new class of devices which are promising for the processing of gigahertz signals.
A single FPGA-based portable ultrasound imaging system for point-of-care applications.
Kim, Gi-Duck; Yoon, Changhan; Kye, Sang-Bum; Lee, Youngbae; Kang, Jeeun; Yoo, Yangmo; Song, Tai-kyong
2012-07-01
We present a cost-effective portable ultrasound system based on a single field-programmable gate array (FPGA) for point-of-care applications. In the portable ultrasound system developed, all the ultrasound signal and image processing modules, including an effective 32-channel receive beamformer with pseudo-dynamic focusing, are embedded in an FPGA chip. For overall system control, a mobile processor running Linux at 667 MHz is used. The scan-converted ultrasound image data from the FPGA are directly transferred to the system controller via external direct memory access without a video processing unit. The potable ultrasound system developed can provide real-time B-mode imaging with a maximum frame rate of 30, and it has a battery life of approximately 1.5 h. These results indicate that the single FPGA-based portable ultrasound system developed is able to meet the processing requirements in medical ultrasound imaging while providing improved flexibility for adapting to emerging POC applications.
Eigensolution of finite element problems in a completely connected parallel architecture
NASA Technical Reports Server (NTRS)
Akl, F.; Morel, M.
1989-01-01
A parallel algorithm is presented for the solution of the generalized eigenproblem in linear elastic finite element analysis. The algorithm is based on a completely connected parallel architecture in which each processor is allowed to communicate with all other processors. The algorithm is successfully implemented on a tightly coupled MIMD parallel processor. A finite element model is divided into m domains each of which is assumed to process n elements. Each domain is then assigned to a processor or to a logical processor (task) if the number of domains exceeds the number of physical processors. The effect of the number of domains, the number of degrees-of-freedom located along the global fronts, and the dimension of the subspace on the performance of the algorithm is investigated. For a 64-element rectangular plate, speed-ups of 1.86, 3.13, 3.18, and 3.61 are achieved on two, four, six, and eight processors, respectively.
Adapting a Navier-Stokes code to the ICL-DAP
NASA Technical Reports Server (NTRS)
Grosch, C. E.
1985-01-01
The results of an experiment are reported, i.c., to adapt a Navier-Stokes code, originally developed on a serial computer, to concurrent processing on the CL Distributed Array Processor (DAP). The algorithm used in solving the Navier-Stokes equations is briefly described. The architecture of the DAP and DAP FORTRAN are also described. The modifications of the algorithm so as to fit the DAP are given and discussed. Finally, performance results are given and conclusions are drawn.
Readout and DAQ for Pixel Detectors
NASA Astrophysics Data System (ADS)
Platkevic, Michal
2010-01-01
Data readout and acquisition control of pixel detectors demand the transfer of significantly a large amounts of bits between the detector and the computer. For this purpose dedicated interfaces are used which are designed with focus on features like speed, small dimensions or flexibility of use such as digital signal processors, field-programmable gate arrays (FPGA) and USB communication ports. This work summarizes the readout and DAQ system built for state-of-the-art pixel detectors of the Medipix family.
Workshop on Future Directions for Optical Information Processing.
1981-03-01
h . The i reference point source simultaneously illuminates the i member of a family of n phase-encoding Aiffusers (e.g. shower glass , ground glass ...diffuser (ground glass ) section illuminated with a plane wave [35.37). The n(n-1) - 4(3) - 12 crosstalk terms have been distributed into the noise...for 2x2 input Fig. 6. Outnut of processor analogous to that array, l.Sx magnifier, ground glass diffuser of Fig. 5, but using spherical wavefront and
Radiation Hardened Low Power Digital Signal Processor
2005-04-15
Image Figure 53.0 Point Spread Function PSF Figure 54.0 Restored Image and Restored PSF Figure 55.0 Newly Created Array Figure 56.0 Deblurred Image and... noise and interference rejection. WOA’s of 32-taps and greater are easily managed by the TCSP. An architecture that could efficiently perform filter...to quickly calculate a Remez filter impulse response to be used in place of the window function. Using the Remez exchange algorithm to calculate the
ELIPS: Toward a Sensor Fusion Processor on a Chip
NASA Technical Reports Server (NTRS)
Daud, Taher; Stoica, Adrian; Tyson, Thomas; Li, Wei-te; Fabunmi, James
1998-01-01
The paper presents the concept and initial tests from the hardware implementation of a low-power, high-speed reconfigurable sensor fusion processor. The Extended Logic Intelligent Processing System (ELIPS) processor is developed to seamlessly combine rule-based systems, fuzzy logic, and neural networks to achieve parallel fusion of sensor in compact low power VLSI. The first demonstration of the ELIPS concept targets interceptor functionality; other applications, mainly in robotics and autonomous systems are considered for the future. The main assumption behind ELIPS is that fuzzy, rule-based and neural forms of computation can serve as the main primitives of an "intelligent" processor. Thus, in the same way classic processors are designed to optimize the hardware implementation of a set of fundamental operations, ELIPS is developed as an efficient implementation of computational intelligence primitives, and relies on a set of fuzzy set, fuzzy inference and neural modules, built in programmable analog hardware. The hardware programmability allows the processor to reconfigure into different machines, taking the most efficient hardware implementation during each phase of information processing. Following software demonstrations on several interceptor data, three important ELIPS building blocks (a fuzzy set preprocessor, a rule-based fuzzy system and a neural network) have been fabricated in analog VLSI hardware and demonstrated microsecond-processing times.
Improving land vehicle situational awareness using a distributed aperture system
NASA Astrophysics Data System (ADS)
Fortin, Jean; Bias, Jason; Wells, Ashley; Riddle, Larry; van der Wal, Gooitzen; Piacentino, Mike; Mandelbaum, Robert
2005-05-01
U.S. Army Research, Development, and Engineering Command (RDECOM) Communications Electronics Research, Development and Engineering Center (CERDEC) Night Vision and Electronic Sensors Directorate (NVESD) has performed early work to develop a Distributed Aperture System (DAS). The DAS aims at improving the situational awareness of armored fighting vehicle crews under closed-hatch conditions. The concept is based on a plurality of sensors configured to create a day and night dome of surveillance coupled with heads up displays slaved to the operator's head to give a "glass turret" feel. State-of-the-art image processing is used to produce multiple seamless hemispherical views simultaneously available to the vehicle commander, crew members and dismounting infantry. On-the-move automatic cueing of multiple moving/pop-up low silhouette threats is also done with the possibility to save/revisit/share past events. As a first step in this development program, a contract was awarded to United Defense to further develop the Eagle VisionTM system. The second-generation prototype features two camera heads, each comprising four high-resolution (2048x1536) color sensors, and each covering a field of view of 270°hx150°v. High-bandwidth digital links interface the camera heads with a field programmable gate array (FPGA) based custom processor developed by Sarnoff Corporation. The processor computes the hemispherical stitch and warp functions required for real-time, low latency, immersive viewing (360°hx120°v, 30° down) and generates up to six simultaneous extended graphics array (XGA) video outputs for independent display either on a helmet-mounted display (with associated head tracking device) or a flat panel display (and joystick). The prototype is currently in its last stage of development and will be integrated on a vehicle for user evaluation and testing. Near-term improvements include the replacement of the color camera heads with a pixel-level fused combination of uncooled long wave infrared (LWIR) and low light level intensified imagery. It is believed that the DAS will significantly increase situational awareness by providing the users with a day and night, wide area coverage, immersive visualization capability.
Adaptive Instrument Module: Space Instrument Controller "Brain" through Programmable Logic Devices
NASA Technical Reports Server (NTRS)
Darrin, Ann Garrison; Conde, Richard; Chern, Bobbie; Luers, Phil; Jurczyk, Steve; Mills, Carl; Day, John H. (Technical Monitor)
2001-01-01
The Adaptive Instrument Module (AIM) will be the first true demonstration of reconfigurable computing with field-programmable gate arrays (FPGAs) in space, enabling the 'brain' of the system to evolve or adapt to changing requirements. In partnership with NASA Goddard Space Flight Center and the Australian Cooperative Research Centre for Satellite Systems (CRC-SS), APL has built the flight version to be flown on the Australian university-class satellite FEDSAT. The AIM provides satellites the flexibility to adapt to changing mission requirements by reconfiguring standardized processing hardware rather than incurring the large costs associated with new builds. This ability to reconfigure the processing in response to changing mission needs leads to true evolveable computing, wherein the instrument 'brain' can learn from new science data in order to perform state-of-the-art data processing. The development of the AIM is significant in its enormous potential to reduce total life-cycle costs for future space exploration missions. The advent of RAM-based FPGAs whose configuration can be changed at any time has enabled the development of the AIM for processing tasks that could not be performed in software. The use of the AIM enables reconfiguration of the FPGA circuitry while the spacecraft is in flight, with many accompanying advantages. The AIM demonstrates the practicalities of using reconfigurable computing hardware devices by conducting a series of designed experiments. These include the demonstration of implementing data compression, data filtering, and communication message processing and inter-experiment data computation. The second generation is the Adaptive Processing Template (ADAPT) which is further described in this paper. The next step forward is to make the hardware itself adaptable and the ADAPT pursues this challenge by developing a reconfigurable module that will be capable of functioning efficiently in various applications. ADAPT will take advantage of radiation tolerant RAM-based field programmable gate array (FPGA) technology to develop a reconfigurable processor that combines the flexibility of a general purpose processor running software with the performance of application specific processing hardware for a variety of high performance computing applications.
Design of a system based on DSP and FPGA for video recording and replaying
NASA Astrophysics Data System (ADS)
Kang, Yan; Wang, Heng
2013-08-01
This paper brings forward a video recording and replaying system with the architecture of Digital Signal Processor (DSP) and Field Programmable Gate Array (FPGA). The system achieved encoding, recording, decoding and replaying of Video Graphics Array (VGA) signals which are displayed on a monitor during airplanes and ships' navigating. In the architecture, the DSP is a main processor which is used for a large amount of complicated calculation during digital signal processing. The FPGA is a coprocessor for preprocessing video signals and implementing logic control in the system. In the hardware design of the system, Peripheral Device Transfer (PDT) function of the External Memory Interface (EMIF) is utilized to implement seamless interface among the DSP, the synchronous dynamic RAM (SDRAM) and the First-In-First-Out (FIFO) in the system. This transfer mode can avoid the bottle-neck of the data transfer and simplify the circuit between the DSP and its peripheral chips. The DSP's EMIF and two level matching chips are used to implement Advanced Technology Attachment (ATA) protocol on physical layer of the interface of an Integrated Drive Electronics (IDE) Hard Disk (HD), which has a high speed in data access and does not rely on a computer. Main functions of the logic on the FPGA are described and the screenshots of the behavioral simulation are provided in this paper. In the design of program on the DSP, Enhanced Direct Memory Access (EDMA) channels are used to transfer data between the FIFO and the SDRAM to exert the CPU's high performance on computing without intervention by the CPU and save its time spending. JPEG2000 is implemented to obtain high fidelity in video recording and replaying. Ways and means of acquiring high performance for code are briefly present. The ability of data processing of the system is desirable. And smoothness of the replayed video is acceptable. By right of its design flexibility and reliable operation, the system based on DSP and FPGA for video recording and replaying has a considerable perspective in analysis after the event, simulated exercitation and so forth.
Sochol, Ryan D; Lu, Albert; Lei, Jonathan; Iwai, Kosuke; Lee, Luke P; Lin, Liwei
2014-05-07
Self-regulating fluidic components are critical to the advancement of microfluidic processors for chemical and biological applications, such as sample preparation on chip, point-of-care molecular diagnostics, and implantable drug delivery devices. Although researchers have developed a wide range of components to enable flow rectification in fluidic systems, engineering microfluidic diodes that function at the low Reynolds number (Re) flows and smaller scales of emerging micro/nanofluidic platforms has remained a considerable challenge. Recently, researchers have demonstrated microfluidic diodes that utilize high numbers of suspended microbeads as dynamic resistive elements; however, using spherical particles to block fluid flow through rectangular microchannels is inherently limited. To overcome this issue, here we present a single-layer microfluidic bead-based diode (18 μm in height) that uses a targeted circular-shaped microchannel for the docking of a single microbead (15 μm in diameter) to rectify fluid flow under low Re conditions. Three-dimensional simulations and experimental results revealed that adjusting the docking channel geometry and size to better match the suspended microbead greatly increased the diodicity (Di) performance. Arraying multiple bead-based diodes in parallel was found to adversely affect system efficacy, while arraying multiple diodes in series was observed to enhance device performance. In particular, systems consisting of four microfluidic bead-based diodes with targeted circular-shaped docking channels in series revealed average Di's ranging from 2.72 ± 0.41 to 10.21 ± 1.53 corresponding to Re varying from 0.1 to 0.6.
A high performance linear equation solver on the VPP500 parallel supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi
1994-12-31
This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.
PixonVision real-time video processor
NASA Astrophysics Data System (ADS)
Puetter, R. C.; Hier, R. G.
2007-09-01
PixonImaging LLC and DigiVision, Inc. have developed a real-time video processor, the PixonVision PV-200, based on the patented Pixon method for image deblurring and denoising, and DigiVision's spatially adaptive contrast enhancement processor, the DV1000. The PV-200 can process NTSC and PAL video in real time with a latency of 1 field (1/60 th of a second), remove the effects of aerosol scattering from haze, mist, smoke, and dust, improve spatial resolution by up to 2x, decrease noise by up to 6x, and increase local contrast by up to 8x. A newer version of the processor, the PV-300, is now in prototype form and can handle high definition video. Both the PV-200 and PV-300 are FPGA-based processors, which could be spun into ASICs if desired. Obvious applications of these processors include applications in the DOD (tanks, aircraft, and ships), homeland security, intelligence, surveillance, and law enforcement. If developed into an ASIC, these processors will be suitable for a variety of portable applications, including gun sights, night vision goggles, binoculars, and guided munitions. This paper presents a variety of examples of PV-200 processing, including examples appropriate to border security, battlefield applications, port security, and surveillance from unmanned aerial vehicles.
SpaceCube v2.0 Space Flight Hybrid Reconfigurable Data Processing System
NASA Technical Reports Server (NTRS)
Petrick, Dave
2014-01-01
This paper details the design architecture, design methodology, and the advantages of the SpaceCube v2.0 high performance data processing system for space applications. The purpose in building the SpaceCube v2.0 system is to create a superior high performance, reconfigurable, hybrid data processing system that can be used in a multitude of applications including those that require a radiation hardened and reliable solution. The SpaceCube v2.0 system leverages seven years of board design, avionics systems design, and space flight application experiences. This paper shows how SpaceCube v2.0 solves the increasing computing demands of space data processing applications that cannot be attained with a standalone processor approach.The main objective during the design stage is to find a good system balance between power, size, reliability, cost, and data processing capability. These design variables directly impact each other, and it is important to understand how to achieve a suitable balance. This paper will detail how these critical design factors were managed including the construction of an Engineering Model for an experiment on the International Space Station to test out design concepts. We will describe the designs for the processor card, power card, backplane, and a mission unique interface card. The mechanical design for the box will also be detailed since it is critical in meeting the stringent thermal and structural requirements imposed by the processing system. In addition, the mechanical design uses advanced thermal conduction techniques to solve the internal thermal challenges.The SpaceCube v2.0 processing system is based on an extended version of the 3U cPCI standard form factor where each card is 190mm x 100mm in size The typical power draw of the processor card is 8 to 10W and scales with application complexity. The SpaceCube v2.0 data processing card features two Xilinx Virtex-5 QV Field Programmable Gate Arrays (FPGA), eight memory modules, a monitor FPGA with analog monitoring, Ethernet, configurable interconnect to the Xilinx FPGAs including gigabit transceivers, and the necessary voltage regulation. The processor board uses a back-to-back design methodology for common parts that maximizes the board real estate available. This paper will show how to meet the IPC 6012B Class 3A standard with a 22-layer board that has two column grid array devices with 1.0mm pitch. All layout trades such as stack-up options, via selection, and FPGA signal breakout will be discussed with feature size results. The overall board design process will be discussed including parts selection, circuit design, proper signal termination, layout placement and route planning, signal integrity design and verification, and power integrity results. The radiation mitigation techniques will also be detailed including configuration scrubbing options, Xilinx circuit mitigation and FPGA functional monitoring, and memory protection.Finally, this paper will describe how this system is being used to solve the extreme challenges of a robotic satellite servicing mission where typical space-rated processors are not sufficient enough to meet the intensive data processing requirements. The SpaceCube v2.0 is the main payload control computer and is required to control critical subsystems such as autonomous rendezvous and docking using a suite of vision sensors and object avoidance when controlling two robotic arms.
List-mode PET image reconstruction for motion correction using the Intel XEON PHI co-processor
NASA Astrophysics Data System (ADS)
Ryder, W. J.; Angelis, G. I.; Bashar, R.; Gillam, J. E.; Fulton, R.; Meikle, S.
2014-03-01
List-mode image reconstruction with motion correction is computationally expensive, as it requires projection of hundreds of millions of rays through a 3D array. To decrease reconstruction time it is possible to use symmetric multiprocessing computers or graphics processing units. The former can have high financial costs, while the latter can require refactoring of algorithms. The Xeon Phi is a new co-processor card with a Many Integrated Core architecture that can run 4 multiple-instruction, multiple data threads per core with each thread having a 512-bit single instruction, multiple data vector register. Thus, it is possible to run in the region of 220 threads simultaneously. The aim of this study was to investigate whether the Xeon Phi co-processor card is a viable alternative to an x86 Linux server for accelerating List-mode PET image reconstruction for motion correction. An existing list-mode image reconstruction algorithm with motion correction was ported to run on the Xeon Phi coprocessor with the multi-threading implemented using pthreads. There were no differences between images reconstructed using the Phi co-processor card and images reconstructed using the same algorithm run on a Linux server. However, it was found that the reconstruction runtimes were 3 times greater for the Phi than the server. A new version of the image reconstruction algorithm was developed in C++ using OpenMP for mutli-threading and the Phi runtimes decreased to 1.67 times that of the host Linux server. Data transfer from the host to co-processor card was found to be a rate-limiting step; this needs to be carefully considered in order to maximize runtime speeds. When considering the purchase price of a Linux workstation with Xeon Phi co-processor card and top of the range Linux server, the former is a cost-effective computation resource for list-mode image reconstruction. A multi-Phi workstation could be a viable alternative to cluster computers at a lower cost for medical imaging applications.
A customizable commercial miniaturized 320×256 indium gallium arsenide shortwave infrared camera
NASA Astrophysics Data System (ADS)
Huang, Shih-Che; O'Grady, Matthew; Groppe, Joseph V.; Ettenberg, Martin H.; Brubaker, Robert M.
2004-10-01
The design and performance of a commercial short-wave-infrared (SWIR) InGaAs microcamera engine is presented. The 0.9-to-1.7 micron SWIR imaging system consists of a room-temperature-TEC-stabilized, 320x256 (25 μm pitch) InGaAs focal plane array (FPA) and a high-performance, highly customizable image-processing set of electronics. The detectivity, D*, of the system is greater than 1013 cm-√Hz/W at 1.55 μm, and this sensitivity may be adjusted in real-time over 100 dB. It features snapshot-mode integration with a minimum exposure time of 130 μs. The digital video processor provides real time pixel-to-pixel, 2-point dark-current subtraction and non-uniformity compensation along with defective-pixel substitution. Other features include automatic gain control (AGC), gamma correction, 7 preset configurations, adjustable exposure time, external triggering, and windowing. The windowing feature is highly flexible; the region of interest (ROI) may be placed anywhere on the imager and can be varied at will. Windowing allows for high-speed readout enabling such applications as target acquisition and tracking; for example, a 32x32 ROI window may be read out at over 3500 frames per second (fps). Output video is provided as EIA170-compatible analog, or as 12-bit CameraLink-compatible digital. All the above features are accomplished in a small volume < 28 cm3, weight < 70 g, and with low power consumption < 1.3 W at room temperature using this new microcamera engine. Video processing is based on a field-programmable gate array (FPGA) platform with a soft-embedded processor that allows for ease of integration/addition of customer-specific algorithms, processes, or design requirements. The camera was developed with the high-performance, space-restricted, power-conscious application in mind, such as robotic or UAV deployment.
Parallel processor for real-time structural control
NASA Astrophysics Data System (ADS)
Tise, Bert L.
1993-07-01
A parallel processor that is optimized for real-time linear control has been developed. This modular system consists of A/D modules, D/A modules, and floating-point processor modules. The scalable processor uses up to 1,000 Motorola DSP96002 floating-point processors for a peak computational rate of 60 GFLOPS. Sampling rates up to 625 kHz are supported by this analog-in to analog-out controller. The high processing rate and parallel architecture make this processor suitable for computing state-space equations and other multiply/accumulate-intensive digital filters. Processor features include 14-bit conversion devices, low input-to-output latency, 240 Mbyte/s synchronous backplane bus, low-skew clock distribution circuit, VME connection to host computer, parallelizing code generator, and look- up-tables for actuator linearization. This processor was designed primarily for experiments in structural control. The A/D modules sample sensors mounted on the structure and the floating- point processor modules compute the outputs using the programmed control equations. The outputs are sent through the D/A module to the power amps used to drive the structure's actuators. The host computer is a Sun workstation. An OpenWindows-based control panel is provided to facilitate data transfer to and from the processor, as well as to control the operating mode of the processor. A diagnostic mode is provided to allow stimulation of the structure and acquisition of the structural response via sensor inputs.
Adaptive near-field beamforming techniques for sound source imaging.
Cho, Yong Thung; Roan, Michael J
2009-02-01
Phased array signal processing techniques such as beamforming have a long history in applications such as sonar for detection and localization of far-field sound sources. Two sometimes competing challenges arise in any type of spatial processing; these are to minimize contributions from directions other than the look direction and minimize the width of the main lobe. To tackle this problem a large body of work has been devoted to the development of adaptive procedures that attempt to minimize side lobe contributions to the spatial processor output. In this paper, two adaptive beamforming procedures-minimum variance distorsionless response and weight optimization to minimize maximum side lobes--are modified for use in source visualization applications to estimate beamforming pressure and intensity using near-field pressure measurements. These adaptive techniques are compared to a fixed near-field focusing technique (both techniques use near-field beamforming weightings focusing at source locations estimated based on spherical wave array manifold vectors with spatial windows). Sound source resolution accuracies of near-field imaging procedures with different weighting strategies are compared using numerical simulations both in anechoic and reverberant environments with random measurement noise. Also, experimental results are given for near-field sound pressure measurements of an enclosed loudspeaker.
Human factors considerations in the evaluation of processor-based signal and train control systems
DOT National Transportation Integrated Search
2007-06-01
In August 2001, the Federal Railroad Administration issued the notice of proposed rulemaking: Standards for Development and : Use of Processor-Based Signal and Train Control Systems (49 Code of Federal Regulations Part 236). This proposed rule addres...
Enabling Future Robotic Missions with Multicore Processors
NASA Technical Reports Server (NTRS)
Powell, Wesley A.; Johnson, Michael A.; Wilmot, Jonathan; Some, Raphael; Gostelow, Kim P.; Reeves, Glenn; Doyle, Richard J.
2011-01-01
Recent commercial developments in multicore processors (e.g. Tilera, Clearspeed, HyperX) have provided an option for high performance embedded computing that rivals the performance attainable with FPGA-based reconfigurable computing architectures. Furthermore, these processors offer more straightforward and streamlined application development by allowing the use of conventional programming languages and software tools in lieu of hardware design languages such as VHDL and Verilog. With these advantages, multicore processors can significantly enhance the capabilities of future robotic space missions. This paper will discuss these benefits, along with onboard processing applications where multicore processing can offer advantages over existing or competing approaches. This paper will also discuss the key artchitecural features of current commercial multicore processors. In comparison to the current art, the features and advancements necessary for spaceflight multicore processors will be identified. These include power reduction, radiation hardening, inherent fault tolerance, and support for common spacecraft bus interfaces. Lastly, this paper will explore how multicore processors might evolve with advances in electronics technology and how avionics architectures might evolve once multicore processors are inserted into NASA robotic spacecraft.
Systems and Methods for Automated Vessel Navigation Using Sea State Prediction
NASA Technical Reports Server (NTRS)
Huntsberger, Terrance L. (Inventor); Howard, Andrew B. (Inventor); Reinhart, Rene Felix (Inventor); Aghazarian, Hrand (Inventor); Rankin, Arturo (Inventor)
2017-01-01
Systems and methods for sea state prediction and autonomous navigation in accordance with embodiments of the invention are disclosed. One embodiment of the invention includes a method of predicting a future sea state including generating a sequence of at least two 3D images of a sea surface using at least two image sensors, detecting peaks and troughs in the 3D images using a processor, identifying at least one wavefront in each 3D image based upon the detected peaks and troughs using the processor, characterizing at least one propagating wave based upon the propagation of wavefronts detected in the sequence of 3D images using the processor, and predicting a future sea state using at least one propagating wave characterizing the propagation of wavefronts in the sequence of 3D images using the processor. Another embodiment includes a method of autonomous vessel navigation based upon a predicted sea state and target location.
Systems and Methods for Automated Vessel Navigation Using Sea State Prediction
NASA Technical Reports Server (NTRS)
Aghazarian, Hrand (Inventor); Reinhart, Rene Felix (Inventor); Huntsberger, Terrance L. (Inventor); Rankin, Arturo (Inventor); Howard, Andrew B. (Inventor)
2015-01-01
Systems and methods for sea state prediction and autonomous navigation in accordance with embodiments of the invention are disclosed. One embodiment of the invention includes a method of predicting a future sea state including generating a sequence of at least two 3D images of a sea surface using at least two image sensors, detecting peaks and troughs in the 3D images using a processor, identifying at least one wavefront in each 3D image based upon the detected peaks and troughs using the processor, characterizing at least one propagating wave based upon the propagation of wavefronts detected in the sequence of 3D images using the processor, and predicting a future sea state using at least one propagating wave characterizing the propagation of wavefronts in the sequence of 3D images using the processor. Another embodiment includes a method of autonomous vessel navigation based upon a predicted sea state and target location.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
NASA Astrophysics Data System (ADS)
Meledin, V.; Anikin, Yu.; Bakakin, G.; Glavniy, V.; Dvoinishnikov, S.; Kulikov, D.; Naumov, I.; Okulov, V.; Pavlov, V.; Rakhmanov, V.; Sadbakov, O.; Mostovskiy, N.; Ilyin, S.
2006-05-01
For hydrodynamic examinations of the turbid three-phase streams with air bubbles and with a depth more than 500 mm for the first time it is developed 2D Laser Doppler Semiconductor Anemometer LADO5-LMZ. Anemometer signal processor base on <
VLITE-Fast: A Real-time, 350 MHz Commensal VLA Survey for Fast Transients
NASA Astrophysics Data System (ADS)
Kerr, Matthew; Ray, Paul S.; Kassim, Namir E.; Clarke, Tracy; Deneva, Julia; Polisensky, Emil
2018-01-01
The VLITE (VLA Low Band Ionosphere and Transient Experiment; http://vlite.nrao.edu) program operates commensally during all Very Large Array observations, collecting data from 320 to 384 MHz. Recently expanded to include 16 antennas, the large field of view and huge time on sky offer good coverage of the transient, low-frequency sky. We describe the VLITE-Fast system, a GPU-based signal processor capable of detecting short (<1s) transients in real time and triggering recording of baseband voltage for offline imaging. In the case of Fast Radio Bursts, this offers the opportunity for discovering host galaxies of non-repeating FRBs, and in the case of single pulses, the identification of pulsar positions for dedicated follow-up. We describe the observing system, techniques for mitigating interference, and initial results from searches for FRBs.
Scalable, efficient ASICS for the square kilometre array: From A/D conversion to central correlation
NASA Astrophysics Data System (ADS)
Schmatz, M. L.; Jongerius, R.; Dittmann, G.; Anghel, A.; Engbersen, T.; van Lunteren, J.; Buchmann, P.
2014-05-01
The Square Kilometre Array (SKA) is a future radio telescope, currently being designed by the worldwide radio-astronomy community. During the first of two construction phases, more than 250,000 antennas will be deployed, clustered in aperture-array stations. The antennas will generate 2.5 Pb/s of data, which needs to be processed in real time. For the processing stages from A/D conversion to central correlation, we propose an ASIC solution using only three chip architectures. The architecture is scalable - additional chips support additional antennas or beams - and versatile - it can relocate its receiver band within a range of a few MHz up to 4GHz. This flexibility makes it applicable to both SKA phases 1 and 2. The proposed chips implement an antenna and station processor for 289 antennas with a power consumption on the order of 600W and a correlator, including corner turn, for 911 stations on the order of 90 kW.
Precise and Efficient Static Array Bound Checking for Large Embedded C Programs
NASA Technical Reports Server (NTRS)
Venet, Arnaud
2004-01-01
In this paper we describe the design and implementation of a static array-bound checker for a family of embedded programs: the flight control software of recent Mars missions. These codes are large (up to 250 KLOC), pointer intensive, heavily multithreaded and written in an object-oriented style, which makes their analysis very challenging. We designed a tool called C Global Surveyor (CGS) that can analyze the largest code in a couple of hours with a precision of 80%. The scalability and precision of the analyzer are achieved by using an incremental framework in which a pointer analysis and a numerical analysis of array indices mutually refine each other. CGS has been designed so that it can distribute the analysis over several processors in a cluster of machines. To the best of our knowledge this is the first distributed implementation of static analysis algorithms. Throughout the paper we will discuss the scalability setbacks that we encountered during the construction of the tool and their impact on the initial design decisions.
The Acceleration of Structural Microarchitectural Simulation via Scheduling
2006-11-01
193 viii List of Tables 1.1 Size of Intel R ©Processors...Table 1.1 shows the total and estimated non-cache transistor counts in succeeding generations of Intel R ©microprocessors. (Cache array transistors are...Intel486TM 1989 1,200,000 800,000 Intel R ©Pentium R © 1993 3,100,000 2,300,000 Intel R ©Pentium R ©II 1997 7,500,000 5,500,000 Intel R ©Pentium R ©III 1999
Function algorithms for MPP scientific subroutines, volume 1
NASA Technical Reports Server (NTRS)
Gouch, J. G.
1984-01-01
Design documentation and user documentation for function algorithms for the Massively Parallel Processor (MPP) are presented. The contract specifies development of MPP assembler instructions to perform the following functions: natural logarithm; exponential (e to the x power); square root; sine; cosine; and arctangent. To fulfill the requirements of the contract, parallel array and solar implementations for these functions were developed on the PDP11/34 Program Development and Management Unit (PDMU) that is resident at the MPP testbed installation located at the NASA Goddard facility.
NASA Technical Reports Server (NTRS)
Perez, Christopher E.; Berg, Melanie D.; Friendlich, Mark R.
2011-01-01
Motivation for this work is: (1) Accurately characterize digital signal processor (DSP) core single-event effect (SEE) behavior (2) Test DSP cores across a large frequency range and across various input conditions (3) Isolate SEE analysis to DSP cores alone (4) Interpret SEE analysis in terms of single-event upsets (SEUs) and single-event transients (SETs) (5) Provide flight missions with accurate estimate of DSP core error rates and error signatures.
Integrated Optical Synthetic Aperture Radar Processor.
1987-09-01
acoustooptic cell was employed to input each radar return into a time-and-space integrating optical architecture comprised of several lenses, a CCD area array...acoustooptic cell and parallel rib waveguide structure. During the course of the literature survey, we became aware of an elegant and poten- tially profound...wave.) scatterer at (f , A(t) is the far-field pattern of the antenna. From the geometry of Si. 1. R can be written as [I-2R,/c - nT1 r(t) = A(nT) rectj
NASA Technical Reports Server (NTRS)
Mugler, D. H.; Ross, M. D.
1990-01-01
The inner ear contains sensory organs which signal changes in head movement. The vestibular sacs, in particular, are sensitive to linear accelerations. Electron microscopic images have revealed the structure of tiny sensory hair bundles, whose mechanical deformation results in the initiation of neuronal activity and the transmission of electrical signals to the brain. The structure of the hair bundles is shown in this paper to be that of the most efficient two-dimensional phased-array signal processors.
Realtime photoacoustic microscopy in vivo with a 30-MHz ultrasound array transducer.
Zemp, Roger J; Song, Liang; Bitton, Rachel; Shung, K Kirk; Wang, Lihong V
2008-05-26
We present a novel high-frequency photoacoustic microscopy system capable of imaging the microvasculature of living subjects in realtime to depths of a few mm. The system consists of a high-repetition-rate Q-switched pump laser, a tunable dye laser, a 30-MHz linear ultrasound array transducer, a multichannel high-frequency data acquisition system, and a shared-RAM multi-core-processor computer. Data acquisition, beamforming, scan conversion, and display are implemented in realtime at 50 frames per second. Clearly resolvable images of 6-microm-diameter carbon fibers are experimentally demonstrated at 80 microm separation distances. Realtime imaging performance is demonstrated on phantoms and in vivo with absorbing structures identified to depths of 2.5-3 mm. This work represents the first high-frequency realtime photoacoustic imaging system to our knowledge.
NASA Technical Reports Server (NTRS)
Gentzsch, W.
1982-01-01
Problems which can arise with vector and parallel computers are discussed in a user oriented context. Emphasis is placed on the algorithms used and the programming techniques adopted. Three recently developed supercomputers are examined and typical application examples are given in CRAY FORTRAN, CYBER 205 FORTRAN and DAP (distributed array processor) FORTRAN. The systems performance is compared. The addition of parts of two N x N arrays is considered. The influence of the architecture on the algorithms and programming language is demonstrated. Numerical analysis of magnetohydrodynamic differential equations by an explicit difference method is illustrated, showing very good results for all three systems. The prognosis for supercomputer development is assessed.
On-board computational efficiency in real time UAV embedded terrain reconstruction
NASA Astrophysics Data System (ADS)
Partsinevelos, Panagiotis; Agadakos, Ioannis; Athanasiou, Vasilis; Papaefstathiou, Ioannis; Mertikas, Stylianos; Kyritsis, Sarantis; Tripolitsiotis, Achilles; Zervos, Panagiotis
2014-05-01
In the last few years, there is a surge of applications for object recognition, interpretation and mapping using unmanned aerial vehicles (UAV). Specifications in constructing those UAVs are highly diverse with contradictory characteristics including cost-efficiency, carrying weight, flight time, mapping precision, real time processing capabilities, etc. In this work, a hexacopter UAV is employed for near real time terrain mapping. The main challenge addressed is to retain a low cost flying platform with real time processing capabilities. The UAV weight limitation affecting the overall flight time, makes the selection of the on-board processing components particularly critical. On the other hand, surface reconstruction, as a computational demanding task, calls for a highly demanding processing unit on board. To merge these two contradicting aspects along with customized development, a System on a Chip (SoC) integrated circuit is proposed as a low-power, low-cost processor, which natively supports camera sensors and positioning and navigation systems. Modern SoCs, such as Omap3530 or Zynq, are classified as heterogeneous devices and provide a versatile platform, allowing access to both general purpose processors, such as the ARM11, as well as specialized processors, such as a digital signal processor and floating field-programmable gate array. A UAV equipped with the proposed embedded processors, allows on-board terrain reconstruction using stereo vision in near real time. Furthermore, according to the frame rate required, additional image processing may concurrently take place, such as image rectification andobject detection. Lastly, the onboard positioning and navigation (e.g., GNSS) chip may further improve the quality of the generated map. The resulting terrain maps are compared to ground truth geodetic measurements in order to access the accuracy limitations of the overall process. It is shown that with our proposed novel system,there is much potential in computational efficiency on board and in optimized time constraints.
Smart Power Supply for Battery-Powered Systems
NASA Technical Reports Server (NTRS)
Krasowski, Michael J.; Greer, Lawrence; Prokop, Norman F.; Flatico, Joseph M.
2010-01-01
A power supply for battery-powered systems has been designed with an embedded controller that is capable of monitoring and maintaining batteries, charging hardware, while maintaining output power. The power supply is primarily designed for rovers and other remote science and engineering vehicles, but it can be used in any battery alone, or battery and charging source applications. The supply can function autonomously, or can be connected to a host processor through a serial communications link. It can be programmed a priori or on the fly to return current and voltage readings to a host. It has two output power busses: a constant 24-V direct current nominal bus, and a programmable bus for output from approximately 24 up to approximately 50 V. The programmable bus voltage level, and its output power limit, can be changed on the fly as well. The power supply also offers options to reduce the programmable bus to 24 V when the set power limit is reached, limiting output power in the case of a system fault detected in the system. The smart power supply is based on an embedded 8051-type single-chip microcontroller. This choice was made in that a credible progression to flight (radiation hard, high reliability) can be assumed as many 8051 processors or gate arrays capable of accepting 8051-type core presently exist and will continue to do so for some time. To solve the problem of centralized control, this innovation moves an embedded microcontroller to the power supply and assigns it the task of overseeing the operation and charging of the power supply assets. This embedded processor is connected to the application central processor via a serial data link such that the central processor can request updates of various parameters within the supply, such as battery current, bus voltage, remaining power in battery estimations, etc. This supply has a direct connection to the battery bus for common (quiescent) power application. Because components from multiple vendors may have differing power needs, this supply also has a secondary power bus, which can be programmed a priori or on-the-fly to boost the primary battery voltage level from 24 to 50 V to accommodate various loads as they are brought on line. Through voltage and current monitoring, the device can also shield the charging source from overloads, keep it within safe operating modes, and can meter available power to the application and maintain safe operations.
50 CFR 679.50 - Groundfish Observer Program.
Code of Federal Regulations, 2010 CFR
2010-10-01
... following: (A) Identification of the management, organizational structure, and ownership structure of the.../processors. A catcher/processor will be assigned to a fishery category based on the retained groundfish catch... in Federal waters will be assigned to a fishery category based on the retained groundfish catch...
NASA Technical Reports Server (NTRS)
Humphreys, William M., Jr.; Lockard, David P.; Khorrami, Mehdi R.; Culliton, William G.; McSwain, Robert G.; Ravetta, Patricio A.; Johns, Zachary
2016-01-01
A new aeroacoustic measurement capability has been developed consisting of a large channelcount, field-deployable microphone phased array suitable for airframe noise flyover measurements for a range of aircraft types and scales. The array incorporates up to 185 hardened, weather-resistant sensors suitable for outdoor use. A custom 4-mA current loop receiver circuit with temperature compensation was developed to power the sensors over extended cable lengths with minimal degradation of the signal to noise ratio and frequency response. Extensive laboratory calibrations and environmental testing of the sensors were conducted to verify the design's performance specifications. A compact data system combining sensor power, signal conditioning, and digitization was assembled for use with the array. Complementing the data system is a robust analysis system capable of near real-time presentation of beamformed and deconvolved contour plots and integrated spectra obtained from array data acquired during flyover passes. Additional instrumentation systems needed to process the array data were also assembled. These include a commercial weather station and a video monitoring / recording system. A detailed mock-up of the instrumentation suite (phased array, weather station, and data processor) was performed in the NASA Langley Acoustic Development Laboratory to vet the system performance. The first deployment of the system occurred at Finnegan Airfield at Fort A.P. Hill where the array was utilized to measure the vehicle noise from a number of sUAS (small Unmanned Aerial System) aircraft. A unique in-situ calibration method for the array microphones using a hovering aerial sound source was attempted for the first time during the deployment.
Yes! An object-oriented compiler compiler (YOOCC)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Avotins, J.; Mingins, C.; Schmidt, H.
1995-12-31
Grammar-based processor generation is one of the most widely studied areas in language processor construction. However, there have been very few approaches to date that reconcile object-oriented principles, processor generation, and an object-oriented language. Pertinent here also. is that currently to develop a processor using the Eiffel Parse libraries requires far too much time to be expended on tasks that can be automated. For these reasons, we have developed YOOCC (Yes! an Object-Oriented Compiler Compiler), which produces a processor framework from a grammar using an enhanced version of the Eiffel Parse libraries, incorporating the ideas hypothesized by Meyer, and Grapemore » and Walden, as well as many others. Various essential changes have been made to the Eiffel Parse libraries. Examples are presented to illustrate the development of a processor using YOOCC, and it is concluded that the Eiffel Parse libraries are now not only an intelligent, but also a productive option for processor construction.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang Meizhen; Shi Longzhao; Wang Yuxing
2006-08-15
An inherently nonlinear relation between the output current of the tetralateral position sensitive detector (PSD) and the position of the incident light spot has been found theoretically. Based on single-chip microcomputer and the theoretical relation between output current and position, a new signal processor capable of correcting nonlinearity and reducing position measurement deviation of tetralateral PSD was developed. A tetralateral PSD (S1200, 13x13 mm{sup 2}, Hamamatsu Photonics K.K.) was measured with the new signal processor, a linear relation between the output position of the PSD, and the incident position of the light spot was obtained. In the 60% range ofmore » a 13x13 mm{sup 2} active area, the position nonlinearity (rms) was 0.15% and the position measurement deviation (rms) was {+-}20 {mu}m. Compared with traditional analog signal processor, the new signal processor is of better compatibility, lower cost, higher precision, and easier to be interfaced.« less
NASA Astrophysics Data System (ADS)
Huang, Mei-Zhen; Shi, Long-Zhao; Wang, Yu-Xing; Ni, Yi; Li, Zhen-Qing; Ding, Hai-Feng
2006-08-01
An inherently nonlinear relation between the output current of the tetralateral position sensitive detector (PSD) and the position of the incident light spot has been found theoretically. Based on single-chip microcomputer and the theoretical relation between output current and position, a new signal processor capable of correcting nonlinearity and reducing position measurement deviation of tetralateral PSD was developed. A tetralateral PSD (S1200, 13×13mm2, Hamamatsu Photonics K.K.) was measured with the new signal processor, a linear relation between the output position of the PSD, and the incident position of the light spot was obtained. In the 60% range of a 13×13mm2 active area, the position nonlinearity (rms) was 0.15% and the position measurement deviation (rms) was ±20μm. Compared with traditional analog signal processor, the new signal processor is of better compatibility, lower cost, higher precision, and easier to be interfaced.
Real-time phase correlation based integrated system for seizure detection
NASA Astrophysics Data System (ADS)
Romaine, James B.; Delgado-Restituto, Manuel; Leñero-Bardallo, Juan A.; Rodríguez-Vázquez, Ángel
2017-05-01
This paper reports a low area, low power, integer-based digital processor for the calculation of phase synchronization between two neural signals. The processor calculates the phase-frequency content of a signal by identifying the specific time periods associated with two consecutive minima. The simplicity of this phase-frequency content identifier allows for the digital processor to utilize only basic digital blocks, such as registers, counters, adders and subtractors, without incorporating any complex multiplication and or division algorithms. In fact, the processor, fabricated in a 0.18μm CMOS process, only occupies an area of 0.0625μm2 and consumes 12.5nW from a 1.2V supply voltage when operated at 128kHz. These low-area, low-power features make the proposed processor a valuable computing element in closed loop neural prosthesis for the treatment of neural diseases, such as epilepsy, or for extracting functional connectivity maps between different recording sites in the brain.
Huang, Kuan-Ju; Shih, Wei-Yeh; Chang, Jui Chung; Feng, Chih Wei; Fang, Wai-Chi
2013-01-01
This paper presents a pipeline VLSI design of fast singular value decomposition (SVD) processor for real-time electroencephalography (EEG) system based on on-line recursive independent component analysis (ORICA). Since SVD is used frequently in computations of the real-time EEG system, a low-latency and high-accuracy SVD processor is essential. During the EEG system process, the proposed SVD processor aims to solve the diagonal, inverse and inverse square root matrices of the target matrices in real time. Generally, SVD requires a huge amount of computation in hardware implementation. Therefore, this work proposes a novel design concept for data flow updating to assist the pipeline VLSI implementation. The SVD processor can greatly improve the feasibility of real-time EEG system applications such as brain computer interfaces (BCIs). The proposed architecture is implemented using TSMC 90 nm CMOS technology. The sample rate of EEG raw data adopts 128 Hz. The core size of the SVD processor is 580×580 um(2), and the speed of operation frequency is 20MHz. It consumes 0.774mW of power during the 8-channel EEG system per execution time.
Miniature Fuel Processors for Portable Fuel Cell Power Supplies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holladay, Jamie D.; Jones, Evan O.; Palo, Daniel R.
2003-06-02
Miniature and micro-scale fuel processors are discussed. The enabling technologies for these devices are the novel catalysts and the micro-technology-based designs. The novel catalyst allows for methanol reforming at high gas hourly space velocities of 50,000 hr-1 or higher, while maintaining a carbon monoxide levels at 1% or less. The micro-technology-based designs enable the devices to be extremely compact and lightweight. The miniature fuel processors can nominally provide between 25-50 watts equivalent of hydrogen which is ample for soldier or personal portable power supplies. The integrated processors have a volume less than 50 cm3, a mass less than 150 grams,more » and thermal efficiencies of up to 83%. With reasonable assumptions on fuel cell efficiencies, anode gas and water management, parasitic power loss, etc., the energy density was estimated at 1700 Whr/kg. The miniature processors have been demonstrated with a carbon monoxide clean-up method and a fuel cell stack. The micro-scale fuel processors have been designed to provide up to 0.3 watt equivalent of power with efficiencies over 20%. They have a volume of less than 0.25 cm3 and a mass of less than 1 gram.« less
NASA Astrophysics Data System (ADS)
Arestova, M. L.; Bykovskii, A. Yu
1995-10-01
An architecture is proposed for a specialised optoelectronic multivalued logic processor based on the Allen—Givone algebra. The processor is intended for multiparametric processing of data arriving from a large number of sensors or for tackling spectral analysis tasks. The processor architecture makes it possible to obtain an approximate general estimate of the state of an object being diagnosed on a p-level scale. Optoelectronic systems are proposed for MAXIMUM, MINIMUM, and LITERAL logic gates, based on optical-frequency encoding of logic levels. Corresponding logic gates form a complete set of logic functions in the Allen—Givone algebra.
The SPAR thermal analyzer: Present and future
NASA Astrophysics Data System (ADS)
Marlowe, M. B.; Whetstone, W. D.; Robinson, J. C.
The SPAR thermal analyzer, a system of finite-element processors for performing steady-state and transient thermal analyses, is described. The processors communicate with each other through the SPAR random access data base. As each processor is executed, all pertinent source data is extracted from the data base and results are stored in the data base. Steady state temperature distributions are determined by a direct solution method for linear problems and a modified Newton-Raphson method for nonlinear problems. An explicit and several implicit methods are available for the solution of transient heat transfer problems. Finite element plotting capability is available for model checkout and verification.
The SPAR thermal analyzer: Present and future
NASA Technical Reports Server (NTRS)
Marlowe, M. B.; Whetstone, W. D.; Robinson, J. C.
1982-01-01
The SPAR thermal analyzer, a system of finite-element processors for performing steady-state and transient thermal analyses, is described. The processors communicate with each other through the SPAR random access data base. As each processor is executed, all pertinent source data is extracted from the data base and results are stored in the data base. Steady state temperature distributions are determined by a direct solution method for linear problems and a modified Newton-Raphson method for nonlinear problems. An explicit and several implicit methods are available for the solution of transient heat transfer problems. Finite element plotting capability is available for model checkout and verification.
Parallel processor for real-time structural control
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tise, B.L.
1992-01-01
A parallel processor that is optimized for real-time linear control has been developed. This modular system consists of A/D modules, D/A modules, and floating-point processor modules. The scalable processor uses up to 1,000 Motorola DSP96002 floating-point processors for a peak computational rate of 60 GFLOPS. Sampling rates up to 625 kHz are supported by this analog-in to analog-out controller. The high processing rate and parallel architecture make this processor suitable for computing state-space equations and other multiply/accumulate-intensive digital filters. Processor features include 14-bit conversion devices, low input-output latency, 240 Mbyte/s synchronous backplane bus, low-skew clock distribution circuit, VME connection tomore » host computer, parallelizing code generator, and look-up-tables for actuator linearization. This processor was designed primarily for experiments in structural control. The A/D modules sample sensors mounted on the structure and the floating-point processor modules compute the outputs using the programmed control equations. The outputs are sent through the D/A module to the power amps used to drive the structure's actuators. The host computer is a Sun workstation. An Open Windows-based control panel is provided to facilitate data transfer to and from the processor, as well as to control the operating mode of the processor. A diagnostic mode is provided to allow stimulation of the structure and acquisition of the structural response via sensor inputs.« less
A Linked List-Based Algorithm for Blob Detection on Embedded Vision-Based Sensors
Acevedo-Avila, Ricardo; Gonzalez-Mendoza, Miguel; Garcia-Garcia, Andres
2016-01-01
Blob detection is a common task in vision-based applications. Most existing algorithms are aimed at execution on general purpose computers; while very few can be adapted to the computing restrictions present in embedded platforms. This paper focuses on the design of an algorithm capable of real-time blob detection that minimizes system memory consumption. The proposed algorithm detects objects in one image scan; it is based on a linked-list data structure tree used to label blobs depending on their shape and node information. An example application showing the results of a blob detection co-processor has been built on a low-powered field programmable gate array hardware as a step towards developing a smart video surveillance system. The detection method is intended for general purpose application. As such, several test cases focused on character recognition are also examined. The results obtained present a fair trade-off between accuracy and memory requirements; and prove the validity of the proposed approach for real-time implementation on resource-constrained computing platforms. PMID:27240382
Parallel processing approach to transform-based image coding
NASA Astrophysics Data System (ADS)
Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.
1991-06-01
This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.
General optical discrete z transform: design and application.
Ngo, Nam Quoc
2016-12-20
This paper presents a generalization of the discrete z transform algorithm. It is shown that the GOD-ZT algorithm is a generalization of several important conventional discrete transforms. Based on the GOD-ZT algorithm, a tunable general optical discrete z transform (GOD-ZT) processor is synthesized using the silica-based finite impulse response transversal filter. To demonstrate the effectiveness of the method, the design and simulation of a tunable optical discrete Fourier transform (ODFT) processor as a special case of the synthesized GOD-ZT processor is presented. It is also shown that the ODFT processor can function as a real-time optical spectrum analyzer. The tunable ODFT has an important potential application as a tunable optical demultiplexer at the receiver end of an optical orthogonal frequency-division multiplexing transmission system.
System and method for controlling power consumption in a computer system based on user satisfaction
Yang, Lei; Dick, Robert P; Chen, Xi; Memik, Gokhan; Dinda, Peter A; Shy, Alex; Ozisikyilmaz, Berkin; Mallik, Arindam; Choudhary, Alok
2014-04-22
Systems and methods for controlling power consumption in a computer system. For each of a plurality of interactive applications, the method changes a frequency at which a processor of the computer system runs, receives an indication of user satisfaction, determines a relationship between the changed frequency and the user satisfaction of the interactive application, and stores the determined relationship information. The determined relationship can distinguish between different users and different interactive applications. A frequency may be selected from the discrete frequencies at which the processor of the computer system runs based on the determined relationship information for a particular user and a particular interactive application running on the processor of the computer system. The processor may be adapted to run at the selected frequency.
Teymouri, Jessica; Hullar, Timothy E; Holden, Timothy A; Chole, Richard A
2011-08-01
To determine the efficacy of clinical computed tomographic (CT) imaging to verify postoperative electrode array placement in cochlear implant (CI) patients. Nine fresh cadaver heads underwent clinical CT scanning, followed by bilateral CI insertion and postoperative clinical CT scanning. Temporal bones were removed, trimmed, and scanned using micro-CT. Specimens were then dehydrated, embedded in either methyl methacrylate or LR White resin, and sectioned with a diamond wafering saw. Histology sections were examined by 3 blinded observers to determine the position of individual electrodes relative to soft tissue structures within the cochlea. Electrodes were judged to be within the scala tympani, scala vestibuli, or in an intermediate position between scalae. The position of the array could be estimated accurately from clinical CT scans in all specimens using micro-CT and histology as a criterion standard. Verification using micro-CT yielded 97% agreement, and histologic analysis revealed 95% agreement with clinical CT results. A composite, 3-dimensional image derived from a patient's preoperative and postoperative CT images using a clinical scanner accurately estimates the position of the electrode array as determined by micro-CT imaging and histologic analyses. Information obtained using the CT method provides valuable insight into numerous variables of interest to patient performance such as surgical technique, array design, and processor programming and troubleshooting.
Multisensory architectures for action-oriented perception
NASA Astrophysics Data System (ADS)
Alba, L.; Arena, P.; De Fiore, S.; Listán, J.; Patané, L.; Salem, A.; Scordino, G.; Webb, B.
2007-05-01
In order to solve the navigation problem of a mobile robot in an unstructured environment a versatile sensory system and efficient locomotion control algorithms are necessary. In this paper an innovative sensory system for action-oriented perception applied to a legged robot is presented. An important problem we address is how to utilize a large variety and number of sensors, while having systems that can operate in real time. Our solution is to use sensory systems that incorporate analog and parallel processing, inspired by biological systems, to reduce the required data exchange with the motor control layer. In particular, as concerns the visual system, we use the Eye-RIS v1.1 board made by Anafocus, which is based on a fully parallel mixed-signal array sensor-processor chip. The hearing sensor is inspired by the cricket hearing system and allows efficient localization of a specific sound source with a very simple analog circuit. Our robot utilizes additional sensors for touch, posture, load, distance, and heading, and thus requires customized and parallel processing for concurrent acquisition. Therefore a Field Programmable Gate Array (FPGA) based hardware was used to manage the multi-sensory acquisition and processing. This choice was made because FPGAs permit the implementation of customized digital logic blocks that can operate in parallel allowing the sensors to be driven simultaneously. With this approach the multi-sensory architecture proposed can achieve real time capabilities.
Design and Implementation of Sound Searching Robots in Wireless Sensor Networks
Han, Lianfu; Shen, Zhengguang; Fu, Changfeng; Liu, Chao
2016-01-01
A sound target-searching robot system which includes a 4-channel microphone array for sound collection, magneto-resistive sensor for declination measurement, and a wireless sensor networks (WSN) for exchanging information is described. It has an embedded sound signal enhancement, recognition and location method, and a sound searching strategy based on a digital signal processor (DSP). As the wireless network nodes, three robots comprise the WSN a personal computer (PC) in order to search the three different sound targets in task-oriented collaboration. The improved spectral subtraction method is used for noise reduction. As the feature of audio signal, Mel-frequency cepstral coefficient (MFCC) is extracted. Based on the K-nearest neighbor classification method, we match the trained feature template to recognize sound signal type. This paper utilizes the improved generalized cross correlation method to estimate time delay of arrival (TDOA), and then employs spherical-interpolation for sound location according to the TDOA and the geometrical position of the microphone array. A new mapping has been proposed to direct the motor to search sound targets flexibly. As the sink node, the PC receives and displays the result processed in the WSN, and it also has the ultimate power to make decision on the received results in order to improve their accuracy. The experiment results show that the designed three-robot system implements sound target searching function without collisions and performs well. PMID:27657088
Design and Implementation of Sound Searching Robots in Wireless Sensor Networks.
Han, Lianfu; Shen, Zhengguang; Fu, Changfeng; Liu, Chao
2016-09-21
A sound target-searching robot system which includes a 4-channel microphone array for sound collection, magneto-resistive sensor for declination measurement, and a wireless sensor networks (WSN) for exchanging information is described. It has an embedded sound signal enhancement, recognition and location method, and a sound searching strategy based on a digital signal processor (DSP). As the wireless network nodes, three robots comprise the WSN a personal computer (PC) in order to search the three different sound targets in task-oriented collaboration. The improved spectral subtraction method is used for noise reduction. As the feature of audio signal, Mel-frequency cepstral coefficient (MFCC) is extracted. Based on the K-nearest neighbor classification method, we match the trained feature template to recognize sound signal type. This paper utilizes the improved generalized cross correlation method to estimate time delay of arrival (TDOA), and then employs spherical-interpolation for sound location according to the TDOA and the geometrical position of the microphone array. A new mapping has been proposed to direct the motor to search sound targets flexibly. As the sink node, the PC receives and displays the result processed in the WSN, and it also has the ultimate power to make decision on the received results in order to improve their accuracy. The experiment results show that the designed three-robot system implements sound target searching function without collisions and performs well.
A word processor optimized for preparing journal articles and student papers.
Wolach, A H; McHale, M A
2001-11-01
A new Windows-based word processor for preparing journal articles and student papers is described. In addition to standard features found in word processors, the present word processor provides specific help in preparing manuscripts. Clicking on "Reference Help (APA Form)" in the "File" menu provides a detailed help system for entering the references in a journal article. Clicking on "Examples and Explanations of APA Form" provides a help system with examples of the various sections of a review article, journal article that has one experiment, or journal article that has two or more experiments. The word processor can automatically place the manuscript page header and page number at the top of each page using the form required by APA and Psychonomic Society journals. The "APA Form" submenu of the "Help" menu provides detailed information about how the word processor is optimized for preparing articles and papers.
Parallel processing architecture for H.264 deblocking filter on multi-core platforms
NASA Astrophysics Data System (ADS)
Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao
2012-03-01
Massively parallel computing (multi-core) chips offer outstanding new solutions that satisfy the increasing demand for high resolution and high quality video compression technologies such as H.264. Such solutions not only provide exceptional quality but also efficiency, low power, and low latency, previously unattainable in software based designs. While custom hardware and Application Specific Integrated Circuit (ASIC) technologies may achieve lowlatency, low power, and real-time performance in some consumer devices, many applications require a flexible and scalable software-defined solution. The deblocking filter in H.264 encoder/decoder poses difficult implementation challenges because of heavy data dependencies and the conditional nature of the computations. Deblocking filter implementations tend to be fixed and difficult to reconfigure for different needs. The ability to scale up for higher quality requirements such as 10-bit pixel depth or a 4:2:2 chroma format often reduces the throughput of a parallel architecture designed for lower feature set. A scalable architecture for deblocking filtering, created with a massively parallel processor based solution, means that the same encoder or decoder will be deployed in a variety of applications, at different video resolutions, for different power requirements, and at higher bit-depths and better color sub sampling patterns like YUV, 4:2:2, or 4:4:4 formats. Low power, software-defined encoders/decoders may be implemented using a massively parallel processor array, like that found in HyperX technology, with 100 or more cores and distributed memory. The large number of processor elements allows the silicon device to operate more efficiently than conventional DSP or CPU technology. This software programing model for massively parallel processors offers a flexible implementation and a power efficiency close to that of ASIC solutions. This work describes a scalable parallel architecture for an H.264 compliant deblocking filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.
System support software for the Space Ultrareliable Modular Computer (SUMC)
NASA Technical Reports Server (NTRS)
Hill, T. E.; Hintze, G. C.; Hodges, B. C.; Austin, F. A.; Buckles, B. P.; Curran, R. T.; Lackey, J. D.; Payne, R. E.
1974-01-01
The highly transportable programming system designed and implemented to support the development of software for the Space Ultrareliable Modular Computer (SUMC) is described. The SUMC system support software consists of program modules called processors. The initial set of processors consists of the supervisor, the general purpose assembler for SUMC instruction and microcode input, linkage editors, an instruction level simulator, a microcode grid print processor, and user oriented utility programs. A FORTRAN 4 compiler is undergoing development. The design facilitates the addition of new processors with a minimum effort and provides the user quasi host independence on the ground based operational software development computer. Additional capability is provided to accommodate variations in the SUMC architecture without consequent major modifications in the initial processors.
Electrical Prototype Power Processor for the 30-cm Mercury electric propulsion engine
NASA Technical Reports Server (NTRS)
Biess, J. J.; Frye, R. J.
1978-01-01
An Electrical Prototpye Power Processor has been designed to the latest electrical and performance requirements for a flight-type 30-cm ion engine and includes all the necessary power, command, telemetry and control interfaces for a typical electric propulsion subsystem. The power processor was configured into seven separate mechanical modules that would allow subassembly fabrication, test and integration into a complete power processor unit assembly. The conceptual mechanical packaging of the electrical prototype power processor unit demonstrated the relative location of power, high voltage and control electronic components to minimize electrical interactions and to provide adequate thermal control in a vacuum environment. Thermal control was accomplished with a heat pipe simulator attached to the base of the modules.
49 CFR Appendix B to Part 236 - Risk Assessment Criteria
Code of Federal Regulations, 2011 CFR
2011-10-01
..., exposure scenarios, and consequences that are related as described in this part. For the full risk... subsystem or component in the risk assessment. (f) How are processor-based subsystems/components assessed? (1) An MTTHE value must be calculated for each processor-based subsystem or component, or both...
SSC 254 Screen-Based Word Processors: Production Tests. The Lanier Word Processor.
ERIC Educational Resources Information Center
Moyer, Ruth A.
Designed for use in Trident Technical College's Secretarial Lab, this series of 12 production tests focuses on the use of the Lanier Word Processor for a variety of tasks. In tests 1 and 2, students are required to type and print out letters. Tests 3 through 8 require students to reformat a text; make corrections on a letter; divide and combine…
An Architecture for Measuring Joint Angles Using a Long Period Fiber Grating-Based Sensor
Perez-Ramirez, Carlos A.; Almanza-Ojeda, Dora L.; Guerrero-Tavares, Jesus N.; Mendoza-Galindo, Francisco J.; Estudillo-Ayala, Julian M.; Ibarra-Manzano, Mario A.
2014-01-01
The implementation of signal filters in a real-time form requires a tradeoff between computation resources and the system performance. Therefore, taking advantage of low lag response and the reduced consumption of resources, in this article, the Recursive Least Square (RLS) algorithm is used to filter a signal acquired from a fiber-optics-based sensor. In particular, a Long-Period Fiber Grating (LPFG) sensor is used to measure the bending movement of a finger. After that, the Gaussian Mixture Model (GMM) technique allows us to classify the corresponding finger position along the motion range. For these measures to help in the development of an autonomous robotic hand, the proposed technique can be straightforwardly implemented on real time platforms such as Field Programmable Gate Array (FPGA) or Digital Signal Processors (DSP). Different angle measurements of the finger's motion are carried out by the prototype and a detailed analysis of the system performance is presented. PMID:25536002
Multipurpose silicon photonics signal processor core.
Pérez, Daniel; Gasulla, Ivana; Crudgington, Lee; Thomson, David J; Khokhar, Ali Z; Li, Ke; Cao, Wei; Mashanovich, Goran Z; Capmany, José
2017-09-21
Integrated photonics changes the scaling laws of information and communication systems offering architectural choices that combine photonics with electronics to optimize performance, power, footprint, and cost. Application-specific photonic integrated circuits, where particular circuits/chips are designed to optimally perform particular functionalities, require a considerable number of design and fabrication iterations leading to long development times. A different approach inspired by electronic Field Programmable Gate Arrays is the programmable photonic processor, where a common hardware implemented by a two-dimensional photonic waveguide mesh realizes different functionalities through programming. Here, we report the demonstration of such reconfigurable waveguide mesh in silicon. We demonstrate over 20 different functionalities with a simple seven hexagonal cell structure, which can be applied to different fields including communications, chemical and biomedical sensing, signal processing, multiprocessor networks, and quantum information systems. Our work is an important step toward this paradigm.Integrated optical circuits today are typically designed for a few special functionalities and require complex design and development procedures. Here, the authors demonstrate a reconfigurable but simple silicon waveguide mesh with different functionalities.