Jiang, Chao; Zhang, Hongyan; Wang, Jia; Wang, Yaru; He, Heng; Liu, Rui; Zhou, Fangyuan; Deng, Jialiang; Li, Pengcheng; Luo, Qingming
2011-11-01
Laser speckle imaging (LSI) is a noninvasive and full-field optical imaging technique which produces two-dimensional blood flow maps of tissues from the raw laser speckle images captured by a CCD camera without scanning. We present a hardware-friendly algorithm for the real-time processing of laser speckle imaging. The algorithm is developed and optimized specifically for LSI processing in the field programmable gate array (FPGA). Based on this algorithm, we designed a dedicated hardware processor for real-time LSI in FPGA. The pipeline processing scheme and parallel computing architecture are introduced into the design of this LSI hardware processor. When the LSI hardware processor is implemented in the FPGA running at the maximum frequency of 130 MHz, up to 85 raw images with the resolution of 640×480 pixels can be processed per second. Meanwhile, we also present a system on chip (SOC) solution for LSI processing by integrating the CCD controller, memory controller, LSI hardware processor, and LCD display controller into a single FPGA chip. This SOC solution also can be used to produce an application specific integrated circuit for LSI processing.
Digital ultrasonics signal processing: Flaw data post processing use and description
NASA Technical Reports Server (NTRS)
Buel, V. E.
1981-01-01
A modular system composed of two sets of tasks which interprets the flaw data and allows compensation of the data due to transducer characteristics is described. The hardware configuration consists of two main units. A DEC LSI-11 processor running under the RT-11 sngle job, version 2C-02 operating system, controls the scanner hardware and the ultrasonic unit. A DEC PDP-11/45 processor also running under the RT-11, version 2C-02, operating system, stores, processes and displays the flaw data. The software developed the Ultrasonics Evaluation System, is divided into two catagories; transducer characterization and flaw classification. Each category is divided further into two functional tasks: a data acquisition and a postprocessor ask. The flaw characterization collects data, compresses its, and writes it to a disk file. The data is then processed by the flaw classification postprocessing task. The use and operation of a flaw data postprocessor is described.
Custom large scale integrated circuits for spaceborne SAR processors
NASA Technical Reports Server (NTRS)
Tyree, V. C.
1978-01-01
The application of modern LSI technology to the development of a time-domain azimuth correlator for SAR processing is discussed. General design requirements for azimuth correlators for missions such as SEASAT-A, Venus orbital imaging radar (VOIR), and shuttle imaging radar (SIR) are summarized. Several azimuth correlator architectures that are suitable for implementation using custom LSI devices are described. Technical factors pertaining to selection of appropriate LSI technologies are discussed, and the maturity of alternative technologies for spacecraft applications are reported in the context of expected space mission launch dates. The preliminary design of a custom LSI time-domain azimuth correlator device (ACD) being developed for use in future SAR processors is detailed.
Next-generation digital camera integration and software development issues
NASA Astrophysics Data System (ADS)
Venkataraman, Shyam; Peters, Ken; Hecht, Richard
1998-04-01
This paper investigates the complexities associated with the development of next generation digital cameras due to requirements in connectivity and interoperability. Each successive generation of digital camera improves drastically in cost, performance, resolution, image quality and interoperability features. This is being accomplished by advancements in a number of areas: research, silicon, standards, etc. As the capabilities of these cameras increase, so do the requirements for both hardware and software. Today, there are two single chip camera solutions in the market including the Motorola MPC 823 and LSI DCAM- 101. Real time constraints for a digital camera may be defined by the maximum time allowable between capture of images. Constraints in the design of an embedded digital camera include processor architecture, memory, processing speed and the real-time operating systems. This paper will present the LSI DCAM-101, a single-chip digital camera solution. It will present an overview of the architecture and the challenges in hardware and software for supporting streaming video in such a complex device. Issues presented include the development of the data flow software architecture, testing and integration on this complex silicon device. The strategy for optimizing performance on the architecture will also be presented.
C-MOS array design techniques: SUMC multiprocessor system study
NASA Technical Reports Server (NTRS)
Clapp, W. A.; Helbig, W. A.; Merriam, A. S.
1972-01-01
The current capabilities of LSI techniques for speed and reliability, plus the possibilities of assembling large configurations of LSI logic and storage elements, have demanded the study of multiprocessors and multiprocessing techniques, problems, and potentialities. Evaluated are three previous systems studies for a space ultrareliable modular computer multiprocessing system, and a new multiprocessing system is proposed that is flexibly configured with up to four central processors, four 1/0 processors, and 16 main memory units, plus auxiliary memory and peripheral devices. This multiprocessor system features a multilevel interrupt, qualified S/360 compatibility for ground-based generation of programs, virtual memory management of a storage hierarchy through 1/0 processors, and multiport access to multiple and shared memory units.
MIL-STD-1553B Marconi LSI chip set in a remote terminal application
NASA Astrophysics Data System (ADS)
Dimarino, A.
1982-11-01
Marconi Avionics is utilizing the MIL-STD-1553B LSI Chip Set in the SCADC Air Data Computer application to perform all of the required remote terminal MIL-STD-1553B protocol functions. Basic components of the RTU are the dual redundant chip set, CT3231 Transceivers, 256 x 16 RAM and a Z8002 microprocessor. Basic transfers are to/from the RAM command of the bus controller or Z8002 processor. During transfers from the processor to the RAM, the chip set busy bit is set for a period not exceeding 250 microseconds. When the transfer is complete, the busy bit is released and transfers to the data bus occur on command. The LSI Chip Set word count lines are used to locate each data word in the local memory and 4 mode codes are used in the application: reset remote terminal, transmit status word, transmitter shut-down, and override transmitter shutdown.
Hardware synthesis from DDL. [Digital Design Language for computer aided design and test of LSI
NASA Technical Reports Server (NTRS)
Shah, A. M.; Shiva, S. G.
1981-01-01
The details of the digital systems can be conveniently input into the design automation system by means of Hardware Description Languages (HDL). The Computer Aided Design and Test (CADAT) system at NASA MSFC is used for the LSI design. The Digital Design Language (DDL) has been selected as HDL for the CADAT System. DDL translator output can be used for the hardware implementation of the digital design. This paper addresses problems of selecting the standard cells from the CADAT standard cell library to realize the logic implied by the DDL description of the system.
Rapid Damage Assessment. Volume II. Development and Testing of Rapid Damage Assessment System.
1981-02-01
pixels/s Camera Line Rate 732.4 lines/s Pixels per Line 1728 video 314 blank 4 line number (binary) 2 run number (BCD) 2048 total Pixel Resolution 8 bits...sists of an LSI-ll microprocessor, a VDI -200 video display processor, an FD-2 dual floppy diskette subsystem, an FT-I function key-trackball module...COMPONENT LIST FOR IMAGE PROCESSOR SYSTEM IMAGE PROCESSOR SYSTEM VIEWS I VDI -200 Display Processor Racks, Table FD-2 Dual Floppy Diskette Subsystem FT-l
Laser Speckle Imaging of Cerebral Blood Flow
NASA Astrophysics Data System (ADS)
Luo, Qingming; Jiang, Chao; Li, Pengcheng; Cheng, Haiying; Wang, Zhen; Wang, Zheng; Tuchin, Valery V.
Monitoring the spatio-temporal characteristics of cerebral blood flow (CBF) is crucial for studying the normal and pathophysiologic conditions of brain metabolism. By illuminating the cortex with laser light and imaging the resulting speckle pattern, relative CBF images with tens of microns spatial and millisecond temporal resolution can be obtained. In this chapter, a laser speckle imaging (LSI) method for monitoring dynamic, high-resolution CBF is introduced. To improve the spatial resolution of current LSI, a modified LSI method is proposed. To accelerate the speed of data processing, three LSI data processing frameworks based on graphics processing unit (GPU), digital signal processor (DSP), and field-programmable gate array (FPGA) are also presented. Applications for detecting the changes in local CBF induced by sensory stimulation and thermal stimulation, the influence of a chemical agent on CBF, and the influence of acute hyperglycemia following cortical spreading depression on CBF are given.
NASA Technical Reports Server (NTRS)
1981-01-01
The set of computer programs described allows for data definition, data input, and data transfer between the LSI-11 microcomputers and the VAX-11/780 minicomputer. Program VAXCOM allows for a simple method of textual file transfer from the LSI to the VAX. Program LSICOM allows for easy file transfer from the VAX to the LSI. Program TTY changes the LSI-11 operators console to the LSI's printing device. Program DICTIN provides a means for defining a data set for input to either computer. Program DATAIN is a simple to operate data entry program which is capable of building data files on either machine. Program LEDITV is an extremely powerful, easy to use, line oriented text editor. Program COPYSBF is designed to print out textual files on the line printer without character loss from FORTRAN carriage control or wide record transfer.
NASA Technical Reports Server (NTRS)
Pitts, E. R.
1976-01-01
The DJANAL (DisJunct ANALyzer) Program provides a means for the LSI designer to format output from the Mask Analysis Program (MAP) for input to the FETLOG (FETSIM/LOGSIM) processor. This document presents a brief description of the operation of DJANAL and provides comprehensive instruction for its use.
Baseband processor development/test performance for 30/20 GHz SS-TDMA communication system
NASA Technical Reports Server (NTRS)
Brown, L.; Sabourin, D.; Attwood, S.
1984-01-01
The baseband processor (BBP) development for the 30/20 GHz Satellite Communication System is described. The SS-TDMA concept for future satellite communications is reviewed, describing the overall system, the satellite payload, and the frequency plan. A brief general description of the BBP is given, and the proof-of-concept model of the BBP is summarized. Key technologies and custom LSI developed for the BBP are listed. Finally, key technology developments and test data are reported for the BBP.
NASA Technical Reports Server (NTRS)
1980-01-01
Detailed software and hardware documentation for the Cardiopulmonary Data Acquisition System is presented. General wiring and timing diagrams are given including those for the LSI-11 computer control panel and interface cables. Flowcharts and complete listings of system programs are provided along with the format of the floppy disk file.
Mechanically verified hardware implementing an 8-bit parallel IO Byzantine agreement processor
NASA Technical Reports Server (NTRS)
Moore, J. Strother
1992-01-01
Consider a network of four processors that use the Oral Messages (Byzantine Generals) Algorithm of Pease, Shostak, and Lamport to achieve agreement in the presence of faults. Bevier and Young have published a functional description of a single processor that, when interconnected appropriately with three identical others, implements this network under the assumption that the four processors step in synchrony. By formalizing the original Pease, et al work, Bevier and Young mechanically proved that such a network achieves fault tolerance. We develop, formalize, and discuss a hardware design that has been mechanically proven to implement their processor. In particular, we formally define mapping functions from the abstract state space of the Bevier-Young processor to a concrete state space of a hardware module and state a theorem that expresses the claim that the hardware correctly implements the processor. We briefly discuss the Brock-Hunt Formal Hardware Description Language which permits designs both to be proved correct with the Boyer-Moore theorem prover and to be expressed in a commercially supported hardware description language for additional electrical analysis and layout. We briefly describe our implementation.
System on chip module configured for event-driven architecture
Robbins, Kevin; Brady, Charles E.; Ashlock, Tad A.
2017-10-17
A system on chip (SoC) module is described herein, wherein the SoC modules comprise a processor subsystem and a hardware logic subsystem. The processor subsystem and hardware logic subsystem are in communication with one another, and transmit event messages between one another. The processor subsystem executes software actors, while the hardware logic subsystem includes hardware actors, the software actors and hardware actors conform to an event-driven architecture, such that the software actors receive and generate event messages and the hardware actors receive and generate event messages.
Support for Diagnosis of Custom Computer Hardware
NASA Technical Reports Server (NTRS)
Molock, Dwaine S.
2008-01-01
The Coldfire SDN Diagnostics software is a flexible means of exercising, testing, and debugging custom computer hardware. The software is a set of routines that, collectively, serve as a common software interface through which one can gain access to various parts of the hardware under test and/or cause the hardware to perform various functions. The routines can be used to construct tests to exercise, and verify the operation of, various processors and hardware interfaces. More specifically, the software can be used to gain access to memory, to execute timer delays, to configure interrupts, and configure processor cache, floating-point, and direct-memory-access units. The software is designed to be used on diverse NASA projects, and can be customized for use with different processors and interfaces. The routines are supported, regardless of the architecture of a processor that one seeks to diagnose. The present version of the software is configured for Coldfire processors on the Subsystem Data Node processor boards of the Solar Dynamics Observatory. There is also support for the software with respect to Mongoose V, RAD750, and PPC405 processors or their equivalents.
ELIPS: Toward a Sensor Fusion Processor on a Chip
NASA Technical Reports Server (NTRS)
Daud, Taher; Stoica, Adrian; Tyson, Thomas; Li, Wei-te; Fabunmi, James
1998-01-01
The paper presents the concept and initial tests from the hardware implementation of a low-power, high-speed reconfigurable sensor fusion processor. The Extended Logic Intelligent Processing System (ELIPS) processor is developed to seamlessly combine rule-based systems, fuzzy logic, and neural networks to achieve parallel fusion of sensor in compact low power VLSI. The first demonstration of the ELIPS concept targets interceptor functionality; other applications, mainly in robotics and autonomous systems are considered for the future. The main assumption behind ELIPS is that fuzzy, rule-based and neural forms of computation can serve as the main primitives of an "intelligent" processor. Thus, in the same way classic processors are designed to optimize the hardware implementation of a set of fundamental operations, ELIPS is developed as an efficient implementation of computational intelligence primitives, and relies on a set of fuzzy set, fuzzy inference and neural modules, built in programmable analog hardware. The hardware programmability allows the processor to reconfigure into different machines, taking the most efficient hardware implementation during each phase of information processing. Following software demonstrations on several interceptor data, three important ELIPS building blocks (a fuzzy set preprocessor, a rule-based fuzzy system and a neural network) have been fabricated in analog VLSI hardware and demonstrated microsecond-processing times.
Extended Logic Intelligent Processing System for a Sensor Fusion Processor Hardware
NASA Technical Reports Server (NTRS)
Stoica, Adrian; Thomas, Tyson; Li, Wei-Te; Daud, Taher; Fabunmi, James
2000-01-01
The paper presents the hardware implementation and initial tests from a low-power, highspeed reconfigurable sensor fusion processor. The Extended Logic Intelligent Processing System (ELIPS) is described, which combines rule-based systems, fuzzy logic, and neural networks to achieve parallel fusion of sensor signals in compact low power VLSI. The development of the ELIPS concept is being done to demonstrate the interceptor functionality which particularly underlines the high speed and low power requirements. The hardware programmability allows the processor to reconfigure into different machines, taking the most efficient hardware implementation during each phase of information processing. Processing speeds of microseconds have been demonstrated using our test hardware.
APRON: A Cellular Processor Array Simulation and Hardware Design Tool
NASA Astrophysics Data System (ADS)
Barr, David R. W.; Dudek, Piotr
2009-12-01
We present a software environment for the efficient simulation of cellular processor arrays (CPAs). This software (APRON) is used to explore algorithms that are designed for massively parallel fine-grained processor arrays, topographic multilayer neural networks, vision chips with SIMD processor arrays, and related architectures. The software uses a highly optimised core combined with a flexible compiler to provide the user with tools for the design of new processor array hardware architectures and the emulation of existing devices. We present performance benchmarks for the software processor array implemented on standard commodity microprocessors. APRON can be configured to use additional processing hardware if necessary and can be used as a complete graphical user interface and development environment for new or existing CPA systems, allowing more users to develop algorithms for CPA systems.
Fault tolerance in a supercomputer through dynamic repartitioning
Chen, Dong; Coteus, Paul W.; Gara, Alan G.; Takken, Todd E.
2007-02-27
A multiprocessor, parallel computer is made tolerant to hardware failures by providing extra groups of redundant standby processors and by designing the system so that these extra groups of processors can be swapped with any group which experiences a hardware failure. This swapping can be under software control, thereby permitting the entire computer to sustain a hardware failure but, after swapping in the standby processors, to still appear to software as a pristine, fully functioning system.
NASA Technical Reports Server (NTRS)
Shiva, S. G.; Shah, A. M.
1980-01-01
The details of digital systems can be conveniently input into the design automation system by means of hardware description language (HDL). The computer aided design and test (CADAT) system at NASA MSFC is used for the LSI design. The digital design language (DDL) was selected as HDL for the CADAT System. DDL translator output can be used for the hardware implementation of the digital design. Problems of selecting the standard cells from the CADAT standard cell library to realize the logic implied by the DDL description of the system are addressed.
NASA Astrophysics Data System (ADS)
Zhang, Yuli; Han, Jun; Weng, Xinqian; He, Zhongzhu; Zeng, Xiaoyang
This paper presents an Application Specific Instruction-set Processor (ASIP) for the SHA-3 BLAKE algorithm family by instruction set extensions (ISE) from an RISC (reduced instruction set computer) processor. With a design space exploration for this ASIP to increase the performance and reduce the area cost, we accomplish an efficient hardware and software implementation of BLAKE algorithm. The special instructions and their well-matched hardware function unit improve the calculation of the key section of the algorithm, namely G-functions. Also, relaxing the time constraint of the special function unit can decrease its hardware cost, while keeping the high data throughput of the processor. Evaluation results reveal the ASIP achieves 335Mbps and 176Mbps for BLAKE-256 and BLAKE-512. The extra area cost is only 8.06k equivalent gates. The proposed ASIP outperforms several software approaches on various platforms in cycle per byte. In fact, both high throughput and low hardware cost achieved by this programmable processor are comparable to that of ASIC implementations.
Development of a Communications Front End Processor (FEP) for the VAX-11/780 Using an LSI-11/23.
1983-12-01
9 Approach . . . . . . . . . . . . . . . . . . 11 Software Development Life Cycle . . . . . . . 11 Requirements Analysis...proven to be useful (25] during the Software Development Life Cycle of a project. Development tools and documentation aids used throughout this effort...include "Structure Charts" ( ref Appendix B ), a "Data Dictionary" ( ref Appendix C ),and Program Design Language CPDL). 1.5.1 Software Development- Life
Video image processor on the Spacelab 2 Solar Optical Universal Polarimeter /SL2 SOUP/
NASA Technical Reports Server (NTRS)
Lindgren, R. W.; Tarbell, T. D.
1981-01-01
The SOUP instrument is designed to obtain diffraction-limited digital images of the sun with high photometric accuracy. The Video Processor originated from the requirement to provide onboard real-time image processing, both to reduce the telemetry rate and to provide meaningful video displays of scientific data to the payload crew. This original concept has evolved into a versatile digital processing system with a multitude of other uses in the SOUP program. The central element in the Video Processor design is a 16-bit central processing unit based on 2900 family bipolar bit-slice devices. All arithmetic, logical and I/O operations are under control of microprograms, stored in programmable read-only memory and initiated by commands from the LSI-11. Several functions of the Video Processor are described, including interface to the High Rate Multiplexer downlink, cosmetic and scientific data processing, scan conversion for crew displays, focus and exposure testing, and use as ground support equipment.
Chip level modeling of LSI devices
NASA Technical Reports Server (NTRS)
Armstrong, J. R.
1984-01-01
The advent of Very Large Scale Integration (VLSI) technology has rendered the gate level model impractical for many simulation activities critical to the design automation process. As an alternative, an approach to the modeling of VLSI devices at the chip level is described, including the specification of modeling language constructs important to the modeling process. A model structure is presented in which models of the LSI devices are constructed as single entities. The modeling structure is two layered. The functional layer in this structure is used to model the input/output response of the LSI chip. A second layer, the fault mapping layer, is added, if fault simulations are required, in order to map the effects of hardware faults onto the functional layer. Modeling examples for each layer are presented. Fault modeling at the chip level is described. Approaches to realistic functional fault selection and defining fault coverage for functional faults are given. Application of the modeling techniques to single chip and bit slice microprocessors is discussed.
Feasibility of a special-purpose computer to solve the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Gritton, E. C.; King, W. S.; Sutherland, I.; Gaines, R. S.; Gazley, C., Jr.; Grosch, C.; Juncosa, M.; Petersen, H.
1978-01-01
Orders-of-magnitude improvements in computer performance can be realized with a parallel array of thousands of fast microprocessors. In this architecture, wiring congestion is minimized by limiting processor communication to nearest neighbors. When certain standard algorithms are applied to a viscous flow problem and existing LSI technology is used, performance estimates of this conceptual design show a dramatic decrease in computational time when compared to the CDC 7600.
Green Secure Processors: Towards Power-Efficient Secure Processor Design
NASA Astrophysics Data System (ADS)
Chhabra, Siddhartha; Solihin, Yan
With the increasing wealth of digital information stored on computer systems today, security issues have become increasingly important. In addition to attacks targeting the software stack of a system, hardware attacks have become equally likely. Researchers have proposed Secure Processor Architectures which utilize hardware mechanisms for memory encryption and integrity verification to protect the confidentiality and integrity of data and computation, even from sophisticated hardware attacks. While there have been many works addressing performance and other system level issues in secure processor design, power issues have largely been ignored. In this paper, we first analyze the sources of power (energy) increase in different secure processor architectures. We then present a power analysis of various secure processor architectures in terms of their increase in power consumption over a base system with no protection and then provide recommendations for designs that offer the best balance between performance and power without compromising security. We extend our study to the embedded domain as well. We also outline the design of a novel hybrid cryptographic engine that can be used to minimize the power consumption for a secure processor. We believe that if secure processors are to be adopted in future systems (general purpose or embedded), it is critically important that power issues are considered in addition to performance and other system level issues. To the best of our knowledge, this is the first work to examine the power implications of providing hardware mechanisms for security.
Hardware based redundant multi-threading inside a GPU for improved reliability
Sridharan, Vilas; Gurumurthi, Sudhanva
2015-05-05
A system and method for verifying computation output using computer hardware are provided. Instances of computation are generated and processed on hardware-based processors. As instances of computation are processed, each instance of computation receives a load accessible to other instances of computation. Instances of output are generated by processing the instances of computation. The instances of output are verified against each other in a hardware based processor to ensure accuracy of the output.
Mongoose: Creation of a Rad-Hard MIPS R3000
NASA Technical Reports Server (NTRS)
Lincoln, Dan; Smith, Brian
1993-01-01
This paper describes the development of a 32 Bit, full MIPS R3000 code-compatible Rad-Hard CPU, code named Mongoose. Mongoose progressed from contract award, through the design cycle, to operational silicon in 12 months to meet a space mission for NASA. The goal was the creation of a fully static device capable of operation to the maximum Mil-883 derated speed, worst-case post-rad exposure with full operational integrity. This included consideration of features for functional enhancements relating to mission compatibility and removal of commercial practices not supported by Rad-Hard technology. 'Mongoose' developed from an evolution of LSI Logic's MIPS-I embedded processor, LR33000, code named Cobra, to its Rad-Hard 'equivalent', Mongoose. The term 'equivalent' is used to infer that the core of the processor is functionally identical, allowing the same use and optimizations of the MIPS-I Instruction Set software tool suite for compilation, software program trace, etc. This activity was started in September of 1991 under a contract from NASA-Goddard Space Flight Center (GSFC)-Flight Data Systems. The approach affected a teaming of NASA-GSFC for program development, LSI Logic for system and ASIC design coupled with the Rad-Hard process technology, and Harris (GASD) for Rad-Hard microprocessor design expertise. The program culminated with the generation of Rad-Hard Mongoose prototypes one year later.
Concept of a programmable maintenance processor applicable to multiprocessing systems
NASA Technical Reports Server (NTRS)
Glover, Richard D.
1988-01-01
A programmable maintenance processor concept applicable to multiprocessing systems has been developed at the NASA Ames Research Center's Dryden Flight Research Facility. This stand-alone-processor is intended to provide support for system and application software testing as well as hardware diagnostics. An initial machanization has been incorporated into the extended aircraft interrogation and display system (XAIDS) which is multiprocessing general-purpose ground support equipment. The XAIDS maintenance processor has independent terminal and printer interfaces and a dedicated magnetic bubble memory that stores system test sequences entered from the terminal. This report describes the hardware and software embodied in this processor and shows a typical application in the check-out of a new XAIDS.
NASA Technical Reports Server (NTRS)
Srivas, Mandayam; Bickford, Mark
1991-01-01
The design and formal verification of a hardware system for a task that is an important component of a fault tolerant computer architecture for flight control systems is presented. The hardware system implements an algorithm for obtaining interactive consistancy (byzantine agreement) among four microprocessors as a special instruction on the processors. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, provided certain preconditions hold. An assumption is made that the processors execute synchronously. For verification, the authors used a computer aided design hardware design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover.
An optical/digital processor - Hardware and applications
NASA Technical Reports Server (NTRS)
Casasent, D.; Sterling, W. M.
1975-01-01
A real-time two-dimensional hybrid processor consisting of a coherent optical system, an optical/digital interface, and a PDP-11/15 control minicomputer is described. The input electrical-to-optical transducer is an electron-beam addressed potassium dideuterium phosphate (KD2PO4) light valve. The requirements and hardware for the output optical-to-digital interface, which is constructed from modular computer building blocks, are presented. Initial experimental results demonstrating the operation of this hybrid processor in phased-array radar data processing, synthetic-aperture image correlation, and text correlation are included. The applications chosen emphasize the role of the interface in the analysis of data from an optical processor and possible extensions to the digital feedback control of an optical processor.
Managing coherence via put/get windows
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2011-01-11
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2012-02-21
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Hardware accuracy counters for application precision and quality feedback
DOE Office of Scientific and Technical Information (OSTI.GOV)
de Paula Rosa Piga, Leonardo; Majumdar, Abhinandan; Paul, Indrani
Methods, devices, and systems for capturing an accuracy of an instruction executing on a processor. An instruction may be executed on the processor, and the accuracy of the instruction may be captured using a hardware counter circuit. The accuracy of the instruction may be captured by analyzing bits of at least one value of the instruction to determine a minimum or maximum precision datatype for representing the field, and determining whether to adjust a value of the hardware counter circuit accordingly. The representation may be output to a debugger or logfile for use by a developer, or may be outputmore » to a runtime or virtual machine to automatically adjust instruction precision or gating of portions of the processor datapath.« less
Olson, Eric J.
2013-06-11
An apparatus, program product, and method that run an algorithm on a hardware based processor, generate a hardware error as a result of running the algorithm, generate an algorithm output for the algorithm, compare the algorithm output to another output for the algorithm, and detect the hardware error from the comparison. The algorithm is designed to cause the hardware based processor to heat to a degree that increases the likelihood of hardware errors to manifest, and the hardware error is observable in the algorithm output. As such, electronic components may be sufficiently heated and/or sufficiently stressed to create better conditions for generating hardware errors, and the output of the algorithm may be compared at the end of the run to detect a hardware error that occurred anywhere during the run that may otherwise not be detected by traditional methodologies (e.g., due to cooling, insufficient heat and/or stress, etc.).
Global synchronization of parallel processors using clock pulse width modulation
Chen, Dong; Ellavsky, Matthew R.; Franke, Ross L.; Gara, Alan; Gooding, Thomas M.; Haring, Rudolf A.; Jeanson, Mark J.; Kopcsay, Gerard V.; Liebsch, Thomas A.; Littrell, Daniel; Ohmacht, Martin; Reed, Don D.; Schenck, Brandon E.; Swetz, Richard A.
2013-04-02
A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.
Electronic processing and control system with programmable hardware
NASA Technical Reports Server (NTRS)
Alkalaj, Leon (Inventor); Fang, Wai-Chi (Inventor); Newell, Michael A. (Inventor)
1998-01-01
A computer system with reprogrammable hardware allowing dynamically allocating hardware resources for different functions and adaptability for different processors and different operating platforms. All hardware resources are physically partitioned into system-user hardware and application-user hardware depending on the specific operation requirements. A reprogrammable interface preferably interconnects the system-user hardware and application-user hardware.
Low latency messages on distributed memory multiprocessors
NASA Technical Reports Server (NTRS)
Rosing, Matthew; Saltz, Joel
1993-01-01
Many of the issues in developing an efficient interface for communication on distributed memory machines are described and a portable interface is proposed. Although the hardware component of message latency is less than one microsecond on many distributed memory machines, the software latency associated with sending and receiving typed messages is on the order of 50 microseconds. The reason for this imbalance is that the software interface does not match the hardware. By changing the interface to match the hardware more closely, applications with fine grained communication can be put on these machines. Based on several tests that were run on the iPSC/860, an interface that will better match current distributed memory machines is proposed. The model used in the proposed interface consists of a computation processor and a communication processor on each node. Communication between these processors and other nodes in the system is done through a buffered network. Information that is transmitted is either data or procedures to be executed on the remote processor. The dual processor system is better suited for efficiently handling asynchronous communications compared to a single processor system. The ability to send data or procedure is very flexible for minimizing message latency, based on the type of communication being performed. The test performed and the proposed interface are described.
Study of a unified hardware and software fault-tolerant architecture
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan; Alger, Linda; Friend, Steven; Greeley, Gregory; Sacco, Stephen; Adams, Stuart
1989-01-01
A unified architectural concept, called the Fault Tolerant Processor Attached Processor (FTP-AP), that can tolerate hardware as well as software faults is proposed for applications requiring ultrareliable computation capability. An emulation of the FTP-AP architecture, consisting of a breadboard Motorola 68010-based quadruply redundant Fault Tolerant Processor, four VAX 750s as attached processors, and four versions of a transport aircraft yaw damper control law, is used as a testbed in the AIRLAB to examine a number of critical issues. Solutions of several basic problems associated with N-Version software are proposed and implemented on the testbed. This includes a confidence voter to resolve coincident errors in N-Version software. A reliability model of N-Version software that is based upon the recent understanding of software failure mechanisms is also developed. The basic FTP-AP architectural concept appears suitable for hosting N-Version application software while at the same time tolerating hardware failures. Architectural enhancements for greater efficiency, software reliability modeling, and N-Version issues that merit further research are identified.
A High-Throughput Processor for Flight Control Research Using Small UAVs
NASA Technical Reports Server (NTRS)
Klenke, Robert H.; Sleeman, W. C., IV; Motter, Mark A.
2006-01-01
There are numerous autopilot systems that are commercially available for small (<100 lbs) UAVs. However, they all share several key disadvantages for conducting aerodynamic research, chief amongst which is the fact that most utilize older, slower, 8- or 16-bit microcontroller technologies. This paper describes the development and testing of a flight control system (FCS) for small UAV s based on a modern, high throughput, embedded processor. In addition, this FCS platform contains user-configurable hardware resources in the form of a Field Programmable Gate Array (FPGA) that can be used to implement custom, application-specific hardware. This hardware can be used to off-load routine tasks such as sensor data collection, from the FCS processor thereby further increasing the computational throughput of the system.
Companion Chip: Building a Segregated Hardware Architecture
NASA Astrophysics Data System (ADS)
Pareaud, Thomas; Houelle, Alain; Vaucher, Niolas; Albinet, Mathieu; Honvault, Christophe
2011-08-01
Partitioning is a more and more mature concept in Space industry. It aims at assuring that some error propagation modes are not possible. This paper gives an overview of an analysis conducted in the frame of a research and technology study performed in 2010/2011. The "Java Companion Chip" study addresses an interesting approach to partitioning using hardware concepts: a SoC architecture integrates a master processor, a companion chip and additional hardware functions aiming at enforcing the time and space segregation between the master processor and the slave one.This paper discusses the benefits and the main challenges of the proposed approach. In addition, it presents an application of these concepts to a case study: a Leon/Java processor architecture able to concurrently execute native and Java applications.
Simplifying and speeding the management of intra-node cache coherence
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton on Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Phillip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Yorktown Heights, NY
2012-04-17
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an area of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.
Managing coherence via put/get windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A; Chen, Dong; Coteus, Paul W
A method and apparatus for managing coherence between two processors of a two processor node of a multi-processor computer system. Generally the present invention relates to a software algorithm that simplifies and significantly speeds the management of cache coherence in a message passing parallel computer, and to hardware apparatus that assists this cache coherence algorithm. The software algorithm uses the opening and closing of put/get windows to coordinate the activated required to achieve cache coherence. The hardware apparatus may be an extension to the hardware address decode, that creates, in the physical memory address space of the node, an areamore » of virtual memory that (a) does not actually exist, and (b) is therefore able to respond instantly to read and write requests from the processing elements.« less
Simulink/PARS Integration Support
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vacaliuc, B.; Nakhaee, N.
2013-12-18
The state of the art for signal processor hardware has far out-paced the development tools for placing applications on that hardware. In addition, signal processors are available in a variety of architectures, each uniquely capable of handling specific types of signal processing efficiently. With these processors becoming smaller and demanding less power, it has become possible to group multiple processors, a heterogeneous set of processors, into single systems. Different portions of the desired problem set can be assigned to different processor types as appropriate. As software development tools do not keep pace with these processors, especially when multiple processors ofmore » different types are used, a method is needed to enable software code portability among multiple processors and multiple types of processors along with their respective software environments. Sundance DSP, Inc. has developed a software toolkit called “PARS”, whose objective is to provide a framework that uses suites of tools provided by different vendors, along with modeling tools and a real time operating system, to build an application that spans different processor types. The software language used to express the behavior of the system is a very high level modeling language, “Simulink”, a MathWorks product. ORNL has used this toolkit to effectively implement several deliverables. This CRADA describes this collaboration between ORNL and Sundance DSP, Inc.« less
Software Coherence in Multiprocessor Memory Systems. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Bolosky, William Joseph
1993-01-01
Processors are becoming faster and multiprocessor memory interconnection systems are not keeping up. Therefore, it is necessary to have threads and the memory they access as near one another as possible. Typically, this involves putting memory or caches with the processors, which gives rise to the problem of coherence: if one processor writes an address, any other processor reading that address must see the new value. This coherence can be maintained by the hardware or with software intervention. Systems of both types have been built in the past; the hardware-based systems tended to outperform the software ones. However, the ratio of processor to interconnect speed is now so high that the extra overhead of the software systems may no longer be significant. This issue is explored both by implementing a software maintained system and by introducing and using the technique of offline optimal analysis of memory reference traces. It finds that in properly built systems, software maintained coherence can perform comparably to or even better than hardware maintained coherence. The architectural features necessary for efficient software coherence to be profitable include a small page size, a fast trap mechanism, and the ability to execute instructions while remote memory references are outstanding.
Potential of minicomputer/array-processor system for nonlinear finite-element analysis
NASA Technical Reports Server (NTRS)
Strohkorb, G. A.; Noor, A. K.
1983-01-01
The potential of using a minicomputer/array-processor system for the efficient solution of large-scale, nonlinear, finite-element problems is studied. A Prime 750 is used as the host computer, and a software simulator residing on the Prime is employed to assess the performance of the Floating Point Systems AP-120B array processor. Major hardware characteristics of the system such as virtual memory and parallel and pipeline processing are reviewed, and the interplay between various hardware components is examined. Effective use of the minicomputer/array-processor system for nonlinear analysis requires the following: (1) proper selection of the computational procedure and the capability to vectorize the numerical algorithms; (2) reduction of input-output operations; and (3) overlapping host and array-processor operations. A detailed discussion is given of techniques to accomplish each of these tasks. Two benchmark problems with 1715 and 3230 degrees of freedom, respectively, are selected to measure the anticipated gain in speed obtained by using the proposed algorithms on the array processor.
Design of a dataway processor for a parallel image signal processing system
NASA Astrophysics Data System (ADS)
Nomura, Mitsuru; Fujii, Tetsuro; Ono, Sadayasu
1995-04-01
Recently, demands for high-speed signal processing have been increasing especially in the field of image data compression, computer graphics, and medical imaging. To achieve sufficient power for real-time image processing, we have been developing parallel signal-processing systems. This paper describes a communication processor called 'dataway processor' designed for a new scalable parallel signal-processing system. The processor has six high-speed communication links (Dataways), a data-packet routing controller, a RISC CORE, and a DMA controller. Each communication link operates at 8-bit parallel in a full duplex mode at 50 MHz. Moreover, data routing, DMA, and CORE operations are processed in parallel. Therefore, sufficient throughput is available for high-speed digital video signals. The processor is designed in a top- down fashion using a CAD system called 'PARTHENON.' The hardware is fabricated using 0.5-micrometers CMOS technology, and its hardware is about 200 K gates.
Bhanot, Gyan [Princeton, NJ; Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2009-09-08
Class network routing is implemented in a network such as a computer network comprising a plurality of parallel compute processors at nodes thereof. Class network routing allows a compute processor to broadcast a message to a range (one or more) of other compute processors in the computer network, such as processors in a column or a row. Normally this type of operation requires a separate message to be sent to each processor. With class network routing pursuant to the invention, a single message is sufficient, which generally reduces the total number of messages in the network as well as the latency to do a broadcast. Class network routing is also applied to dense matrix inversion algorithms on distributed memory parallel supercomputers with hardware class function (multicast) capability. This is achieved by exploiting the fact that the communication patterns of dense matrix inversion can be served by hardware class functions, which results in faster execution times.
1982-07-01
blocks. DISCUS has no form of hardware synchronisation between the processors. The only synchronisation is at an operating system level. ;ach processor is... operations in global store so that semaphoring on global objects can be done correctly. Write Protect is used by the operating system for read-only...the appropriate operating system program. String Handling primitives . The Z8000 has a rich set of string primitives . However as we saw before if a
Using Modern Design Tools for Digital Avionics Development
NASA Technical Reports Server (NTRS)
Hyde, David W.; Lakin, David R., II; Asquith, Thomas E.
2000-01-01
Using Modem Design Tools for Digital Avionics Development Shrinking development time and increased complexity of new avionics forces the designer to use modem tools and methods during hardware development. Engineers at the Marshall Space Flight Center have successfully upgraded their design flow and used it to develop a Mongoose V based radiation tolerant processor board for the International Space Station's Water Recovery System. The design flow, based on hardware description languages, simulation, synthesis, hardware models, and full functional software model libraries, allowed designers to fully simulate the processor board from reset, through initialization before any boards were built. The fidelity of a digital simulation is limited to the accuracy of the models used and how realistically the designer drives the circuit's inputs during simulation. By using the actual silicon during simulation, device modeling errors are reduced. Numerous design flaws were discovered early in the design phase when they could be easily fixed. The use of hardware models and actual MIPS software loaded into full functional memory models also provided checkout of the software development environment. This paper will describe the design flow used to develop the processor board and give examples of errors that were found using the tools. An overview of the processor board firmware will also be covered.
Distributed digital signal processors for multi-body structures
NASA Technical Reports Server (NTRS)
Lee, Gordon K.
1990-01-01
Several digital filter designs were investigated which may be used to process sensor data from large space structures and to design digital hardware to implement the distributed signal processing architecture. Several experimental tests articles are available at NASA Langley Research Center to evaluate these designs. A summary of some of the digital filter designs is presented, an evaluation of their characteristics relative to control design is discussed, and candidate hardware microcontroller/microcomputer components are given. Future activities include software evaluation of the digital filter designs and actual hardware inplementation of some of the signal processor algorithms on an experimental testbed at NASA Langley.
A low power biomedical signal processor ASIC based on hardware software codesign.
Nie, Z D; Wang, L; Chen, W G; Zhang, T; Zhang, Y T
2009-01-01
A low power biomedical digital signal processor ASIC based on hardware and software codesign methodology was presented in this paper. The codesign methodology was used to achieve higher system performance and design flexibility. The hardware implementation included a low power 32bit RISC CPU ARM7TDMI, a low power AHB-compatible bus, and a scalable digital co-processor that was optimized for low power Fast Fourier Transform (FFT) calculations. The co-processor could be scaled for 8-point, 16-point and 32-point FFTs, taking approximate 50, 100 and 150 clock circles, respectively. The complete design was intensively simulated using ARM DSM model and was emulated by ARM Versatile platform, before conducted to silicon. The multi-million-gate ASIC was fabricated using SMIC 0.18 microm mixed-signal CMOS 1P6M technology. The die area measures 5,000 microm x 2,350 microm. The power consumption was approximately 3.6 mW at 1.8 V power supply and 1 MHz clock rate. The power consumption for FFT calculations was less than 1.5 % comparing with the conventional embedded software-based solution.
AFOSR BRI: Co-Design of Hardware/Software for Predicting MAV Aerodynamics
2016-09-27
DOCUMENTATION PAGE Form ApprovedOMB No. 0704-0188 1. REPORT DATE (DD-MM-YYYY) 2. REPORT TYPE 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER 6. AUTHOR(S) 7...703-588-8494 AFOSR BRI While Moore’s Law theoretically doubles processor performance every 24 months, much of the realizable performance remains...past efforts to develop such CFD codes on accelerated processors showed limited success, our hardware/software co-design approach created malleable
Hierarchical algorithms for modeling the ocean on hierarchical architectures
NASA Astrophysics Data System (ADS)
Hill, C. N.
2012-12-01
This presentation will describe an approach to using accelerator/co-processor technology that maps hierarchical, multi-scale modeling techniques to an underlying hierarchical hardware architecture. The focus of this work is on making effective use of both CPU and accelerator/co-processor parts of a system, for large scale ocean modeling. In the work, a lower resolution basin scale ocean model is locally coupled to multiple, "embedded", limited area higher resolution sub-models. The higher resolution models execute on co-processor/accelerator hardware and do not interact directly with other sub-models. The lower resolution basin scale model executes on the system CPU(s). The result is a multi-scale algorithm that aligns with hardware designs in the co-processor/accelerator space. We demonstrate this approach being used to substitute explicit process models for standard parameterizations. Code for our sub-models is implemented through a generic abstraction layer, so that we can target multiple accelerator architectures with different programming environments. We will present two application and implementation examples. One uses the CUDA programming environment and targets GPU hardware. This example employs a simple non-hydrostatic two dimensional sub-model to represent vertical motion more accurately. The second example uses a highly threaded three-dimensional model at high resolution. This targets a MIC/Xeon Phi like environment and uses sub-models as a way to explicitly compute sub-mesoscale terms. In both cases the accelerator/co-processor capability provides extra compute cycles that allow improved model fidelity for little or no extra wall-clock time cost.
Digital Hardware Architecture Implementation
1993-02-15
of micro - MOTOROLA 63.7 50MHZ 64 BIT 2092 N/A processors during quarterly re- INTEL 42 50MHz 64 BIT 1092 N/A views and monthly reports. The 186o XP...27 3.2.1 Signal Processor (SP) Analysis...31 3.2.1.11 MasPar Software Statements ........................................................ 32 3.2.2 Data Processor
Pierce, Paul E.
1986-01-01
A hardware processor is disclosed which in the described embodiment is a memory mapped multiplier processor that can operate in parallel with a 16 bit microcomputer. The multiplier processor decodes the address bus to receive specific instructions so that in one access it can write and automatically perform single or double precision multiplication involving a number written to it with or without addition or subtraction with a previously stored number. It can also, on a single read command automatically round and scale a previously stored number. The multiplier processor includes two concatenated 16 bit multiplier registers, two 16 bit concatenated 16 bit multipliers, and four 16 bit product registers connected to an internal 16 bit data bus. A high level address decoder determines when the multiplier processor is being addressed and first and second low level address decoders generate control signals. In addition, certain low order address lines are used to carry uncoded control signals. First and second control circuits coupled to the decoders generate further control signals and generate a plurality of clocking pulse trains in response to the decoded and address control signals.
NASA Astrophysics Data System (ADS)
Erez, Mattan; Dally, William J.
Stream processors, like other multi core architectures partition their functional units and storage into multiple processing elements. In contrast to typical architectures, which contain symmetric general-purpose cores and a cache hierarchy, stream processors have a significantly leaner design. Stream processors are specifically designed for the stream execution model, in which applications have large amounts of explicit parallel computation, structured and predictable control, and memory accesses that can be performed at a coarse granularity. Applications in the streaming model are expressed in a gather-compute-scatter form, yielding programs with explicit control over transferring data to and from on-chip memory. Relying on these characteristics, which are common to many media processing and scientific computing applications, stream architectures redefine the boundary between software and hardware responsibilities with software bearing much of the complexity required to manage concurrency, locality, and latency tolerance. Thus, stream processors have minimal control consisting of fetching medium- and coarse-grained instructions and executing them directly on the many ALUs. Moreover, the on-chip storage hierarchy of stream processors is under explicit software control, as is all communication, eliminating the need for complex reactive hardware mechanisms.
Pierce, P.E.
A hardware processor is disclosed which in the described embodiment is a memory mapped multiplier processor that can operate in parallel with a 16 bit microcomputer. The multiplier processor decodes the address bus to receive specific instructions so that in one access it can write and automatically perform single or double precision multiplication involving a number written to it with or without addition or subtraction with a previously stored number. It can also, on a single read command automatically round and scale a previously stored number. The multiplier processor includes two concatenated 16 bit multiplier registers, two 16 bit concatenated 16 bit multipliers, and four 16 bit product registers connected to an internal 16 bit data bus. A high level address decoder determines when the multiplier processor is being addressed and first and second low level address decoders generate control signals. In addition, certain low order address lines are used to carry uncoded control signals. First and second control circuits coupled to the decoders generate further control signals and generate a plurality of clocking pulse trains in response to the decoded and address control signals.
Modular hardware synthesis using an HDL. [Hardware Description Language
NASA Technical Reports Server (NTRS)
Covington, J. A.; Shiva, S. G.
1981-01-01
Although hardware description languages (HDL) are becoming more and more necessary to automated design systems, their application is complicated due to the difficulty in translating the HDL description into an implementable format, nonfamiliarity of hardware designers with high-level language programming, nonuniform design methodologies and the time and costs involved in transfering HDL design software. Digital design language (DDL) suffers from all of the above problems and in addition can only by synthesized on a complete system and not on its subparts, making it unsuitable for synthesis using standard modules or prefabricated chips such as those required in LSI or VLSI circuits. The present paper presents a method by which the DDL translator can be made to generate modular equations that will allow the system to be synthesized as an interconnection of lower-level modules. The method involves the introduction of a new language construct called a Module which provides for the separate translation of all equations bounded by it.
The Use of a Microcomputer Based Array Processor for Real Time Laser Velocimeter Data Processing
NASA Technical Reports Server (NTRS)
Meyers, James F.
1990-01-01
The application of an array processor to laser velocimeter data processing is presented. The hardware is described along with the method of parallel programming required by the array processor. A portion of the data processing program is described in detail. The increase in computational speed of a microcomputer equipped with an array processor is illustrated by comparative testing with a minicomputer.
Contextual classification on a CDC Flexible Processor system. [for photomapped remote sensing data
NASA Technical Reports Server (NTRS)
Smith, B. W.; Siegel, H. J.; Swain, P. H.
1981-01-01
A potential hardware organization for the Flexible Processor Array is presented. An algorithm that implements a contextual classifier for remote sensing data analysis is given, along with uniprocessor classification algorithms. The Flexible Processor algorithm is provided, as are simulated timings for contextual classifiers run on the Flexible Processor Array and another system. The timings are analyzed for context neighborhoods of sizes three and nine.
NASA Technical Reports Server (NTRS)
Reinhart, Richard C.; Kacpura, Thomas J.; Smith, Carl R.; Liebetreu, John; Hill, Gary; Mortensen, Dale J.; Andro, Monty; Scardelletti, Maximilian C.; Farrington, Allen
2008-01-01
This report defines a hardware architecture approach for software-defined radios to enable commonality among NASA space missions. The architecture accommodates a range of reconfigurable processing technologies including general-purpose processors, digital signal processors, field programmable gate arrays, and application-specific integrated circuits (ASICs) in addition to flexible and tunable radiofrequency front ends to satisfy varying mission requirements. The hardware architecture consists of modules, radio functions, and interfaces. The modules are a logical division of common radio functions that compose a typical communication radio. This report describes the architecture details, the module definitions, the typical functions on each module, and the module interfaces. Tradeoffs between component-based, custom architecture and a functional-based, open architecture are described. The architecture does not specify a physical implementation internally on each module, nor does the architecture mandate the standards or ratings of the hardware used to construct the radios.
HEP - A semaphore-synchronized multiprocessor with central control. [Heterogeneous Element Processor
NASA Technical Reports Server (NTRS)
Gilliland, M. C.; Smith, B. J.; Calvert, W.
1976-01-01
The paper describes the design concept of the Heterogeneous Element Processor (HEP), a system tailored to the special needs of scientific simulation. In order to achieve high-speed computation required by simulation, HEP features a hierarchy of processes executing in parallel on a number of processors, with synchronization being largely accomplished by hardware. A full-empty-reserve scheme of synchronization is realized by zero-one-valued hardware semaphores. A typical system has, besides the control computer and the scheduler, an algebraic module, a memory module, a first-in first-out (FIFO) module, an integrator module, and an I/O module. The architecture of the scheduler and the algebraic module is examined in detail.
Store-operate-coherence-on-value
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Dong; Heidelberger, Philip; Kumar, Sameer
A system, method and computer program product for performing various store-operate instructions in a parallel computing environment that includes a plurality of processors and at least one cache memory device. A queue in the system receives, from a processor, a store-operate instruction that specifies under which condition a cache coherence operation is to be invoked. A hardware unit in the system runs the received store-operate instruction. The hardware unit evaluates whether a result of the running the received store-operate instruction satisfies the condition. The hardware unit invokes a cache coherence operation on a cache memory address associated with the receivedmore » store-operate instruction if the result satisfies the condition. Otherwise, the hardware unit does not invoke the cache coherence operation on the cache memory device.« less
Water Processor and Oxygen Generation Assembly
NASA Technical Reports Server (NTRS)
Bedard, John
1997-01-01
This report documents the results of the tasks which initiated efforts on design issues relating to the Water Processor (WP) and the Oxygen Generation Assembly (OGA) Flight Hardware for the International Space Station. This report fulfills the Statement of Work deliverables requirement for contract H-29387D. The following lists the tasks required by contract H-29387D: (1) HSSSI shall coordinate a detailed review of WP/OGA Flight Hardware program requirements with personnel from MSFC to identify requirements that can be eliminated without affecting the technical integrity of the WP/OGA Hardware; (2) HSSSI shall conduct the technical interchanges with personnel from MSFC to resolve design issues related to WP/OGA Flight Hardware; (3) HSSSI will initiate discussions with Zellwegger Analytics, Inc. to address design issues related to WP and PCWQM interfaces.
Trigger and Readout System for the Ashra-1 Detector
NASA Astrophysics Data System (ADS)
Aita, Y.; Aoki, T.; Asaoka, Y.; Morimoto, Y.; Motz, H. M.; Sasaki, M.; Abiko, C.; Kanokohata, C.; Ogawa, S.; Shibuya, H.; Takada, T.; Kimura, T.; Learned, J. G.; Matsuno, S.; Kuze, S.; Binder, P. M.; Goldman, J.; Sugiyama, N.; Watanabe, Y.
Highly sophisticated trigger and readout system has been developed for All-sky Survey High Resolution Air-shower (Ashra) detector. Ashra-1 detector has 42 degree diameter field of view. Detection of Cherenkov and fluorescence light from large background in the large field of view requires finely segmented and high speed trigger and readout system. The system is composed of optical fiber image transmission system, 64 × 64 channel trigger sensor and FPGA based trigger logic processor. The system typically processes the image within 10 to 30 ns and opens the shutter on the fine CMOS sensor. 64 × 64 coarse split image is transferred via 64 × 64 precisely aligned optical fiber bundle to a photon sensor. Current signals from the photon sensor are discriminated by custom made trigger amplifiers. FPGA based processor processes 64 × 64 hit pattern and correspondent partial area of the fine image is acquired. Commissioning earth skimming tau neutrino observational search was carried out with this trigger system. In addition to the geometrical advantage of the Ashra observational site, the excellent tau shower axis measurement based on the fine imaging and the night sky background rejection based on the fine and fast imaging allow zero background tau shower search. Adoption of the optical fiber bundle and trigger LSI realized 4k channel trigger system cheaply. Detectability of tau shower is also confirmed by simultaneously observed Cherenkov air shower. Reduction of the trigger threshold appears to enhance the effective area especially in PeV tau neutrino energy region. New two dimensional trigger LSI was introduced and the trigger threshold was lowered. New calibration system of the trigger system was recently developed and introduced to the Ashra detector
Control structures for high speed processors
NASA Technical Reports Server (NTRS)
Maki, G. K.; Mankin, R.; Owsley, P. A.; Kim, G. M.
1982-01-01
A special processor was designed to function as a Reed Solomon decoder with throughput data rate in the Mhz range. This data rate is significantly greater than is possible with conventional digital architectures. To achieve this rate, the processor design includes sequential, pipelined, distributed, and parallel processing. The processor was designed using a high level language register transfer language. The RTL can be used to describe how the different processes are implemented by the hardware. One problem of special interest was the development of dependent processes which are analogous to software subroutines. For greater flexibility, the RTL control structure was implemented in ROM. The special purpose hardware required approximately 1000 SSI and MSI components. The data rate throughput is 2.5 megabits/second. This data rate is achieved through the use of pipelined and distributed processing. This data rate can be compared with 800 kilobits/second in a recently proposed very large scale integration design of a Reed Solomon encoder.
Matrix-vector multiplication using digital partitioning for more accurate optical computing
NASA Technical Reports Server (NTRS)
Gary, C. K.
1992-01-01
Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.
Performance Qualification Test of the ISS Water Processor Assembly (WPA) Expendables
NASA Technical Reports Server (NTRS)
Carter, Layne; Tabb, David; Tatara, James D.; Mason, Richard K.
2005-01-01
The Water Processor Assembly (WPA) for use on the International Space Station (ISS) includes various technologies for the treatment of waste water. These technologies include filtration, ion exchange, adsorption, catalytic oxidation, and iodination. The WPA hardware implementing portions of these technologies, including the Particulate Filter, Multifiltration Bed, Ion Exchange Bed, and Microbial Check Valve, was recently qualified for chemical performance at the Marshall Space Flight Center. Waste water representing the quality of that produced on the ISS was generated by test subjects and processed by the WPA. Water quality analysis and instrumentation data was acquired throughout the test to monitor hardware performance. This paper documents operation of the test and the assessment of the hardware performance.
Method for prefetching non-contiguous data structures
Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Hoenicke, Dirk [Ossining, NY; Ohmacht, Martin [Brewster, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Takken, Todd E [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY
2009-05-05
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.
WATERLOPP V2/64: A highly parallel machine for numerical computation
NASA Astrophysics Data System (ADS)
Ostlund, Neil S.
1985-07-01
Current technological trends suggest that the high performance scientific machines of the future are very likely to consist of a large number (greater than 1024) of processors connected and communicating with each other in some as yet undetermined manner. Such an assembly of processors should behave as a single machine in obtaining numerical solutions to scientific problems. However, the appropriate way of organizing both the hardware and software of such an assembly of processors is an unsolved and active area of research. It is particularly important to minimize the organizational overhead of interprocessor comunication, global synchronization, and contention for shared resources if the performance of a large number ( n) of processors is to be anything like the desirable n times the performance of a single processor. In many situations, adding a processor actually decreases the performance of the overall system since the extra organizational overhead is larger than the extra processing power added. The systolic loop architecture is a new multiple processor architecture which attemps at a solution to the problem of how to organize a large number of asynchronous processors into an effective computational system while minimizing the organizational overhead. This paper gives a brief overview of the basic systolic loop architecture, systolic loop algorithms for numerical computation, and a 64-processor implementation of the architecture, WATERLOOP V2/64, that is being used as a testbed for exploring the hardware, software, and algorithmic aspects of the architecture.
A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor
NASA Technical Reports Server (NTRS)
Rao, Hariprasad Nannapaneni
1989-01-01
The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
Design of an integrated fuel processor for residential PEMFCs applications
NASA Astrophysics Data System (ADS)
Seo, Yu Taek; Seo, Dong Joo; Jeong, Jin Hyeok; Yoon, Wang Lai
KIER has been developing a novel fuel processing system to provide hydrogen rich gas to residential PEMFCs system. For the effective design of a compact hydrogen production system, each unit process for steam reforming and water gas shift, has a steam generator and internal heat exchangers which are thermally and physically integrated into a single packaged hardware system. The newly designed fuel processor (prototype II) showed a thermal efficiency of 78% as a HHV basis with methane conversion of 89%. The preferential oxidation unit with two staged cascade reactors, reduces, the CO concentration to below 10 ppm without complicated temperature control hardware, which is the prerequisite CO limit for the PEMFC stack. After we achieve the initial performance of the fuel processor, partial load operation was carried out to test the performance and reliability of the fuel processor at various loads. The stability of the fuel processor was also demonstrated for three successive days with a stable composition of product gas and thermal efficiency. The CO concentration remained below 10 ppm during the test period and confirmed the stable performance of the two-stage PrOx reactors.
Dynamically allocating sets of fine-grained processors to running computations
NASA Technical Reports Server (NTRS)
Middleton, David
1988-01-01
Researchers explore an approach to using general purpose parallel computers which involves mapping hardware resources onto computations instead of mapping computations onto hardware. Problems such as processor allocation, task scheduling and load balancing, which have traditionally proven to be challenging, change significantly under this approach and may become amenable to new attacks. Researchers describe the implementation of this approach used by the FFP Machine whose computation and communication resources are repeatedly partitioned into disjoint groups that match the needs of available tasks from moment to moment. Several consequences of this system are examined.
Teaching Robotics Software with the Open Hardware Mobile Manipulator
ERIC Educational Resources Information Center
Vona, M.; Shekar, N. H.
2013-01-01
The "open hardware mobile manipulator" (OHMM) is a new open platform with a unique combination of features for teaching robotics software and algorithms. On-board low- and high-level processors support real-time embedded programming and motor control, as well as higher-level coding with contemporary libraries. Full hardware designs and…
Analysis of a hardware and software fault tolerant processor for critical applications
NASA Technical Reports Server (NTRS)
Dugan, Joanne B.
1993-01-01
Computer systems for critical applications must be designed to tolerate software faults as well as hardware faults. A unified approach to tolerating hardware and software faults is characterized by classifying faults in terms of duration (transient or permanent) rather than source (hardware or software). Errors arising from transient faults can be handled through masking or voting, but errors arising from permanent faults require system reconfiguration to bypass the failed component. Most errors which are caused by software faults can be considered transient, in that they are input-dependent. Software faults are triggered by a particular set of inputs. Quantitative dependability analysis of systems which exhibit a unified approach to fault tolerance can be performed by a hierarchical combination of fault tree and Markov models. A methodology for analyzing hardware and software fault tolerant systems is applied to the analysis of a hypothetical system, loosely based on the Fault Tolerant Parallel Processor. The models consider both transient and permanent faults, hardware and software faults, independent and related software faults, automatic recovery, and reconfiguration.
NASA Technical Reports Server (NTRS)
Perkinson, J. A.
1974-01-01
The application of associative memory processor equipment to conventional host processors type systems is discussed. Efforts were made to demonstrate how such application relieves the task burden of conventional systems, and enhance system speed and efficiency. Data cover comparative theoretical performance analysis, demonstration of expanded growth capabilities, and demonstrations of actual hardware in simulated environment.
Hardware Architecture Study for NASA's Space Software Defined Radios
NASA Technical Reports Server (NTRS)
Reinhart, Richard C.; Scardelletti, Maximilian C.; Mortensen, Dale J.; Kacpura, Thomas J.; Andro, Monty; Smith, Carl; Liebetreu, John
2008-01-01
This study defines a hardware architecture approach for software defined radios to enable commonality among NASA space missions. The architecture accommodates a range of reconfigurable processing technologies including general purpose processors, digital signal processors, field programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs) in addition to flexible and tunable radio frequency (RF) front-ends to satisfy varying mission requirements. The hardware architecture consists of modules, radio functions, and and interfaces. The modules are a logical division of common radio functions that comprise a typical communication radio. This paper describes the architecture details, module definitions, and the typical functions on each module as well as the module interfaces. Trade-offs between component-based, custom architecture and a functional-based, open architecture are described. The architecture does not specify the internal physical implementation within each module, nor does the architecture mandate the standards or ratings of the hardware used to construct the radios.
Simulation analysis of a microcomputer-based, low-cost Omega navigation system
NASA Technical Reports Server (NTRS)
Lilley, R. W.; Salter, R. J., Jr.
1976-01-01
The current status of research on a proposed micro-computer-based, low-cost Omega Navigation System (ONS) is described. The design approach emphasizes minimum hardware, maximum software, and the use of a low-cost, commercially-available microcomputer. Currently under investigation is the implementation of a low-cost navigation processor and its interface with an omega sensor to complete the hardware-based ONS. Sensor processor functions are simulated to determine how many of the sensor processor functions can be handled by innovative software. An input data base of live Omega ground and flight test data was created. The Omega sensor and microcomputer interface modules used to collect the data are functionally described. Automatic synchronization to the Omega transmission pattern is described as an example of the algorithms developed using this data base.
Integrated Hardware and Software for No-Loss Computing
NASA Technical Reports Server (NTRS)
James, Mark
2007-01-01
When an algorithm is distributed across multiple threads executing on many distinct processors, a loss of one of those threads or processors can potentially result in the total loss of all the incremental results up to that point. When implementation is massively hardware distributed, then the probability of a hardware failure during the course of a long execution is potentially high. Traditionally, this problem has been addressed by establishing checkpoints where the current state of some or part of the execution is saved. Then in the event of a failure, this state information can be used to recompute that point in the execution and resume the computation from that point. A serious problem arises when one distributes a problem across multiple threads and physical processors is that one increases the likelihood of the algorithm failing due to no fault of the scientist but as a result of hardware faults coupled with operating system problems. With good reason, scientists expect their computing tools to serve them and not the other way around. What is novel here is a unique combination of hardware and software that reformulates an application into monolithic structure that can be monitored in real-time and dynamically reconfigured in the event of a failure. This unique reformulation of hardware and software will provide advanced aeronautical technologies to meet the challenges of next-generation systems in aviation, for civilian and scientific purposes, in our atmosphere and in atmospheres of other worlds. In particular, with respect to NASA s manned flight to Mars, this technology addresses the critical requirements for improving safety and increasing reliability of manned spacecraft.
Hardware and software reliability estimation using simulations
NASA Technical Reports Server (NTRS)
Swern, Frederic L.
1994-01-01
The simulation technique is used to explore the validation of both hardware and software. It was concluded that simulation is a viable means for validating both hardware and software and associating a reliability number with each. This is useful in determining the overall probability of system failure of an embedded processor unit, and improving both the code and the hardware where necessary to meet reliability requirements. The methodologies were proved using some simple programs, and simple hardware models.
Identification of Maize Silicon Influx Transporters
Mitani, Namiki; Yamaji, Naoki; Ma, Jian Feng
2009-01-01
Maize (Zea mays L.) shows a high accumulation of silicon (Si), but transporters involved in the uptake and distribution have not been identified. In the present study, we isolated two genes (ZmLsi1 and ZmLsi6), which are homologous to rice influx Si transporter OsLsi1. Heterologous expression in Xenopus laevis oocytes showed that both ZmLsi1 and ZmLsi6 are permeable to silicic acid. ZmLsi1 was mainly expressed in the roots. By contrast, ZmLsi6 was expressed more in the leaf sheaths and blades. Different from OsLsi1, the expression level of both ZmLsi1 and ZmLsi6 was unaffected by Si supply. Immunostaining showed that ZmLsi1 was localized on the plasma membrane of the distal side of root epidermal and hypodermal cells in the seminal and crown roots, and also in cortex cells in lateral roots. In the shoots, ZmLsi6 was found in the xylem parenchyma cells that are adjacent to the vessels in both leaf sheaths and leaf blades. ZmLsi6 in the leaf sheaths and blades also exhibited polar localization on the side facing towards the vessel. Taken together, it can be concluded that ZmLsi1 is an influx transporter of Si, which is responsible for the transport of Si from the external solution to the root cells and that ZmLsi6 mainly functions as a Si transporter for xylem unloading. PMID:18676379
Software-Reconfigurable Processors for Spacecraft
NASA Technical Reports Server (NTRS)
Farrington, Allen; Gray, Andrew; Bell, Bryan; Stanton, Valerie; Chong, Yong; Peters, Kenneth; Lee, Clement; Srinivasan, Jeffrey
2005-01-01
A report presents an overview of an architecture for a software-reconfigurable network data processor for a spacecraft engaged in scientific exploration. When executed on suitable electronic hardware, the software performs the functions of a physical layer (in effect, acts as a software radio in that it performs modulation, demodulation, pulse-shaping, error correction, coding, and decoding), a data-link layer, a network layer, a transport layer, and application-layer processing of scientific data. The software-reconfigurable network processor is undergoing development to enable rapid prototyping and rapid implementation of communication, navigation, and scientific signal-processing functions; to provide a long-lived communication infrastructure; and to provide greatly improved scientific-instrumentation and scientific-data-processing functions by enabling science-driven in-flight reconfiguration of computing resources devoted to these functions. This development is an extension of terrestrial radio and network developments (e.g., in the cellular-telephone industry) implemented in software running on such hardware as field-programmable gate arrays, digital signal processors, traditional digital circuits, and mixed-signal application-specific integrated circuits (ASICs).
The design of an adaptive predictive coder using a single-chip digital signal processor
NASA Astrophysics Data System (ADS)
Randolph, M. A.
1985-01-01
A speech coding processor architecture design study has been performed in which Texas Instruments TMS32010 has been selected from among three commercially available digital signal processing integrated circuits and evaluated in an implementation study of real-time Adaptive Predictive Coding (APC). The TMS32010 has been compared with AR&T Bell Laboratories DSP I and Nippon Electric Co. PD7720 and was found to be most suitable for a single chip implementation of APC. A preliminary design system based on TMS32010 has been performed, and several of the hardware and software design issues are discussed. Particular attention was paid to the design of an external memory controller which permits rapid sequential access of external RAM. As a result, it has been determined that a compact hardware implementation of the APC algorithm is feasible based of the TSM32010. Originator-supplied keywords include: vocoders, speech compression, adaptive predictive coding, digital signal processing microcomputers, speech processor architectures, and special purpose processor.
NASA Technical Reports Server (NTRS)
Mejzak, R. S.
1980-01-01
The distributed processing concept is defined in terms of control primitives, variables, and structures and their use in performing a decomposed discrete Fourier transform (DET) application function. The design assumes interprocessor communications to be anonymous. In this scheme, all processors can access an entire common database by employing control primitives. Access to selected areas within the common database is random, enforced by a hardware lock, and determined by task and subtask pointers. This enables the number of processors to be varied in the configuration without any modifications to the control structure. Decompositional elements of the DFT application function in terms of tasks and subtasks are also described. The experimental hardware configuration consists of IMSAI 8080 chassis which are independent, 8 bit microcomputer units. These chassis are linked together to form a multiple processing system by means of a shared memory facility. This facility consists of hardware which provides a bus structure to enable up to six microcomputers to be interconnected. It provides polling and arbitration logic so that only one processor has access to shared memory at any one time.
A proposed microcomputer implementation of an Omega navigation processor
NASA Technical Reports Server (NTRS)
Abel, J. D.
1976-01-01
A microprocessor navigation systems using the Omega process is discussed. Several methods for correcting incoming sky waves are presented along with the hardware design which depends on a microcomputer. The control program is discussed, and block diagrams of the Omega processor and interface systems are presented.
NASA Astrophysics Data System (ADS)
Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.
2017-11-01
Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Systems Suitable for Information Professionals.
ERIC Educational Resources Information Center
Blair, John C., Jr.
1983-01-01
Describes computer operating systems applicable to microcomputers, noting hardware components, advantages and disadvantages of each system, local area networks, distributed processing, and a fully configured system. Lists of hardware components (disk drives, solid state disk emulators, input/output and memory components, and processors) and…
VLSI 'smart' I/O module development
NASA Astrophysics Data System (ADS)
Kirk, Dan
The developmental history, design, and operation of the MIL-STD-1553A/B discrete and serial module (DSM) for the U.S. Navy AN/AYK-14(V) avionics computer are described and illustrated with diagrams. The ongoing preplanned product improvement for the AN/AYK-14(V) includes five dual-redundant MIL-STD-1553 channels based on DSMs. The DSM is a front-end processor for transferring data to and from a common memory, sharing memory with a host processor to provide improved 'smart' input/output performance. Each DSM comprises three hardware sections: three VLSI-6000 semicustomized CMOS arrays, memory units to support the arrays, and buffers and resynchronization circuits. The DSM hardware module design, VLSI-6000 design tools, controlware and test software, and checkout procedures (using a hardware simulator) are characterized in detail.
PANDA: A distributed multiprocessor operating system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chubb, P.
1989-01-01
PANDA is a design for a distributed multiprocessor and an operating system. PANDA is designed to allow easy expansion of both hardware and software. As such, the PANDA kernel provides only message passing and memory and process management. The other features needed for the system (device drivers, secondary storage management, etc.) are provided as replaceable user tasks. The thesis presents PANDA's design and implementation, both hardware and software. PANDA uses multiple 68010 processors sharing memory on a VME bus, each such node potentially connected to others via a high speed network. The machine is completely homogeneous: there are no differencesmore » between processors that are detectable by programs running on the machine. A single two-processor node has been constructed. Each processor contains memory management circuits designed to allow processors to share page tables safely. PANDA presents a programmers' model similar to the hardware model: a job is divided into multiple tasks, each having its own address space. Within each task, multiple processes share code and data. Tasks can send messages to each other, and set up virtual circuits between themselves. Peripheral devices such as disc drives are represented within PANDA by tasks. PANDA divides secondary storage into volumes, each volume being accessed by a volume access task, or VAT. All knowledge about the way that data is stored on a disc is kept in its volume's VAT. The design is such that PANDA should provide a useful testbed for file systems and device drivers, as these can be installed without recompiling PANDA itself, and without rebooting the machine.« less
Low latency memory access and synchronization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processormore » only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.« less
Low latency memory access and synchronization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Coteus, Paul W.
A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processormore » only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.« less
Embedded System Implementation on FPGA System With μCLinux OS
NASA Astrophysics Data System (ADS)
Fairuz Muhd Amin, Ahmad; Aris, Ishak; Syamsul Azmir Raja Abdullah, Raja; Kalos Zakiah Sahbudin, Ratna
2011-02-01
Embedded systems are taking on more complicated tasks as the processors involved become more powerful. The embedded systems have been widely used in many areas such as in industries, automotives, medical imaging, communications, speech recognition and computer vision. The complexity requirements in hardware and software nowadays need a flexibility system for further enhancement in any design without adding new hardware. Therefore, any changes in the design system will affect the processor that need to be changed. To overcome this problem, a System On Programmable Chip (SOPC) has been designed and developed using Field Programmable Gate Array (FPGA). A softcore processor, NIOS II 32-bit RISC, which is the microprocessor core was utilized in FPGA system together with the embedded operating system(OS), μClinux. In this paper, an example of web server is explained and demonstrated
LSI/VLSI design for testability analysis and general approach
NASA Technical Reports Server (NTRS)
Lam, A. Y.
1982-01-01
The incorporation of testability characteristics into large scale digital design is not only necessary for, but also pertinent to effective device testing and enhancement of device reliability. There are at least three major DFT techniques, namely, the self checking, the LSSD, and the partitioning techniques, each of which can be incorporated into a logic design to achieve a specific set of testability and reliability requirements. Detailed analysis of the design theory, implementation, fault coverage, hardware requirements, application limitations, etc., of each of these techniques are also presented.
A natural command language for C/3/I applications
NASA Astrophysics Data System (ADS)
Mergler, J. P.
1980-03-01
The article discusses the development of a natural command language and a control and analysis console designed to simplify the task of the operator in field of Command, Control, Communications, and Intelligence. The console is based on a DEC LSI-11 microcomputer, supported by 16-K words of memory and a serial interface component. Discussion covers the language, which utilizes English and a natural syntax, and how it is integrated with the hardware. It is concluded that results have demonstrated the effectiveness of this natural command language.
Low-Latency Embedded Vision Processor (LLEVS)
2016-03-01
26 3.2.3 Task 3 Projected Performance Analysis of FPGA- based Vision Processor ........... 31 3.2.3.1 Algorithms Latency Analysis ...Programmable Gate Array Custom Hardware for Real- Time Multiresolution Analysis . ............................................... 35...conduct data analysis for performance projections. The data acquired through measurements , simulation and estimation provide the requisite platform for
The software system development for the TAMU real-time fan beam scatterometer data processors
NASA Technical Reports Server (NTRS)
Clark, B. V.; Jean, B. R.
1980-01-01
A software package was designed and written to process in real-time any one quadrature channel pair of radar scatterometer signals form the NASA L- or C-Band radar scatterometer systems. The software was successfully tested in the C-Band processor breadboard hardware using recorded radar and NERDAS (NASA Earth Resources Data Annotation System) signals as the input data sources. The processor development program and the overall processor theory of operation and design are described. The real-time processor software system is documented and the results of the laboratory software tests, and recommendations for the efficient application of the data processing capabilities are presented.
ROSE::FTTransform - A Source-to-Source Translation Framework for Exascale Fault-Tolerance Research
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lidman, J; Quinlan, D; Liao, C
2012-03-26
Exascale computing systems will require sufficient resilience to tolerate numerous types of hardware faults while still assuring correct program execution. Such extreme-scale machines are expected to be dominated by processors driven at lower voltages (near the minimum 0.5 volts for current transistors). At these voltage levels, the rate of transient errors increases dramatically due to the sensitivity to transient and geographically localized voltage drops on parts of the processor chip. To achieve power efficiency, these processors are likely to be streamlined and minimal, and thus they cannot be expected to handle transient errors entirely in hardware. Here we present anmore » open, compiler-based framework to automate the armoring of High Performance Computing (HPC) software to protect it from these types of transient processor errors. We develop an open infrastructure to support research work in this area, and we define tools that, in the future, may provide more complete automated and/or semi-automated solutions to support software resiliency on future exascale architectures. Results demonstrate that our approach is feasible, pragmatic in how it can be separated from the software development process, and reasonably efficient (0% to 30% overhead for the Jacobi iteration on common hardware; and 20%, 40%, 26%, and 2% overhead for a randomly selected subset of benchmarks from the Livermore Loops [1]).« less
High Resolution Thermography In Medicine
NASA Astrophysics Data System (ADS)
Clark, R. P.; Goff, M. R.; Culley, J. E.
1988-10-01
A high resolution medical thermal imaging system using an 8 element SPRI1E detector is described. Image processing is by an Intellect 100 processor and is controlled by a DEC LSI 11/23 minicomputer. Image storage is with a 170 Mbyte winchester disc together with archival storage on 12 inch diameter optical discs having a capacity of 1 Gbyte per side. The system is currently being evaluated for use in physiology and medicine. Applications outlined include the potential of thermographic screening to identify genetic carriers in X-linked hypohidrotic ectodermal dysplasia (XED), detailed vas-cular perfusion studies in health and disease and the relation-ship between cutaneous blood flow, neurological peripheral function and skin surface temperature.
Atac, R.; Fischler, M.S.; Husby, D.E.
1991-01-15
A bus switching apparatus and method for multiple processor computer systems comprises a plurality of bus switches interconnected by branch buses. Each processor or other module of the system is connected to a spigot of a bus switch. Each bus switch also serves as part of a backplane of a modular crate hardware package. A processor initiates communication with another processor by identifying that other processor. The bus switch to which the initiating processor is connected identifies and secures, if possible, a path to that other processor, either directly or via one or more other bus switches which operate similarly. If a particular desired path through a given bus switch is not available to be used, an alternate path is considered, identified and secured. 11 figures.
Atac, Robert; Fischler, Mark S.; Husby, Donald E.
1991-01-01
A bus switching apparatus and method for multiple processor computer systems comprises a plurality of bus switches interconnected by branch buses. Each processor or other module of the system is connected to a spigot of a bus switch. Each bus switch also serves as part of a backplane of a modular crate hardware package. A processor initiates communication with another processor by identifying that other processor. The bus switch to which the initiating processor is connected identifies and secures, if possible, a path to that other processor, either directly or via one or more other bus switches which operate similarly. If a particular desired path through a given bus switch is not available to be used, an alternate path is considered, identified and secured.
Laser speckle imaging of rat retinal blood flow with hybrid temporal and spatial analysis method
NASA Astrophysics Data System (ADS)
Cheng, Haiying; Yan, Yumei; Duong, Timothy Q.
2009-02-01
Noninvasive monitoring of blood flow in retinal circulation will reveal the progression and treatment of ocular disorders, such as diabetic retinopathy, age-related macular degeneration and glaucoma. A non-invasive and direct BF measurement technique with high spatial-temporal resolution is needed for retinal imaging. Laser speckle imaging (LSI) is such a method. Currently, there are two analysis methods for LSI: spatial statistics LSI (SS-LSI) and temporal statistical LSI (TS-LSI). Comparing these two analysis methods, SS-LSI has higher signal to noise ratio (SNR) and TSLSI is less susceptible to artifacts from stationary speckle. We proposed a hybrid temporal and spatial analysis method (HTS-LSI) to measure the retinal blood flow. Gas challenge experiment was performed and images were analyzed by HTS-LSI. Results showed that HTS-LSI can not only remove the stationary speckle but also increase the SNR. Under 100% O2, retinal BF decreased by 20-30%. This was consistent with the results observed with laser Doppler technique. As retinal blood flow is a critical physiological parameter and its perturbation has been implicated in the early stages of many retinal diseases, HTS-LSI will be an efficient method in early detection of retina diseases.
NASA Astrophysics Data System (ADS)
Bellerby, Tim
2015-04-01
PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks < number of processors) or tasks are divided out among the available processors (number of tasks > number of processors). Nested parallel statements may further subdivide the processor set owned by a given task. Tasks or processors are distributed evenly by default, but uneven distributions are possible under programmer control. It is also possible to explicitly enable child tasks to migrate within the processor set owned by their parent task, reducing load unbalancing at the potential cost of increased inter-processor message traffic. PM incorporates some programming structures from the earlier MIST language presented at a previous EGU General Assembly, while adopting a significantly different underlying parallelisation model and type system. PM code is available at www.pm-lang.org under an unrestrictive MIT license. Reference Ruymán Reyes, Antonio J. Dorta, Francisco Almeida, Francisco de Sande, 2009. Automatic Hybrid MPI+OpenMP Code Generation with llc, Recent Advances in Parallel Virtual Machine and Message Passing Interface, Lecture Notes in Computer Science Volume 5759, 185-195
Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel
2008-01-01
Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553
Neurovision processor for designing intelligent sensors
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1992-03-01
A programmable multi-task neuro-vision processor, called the Positive-Negative (PN) neural processor, is proposed as a plausible hardware mechanism for constructing robust multi-task vision sensors. The computational operations performed by the PN neural processor are loosely based on the neural activity fields exhibited by certain nervous tissue layers situated in the brain. The neuro-vision processor can be programmed to generate diverse dynamic behavior that may be used for spatio-temporal stabilization (STS), short-term visual memory (STVM), spatio-temporal filtering (STF) and pulse frequency modulation (PFM). A multi- functional vision sensor that performs a variety of information processing operations on time- varying two-dimensional sensory images can be constructed from a parallel and hierarchical structure of numerous individually programmed PN neural processors.
Programs for Testing Processor-in-Memory Computing Systems
NASA Technical Reports Server (NTRS)
Katz, Daniel S.
2006-01-01
The Multithreaded Microbenchmarks for Processor-In-Memory (PIM) Compilers, Simulators, and Hardware are computer programs arranged in a series for use in testing the performances of PIM computing systems, including compilers, simulators, and hardware. The programs at the beginning of the series test basic functionality; the programs at subsequent positions in the series test increasingly complex functionality. The programs are intended to be used while designing a PIM system, and can be used to verify that compilers, simulators, and hardware work correctly. The programs can also be used to enable designers of these system components to examine tradeoffs in implementation. Finally, these programs can be run on non-PIM hardware (either single-threaded or multithreaded) using the POSIX pthreads standard to verify that the benchmarks themselves operate correctly. [POSIX (Portable Operating System Interface for UNIX) is a set of standards that define how programs and operating systems interact with each other. pthreads is a library of pre-emptive thread routines that comply with one of the POSIX standards.
Media processors using a new microsystem architecture designed for the Internet era
NASA Astrophysics Data System (ADS)
Wyland, David C.
1999-12-01
The demands of digital image processing, communications and multimedia applications are growing more rapidly than traditional design methods can fulfill them. Previously, only custom hardware designs could provide the performance required to meet the demands of these applications. However, hardware design has reached a crisis point. Hardware design can no longer deliver a product with the required performance and cost in a reasonable time for a reasonable risk. Software based designs running on conventional processors can deliver working designs in a reasonable time and with low risk but cannot meet the performance requirements. What is needed is a media processing approach that combines very high performance, a simple programming model, complete programmability, short time to market and scalability. The Universal Micro System (UMS) is a solution to these problems. The UMS is a completely programmable (including I/O) system on a chip that combines hardware performance with the fast time to market, low cost and low risk of software designs.
NASA Astrophysics Data System (ADS)
Laracuente, Nicholas; Grossman, Carl
2013-03-01
We developed an algorithm and software to calculate autocorrelation functions from real-time photon-counting data using the fast, parallel capabilities of graphical processor units (GPUs). Recent developments in hardware and software have allowed for general purpose computing with inexpensive GPU hardware. These devices are more suited for emulating hardware autocorrelators than traditional CPU-based software applications by emphasizing parallel throughput over sequential speed. Incoming data are binned in a standard multi-tau scheme with configurable points-per-bin size and are mapped into a GPU memory pattern to reduce time-expensive memory access. Applications include dynamic light scattering (DLS) and fluorescence correlation spectroscopy (FCS) experiments. We ran the software on a 64-core graphics pci card in a 3.2 GHz Intel i5 CPU based computer running Linux. FCS measurements were made on Alexa-546 and Texas Red dyes in a standard buffer (PBS). Software correlations were compared to hardware correlator measurements on the same signals. Supported by HHMI and Swarthmore College
A pluggable framework for parallel pairwise sequence search.
Archuleta, Jeremy; Feng, Wu-chun; Tilevich, Eli
2007-01-01
The current and near future of the computing industry is one of multi-core and multi-processor technology. Most existing sequence-search tools have been designed with a focus on single-core, single-processor systems. This discrepancy between software design and hardware architecture substantially hinders sequence-search performance by not allowing full utilization of the hardware. This paper presents a novel framework that will aid the conversion of serial sequence-search tools into a parallel version that can take full advantage of the available hardware. The framework, which is based on a software architecture called mixin layers with refined roles, enables modules to be plugged into the framework with minimal effort. The inherent modular design improves maintenance and extensibility, thus opening up a plethora of opportunities for advanced algorithmic features to be developed and incorporated while routine maintenance of the codebase persists.
Isolation and functional characterization of CsLsi1, a silicon transporter gene in Cucumis sativus.
Sun, Hao; Guo, Jia; Duan, Yaoke; Zhang, Tiantian; Huo, Heqiang; Gong, Haijun
2017-02-01
Cucumber (Cucumis sativus) is a widely grown cucurbitaceous vegetable that exhibits a relatively high capacity for silicon (Si) accumulation, but the molecular mechanism for silicon uptake remains to be clarified. Here we isolated and characterized CsLsi1, a gene encoding a silicon transporter in cucumber (cv. Mch-4). CsLsi1 shares 55.70 and 90.63% homology with the Lsi1s of a monocot and dicot, rice (Oryza sativa) and pumpkin (Cucurbita moschata), respectively. CsLsi1 was predominantly expressed in the roots, and application of exogenous silicon suppressed its expression. Transient expression in cucumber protoplasts showed that CsLsi1 was localized in the plasma membrane. Heterologous expression in Xenopus laevis oocytes showed that CsLsi1 evidenced influx transport activity for silicon but not urea or glycerol. Expression of cucumber CsLsi1-mGFP under its own promoter showed that CsLsi1 was localized at the distal side of the endodermis and the cortical cells in the root tips as well as in the root hairs near the root tips. Heterologous expression of CsLsi1 in a rice mutant defective in silicon uptake and the over-expression of this gene in cucumber further confirmed the role of CsLsi1 in silicon uptake. Our results suggest that CsLsi1 is a silicon influx transporter in cucumber. The cellular localization of CsLsi1 in cucumber roots is different from that in other plants, implying the possible effect of transporter localization on silicon uptake capability. © 2016 Scandinavian Plant Physiology Society.
Comparing an FPGA to a Cell for an Image Processing Application
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ngo, Hau; Broussard, Randy P.; Ives, Robert W.
2010-12-01
Modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays (FPGAs), have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms. On the other hand, PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high performance. In this research project, our aim is to study the differences in performance of a modern image processing algorithm on these two hardware platforms. In particular, Iris Recognition Systems have recently become an attractive identification method because of their extremely high accuracy. Iris matching, a repeatedly executed portion of a modern iris recognition algorithm, is parallelized on an FPGA system and a Cell processor. We demonstrate a 2.5 times speedup of the parallelized algorithm on the FPGA system when compared to a Cell processor-based version.
An acceleration framework for synthetic aperture radar algorithms
NASA Astrophysics Data System (ADS)
Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.
2017-04-01
Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
Use of Field Programmable Gate Array Technology in Future Space Avionics
NASA Technical Reports Server (NTRS)
Ferguson, Roscoe C.; Tate, Robert
2005-01-01
Fulfilling NASA's new vision for space exploration requires the development of sustainable, flexible and fault tolerant spacecraft control systems. The traditional development paradigm consists of the purchase or fabrication of hardware boards with fixed processor and/or Digital Signal Processing (DSP) components interconnected via a standardized bus system. This is followed by the purchase and/or development of software. This paradigm has several disadvantages for the development of systems to support NASA's new vision. Building a system to be fault tolerant increases the complexity and decreases the performance of included software. Standard bus design and conventional implementation produces natural bottlenecks. Configuring hardware components in systems containing common processors and DSPs is difficult initially and expensive or impossible to change later. The existence of Hardware Description Languages (HDLs), the recent increase in performance, density and radiation tolerance of Field Programmable Gate Arrays (FPGAs), and Intellectual Property (IP) Cores provides the technology for reprogrammable Systems on a Chip (SOC). This technology supports a paradigm better suited for NASA's vision. Hardware and software production are melded for more effective development; they can both evolve together over time. Designers incorporating this technology into future avionics can benefit from its flexibility. Systems can be designed with improved fault isolation and tolerance using hardware instead of software. Also, these designs can be protected from obsolescence problems where maintenance is compromised via component and vendor availability.To investigate the flexibility of this technology, the core of the Central Processing Unit and Input/Output Processor of the Space Shuttle AP101S Computer were prototyped in Verilog HDL and synthesized into an Altera Stratix FPGA.
A microprocessor based high speed packet switch for satellite communications
NASA Technical Reports Server (NTRS)
Arozullah, M.; Crist, S. C.
1980-01-01
The architectures of a single processor, a three processor, and a multiple processor system are described. The hardware circuits, and software routines required for implementing the three and multiple processor designs are presented. A bit-slice microprocessor was designed and microprogrammed. Maximum throughput was calculated for all three designs. Queue theoretic models for these three designs were developed and utilized to obtain analytical expressions for the average waiting times, overall average response times and average queue sizes. From these expressions, graphs were obtained showing the effect on the system performance of a number of design parameters.
Scheduler for multiprocessor system switch with selective pairing
Gara, Alan; Gschwind, Michael Karl; Salapura, Valentina
2015-01-06
System, method and computer program product for scheduling threads in a multiprocessing system with selective pairing of processor cores for increased processing reliability. A selective pairing facility is provided that selectively connects, i.e., pairs, multiple microprocessor or processor cores to provide one highly reliable thread (or thread group). The method configures the selective pairing facility to use checking provide one highly reliable thread for high-reliability and allocate threads to corresponding processor cores indicating need for hardware checking. The method configures the selective pairing facility to provide multiple independent cores and allocate threads to corresponding processor cores indicating inherent resilience.
Special-purpose computing for dense stellar systems
NASA Astrophysics Data System (ADS)
Makino, Junichiro
2007-08-01
I'll describe the current status of the GRAPE-DR project. The GRAPE-DR is the next-generation hardware for N-body simulation. Unlike the previous GRAPE hardwares, it is programmable SIMD machine with a large number of simple processors integrated into a single chip. The GRAPE-DR chip consists of 512 simple processors and operates at the clock speed of 500 MHz, delivering the theoretical peak speed of 512/226 Gflops (single/double precision). As of August 2006, the first prototype board with the sample chip successfully passed the test we prepared. The full GRAPE-DR system will consist of 4096 chips, reaching the theoretical peak speed of 2 Pflops.
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow
NASA Technical Reports Server (NTRS)
Lottati, I.
1983-01-01
A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
NASA Technical Reports Server (NTRS)
Harper, Richard E.; Butler, Bryan P.
1990-01-01
The Draper fault-tolerant processor with fault-tolerant shared memory (FTP/FTSM), which is designed to allow application tasks to continue execution during the memory alignment process, is described. Processor performance is not affected by memory alignment. In addition, the FTP/FTSM incorporates a hardware scrubber device to perform the memory alignment quickly during unused memory access cycles. The FTP/FTSM architecture is described, followed by an estimate of the time required for channel reintegration.
A VME-based software trigger system using UNIX processors
NASA Astrophysics Data System (ADS)
Atmur, Robert; Connor, David F.; Molzon, William
1997-02-01
We have constructed a distributed computing platform with eight processors to assemble and filter data from digitization crates. The filtered data were transported to a tape-writing UNIX computer via ethernet. Each processor ran a UNIX operating system and was installed in its own VME crate. Each VME crate contained dual-port memories which interfaced with the digitizers. Using standard hardware and software (VME and UNIX) allows us to select from a wide variety of non-proprietary products and makes upgrades simpler, if they are necessary.
Extended performance electric propulsion power processor design study. Volume 1: Executive summary
NASA Technical Reports Server (NTRS)
Biess, J. J.; Inouye, L. Y.; Schoenfeld, A. D.
1977-01-01
Several power processor design concepts were evaluated and compared. Emphasis was placed on a 30cm ion thruster power processor with a beam supply rating of 2.2kW to 10kW. Extensions in power processor performance were defined and were designed in sufficient detail to determine efficiency, component weight, part count, reliability and thermal control. Preliminary electrical design, mechanical design, and thermal analysis were performed on a 6kW power transformer for the beam supply. Bi-Mod mechanical, structural, and thermal control configurations were evaluated for the power processor, and preliminary estimates of mechanical weight were determined. A program development plan was formulated that outlines the work breakdown structure for the development, qualification and fabrication of the power processor flight hardware.
Framework for Development and Distribution of Hardware Acceleration
NASA Astrophysics Data System (ADS)
Thomas, David B.; Luk, Wayne W.
2002-07-01
This paper describes IGOL, a framework for developing reconfigurable data processing applications. While IGOL was originally designed to target imaging and graphics systems, its structure is sufficiently general to support a broad range of applications. IGOL adopts a four-layer architecture: application layer, operation layer, appliance layer and configuration layer. This architecture is intended to separate and co-ordinate both the development and execution of hardware and software components. Hardware developers can use IGOL as an instance testbed for verification and benchmarking, as well as for distribution. Software application developers can use IGOL to discover hardware accelerated data processors, and to access them in a transparent, non-hardware specific manner. IGOL provides extensive support for the RC1000-PP board via the Handel-C language, and a wide selection of image processing filters have been developed. IGOL also supplies plug-ins to enable such filters to be incorporated in popular applications such as Premiere, Winamp, VirtualDub and DirectShow. Moreover, IGOL allows the automatic use of multiple cards to accelerate an application, demonstrated using DirectShow. To enable transparent acceleration without sacrificing performance, a three-tiered COM (Component Object Model) API has been designed and implemented. This API provides a well-defined and extensible interface which facilitates the development of hardware data processors that can accelerate multiple applications.
Tailoring Software for Multiple Processor Systems
1982-10-01
resource management decisions . Despite the lack of programming support, the use of multiple processor systems has grown sub- -stantially. Software has...making resource management decisions . Specifically, program- 1 mers need not allocate specific hardware resources to individual program components...Instead, such allocation decisions are automatically made based on high-level resource directives stated by ap- plication programmers, where each directive
A SOPC-BASED Evaluation of AES for 2.4 GHz Wireless Network
NASA Astrophysics Data System (ADS)
Ken, Cai; Xiaoying, Liang
In modern systems, data security is needed more than ever before and many cryptographic algorithms are utilized for security services. Wireless Sensor Networks (WSN) is an example of such technologies. In this paper an innovative SOPC-based approach for the security services evaluation in WSN is proposed that addresses the issues of scalability, flexible performance, and silicon efficiency for the hardware acceleration of encryption system. The design includes a Nios II processor together with custom designed modules for the Advanced Encryption Standard (AES) which has become the default choice for various security services in numerous applications. The objective of this mechanism is to present an efficient hardware realization of AES using very high speed integrated circuit hardware description language (Verilog HDL) and expand the usability for various applications. As compared to traditional customize processor design, the mechanism provides a very broad range of cost/performance points.
The implementation and use of Ada on distributed systems with reliability requirements
NASA Technical Reports Server (NTRS)
Reynolds, P. F.; Knight, J. C.; Urquhart, J. I. A.
1983-01-01
The issues involved in the use of the programming language Ada on distributed systems are discussed. The effects of Ada programs on hardware failures such as loss of a processor are emphasized. It is shown that many Ada language elements are not well suited to this environment. Processor failure can easily lead to difficulties on those processors which remain. As an example, the calling task in a rendezvous may be suspended forever if the processor executing the serving task fails. A mechanism for detecting failure is proposed and changes to the Ada run time support system are suggested which avoid most of the difficulties. Ada program structures are defined which allow programs to reconfigure and continue to provide service following processor failure.
Laser speckle imaging based on photothermally driven convection.
Regan, Caitlin; Choi, Bernard
2016-02-01
Laser speckle imaging (LSI) is an interferometric technique that provides information about the relative speed of moving scatterers in a sample. Photothermal LSI overcomes limitations in depth resolution faced by conventional LSI by incorporating an excitation pulse to target absorption by hemoglobin within the vascular network. Here we present results from experiments designed to determine the mechanism by which photothermal LSI decreases speckle contrast. We measured the impact of mechanical properties on speckle contrast, as well as the spatiotemporal temperature dynamics and bulk convective motion occurring during photothermal LSI. Our collective data strongly support the hypothesis that photothermal LSI achieves a transient reduction in speckle contrast due to bulk motion associated with thermally driven convection. The ability of photothermal LSI to image structures below a scattering medium may have important preclinical and clinical applications.
Software handlers for process interfaces
NASA Technical Reports Server (NTRS)
Bercaw, R. W.
1976-01-01
The principles involved in the development of software handlers for custom interfacing problems are discussed. Handlers for the CAMAC standard are examined in detail. The types of transactions that must be supported have been established by standards groups, eliminating conflicting requirements arising out of different design philosophies and applications. Implementation of the standard handlers has been facilititated by standardization of hardware. The necessary local processors can be placed in the handler when it is written or at run time by means of input/output directives, or they can be built into a high-performance input/output processor. The full benefits of these process interfaces will only be realized when software requirements are incorporated uniformly into the hardware.
Theorem Proving in Intel Hardware Design
NASA Technical Reports Server (NTRS)
O'Leary, John
2009-01-01
For the past decade, a framework combining model checking (symbolic trajectory evaluation) and higher-order logic theorem proving has been in production use at Intel. Our tools and methodology have been used to formally verify execution cluster functionality (including floating-point operations) for a number of Intel products, including the Pentium(Registered TradeMark)4 and Core(TradeMark)i7 processors. Hardware verification in 2009 is much more challenging than it was in 1999 - today s CPU chip designs contain many processor cores and significant firmware content. This talk will attempt to distill the lessons learned over the past ten years, discuss how they apply to today s problems, outline some future directions.
Hardware for Accelerating N-Modular Redundant Systems for High-Reliability Computing
NASA Technical Reports Server (NTRS)
Dobbs, Carl, Sr.
2012-01-01
A hardware unit has been designed that reduces the cost, in terms of performance and power consumption, for implementing N-modular redundancy (NMR) in a multiprocessor device. The innovation monitors transactions to memory, and calculates a form of sumcheck on-the-fly, thereby relieving the processors of calculating the sumcheck in software
Interface Circuits for Self-Checking Microprocessors
NASA Technical Reports Server (NTRS)
Rennels, D. A.; Chandramouli, R.
1986-01-01
Fault-tolerant-microcomputer concept based on enhancing "simple" computer with redundancy and self-checking logic circuits detect hardware faults. Interface and checking logic and redundant processors confer on 16-bit microcomputer ability to check itself for hardware faults. Checking circuitry also checks itself. Concept of self-checking complementary pairs (SCCP's) employed throughout ICL unit.
Evaluating local indirect addressing in SIMD proc essors
NASA Technical Reports Server (NTRS)
Middleton, David; Tomboulian, Sherryl
1989-01-01
In the design of parallel computers, there exists a tradeoff between the number and power of individual processors. The single instruction stream, multiple data stream (SIMD) model of parallel computers lies at one extreme of the resulting spectrum. The available hardware resources are devoted to creating the largest possible number of processors, and consequently each individual processor must use the fewest possible resources. Disagreement exists as to whether SIMD processors should be able to generate addresses individually into their local data memory, or all processors should access the same address. The tradeoff is examined between the increased capability and the reduced number of processors that occurs in this single instruction stream, multiple, locally addressed, data (SIMLAD) model. Factors are assembled that affect this design choice, and the SIMLAD model is compared with the bare SIMD and the MIMD models.
Ok, Seung-Ho; Lee, Yong-Hwan; Shim, Jae Hoon; Lim, Sung Kyu; Moon, Byungin
2017-02-22
Recently, stereo matching processors have been adopted in real-time embedded systems such as intelligent robots and autonomous vehicles, which require minimal hardware resources and low power consumption. Meanwhile, thanks to the through-silicon via (TSV), three-dimensional (3D) stacking technology has emerged as a practical solution to achieving the desired requirements of a high-performance circuit. In this paper, we present the benefits of 3D stacking and process technology scaling on stereo matching processors. We implemented 2-tier 3D-stacked stereo matching processors with GlobalFoundries 130-nm and Nangate 45-nm process design kits and compare them with their two-dimensional (2D) counterparts to identify comprehensive design benefits. In addition, we examine the findings from various analyses to identify the power benefits of 3D-stacked integrated circuit (IC) and device technology advancements. From experiments, we observe that the proposed 3D-stacked ICs, compared to their 2D IC counterparts, obtain 43% area, 13% power, and 14% wire length reductions. In addition, we present a logic partitioning method suitable for a pipeline-based hardware architecture that minimizes the use of TSVs.
The Impact of 3D Stacking and Technology Scaling on the Power and Area of Stereo Matching Processors
Ok, Seung-Ho; Lee, Yong-Hwan; Shim, Jae Hoon; Lim, Sung Kyu; Moon, Byungin
2017-01-01
Recently, stereo matching processors have been adopted in real-time embedded systems such as intelligent robots and autonomous vehicles, which require minimal hardware resources and low power consumption. Meanwhile, thanks to the through-silicon via (TSV), three-dimensional (3D) stacking technology has emerged as a practical solution to achieving the desired requirements of a high-performance circuit. In this paper, we present the benefits of 3D stacking and process technology scaling on stereo matching processors. We implemented 2-tier 3D-stacked stereo matching processors with GlobalFoundries 130-nm and Nangate 45-nm process design kits and compare them with their two-dimensional (2D) counterparts to identify comprehensive design benefits. In addition, we examine the findings from various analyses to identify the power benefits of 3D-stacked integrated circuit (IC) and device technology advancements. From experiments, we observe that the proposed 3D-stacked ICs, compared to their 2D IC counterparts, obtain 43% area, 13% power, and 14% wire length reductions. In addition, we present a logic partitioning method suitable for a pipeline-based hardware architecture that minimizes the use of TSVs. PMID:28241437
Laser speckle imaging based on photothermally driven convection
Regan, Caitlin; Choi, Bernard
2016-01-01
Abstract. Laser speckle imaging (LSI) is an interferometric technique that provides information about the relative speed of moving scatterers in a sample. Photothermal LSI overcomes limitations in depth resolution faced by conventional LSI by incorporating an excitation pulse to target absorption by hemoglobin within the vascular network. Here we present results from experiments designed to determine the mechanism by which photothermal LSI decreases speckle contrast. We measured the impact of mechanical properties on speckle contrast, as well as the spatiotemporal temperature dynamics and bulk convective motion occurring during photothermal LSI. Our collective data strongly support the hypothesis that photothermal LSI achieves a transient reduction in speckle contrast due to bulk motion associated with thermally driven convection. The ability of photothermal LSI to image structures below a scattering medium may have important preclinical and clinical applications. PMID:26927221
Image Matrix Processor for Volumetric Computations Final Report CRADA No. TSB-1148-95
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberson, G. Patrick; Browne, Jolyon
The development of an Image Matrix Processor (IMP) was proposed that would provide an economical means to perform rapid ray-tracing processes on volume "Giga Voxel" data sets. This was a multi-phased project. The objective of the first phase of the IMP project was to evaluate the practicality of implementing a workstation-based Image Matrix Processor for use in volumetric reconstruction and rendering using hardware simulation techniques. Additionally, ARACOR and LLNL worked together to identify and pursue further funding sources to complete a second phase of this project.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-01-01
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation. PMID:27983714
The Alaska SAR processor - Operations and control
NASA Technical Reports Server (NTRS)
Carande, Richard E.
1989-01-01
The Alaska SAR (synthetic-aperture radar) Facility (ASF) will be capable of receiving, processing, archiving, and producing a variety of SAR image products from three satellite-borne SARs: E-ERS-1 (ESA), J-ERS-1 (NASDA) and Radarsat (Canada). Crucial to the success of the ASF is the Alaska SAR processor (ASP), which will be capable of processing over 200 100-km x 100-km (Seasat-like) frames per day from the raw SAR data, at a ground resolution of about 30 m x 30 m. The processed imagery is of high geometric and radiometric accuracy, and is geolocated to within 500 m. Special-purpose hardware has been designed to execute a SAR processing algorithm to achieve this performance. This hardware is currently undergoing acceptance testing for delivery to the University of Alaska. Particular attention has been devoted to making the operations semi-automated and to providing a friendly operator interface via a computer workstation. The operations and control of the Alaska SAR processor are described.
A Real-Time Marker-Based Visual Sensor Based on a FPGA and a Soft Core Processor.
Tayara, Hilal; Ham, Woonchul; Chong, Kil To
2016-12-15
This paper introduces a real-time marker-based visual sensor architecture for mobile robot localization and navigation. A hardware acceleration architecture for post video processing system was implemented on a field-programmable gate array (FPGA). The pose calculation algorithm was implemented in a System on Chip (SoC) with an Altera Nios II soft-core processor. For every frame, single pass image segmentation and Feature Accelerated Segment Test (FAST) corner detection were used for extracting the predefined markers with known geometries in FPGA. Coplanar PosIT algorithm was implemented on the Nios II soft-core processor supplied with floating point hardware for accelerating floating point operations. Trigonometric functions have been approximated using Taylor series and cubic approximation using Lagrange polynomials. Inverse square root method has been implemented for approximating square root computations. Real time results have been achieved and pixel streams have been processed on the fly without any need to buffer the input frame for further implementation.
The ATLAS Level-1 Calorimeter Trigger: PreProcessor implementation and performance
NASA Astrophysics Data System (ADS)
Åsman, B.; Achenbach, R.; Allbrooke, B. M. M.; Anders, G.; Andrei, V.; Büscher, V.; Bansil, H. S.; Barnett, B. M.; Bauss, B.; Bendtz, K.; Bohm, C.; Bracinik, J.; Brawn, I. P.; Brock, R.; Buttinger, W.; Caputo, R.; Caughron, S.; Cerrito, L.; Charlton, D. G.; Childers, J. T.; Curtis, C. J.; Daniells, A. C.; Davis, A. O.; Davygora, Y.; Dorn, M.; Eckweiler, S.; Edmunds, D.; Edwards, J. P.; Eisenhandler, E.; Ellis, K.; Ermoline, Y.; Föhlisch, F.; Faulkner, P. J. W.; Fedorko, W.; Fleckner, J.; French, S. T.; Gee, C. N. P.; Gillman, A. R.; Goeringer, C.; Hülsing, T.; Hadley, D. R.; Hanke, P.; Hauser, R.; Heim, S.; Hellman, S.; Hickling, R. S.; Hidvégi, A.; Hillier, S. J.; Hofmann, J. I.; Hristova, I.; Ji, W.; Johansen, M.; Keller, M.; Khomich, A.; Kluge, E.-E.; Koll, J.; Laier, H.; Landon, M. P. J.; Lang, V. S.; Laurens, P.; Lepold, F.; Lilley, J. N.; Linnemann, J. T.; Müller, F.; Müller, T.; Mahboubi, K.; Martin, T. A.; Mass, A.; Meier, K.; Meyer, C.; Middleton, R. P.; Moa, T.; Moritz, S.; Morris, J. D.; Mudd, R. D.; Narayan, R.; zur Nedden, M.; Neusiedl, A.; Newman, P. R.; Nikiforov, A.; Ohm, C. C.; Perera, V. J. O.; Pfeiffer, U.; Plucinski, P.; Poddar, S.; Prieur, D. P. F.; Qian, W.; Rieck, P.; Rizvi, E.; Sankey, D. P. C.; Schäfer, U.; Scharf, V.; Schmitt, K.; Schröder, C.; Schultz-Coulon, H.-C.; Schumacher, C.; Schwienhorst, R.; Silverstein, S. B.; Simioni, E.; Snidero, G.; Staley, R. J.; Stamen, R.; Stock, P.; Stockton, M. C.; Tan, C. L. A.; Tapprogge, S.; Thomas, J. P.; Thompson, P. D.; Thomson, M.; True, P.; Watkins, P. M.; Watson, A. T.; Watson, M. F.; Weber, P.; Wessels, M.; Wiglesworth, C.; Williams, S. L.
2012-12-01
The PreProcessor system of the ATLAS Level-1 Calorimeter Trigger (L1Calo) receives about 7200 analogue signals from the electromagnetic and hadronic components of the calorimetric detector system. Lateral division results in cells which are pre-summed to so-called Trigger Towers of size 0.1 × 0.1 along azimuth (phi) and pseudorapidity (η). The received calorimeter signals represent deposits of transverse energy. The system consists of 124 individual PreProcessor modules that digitise the input signals for each LHC collision, and provide energy and timing information to the digital processors of the L1Calo system, which identify physics objects forming much of the basis for the full ATLAS first level trigger decision. This paper describes the architecture of the PreProcessor, its hardware realisation, functionality, and performance.
Life sciences flight experiments microcomputer
NASA Technical Reports Server (NTRS)
Bartram, Peter N.
1987-01-01
A promising microcomputer configuration for the Spacelab Life Sciences Lab. Equipment inventory consists of multiple processors. One processor's use is reserved, with additional processors dedicated to real time input and output operations. A simple form of such a configuration, with a processor board for analog to digital conversion and another processor board for digital to analog conversion, was studied. The system used digital parallel data lines between the boards, operating independently of the system bus. Good performance of individual components was demonstrated: the analog to digital converter was at over 10,000 samples per second. The combination of the data transfer between boards with the input or output functions on each board slowed performance, with a maximum throughput of 2800 to 2900 analog samples per second. Any of several techniques, such as use of the system bus for data transfer or the addition of direct memory access hardware to the processor boards, should give significantly improved performance.
Crosetto, D.B.
1996-12-31
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor to a plurality of slave processors to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor`s status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer, a digital signal processor, a parallel transfer controller, and two three-port memory devices. A communication switch within each node connects it to a fast parallel hardware channel through which all high density data arrives or leaves the node. 6 figs.
Development and analysis of the Software Implemented Fault-Tolerance (SIFT) computer
NASA Technical Reports Server (NTRS)
Goldberg, J.; Kautz, W. H.; Melliar-Smith, P. M.; Green, M. W.; Levitt, K. N.; Schwartz, R. L.; Weinstock, C. B.
1984-01-01
SIFT (Software Implemented Fault Tolerance) is an experimental, fault-tolerant computer system designed to meet the extreme reliability requirements for safety-critical functions in advanced aircraft. Errors are masked by performing a majority voting operation over the results of identical computations, and faulty processors are removed from service by reassigning computations to the nonfaulty processors. This scheme has been implemented in a special architecture using a set of standard Bendix BDX930 processors, augmented by a special asynchronous-broadcast communication interface that provides direct, processor to processor communication among all processors. Fault isolation is accomplished in hardware; all other fault-tolerance functions, together with scheduling and synchronization are implemented exclusively by executive system software. The system reliability is predicted by a Markov model. Mathematical consistency of the system software with respect to the reliability model has been partially verified, using recently developed tools for machine-aided proof of program correctness.
NASA Technical Reports Server (NTRS)
Wu, C.; Barkan, B.; Huneycutt, B.; Leang, C.; Pang, S.
1981-01-01
Basic engineering data regarding the Interim Digital SAR Processor (IDP) and the digitally correlated Seasat synthetic aperature radar (SAR) imagery are presented. The correlation function and IDP hardware/software configuration are described, and a preliminary performance assessment presented. The geometric and radiometric characteristics, with special emphasis on those peculiar to the IDP produced imagery, are described.
Techniques for the rapid display and manipulation of 3-D biomedical data.
Goldwasser, S M; Reynolds, R A; Talton, D A; Walsh, E S
1988-01-01
The use of fully interactive 3-D workstations with true real-time performance will become increasingly common as technology matures and economical commercial systems become available. This paper provides a comprehensive introduction to high speed approaches to the display and manipulation of 3-D medical objects obtained from tomographic data acquisition systems such as CT, MR, and PET. A variety of techniques are outlined including the use of software on conventional minicomputers, hardware assist devices such as array processors and programmable frame buffers, and special purpose computer architecture for dedicated high performance systems. While both algorithms and architectures are addressed, the major theme centers around the utilization of hardware-based approaches including parallel processors for the implementation of true real-time systems.
Some issues related to simulation of the tracking and communications computer network
NASA Technical Reports Server (NTRS)
Lacovara, Robert C.
1989-01-01
The Communications Performance and Integration branch of the Tracking and Communications Division has an ongoing involvement in the simulation of its flight hardware for Space Station Freedom. Specifically, the communication process between central processor(s) and orbital replaceable units (ORU's) is simulated with varying degrees of fidelity. The results of investigations into three aspects of this simulation effort are given. The most general area involves the use of computer assisted software engineering (CASE) tools for this particular simulation. The second area of interest is simulation methods for systems of mixed hardware and software. The final area investigated is the application of simulation methods to one of the proposed computer network protocols for space station, specifically IEEE 802.4.
Some issues related to simulation of the tracking and communications computer network
NASA Astrophysics Data System (ADS)
Lacovara, Robert C.
1989-12-01
The Communications Performance and Integration branch of the Tracking and Communications Division has an ongoing involvement in the simulation of its flight hardware for Space Station Freedom. Specifically, the communication process between central processor(s) and orbital replaceable units (ORU's) is simulated with varying degrees of fidelity. The results of investigations into three aspects of this simulation effort are given. The most general area involves the use of computer assisted software engineering (CASE) tools for this particular simulation. The second area of interest is simulation methods for systems of mixed hardware and software. The final area investigated is the application of simulation methods to one of the proposed computer network protocols for space station, specifically IEEE 802.4.
LSI logic for phase-control rectifiers
NASA Technical Reports Server (NTRS)
Dolland, C.
1980-01-01
Signals for controlling phase-controlled rectifier circuit are generated by combinatorial logic than can be implemented in large-scale integration (LSI). LSI circuit saves space, weight, and assembly time compared to previous controls that employ one-shot multivibrators, latches, and capacitors. LSI logic functions by sensing three phases of ac power source and by comparing actual currents with intended currents.
Physiological and molecular characterization of Si uptake in wild rice species.
Mitani-Ueno, Namiki; Ogai, Hisao; Yamaji, Naoki; Ma, Jian Feng
2014-07-01
Cultivated rice (Oryza sativa) accumulates high concentration of silicon (Si), which is required for its high and sustainable production. High Si accumulation in cultivated rice is achieved by a high expression of both influx (Lsi1) and efflux (Lsi2) Si transporters in roots. Herein, we physiologically investigated Si uptake, isolated and functionally characterized Si transporters in six wild rice species with different genome types. Si uptake by the roots was lower in Oryza rufipogon, Oryza barthii (AA genome), Oryza australiensis (EE genome) and Oryza punctata (BB genome), but similar in Oryza glumaepatula and Oryza meridionalis (AA genome) compared with the cultivated rice (cv. Nipponbare). However, all wild rice species and the cultivated rice showed similar concentration of Si in the shoots when grown in a field. All species with AA genome showed the same amino acid sequence of both Lsi1 and Lsi2 as O. sativa, whereas species with EE and BB genome showed several nucleotide differences in both Lsi1 and Lsi2. However, proteins encoded by these genes also showed transport activity for Si in Xenopus oocyte. The mRNA expression of Lsi1 in all wild rice species was lower than that in the cultivated rice, whereas the expression of Lsi2 was lower in O. rufipogon and O. barthii but similar in other species. Similar cellular localization of Lsi1 and Lsi2 was observed in all wild rice as the cultivated rice. These results indicate that superior Si uptake, the important trait for rice growth, is basically conserved in wild and cultivated rice species. © 2013 Scandinavian Plant Physiology Society.
A fast, programmable hardware architecture for spaceborne SAR processing
NASA Technical Reports Server (NTRS)
Bennett, J. R.; Cumming, I. G.; Lim, J.; Wedding, R. M.
1983-01-01
The launch of spaceborne SARs during the 1980's is discussed. The satellite SARs require high quality and high throughput ground processors. Compression ratios in range and azimuth of greater than 500 and 150 respectively lead to frequency domain processing and data computation rates in excess of 2000 million real operations per second for C-band SARs under consideration. Various hardware architectures are examined and two promising candidates and proceeds to recommend a fast, programmable hardware architecture for spaceborne SAR processing are selected. Modularity and programmability are introduced as desirable attributes for the purpose of HTSP hardware selection.
DRFM Cordic Processor and Sea Clutter Modeling for Enhancing Structured False Target Synthesis
2017-09-01
was implemented using the Verilog hardware description language. The second investigation concerns generating sea clutter to impose on the false target...to achieve accuracy at 5.625o. The resulting design was implemented using the Verilog hardware description language. The second investigation...33 3. Initialization of the Angle Accumulator ....................................34 4. Design Methodology for I/Q Phase
Electrically reconfigurable logic array
NASA Technical Reports Server (NTRS)
Agarwal, R. K.
1982-01-01
To compose the complicated systems using algorithmically specialized logic circuits or processors, one solution is to perform relational computations such as union, division and intersection directly on hardware. These relations can be pipelined efficiently on a network of processors having an array configuration. These processors can be designed and implemented with a few simple cells. In order to determine the state-of-the-art in Electrically Reconfigurable Logic Array (ERLA), a survey of the available programmable logic array (PLA) and the logic circuit elements used in such arrays was conducted. Based on this survey some recommendations are made for ERLA devices.
Tactical Operations Analysis Support Facility.
1983-07-01
are stored in nonvolatile RAM (NVR). Communication with a host processor via a UART (75-19.2K bps) in full duplex mode. An advanced video option...hardware/firmware "machines." Smart terminals, I/O con- * trollers, and unique peripheral processors are examples of this process. Briton Lee, Inc...the relational data base for symbol attributes and data retrievals. * Generates a grid system for precise cursor positioning for lines, charts, and
Benchmarking hypercube hardware and software
NASA Technical Reports Server (NTRS)
Grunwald, Dirk C.; Reed, Daniel A.
1986-01-01
It was long a truism in computer systems design that balanced systems achieve the best performance. Message passing parallel processors are no different. To quantify the balance of a hypercube design, an experimental methodology was developed and the associated suite of benchmarks was applied to several existing hypercubes. The benchmark suite includes tests of both processor speed in the absence of internode communication and message transmission speed as a function of communication patterns.
An executable specification for the message processor in a simple combining network
NASA Technical Reports Server (NTRS)
Middleton, David
1995-01-01
While the primary function of the network in a parallel computer is to communicate data between processors, it is often useful if the network can also perform rudimentary calculations. That is, some simple processing ability in the network itself, particularly for performing parallel prefix computations, can reduce both the volume of data being communicated and the computational load on the processors proper. Unfortunately, typical implementations of such networks require a large fraction of the hardware budget, and so combining networks are viewed as being impractical. The FFP Machine has such a combining network, and various characteristics of the machine allow a good deal of simplification in the network design. Despite being simple in construction however, the network relies on many subtle details to work correctly. This paper describes an executable model of the network which will serve several purposes. It provides a complete and detailed description of the network which can substantiate its ability to support necessary functions. It provides an environment in which algorithms to be run on the network can be designed and debugged more easily than they would on physical hardware. Finally, it provides the foundation for exploring the design of the message receiving facility which connects the network to the individual processors.
Enabling Future Robotic Missions with Multicore Processors
NASA Technical Reports Server (NTRS)
Powell, Wesley A.; Johnson, Michael A.; Wilmot, Jonathan; Some, Raphael; Gostelow, Kim P.; Reeves, Glenn; Doyle, Richard J.
2011-01-01
Recent commercial developments in multicore processors (e.g. Tilera, Clearspeed, HyperX) have provided an option for high performance embedded computing that rivals the performance attainable with FPGA-based reconfigurable computing architectures. Furthermore, these processors offer more straightforward and streamlined application development by allowing the use of conventional programming languages and software tools in lieu of hardware design languages such as VHDL and Verilog. With these advantages, multicore processors can significantly enhance the capabilities of future robotic space missions. This paper will discuss these benefits, along with onboard processing applications where multicore processing can offer advantages over existing or competing approaches. This paper will also discuss the key artchitecural features of current commercial multicore processors. In comparison to the current art, the features and advancements necessary for spaceflight multicore processors will be identified. These include power reduction, radiation hardening, inherent fault tolerance, and support for common spacecraft bus interfaces. Lastly, this paper will explore how multicore processors might evolve with advances in electronics technology and how avionics architectures might evolve once multicore processors are inserted into NASA robotic spacecraft.
Optimization of the Multi-Spectral Euclidean Distance Calculation for FPGA-based Spaceborne Systems
NASA Technical Reports Server (NTRS)
Cristo, Alejandro; Fisher, Kevin; Perez, Rosa M.; Martinez, Pablo; Gualtieri, Anthony J.
2012-01-01
Due to the high quantity of operations that spaceborne processing systems must carry out in space, new methodologies and techniques are being presented as good alternatives in order to free the main processor from work and improve the overall performance. These include the development of ancillary dedicated hardware circuits that carry out the more redundant and computationally expensive operations in a faster way, leaving the main processor free to carry out other tasks while waiting for the result. One of these devices is SpaceCube, a FPGA-based system designed by NASA. The opportunity to use FPGA reconfigurable architectures in space allows not only the optimization of the mission operations with hardware-level solutions, but also the ability to create new and improved versions of the circuits, including error corrections, once the satellite is already in orbit. In this work, we propose the optimization of a common operation in remote sensing: the Multi-Spectral Euclidean Distance calculation. For that, two different hardware architectures have been designed and implemented in a Xilinx Virtex-5 FPGA, the same model of FPGAs used by SpaceCube. Previous results have shown that the communications between the embedded processor and the circuit create a bottleneck that affects the overall performance in a negative way. In order to avoid this, advanced methods including memory sharing, Native Port Interface (NPI) connections and Data Burst Transfers have been used.
Vatansever, Recep; Ozyigit, Ibrahim Ilker; Filiz, Ertugrul; Gozukirmizi, Nermin
2017-04-01
Silicon (Si) is a nonessential, beneficial micronutrient for plants. It increases the plant stress tolerance in relation to its accumulation capacity. In this work, root Si transporter genes were characterized in 17 different plants and inferred for their Si-accumulation status. A total of 62 Si transporter genes (31 Lsi1 and 31 Lsi2) were identified in studied plants. Lsi1s were 261-324 residues protein with a MIP family domain whereas Lsi2s were 472-547 residues with a citrate transporter family domain. Lsi1s possessed characteristic sequence features that can be employed as benchmark in prediction of Si-accumulation status/capacity of the plants. Silicic acid selectivity in Lsi1s was associated with two highly conserved NPA (Asn-Pro-Ala) motifs and a Gly-Ser-Gly-Arg (GSGR) ar/R filter. Two NPA regions were present in all Lsi1 members but some Ala substituted with Ser or Val. GSGR filter was only available in the proposed high and moderate Si accumulators. In phylogeny, Lsi1s formed three clusters as low, moderate and high Si accumulators based on tree topology and availability of GSGR filter. Low-accumulators contained filters WIGR, AIGR, FAAR, WVAR and AVAR, high-accumulators only with GSGR filter, and moderate-accumulators mostly with GSGR but some with A/CSGR filters. A positive correlation was also available between sequence homology and Si-accumulation status of the tested plants. Thus, availability of GSGR selectivity filter and sequence homology degree could be used as signatures in prediction of Si-accumulation status in experimentally uncharacterized plants. Moreover, interaction partner and expression profile analyses implicated the involvement of Si transporters in plant stress tolerance.
Correcting for motion artifact in handheld laser speckle images
NASA Astrophysics Data System (ADS)
Lertsakdadet, Ben; Yang, Bruce Y.; Dunn, Cody E.; Ponticorvo, Adrien; Crouzet, Christian; Bernal, Nicole; Durkin, Anthony J.; Choi, Bernard
2018-03-01
Laser speckle imaging (LSI) is a wide-field optical technique that enables superficial blood flow quantification. LSI is normally performed in a mounted configuration to decrease the likelihood of motion artifact. However, mounted LSI systems are cumbersome and difficult to transport quickly in a clinical setting for which portability is essential in providing bedside patient care. To address this issue, we created a handheld LSI device using scientific grade components. To account for motion artifact of the LSI device used in a handheld setup, we incorporated a fiducial marker (FM) into our imaging protocol and determined the difference between highest and lowest speckle contrast values for the FM within each data set (Kbest and Kworst). The difference between Kbest and Kworst in mounted and handheld setups was 8% and 52%, respectively, thereby reinforcing the need for motion artifact quantification. When using a threshold FM speckle contrast value (KFM) to identify a subset of images with an acceptable level of motion artifact, mounted and handheld LSI measurements of speckle contrast of a flow region (KFLOW) in in vitro flow phantom experiments differed by 8%. Without the use of the FM, mounted and handheld KFLOW values differed by 20%. To further validate our handheld LSI device, we compared mounted and handheld data from an in vivo porcine burn model of superficial and full thickness burns. The speckle contrast within the burn region (KBURN) of the mounted and handheld LSI data differed by <4 % when accounting for motion artifact using the FM, which is less than the speckle contrast difference between superficial and full thickness burns. Collectively, our results suggest the potential of handheld LSI with an FM as a suitable alternative to mounted LSI, especially in challenging clinical settings with space limitations such as the intensive care unit.
Set processing in a network environment. [data bases and magnetic disks and tapes
NASA Technical Reports Server (NTRS)
Hardgrave, W. T.
1975-01-01
A combination of a local network, a mass storage system, and an autonomous set processor serving as a data/storage management machine is described. Its characteristics include: content-accessible data bases usable from all connected devices; efficient storage/access of large data bases; simple and direct programming with data manipulation and storage management handled by the set processor; simple data base design and entry from source representation to set processor representation with no predefinition necessary; capability available for user sort/order specification; significant reduction in tape/disk pack storage and mounts; flexible environment that allows upgrading hardware/software configuration without causing major interruptions in service; minimal traffic on data communications network; and improved central memory usage on large processors.
Full-field, nonscanning, optical imaging for perfusion indication
NASA Astrophysics Data System (ADS)
Chou, Nee-Yin; Winchester, L. W., Jr.; Naramore, W. J.; Alley, M. S.; Lesnick, A. J.
2010-04-01
Laser speckle imaging (LSI) has been gaining popularity for the past few years. Like other optical imaging modalities such as optical coherence tomography (OCT), orthogonal polarization spectroscopy (OPS), and laser Doppler imaging (LDI), LSI utilizes nonionizing radiation. In LSI, blood flow velocity is obtained by analyzing, temporally or spatially, laser speckle (LS) patterns generated when an expanded laser beam illuminates the tissue. The advantages of LSI are that it is fast, does not require scanning, and provides full-field LS images to extract realtime, quantitative hemodynamic information of subtle changes in the tissue vasculature. For medical applications, LSI has been used for obtaining blood velocities in human retina, skin flaps, wounds, and cerebral and sublingual areas. When coupled with optical fibers, LSI can be used for endoscopic measurements for a variety of applications. This paper describes the application of LSI in retinal, sublingual, and skin flap measurements. Evaluation of retinal hemodynamics provides very important diagnostic information, since the human retina offers direct optical access to both the central nervous system (CNS) and afferent and efferent CNS vasculature. The performance of an LSI-based fundus imager for measuring retinal hemodynamics is presented. Sublingual microcirculation may have utility for sepsis indication, since inherent in organ injury caused by sepsis is a profound change in microvascular hemodynamics. Sublingual measurement results using an LSI scope are reported. A wound imager for imaging LS patterns of wounds and skin flaps is described, and results are presented.
Efficient Matrix Models for Relational Learning
2009-10-01
74 4.5.3 Comparison to pLSI- pHITS . . . . . . . . . . . . . . . . . . . . 76 5 Hierarchical Bayesian Collective...Behaviour of Newton vs. Stochastic Newton on a three-factor model. 4.5.3 Comparison to pLSI- pHITS Caveat: Collective Matrix Factorization makes no guarantees...leads to better results; and another where a co-clustering model, pLSI- pHITS , has the advantage. pLSI- pHITS [24] is a relational clustering technique
Ultra high speed image processing techniques. [electronic packaging techniques
NASA Technical Reports Server (NTRS)
Anthony, T.; Hoeschele, D. F.; Connery, R.; Ehland, J.; Billings, J.
1981-01-01
Packaging techniques for ultra high speed image processing were developed. These techniques involve the development of a signal feedthrough technique through LSI/VLSI sapphire substrates. This allows the stacking of LSI/VLSI circuit substrates in a 3 dimensional package with greatly reduced length of interconnecting lines between the LSI/VLSI circuits. The reduced parasitic capacitances results in higher LSI/VLSI computational speeds at significantly reduced power consumption levels.
Real Time Phase Noise Meter Based on a Digital Signal Processor
NASA Technical Reports Server (NTRS)
Angrisani, Leopoldo; D'Arco, Mauro; Greenhall, Charles A.; Schiano Lo Morille, Rosario
2006-01-01
A digital signal-processing meter for phase noise measurement on sinusoidal signals is dealt with. It enlists a special hardware architecture, made up of a core digital signal processor connected to a data acquisition board, and takes advantage of a quadrature demodulation-based measurement scheme, already proposed by the authors. Thanks to an efficient measurement process and an optimized implementation of its fundamental stages, the proposed meter succeeds in exploiting all hardware resources in such an effective way as to gain high performance and real-time operation. For input frequencies up to some hundreds of kilohertz, the meter is capable both of updating phase noise power spectrum while seamlessly capturing the analyzed signal into its memory, and granting as good frequency resolution as few units of hertz.
Shao, Chenzhong; Tanaka, Shuji; Nakayama, Takahiro; Hata, Yoshiyuki; Muroyama, Masanori
2018-01-15
For installing many sensors in a limited space with a limited computing resource, the digitization of the sensor output at the site of sensation has advantages such as a small amount of wiring, low signal interference and high scalability. For this purpose, we have developed a dedicated Complementary Metal-Oxide-Semiconductor (CMOS) Large-Scale Integration (LSI) (referred to as "sensor platform LSI") for bus-networked Micro-Electro-Mechanical-Systems (MEMS)-LSI integrated sensors. In this LSI, collision avoidance, adaptation and event-driven functions are simply implemented to relieve data collision and congestion in asynchronous serial bus communication. In this study, we developed a network system with 48 sensor platform LSIs based on Printed Circuit Board (PCB) in a backbone bus topology with the bus length being 2.4 m. We evaluated the serial communication performance when 48 LSIs operated simultaneously with the adaptation function. The number of data packets received from each LSI was almost identical, and the average sampling frequency of 384 capacitance channels (eight for each LSI) was 73.66 Hz.
The science of computing - Parallel computation
NASA Technical Reports Server (NTRS)
Denning, P. J.
1985-01-01
Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
The use of emulator-based simulators for on-board software maintenance
NASA Astrophysics Data System (ADS)
Irvine, M. M.; Dartnell, A.
2002-07-01
Traditionally, onboard software maintenance activities within the space sector are performed using hardware-based facilities. These facilities are developed around the use of hardware emulation or breadboards containing target processors. Some sort of environment is provided around the hardware to support the maintenance actives. However, these environments are not easy to use to set-up the required test scenarios, particularly when the onboard software executes in a dynamic I/O environment, e.g. attitude control software, or data handling software. In addition, the hardware and/or environment may not support the test set-up required during investigations into software anomalies, e.g. raise spurious interrupt, fail memory, etc, and the overall "visibility" of the software executing may be limited. The Software Maintenance Simulator (SOMSIM) is a tool that can support the traditional maintenance facilities. The following list contains some of the main benefits that SOMSIM can provide: Low cost flexible extension to existing product - operational simulator containing software processor emulator; System-level high-fidelity test-bed in which software "executes"; Provides a high degree of control/configuration over the entire "system", including contingency conditions perhaps not possible with real hardware; High visibility and control over execution of emulated software. This paper describes the SOMSIM concept in more detail, and also describes the SOMSIM study being carried out for ESA/ESOC by VEGA IT GmbH.
Operating System Support for Shared Hardware Data Structures
2013-01-31
Carbon [73] uses hardware queues to improve fine-grained multitasking for Recognition, Mining , and Synthesis. Compared to software ap- proaches...web transaction processing, data mining , and multimedia. Early work in database processors [114, 96, 79, 111] reduce the costs of relational database...assignment can be solved statically or dynamically. Static assignment deter- mines offline which data structures are assigned to use HWDS resources and at
Solid State Audio/Speech Processor Analysis.
1980-03-01
techniques. The techniques were demonstrated to be worthwhile in an efficient realtime AWR system. Finally, microprocessor architectures were designed to...do not include custom chip development, detailed hardware design , construction or testing. ITTDCD is very encouraged by the results obtained in this...California, Berkley, was responsible for furnishing the simulation data of OD speech analysis techniques and for the design and development of the hardware OD
Design, processing and testing of LSI arrays hybrid microelectronics task
NASA Technical Reports Server (NTRS)
Himmel, R. P.; Stuhlbarg, S. M.; Salmassy, S.
1978-01-01
Those factors affecting the cost of electronic subsystems utilizing LSI microcircuits were determined and the most efficient methods for low cost packaging of LSI devices as a function of density and reliability were developed.
Analog hardware for learning neural networks
NASA Technical Reports Server (NTRS)
Eberhardt, Silvio P. (Inventor)
1991-01-01
This is a recurrent or feedforward analog neural network processor having a multi-level neuron array and a synaptic matrix for storing weighted analog values of synaptic connection strengths which is characterized by temporarily changing one connection strength at a time to determine its effect on system output relative to the desired target. That connection strength is then adjusted based on the effect, whereby the processor is taught the correct response to training examples connection by connection.
Modernization of the NASA IRTF Telescope Control System
NASA Astrophysics Data System (ADS)
Pilger, Eric J.; Harwood, James V.; Onaka, Peter M.
1994-06-01
We describe the ongoing modernization of the NASA IR Telescope Facility Telescope Control System. A major mandate of this project is to keep the telescope available for observations throughout. Therefore, we have developed an incremental plan that will allow us to replace components of the software and hardware without shutting down the system. The current system, running under FORTH on a DEC LSI 11/23 minicomputer interfaced to a Bus and boards developed in house, will be replaced with a combination of a Sun SPARCstation running SunOS, a MicroSPARC based Single Board Computer running LynxOS, and various intelligent VME based peripheral cards. The software is based on a design philosophy originally developed by Pat Wallace for use on the Anglo Australian Telescope. This philosophy has gained wide acceptance, and is currently used in a number of observatories around the world. A key element of this philosophy is the division of the TCS into `Virtual' and `Real' parts. This will allow us to replace the higher level functions of the TCS with software running on the Sun, while still relying on the LSI 11/23 for performance of the lower level functions. Eventual transfer of lower level functions to the MicroSPARC system will then proceed incrementally through use of a Q-Bus to VME-Bus converter.
Crosetto, Dario B.
1996-01-01
The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.
Digital system for structural dynamics simulation
NASA Technical Reports Server (NTRS)
Krauter, A. I.; Lagace, L. J.; Wojnar, M. K.; Glor, C.
1982-01-01
State-of-the-art digital hardware and software for the simulation of complex structural dynamic interactions, such as those which occur in rotating structures (engine systems). System were incorporated in a designed to use an array of processors in which the computation for each physical subelement or functional subsystem would be assigned to a single specific processor in the simulator. These node processors are microprogrammed bit-slice microcomputers which function autonomously and can communicate with each other and a central control minicomputer over parallel digital lines. Inter-processor nearest neighbor communications busses pass the constants which represent physical constraints and boundary conditions. The node processors are connected to the six nearest neighbor node processors to simulate the actual physical interface of real substructures. Computer generated finite element mesh and force models can be developed with the aid of the central control minicomputer. The control computer also oversees the animation of a graphics display system, disk-based mass storage along with the individual processing elements.
Next Generation Space Telescope Integrated Science Module Data System
NASA Technical Reports Server (NTRS)
Schnurr, Richard G.; Greenhouse, Matthew A.; Jurotich, Matthew M.; Whitley, Raymond; Kalinowski, Keith J.; Love, Bruce W.; Travis, Jeffrey W.; Long, Knox S.
1999-01-01
The Data system for the Next Generation Space Telescope (NGST) Integrated Science Module (ISIM) is the primary data interface between the spacecraft, telescope, and science instrument systems. This poster includes block diagrams of the ISIM data system and its components derived during the pre-phase A Yardstick feasibility study. The poster details the hardware and software components used to acquire and process science data for the Yardstick instrument compliment, and depicts the baseline external interfaces to science instruments and other systems. This baseline data system is a fully redundant, high performance computing system. Each redundant computer contains three 150 MHz power PC processors. All processors execute a commercially available real time multi-tasking operating system supporting, preemptive multi-tasking, file management and network interfaces. These six processors in the system are networked together. The spacecraft interface baseline is an extension of the network, which links the six processors. The final selection for Processor busses, processor chips, network interfaces, and high-speed data interfaces will be made during mid 2002.
NASA Astrophysics Data System (ADS)
He, Yongli; Huang, Jianping; Li, Dongdong; Xie, Yongkun; Zhang, Guolong; Qi, Yulei; Wang, Shanshan; Totz, Sonja
2017-11-01
The influence of winter and summer land-sea surface thermal contrast on blocking for 1948-2013 is investigated using observations and the coupled model intercomparison project outputs. The land-sea index (LSI) is defined to measure the changes of zonal asymmetric thermal forcing under global warming. The summer LSI shows a slower increasing trend than winter during this period. For the positive of summer LSI, the EP flux convergence induced by the land-sea thermal forcing in the high latitude becomes weaker than normal, which induces positive anomaly of zonal-mean westerly and double-jet structure. Based on the quasiresonance amplification mechanism, the narrow and reduced westerly tunnel between two jet centers provides a favor environment for more frequent blocking. Composite analysis demonstrates that summer blocking shows an increasing trend of event numbers and a decreasing trend of durations. The numbers of the short-lived blocking persisting for 5-9 days significantly increases and the numbers of the long-lived blocking persisting for longer than 10 days has a weak increase than that in negative phase of summer LSI. The increasing transient wave activities induced by summer LSI is responsible for the decreasing duration of blockings. The increasing blocking due to summer LSI can further strengthen the continent warming and increase the summer LSI, which forms a positive feedback. The opposite dynamical effect of LSI on summer and winter blocking are discussed and found that the LSI-blocking negative feedback partially reduces the influence of the above positive feedback and induce the weak summer warming rate.
Effect of rebamipide ophthalmic suspension on the success of lacrimal stent intubation.
Mimura, Masashi; Ueki, Mari; Oku, Hidehiro; Sato, Bunpei; Ikeda, Tsunehiko
2016-02-01
To evaluate the effect of the postoperative administration of rebamipide ophthalmic suspension on the success rate of lacrimal stent intubation (LSI) for the treatment of primary acquired nasolacrimal duct obstruction (PANDO). This comparative interventional cohort study investigated 110 consecutive patients with PANDO who were treated with LSI and followed up for 12 months postoperatively at one institution. LSI was performed by one surgeon, and all patients received identical postoperative care. Among the total 110 patients, 71 underwent LSI with postoperative administration of rebamipide ophthalmic suspension, and 39 underwent LSI without administration of the suspension. Data related to patient age, gender, laterality, and postoperative administration of rebamipide ophthalmic suspension were collected and used as independent variables, and logistic regression analyses were performed to compare the anatomical success rate at 12 months postoperatively between patients with and without postoperative administration of the suspension. The anatomical success rate of LSI in patients with and without postoperative administration of rebamipide ophthalmic suspension was 90.1 and 69.2 %, respectively. A comparison of these success rates showed statistical significance, in that the rate of treatment success was higher in PANDO patients who underwent LSI with postoperative administration of the suspension [odds ratio (OR), 3.37; P < 0.05]. The findings of this study show that postoperative administration of rebamipide ophthalmic suspension increases the rate of anatomical success in patients who undergo LSI for the treatment of PANDO.
A digital retina-like low-level vision processor.
Mertoguno, S; Bourbakis, N G
2003-01-01
This correspondence presents the basic design and the simulation of a low level multilayer vision processor that emulates to some degree the functional behavior of a human retina. This retina-like multilayer processor is the lower part of an autonomous self-organized vision system, called Kydon, that could be used on visually impaired people with a damaged visual cerebral cortex. The Kydon vision system, however, is not presented in this paper. The retina-like processor consists of four major layers, where each of them is an array processor based on hexagonal, autonomous processing elements that perform a certain set of low level vision tasks, such as smoothing and light adaptation, edge detection, segmentation, line recognition and region-graph generation. At each layer, the array processor is a 2D array of k/spl times/m hexagonal identical autonomous cells that simultaneously execute certain low level vision tasks. Thus, the hardware design and the simulation at the transistor level of the processing elements (PEs) of the retina-like processor and its simulated functionality with illustrative examples are provided in this paper.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Yao; Balaprakash, Prasanna; Meng, Jiayuan
We present Raexplore, a performance modeling framework for architecture exploration. Raexplore enables rapid, automated, and systematic search of architecture design space by combining hardware counter-based performance characterization and analytical performance modeling. We demonstrate Raexplore for two recent manycore processors IBM Blue- Gene/Q compute chip and Intel Xeon Phi, targeting a set of scientific applications. Our framework is able to capture complex interactions between architectural components including instruction pipeline, cache, and memory, and to achieve a 3–22% error for same-architecture and cross-architecture performance predictions. Furthermore, we apply our framework to assess the two processors, and discover and evaluate a list ofmore » architectural scaling options for future processor designs.« less
A 20 MHz CMOS reorder buffer for a superscalar microprocessor
NASA Technical Reports Server (NTRS)
Lenell, John; Wallace, Steve; Bagherzadeh, Nader
1992-01-01
Superscalar processors can achieve increased performance by issuing instructions out-of-order from the original sequential instruction stream. Implementing an out-of-order instruction issue policy requires a hardware mechanism to prevent incorrectly executed instructions from updating register values. A reorder buffer can be used to allow a superscalar processor to issue instructions out-of-order and maintain program correctness. This paper describes the design and implementation of a 20MHz CMOS reorder buffer for superscalar processors. The reorder buffer is designed to accept and retire two instructions per cycle. A full-custom layout in 1.2 micron has been implemented, measuring 1.1058 mm by 1.3542 mm.
NASA Technical Reports Server (NTRS)
Taylor, B. K.; Casasent, D. P.
1989-01-01
The use of simplified error models to accurately simulate and evaluate the performance of an optical linear-algebra processor is described. The optical architecture used to perform banded matrix-vector products is reviewed, along with a linear dynamic finite-element case study. The laboratory hardware and ac-modulation technique used are presented. The individual processor error-source models and their simulator implementation are detailed. Several significant simplifications are introduced to ease the computational requirements and complexity of the simulations. The error models are verified with a laboratory implementation of the processor, and are used to evaluate its potential performance.
FPGA-Based, Self-Checking, Fault-Tolerant Computers
NASA Technical Reports Server (NTRS)
Some, Raphael; Rennels, David
2004-01-01
A proposed computer architecture would exploit the capabilities of commercially available field-programmable gate arrays (FPGAs) to enable computers to detect and recover from bit errors. The main purpose of the proposed architecture is to enable fault-tolerant computing in the presence of single-event upsets (SEUs). [An SEU is a spurious bit flip (also called a soft error) caused by a single impact of ionizing radiation.] The architecture would also enable recovery from some soft errors caused by electrical transients and, to some extent, from intermittent and permanent (hard) errors caused by aging of electronic components. A typical FPGA of the current generation contains one or more complete processor cores, memories, and highspeed serial input/output (I/O) channels, making it possible to shrink a board-level processor node to a single integrated-circuit chip. Custom, highly efficient microcontrollers, general-purpose computers, custom I/O processors, and signal processors can be rapidly and efficiently implemented by use of FPGAs. Unfortunately, FPGAs are susceptible to SEUs. Prior efforts to mitigate the effects of SEUs have yielded solutions that degrade performance of the system and require support from external hardware and software. In comparison with other fault-tolerant- computing architectures (e.g., triple modular redundancy), the proposed architecture could be implemented with less circuitry and lower power demand. Moreover, the fault-tolerant computing functions would require only minimal support from circuitry outside the central processing units (CPUs) of computers, would not require any software support, and would be largely transparent to software and to other computer hardware. There would be two types of modules: a self-checking processor module and a memory system (see figure). The self-checking processor module would be implemented on a single FPGA and would be capable of detecting its own internal errors. It would contain two CPUs executing identical programs in lock step, with comparison of their outputs to detect errors. It would also contain various cache local memory circuits, communication circuits, and configurable special-purpose processors that would use self-checking checkers. (The basic principle of the self-checking checker method is to utilize logic circuitry that generates error signals whenever there is an error in either the checker or the circuit being checked.) The memory system would comprise a main memory and a hardware-controlled check-pointing system (CPS) based on a buffer memory denoted the recovery cache. The main memory would contain random-access memory (RAM) chips and FPGAs that would, in addition to everything else, implement double-error-detecting and single-error-correcting memory functions to enable recovery from single-bit errors.
Orthorectification by Using Gpgpu Method
NASA Astrophysics Data System (ADS)
Sahin, H.; Kulur, S.
2012-07-01
Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
Correcting for motion artifact in handheld laser speckle images.
Lertsakdadet, Ben; Yang, Bruce Y; Dunn, Cody E; Ponticorvo, Adrien; Crouzet, Christian; Bernal, Nicole; Durkin, Anthony J; Choi, Bernard
2018-03-01
Laser speckle imaging (LSI) is a wide-field optical technique that enables superficial blood flow quantification. LSI is normally performed in a mounted configuration to decrease the likelihood of motion artifact. However, mounted LSI systems are cumbersome and difficult to transport quickly in a clinical setting for which portability is essential in providing bedside patient care. To address this issue, we created a handheld LSI device using scientific grade components. To account for motion artifact of the LSI device used in a handheld setup, we incorporated a fiducial marker (FM) into our imaging protocol and determined the difference between highest and lowest speckle contrast values for the FM within each data set (Kbest and Kworst). The difference between Kbest and Kworst in mounted and handheld setups was 8% and 52%, respectively, thereby reinforcing the need for motion artifact quantification. When using a threshold FM speckle contrast value (KFM) to identify a subset of images with an acceptable level of motion artifact, mounted and handheld LSI measurements of speckle contrast of a flow region (KFLOW) in in vitro flow phantom experiments differed by 8%. Without the use of the FM, mounted and handheld KFLOW values differed by 20%. To further validate our handheld LSI device, we compared mounted and handheld data from an in vivo porcine burn model of superficial and full thickness burns. The speckle contrast within the burn region (KBURN) of the mounted and handheld LSI data differed by <4 % when accounting for motion artifact using the FM, which is less than the speckle contrast difference between superficial and full thickness burns. Collectively, our results suggest the potential of handheld LSI with an FM as a suitable alternative to mounted LSI, especially in challenging clinical settings with space limitations such as the intensive care unit. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Handheld, point-of-care laser speckle imaging
Farraro, Ryan; Fathi, Omid; Choi, Bernard
2016-01-01
Abstract. Laser speckle imaging (LSI) enables measurement of relative changes in blood flow in biological tissues. We postulate that a point-of-care form factor will lower barriers to routine clinical use of LSI. Here, we describe a first-generation handheld LSI device based on a tablet computer. The coefficient of variation of speckle contrast was <2% after averaging imaging data collected over an acquisition period of 5.3 s. With a single, experienced user, handheld motion artifacts had a negligible effect on data collection. With operation by multiple users, we did not identify any significant difference (p>0.05) between the measured speckle contrast values using either a handheld or mounted configuration. In vivo data collected during occlusion experiments demonstrate that a handheld LSI is capable of both quantitative and qualitative assessment of changes in blood flow. Finally, as a practical application of handheld LSI, we collected data from a 53-day-old neonate with confirmed compromised blood flow in the hand. We readily identified with LSI a region of diminished blood flow in the thumb of the affected hand. Our data collectively suggest that handheld LSI is a promising technique to enable clinicians to obtain point-of-care measurements of blood flow. PMID:27579578
Design and evaluation of a miniature laser speckle imaging device to assess gingival health
Regan, Caitlin; White, Sean M.; Yang, Bruce Y.; Takesh, Thair; Ho, Jessica; Wink, Cherie; Wilder-Smith, Petra; Choi, Bernard
2016-01-01
Abstract. Current methods used to assess gingivitis are qualitative and subjective. We hypothesized that gingival perfusion measurements could provide a quantitative metric of disease severity. We constructed a compact laser speckle imaging (LSI) system that could be mounted in custom-made oral molds. Rigid fixation of the LSI system in the oral cavity enabled measurement of blood flow in the gingiva. In vitro validation performed in controlled flow phantoms demonstrated that the compact LSI system had comparable accuracy and linearity compared to a conventional bench-top LSI setup. In vivo validation demonstrated that the compact LSI system was capable of measuring expected blood flow dynamics during a standard postocclusive reactive hyperemia and that the compact LSI system could be used to measure gingival blood flow repeatedly without significant variation in measured blood flow values (p<0.05). Finally, compact LSI system measurements were collected from the interdental papilla of nine subjects and compared to a clinical assessment of gingival bleeding on probing. A statistically significant correlation (ρ=0.53; p<0.005) was found between these variables, indicating that quantitative gingival perfusion measurements performed using our system may aid in the diagnosis and prognosis of periodontal disease. PMID:27787545
Design and evaluation of a miniature laser speckle imaging device to assess gingival health
NASA Astrophysics Data System (ADS)
Regan, Caitlin; White, Sean M.; Yang, Bruce Y.; Takesh, Thair; Ho, Jessica; Wink, Cherie; Wilder-Smith, Petra; Choi, Bernard
2016-10-01
Current methods used to assess gingivitis are qualitative and subjective. We hypothesized that gingival perfusion measurements could provide a quantitative metric of disease severity. We constructed a compact laser speckle imaging (LSI) system that could be mounted in custom-made oral molds. Rigid fixation of the LSI system in the oral cavity enabled measurement of blood flow in the gingiva. In vitro validation performed in controlled flow phantoms demonstrated that the compact LSI system had comparable accuracy and linearity compared to a conventional bench-top LSI setup. In vivo validation demonstrated that the compact LSI system was capable of measuring expected blood flow dynamics during a standard postocclusive reactive hyperemia and that the compact LSI system could be used to measure gingival blood flow repeatedly without significant variation in measured blood flow values (p<0.05). Finally, compact LSI system measurements were collected from the interdental papilla of nine subjects and compared to a clinical assessment of gingival bleeding on probing. A statistically significant correlation (ρ=0.53 p<0.005) was found between these variables, indicating that quantitative gingival perfusion measurements performed using our system may aid in the diagnosis and prognosis of periodontal disease.
Handheld, point-of-care laser speckle imaging
NASA Astrophysics Data System (ADS)
Farraro, Ryan; Fathi, Omid; Choi, Bernard
2016-09-01
Laser speckle imaging (LSI) enables measurement of relative changes in blood flow in biological tissues. We postulate that a point-of-care form factor will lower barriers to routine clinical use of LSI. Here, we describe a first-generation handheld LSI device based on a tablet computer. The coefficient of variation of speckle contrast was <2% after averaging imaging data collected over an acquisition period of 5.3 s. With a single, experienced user, handheld motion artifacts had a negligible effect on data collection. With operation by multiple users, we did not identify any significant difference (p>0.05) between the measured speckle contrast values using either a handheld or mounted configuration. In vivo data collected during occlusion experiments demonstrate that a handheld LSI is capable of both quantitative and qualitative assessment of changes in blood flow. Finally, as a practical application of handheld LSI, we collected data from a 53-day-old neonate with confirmed compromised blood flow in the hand. We readily identified with LSI a region of diminished blood flow in the thumb of the affected hand. Our data collectively suggest that handheld LSI is a promising technique to enable clinicians to obtain point-of-care measurements of blood flow.
Relational Learning via Collective Matrix Factorization
2008-06-01
well-known example of such a schema is pLSI- pHITS [13], which models document-word counts and document-document citations: E1 = words and E2 = E3...relational co- clustering include pLSI, pLSI- pHITS , the symmetric block models of Long et. al. [23, 24, 25], and Bregman tensor clustering [5] (which can...to pLSI- pHITS In this section we provide an example where the additional flexibility of collective matrix factorization leads to better results; and
Real-time lens distortion correction: speed, accuracy and efficiency
NASA Astrophysics Data System (ADS)
Bax, Michael R.; Shahidi, Ramin
2014-11-01
Optical lens systems suffer from nonlinear geometrical distortion. Optical imaging applications such as image-enhanced endoscopy and image-based bronchoscope tracking require correction of this distortion for accurate localization, tracking, registration, and measurement of image features. Real-time capability is desirable for interactive systems and live video. The use of a texture-mapping graphics accelerator, which is standard hardware on current motherboard chipsets and add-in video graphics cards, to perform distortion correction is proposed. Mesh generation for image tessellation, an error analysis, and performance results are presented. It is shown that distortion correction using commodity graphics hardware is substantially faster than using the main processor and can be performed at video frame rates (faster than 30 frames per second), and that the polar-based method of mesh generation proposed here is more accurate than a conventional grid-based approach. Using graphics hardware to perform distortion correction is not only fast and accurate but also efficient as it frees the main processor for other tasks, which is an important issue in some real-time applications.
Introduction on performance analysis and profiling methodologies for KVM on ARM virtualization
NASA Astrophysics Data System (ADS)
Motakis, Antonios; Spyridakis, Alexander; Raho, Daniel
2013-05-01
The introduction of hardware virtualization extensions on ARM Cortex-A15 processors has enabled the implementation of full virtualization solutions for this architecture, such as KVM on ARM. This trend motivates the need to quantify and understand the performance impact, emerged by the application of this technology. In this work we start looking into some interesting performance metrics on KVM for ARM processors, which can provide us with useful insight that may lead to potential improvements in the future. This includes measurements such as interrupt latency and guest exit cost, performed on ARM Versatile Express and Samsung Exynos 5250 hardware platforms. Furthermore, we discuss additional methodologies that can provide us with a deeper understanding in the future of the performance footprint of KVM. We identify some of the most interesting approaches in this field, and perform a tentative analysis on how these may be implemented in the KVM on ARM port. These take into consideration hardware and software based counters for profiling, and issues related to the limitations of the simulators which are often used, such as the ARM Fast Models platform.
Space Telecommunications Radio System Software Architecture Concepts and Analysis
NASA Technical Reports Server (NTRS)
Handler, Louis M.; Hall, Charles S.; Briones, Janette C.; Blaser, Tammy M.
2008-01-01
The Space Telecommunications Radio System (STRS) project investigated various Software Defined Radio (SDR) architectures for Space. An STRS architecture has been selected that separates the STRS operating environment from its various waveforms and also abstracts any specialized hardware to limit its effect on the operating environment. The design supports software evolution where new functionality is incorporated into the radio. Radio hardware functionality has been moving from hardware based ASICs into firmware and software based processors such as FPGAs, DSPs and General Purpose Processors (GPPs). Use cases capture the requirements of a system by describing how the system should interact with the users or other systems (the actors) to achieve a specific goal. The Unified Modeling Language (UML) is used to illustrate the Use Cases in a variety of ways. The Top Level Use Case diagram shows groupings of the use cases and how the actors are involved. The state diagrams depict the various states that a system or object may be in and the transitions between those states. The sequence diagrams show the main flow of activity as described in the use cases.
The implementation and use of Ada on distributed systems with high reliability requirements
NASA Technical Reports Server (NTRS)
Knight, J. C.; Gregory, S. T.; Urquhart, J. I. A.
1984-01-01
The use and implementation of Ada (a trade mark of the US Dept. of Defense) in distributed environments in which the hardware are assumed to be unreliable were investigated. The possibility that a distributed system is programmed entirely in Ada so that the individual tasks of the system are unconcerned with which processors they are executing on and failures occurring in the underlying hardware were examined.
Status of the Regenerative ECLSS Water Recovery System
NASA Technical Reports Server (NTRS)
Carter, Donald Layne
2009-01-01
NASA has completed the delivery of the regenerative Water Recovery System (WRS) for the International Space Station (ISS). The major assemblies included in this system are the Water Processor Assembly (WPA) and Urine Processor Assembly (UPA). This paper summarizes the final effort to deliver the hardware to the Kennedy Space Center for launch on STS-126, the on-orbit status as of April 2009, and describes some of the technical challenges encountered and lessons learned over the past year.
2014-10-01
44 Table 19: Raspberry Pi Information...boards – These are single board devices targeted to education and embedding, the best known being the Raspberry Pi ; and 3. Development boards – These...popular, as it has high performance processor (perhaps 4 times the power of a Raspberry Pi ) with dual core processors running at 1.6 GHz and the cost is
High-performance reconfigurable hardware architecture for restricted Boltzmann machines.
Ly, Daniel Le; Chow, Paul
2010-11-01
Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications has been limited. A primary cause for this lack of adoption is that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can exploit the inherent parallelism in neural networks is desired. This paper investigates how the restricted Boltzmann machine (RBM), which is a popular type of neural network, can be mapped to a high-performance hardware architecture on field-programmable gate array (FPGA) platforms. The proposed modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. A method to partition large RBMs into smaller congruent components is also presented, allowing the distribution of one RBM across multiple FPGA resources. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100 MHz through a variety of different configurations. The maximum performance was obtained by instantiating an RBM of 256 × 256 nodes distributed across four FPGAs, which resulted in a computational speed of 3.13 billion connection-updates-per-second and a speedup of 145-fold over an optimized C program running on a 2.8-GHz Intel processor.
Environmental Control and Life Support Integration Strategy for 6-Crew Operations Stephanie Duchesne
NASA Technical Reports Server (NTRS)
Duchesne, Stephanie M.
2009-01-01
The International Space Station (ISS) crew compliment has increased in size from 3 to 6 crew members . In order to support this increase in crew on ISS, the United States on-orbit Segment (USOS) has been outfitted with a suite of regenerative Environmental Control and Life Support (ECLS) hardware including an Oxygen Generation System(OGS), Waste and Hygiene Compartment (WHC), and a Water Recovery System (WRS). The WRS includes the Urine Processor Assembly (UPA) and the Water Processor Assembly (WPA). With this additional life support hardware, the ISS has achieved full redundancy in its on-orbit life support system between the USOS and Russian Segment (RS). The additional redundancy created by the Regenerative ECLS hardware creates the opportunity for independent support capabilities between segments, and for the first time since the start of ISS, the necessity to revise Life Support strategy agreements. Independent operating strategies coupled with the loss of the Space Shuttle supply and return capabilities in 2010 offer new and unique challenges. This paper will discuss the evolution of the ISS Life Support hardware strategy in support of 6-Crew on ISS, as well as the continued work that is necessary to ensure the support of crew and ISS Program objectives through the life of station.
Environmental Control and Life Support Integration Strategy for 6-Crew Operations
NASA Technical Reports Server (NTRS)
Duchesne, Stephanie M.; Tressler, Chad H.
2010-01-01
The International Space Station (ISS) crew complement has increased in size from 3 to 6 crew members. In order to support this increase in crew on ISS, the United States on-orbit Segment (USOS) has been outfitted with a suite of regenerative Environmental Control and Life Support (ECLS) hardware including an Oxygen Generation System (OGS), Waste and Hygiene Compartment (WHC), and a Water Recovery System (WRS). The WRS includes the Urine Processor Assembly (UPA) and the Water Processor Assembly (WPA). With this additional life support hardware, the ISS has achieved full redundancy in its on-orbit life support system between the t OS and Russian Segment (RS). The additional redundancy created by the Regenerative ECLS hardware creates the opportunity for independent support capabilities between segments, and for the first time since the start of ISS, the necessity to revise Life Support strategy agreements. Independent operating strategies coupled with the loss of the Space Shuttle supply and return capabilities in 2010 offer new and unique challenges. This paper will discuss the evolution of the ISS Life Support hardware strategy in support of 6-Crew on ISS, as well as the continued work that is necessary to ensure the support of crew and ISS Program objectives through the life of station
A Parallel Rendering Algorithm for MIMD Architectures
NASA Technical Reports Server (NTRS)
Crockett, Thomas W.; Orloff, Tobias
1991-01-01
Applications such as animation and scientific visualization demand high performance rendering of complex three dimensional scenes. To deliver the necessary rendering rates, highly parallel hardware architectures are required. The challenge is then to design algorithms and software which effectively use the hardware parallelism. A rendering algorithm targeted to distributed memory MIMD architectures is described. For maximum performance, the algorithm exploits both object-level and pixel-level parallelism. The behavior of the algorithm is examined both analytically and experimentally. Its performance for large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 shows increasing performance from 1 to 128 processors across a wide range of scene complexities. It is shown that minimal modifications to the algorithm will adapt it for use on shared memory architectures as well.
Advanced Optical Technologies for Defense Trauma and Critical Care
2014-02-04
conventional LSI. We also demonstrated that mcLSI enables improved characterization of curved surfaces of the body by positioning LSI modules at...Implementation of an LED based clinical spatial frequency domain imaging 17 system. Proc SPIE Vol. 8254, Emerging Digital Micromirror Device Based
Hot Chips and Hot Interconnects for High End Computing Systems
NASA Technical Reports Server (NTRS)
Saini, Subhash
2005-01-01
I will discuss several processors: 1. The Cray proprietary processor used in the Cray X1; 2. The IBM Power 3 and Power 4 used in an IBM SP 3 and IBM SP 4 systems; 3. The Intel Itanium and Xeon, used in the SGI Altix systems and clusters respectively; 4. IBM System-on-a-Chip used in IBM BlueGene/L; 5. HP Alpha EV68 processor used in DOE ASCI Q cluster; 6. SPARC64 V processor, which is used in the Fujitsu PRIMEPOWER HPC2500; 7. An NEC proprietary processor, which is used in NEC SX-6/7; 8. Power 4+ processor, which is used in Hitachi SR11000; 9. NEC proprietary processor, which is used in Earth Simulator. The IBM POWER5 and Red Storm Computing Systems will also be discussed. The architectures of these processors will first be presented, followed by interconnection networks and a description of high-end computer systems based on these processors and networks. The performance of various hardware/programming model combinations will then be compared, based on latest NAS Parallel Benchmark results (MPI, OpenMP/HPF and hybrid (MPI + OpenMP). The tutorial will conclude with a discussion of general trends in the field of high performance computing, (quantum computing, DNA computing, cellular engineering, and neural networks).
The Chimera II Real-Time Operating System for advanced sensor-based control applications
NASA Technical Reports Server (NTRS)
Stewart, David B.; Schmitz, Donald E.; Khosla, Pradeep K.
1992-01-01
Attention is given to the Chimera II Real-Time Operating System, which has been developed for advanced sensor-based control applications. The Chimera II provides a high-performance real-time kernel and a variety of IPC features. The hardware platform required to run Chimera II consists of commercially available hardware, and allows custom hardware to be easily integrated. The design allows it to be used with almost any type of VMEbus-based processors and devices. It allows radially differing hardware to be programmed using a common system, thus providing a first and necessary step towards the standardization of reconfigurable systems that results in a reduction of development time and cost.
NASA Technical Reports Server (NTRS)
Psiaki, Mark L. (Inventor); Kintner, Jr., Paul M. (Inventor); Ledvina, Brent M. (Inventor); Powell, Steven P. (Inventor)
2007-01-01
A real-time software receiver that executes on a general purpose processor. The software receiver includes data acquisition and correlator modules that perform, in place of hardware correlation, baseband mixing and PRN code correlation using bit-wise parallelism.
NASA Technical Reports Server (NTRS)
Psiaki, Mark L. (Inventor); Ledvina, Brent M. (Inventor); Powell, Steven P. (Inventor); Kintner, Jr., Paul M. (Inventor)
2006-01-01
A real-time software receiver that executes on a general purpose processor. The software receiver includes data acquisition and correlator modules that perform, in place of hardware correlation, baseband mixing and PRN code correlation using bit-wise parallelism.
Phosphinosilylenes as a novel ligand system for heterobimetallic complexes.
Breit, Nora C; Eisenhut, Carsten; Inoue, Shigeyoshi
2016-04-25
A dihydrophosphinosilylene iron complex [LSi{Fe(CO)4}PH2] has been prepared and utilized in the synthesis of novel heterobimetallic complexes. The phosphine moiety in this phosphinosilylene complex allows coordination towards tungsten leading to the iron-tungsten heterobimetallic complex [LSi{Fe(CO)4}PH2{W(CO)5}]. In contrast, the reaction of [LSi{Fe(CO)4}PH2] with ethylenebis(triphenylphosphine)platinum(0) results in the formation of the iron-platinum heterobimetallic complex [LSi{Fe(CO)4}PH{PtH(PPh3)2}] via oxidative addition.
Hecht, Nils; Woitzik, Johannes; König, Susanne; Horn, Peter; Vajkoczy, Peter
2013-07-01
Currently, there is no adequate technique for intraoperative monitoring of cerebral blood flow (CBF). To evaluate laser speckle imaging (LSI) for assessment of relative CBF, LSI was performed in 30 patients who underwent direct surgical revascularization for treatment of arteriosclerotic cerebrovascular disease (ACVD), Moyamoya disease (MMD), or giant aneurysms, and in 8 control patients who underwent intracranial surgery for reasons other than hemodynamic compromise. The applicability and sensitivity of LSI was investigated through baseline perfusion and CO2 reactivity testing. The dynamics of LSI were assessed during bypass test occlusion and flow initiation procedures. Laser speckle imaging permitted robust (pseudo-) quantitative assessment of relative microcirculatory flow and standard bypass grafting resulted in significantly higher postoperative baseline perfusion values in ACVD and MMD. The applicability and sensitivity of LSI was shown by a significantly reduced CO2 reactivity in ACVD (9.6±9%) and MMD (8.5±8%) compared with control (31.2±5%; P<0.0001). In high- and intermediate-flow bypass patients, LSI was characterized by a dynamic real-time response to acute perfusion changes and ultimately confirmed a sufficient flow substitution through the bypass graft. Thus, LSI can be used for sensitive and continuous, non-invasive real-time visualization and measurement of relative cortical CBF in excellent spatial-temporal resolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wickstrom, Gregory Lloyd; Gale, Jason Carl; Ma, Kwok Kee
The Sandia Secure Processor (SSP) is a new native Java processor that has been specifically designed for embedded applications. The SSP's design is a system composed of a core Java processor that directly executes Java bytecodes, on-chip intelligent IO modules, and a suite of software tools for simulation and compiling executable binary files. The SSP is unique in that it provides a way to control real-time IO modules for embedded applications. The system software for the SSP is a 'class loader' that takes Java .class files (created with your favorite Java compiler), links them together, and compiles a binary. Themore » complete SSP system provides very powerful functionality with very light hardware requirements with the potential to be used in a wide variety of small-system embedded applications. This paper gives a detail description of the Sandia Secure Processor and its unique features.« less
1993-01-01
Deoxyribose nucleicacid DPP: Digital Post-Processor DREO Detence Research Establishment Ottawa RF: Radio Frequency TeO2 : tellurium dioxide TIC: Time... TeO2 is 620 m/s, a device with a 100-As aperture device is 62-mm long. To take advantage of the full interaction time of these Bragg cells, the whole...INCLUDED IN THE DIGITAL POST-PROCESSOR HARDWARE Characteristics of Bandwidth Center Frequency Bragg Cell glass (bulk 100 MHz 150 MHz interaction) iNbO3
Stand-alone development system using a KIM-1 microcomputer module
NASA Technical Reports Server (NTRS)
Nickum, J. D.
1978-01-01
A small microprocessor-based system designed to: contain all or most of the interface hardware, designed to be easy to access and modify the hardware, to be capable of being strapped to the seat of a small general aviation aircraft, and to be independent of the aircraft power system is described. The system is used to develop a low cost Loran C sensor processor, but is designed such that the Loran interface boards may be removed and other hardware interfaces inserted into the same connectors. This flexibility is achieved through memory-mapping techniques into the microprocessor.
Cache Hardware Approaches to Multiple Independent Levels of Security (MILS)
2012-10-01
systems that require that several multicore processors be connected together in a single system. However, no such boards were available on the market ...available concerning each module. However, the availability of modules seems to significantly lag the time when the corresponding hardware hits the market ...version of real mode often referred to as “Unreal mode” can be entered by loading a Local Descriptor Table (LDT) and Global Descriptor Table (GDT
NASA Technical Reports Server (NTRS)
Thakoor, Anil
1990-01-01
Viewgraphs on electronic neural networks for space station are presented. Topics covered include: electronic neural networks; electronic implementations; VLSI/thin film hybrid hardware for neurocomputing; computations with analog parallel processing; features of neuroprocessors; applications of neuroprocessors; neural network hardware for terrain trafficability determination; a dedicated processor for path planning; neural network system interface; neural network for robotic control; error backpropagation algorithm for learning; resource allocation matrix; global optimization neuroprocessor; and electrically programmable read only thin-film synaptic array.
Architecture of security management unit for safe hosting of multiple agents
NASA Astrophysics Data System (ADS)
Gilmont, Tanguy; Legat, Jean-Didier; Quisquater, Jean-Jacques
1999-04-01
In such growing areas as remote applications in large public networks, electronic commerce, digital signature, intellectual property and copyright protection, and even operating system extensibility, the hardware security level offered by existing processors is insufficient. They lack protection mechanisms that prevent the user from tampering critical data owned by those applications. Some devices make exception, but have not enough processing power nor enough memory to stand up to such applications (e.g. smart cards). This paper proposes an architecture of secure processor, in which the classical memory management unit is extended into a new security management unit. It allows ciphered code execution and ciphered data processing. An internal permanent memory can store cipher keys and critical data for several client agents simultaneously. The ordinary supervisor privilege scheme is replaced by a privilege inheritance mechanism that is more suited to operating system extensibility. The result is a secure processor that has hardware support for extensible multitask operating systems, and can be used for both general applications and critical applications needing strong protection. The security management unit and the internal permanent memory can be added to an existing CPU core without loss of performance, and do not require it to be modified.
NASA Astrophysics Data System (ADS)
Prengaman, R. J.; Thurber, R. E.; Bath, W. G.
The usefulness of radar systems depends on the ability to distinguish between signals returned from desired targets and noise. A retrospective processor uses all contacts (or 'plots') from several past radar scans, taking into account all possible target trajectories formed from stored contacts for each input detection. The processor eliminates many false alarms, while retaining those contacts describing resonable trajectories. The employment of a retrospective processor makes it, therefore, possible to obtain large improvements in detection sensitivity in certain important clutter environments. Attention is given to the retrospective processing concept, a theoretical analysis of the multiscan detection process, the experimental evaluation of retrospective data filter, and aspects of retrospective data filter hardware implementation.
Cancio, Leopoldo C.; Batchinsky, Andriy I.; Baker, William L.; Necsoiu, Corina; Salinas, José; Goldberger, Ary L.; Costa, Madalena D.
2013-01-01
Purpose We found that heart-rate (HR) complexity metrics, such as sample entropy (SampEn), identified trauma patients receiving lifesaving interventions (LSIs). We now aimed: 1) to test a new multiscale entropy (MSE) index; 2) to compare it to single-scale measures including SampEn; and 3) to assess different parameter values for calculation of SampEn and MSE. Methods This was a study of combat casualties in an Emergency Department (ED) in Iraq. ECGs of 70 acutely injured adults were recorded. Twelve underwent LSIs and 58 did not. LSIs included endotracheal intubation (9); tube thoracostomy (9); and emergency transfusion (4). From each ECG, a segment of 800 consecutive beats was selected. Off-line, R waves were detected and R-to-R (RR) interval time series were generated. SampEn, MSE, and time-domain measures of HR variability (mean HR, standard deviation, pNN20, pNN50, rMSSD) were computed. Results Differences in mean HR (LSI=111±33, NonLSI=90±17) were not significant. Systolic arterial pressure was statistically but not clinically different (LSI=123±19, NonLSI=135±19). SampEn (LSI=0.90±0.42, NonLSI=1.19±0.35, p<0.05) and MSE index (LSI = 2.58±2.55, NonLSI=5.67±2.48, p<0.001) differed significantly. Conclusions Complexity of HR dynamics over a range of time scales was lower in high-risk than in low-risk combat casualties and outperformed traditional vital signs. PMID:24140167
LSI-based amperometric sensor for bio-imaging and multi-point biosensing.
Inoue, Kumi Y; Matsudaira, Masahki; Kubo, Reyushi; Nakano, Masanori; Yoshida, Shinya; Matsuzaki, Sakae; Suda, Atsushi; Kunikata, Ryota; Kimura, Tatsuo; Tsurumi, Ryota; Shioya, Toshihito; Ino, Kosuke; Shiku, Hitoshi; Satoh, Shiro; Esashi, Masayoshi; Matsue, Tomokazu
2012-09-21
We have developed an LSI-based amperometric sensor called "Bio-LSI" with 400 measurement points as a platform for electrochemical bio-imaging and multi-point biosensing. The system is comprised of a 10.4 mm × 10.4 mm CMOS sensor chip with 20 × 20 unit cells, an external circuit box, a control unit for data acquisition, and a DC power box. Each unit cell of the chip contains an operational amplifier with a switched-capacitor type I-V converter for in-pixel signal amplification. We successfully realized a wide dynamic range from ±1 pA to ±100 nA with a well-organized circuit design and operating software. In particular, in-pixel signal amplification and an original program to control the signal read-out contribute to the lower detection limit and wide detection range of Bio-LSI. The spacial resolution is 250 μm and the temporal resolution is 18-125 ms/400 points, which depends on the desired current detection range. The coefficient of variance of the current for 400 points is within 5%. We also demonstrated the real-time imaging of a biological molecule using Bio-LSI. The LSI coated with an Os-HRP film was successfully applied to the monitoring of the changes of hydrogen peroxide concentration in a flow. The Os-HRP-coated LSI was spotted with glucose oxidase and used for bioelectrochemical imaging of the glucose oxidase (GOx)-catalyzed oxidation of glucose. Bio-LSI is a promising platform for a wide range of analytical fields, including diagnostics, environmental measurements and basic biochemistry.
Stanford Hardware Development Program
NASA Technical Reports Server (NTRS)
Peterson, A.; Linscott, I.; Burr, J.
1986-01-01
Architectures for high performance, digital signal processing, particularly for high resolution, wide band spectrum analysis were developed. These developments are intended to provide instrumentation for NASA's Search for Extraterrestrial Intelligence (SETI) program. The real time signal processing is both formal and experimental. The efficient organization and optimal scheduling of signal processing algorithms were investigated. The work is complemented by efforts in processor architecture design and implementation. A high resolution, multichannel spectrometer that incorporates special purpose microcoded signal processors is being tested. A general purpose signal processor for the data from the multichannel spectrometer was designed to function as the processing element in a highly concurrent machine. The processor performance required for the spectrometer is in the range of 1000 to 10,000 million instructions per second (MIPS). Multiple node processor configurations, where each node performs at 100 MIPS, are sought. The nodes are microprogrammable and are interconnected through a network with high bandwidth for neighboring nodes, and medium bandwidth for nodes at larger distance. The implementation of both the current mutlichannel spectrometer and the signal processor as Very Large Scale Integration CMOS chip sets was commenced.
Analog Ranging Modem Code Processor and Generator
DOT National Transportation Integrated Search
1974-05-01
The report details technical development efforts to implement an analog ranging modem using recently developed linear integrated circuits where possible. The breadboard hardware is capable of acquiring frequency and phase of a weak signal in a high n...
Multicore Challenges and Benefits for High Performance Scientific Computing
Nielsen, Ida M. B.; Janssen, Curtis L.
2008-01-01
Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexitymore » of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.« less
Performance Comparison of Mainframe, Workstations, Clusters, and Desktop Computers
NASA Technical Reports Server (NTRS)
Farley, Douglas L.
2005-01-01
A performance evaluation of a variety of computers frequently found in a scientific or engineering research environment was conducted using a synthetic and application program benchmarks. From a performance perspective, emerging commodity processors have superior performance relative to legacy mainframe computers. In many cases, the PC clusters exhibited comparable performance with traditional mainframe hardware when 8-12 processors were used. The main advantage of the PC clusters was related to their cost. Regardless of whether the clusters were built from new computers or whether they were created from retired computers their performance to cost ratio was superior to the legacy mainframe computers. Finally, the typical annual maintenance cost of legacy mainframe computers is several times the cost of new equipment such as multiprocessor PC workstations. The savings from eliminating the annual maintenance fee on legacy hardware can result in a yearly increase in total computational capability for an organization.
Design and realization of the baseband processor in satellite navigation and positioning receiver
NASA Astrophysics Data System (ADS)
Zhang, Dawei; Hu, Xiulin; Li, Chen
2007-11-01
The content of this paper is focused on the Design and realization of the baseband processor in satellite navigation and positioning receiver. Baseband processor is the most important part of the satellite positioning receiver. The design covers baseband processor's main functions include multi-channel digital signal DDC, acquisition, code tracking, carrier tracking, demodulation, etc. The realization is based on an Altera's FPGA device, that makes the system can be improved and upgraded without modifying the hardware. It embodies the theory of software defined radio (SDR), and puts the theory of the spread spectrum into practice. This paper puts emphasis on the realization of baseband processor in FPGA. In the order of choosing chips, design entry, debugging and synthesis, the flow is presented detailedly. Additionally the paper detailed realization of Digital PLL in order to explain a method of reducing the consumption of FPGA. Finally, the paper presents the result of Synthesis. This design has been used in BD-1, BD-2 and GPS.
NASA Astrophysics Data System (ADS)
Leggett, C.; Binet, S.; Jackson, K.; Levinthal, D.; Tatarkhanov, M.; Yao, Y.
2011-12-01
Thermal limitations have forced CPU manufacturers to shift from simply increasing clock speeds to improve processor performance, to producing chip designs with multi- and many-core architectures. Further the cores themselves can run multiple threads as a zero overhead context switch allowing low level resource sharing (Intel Hyperthreading). To maximize bandwidth and minimize memory latency, memory access has become non uniform (NUMA). As manufacturers add more cores to each chip, a careful understanding of the underlying architecture is required in order to fully utilize the available resources. We present AthenaMP and the Atlas event loop manager, the driver of the simulation and reconstruction engines, which have been rewritten to make use of multiple cores, by means of event based parallelism, and final stage I/O synchronization. However, initial studies on 8 andl6 core Intel architectures have shown marked non-linearities as parallel process counts increase, with as much as 30% reductions in event throughput in some scenarios. Since the Intel Nehalem architecture (both Gainestown and Westmere) will be the most common choice for the next round of hardware procurements, an understanding of these scaling issues is essential. Using hardware based event counters and Intel's Performance Tuning Utility, we have studied the performance bottlenecks at the hardware level, and discovered optimization schemes to maximize processor throughput. We have also produced optimization mechanisms, common to all large experiments, that address the extreme nature of today's HEP code, which due to it's size, places huge burdens on the memory infrastructure of today's processors.
NASA Astrophysics Data System (ADS)
Blume, H.; Alexandru, R.; Applegate, R.; Giordano, T.; Kamiya, K.; Kresina, R.
1986-06-01
In a digital diagnostic imaging department, the majority of operations for handling and processing of images can be grouped into a small set of basic operations, such as image data buffering and storage, image processing and analysis, image display, image data transmission and image data compression. These operations occur in almost all nodes of the diagnostic imaging communications network of the department. An image processor architecture was developed in which each of these functions has been mapped into hardware and software modules. The modular approach has advantages in terms of economics, service, expandability and upgradeability. The architectural design is based on the principles of hierarchical functionality, distributed and parallel processing and aims at real time response. Parallel processing and real time response is facilitated in part by a dual bus system: a VME control bus and a high speed image data bus, consisting of 8 independent parallel 16-bit busses, capable of handling combined up to 144 MBytes/sec. The presented image processor is versatile enough to meet the video rate processing needs of digital subtraction angiography, the large pixel matrix processing requirements of static projection radiography, or the broad range of manipulation and display needs of a multi-modality diagnostic work station. Several hardware modules are described in detail. For illustrating the capabilities of the image processor, processed 2000 x 2000 pixel computed radiographs are shown and estimated computation times for executing the processing opera-tions are presented.
CoNNeCT Baseband Processor Module Boot Code SoftWare (BCSW)
NASA Technical Reports Server (NTRS)
Yamamoto, Clifford K.; Orozco, David S.; Byrne, D. J.; Allen, Steven J.; Sahasrabudhe, Adit; Lang, Minh
2012-01-01
This software provides essential startup and initialization routines for the CoNNeCT baseband processor module (BPM) hardware upon power-up. A command and data handling (C&DH) interface is provided via 1553 and diagnostic serial interfaces to invoke operational, reconfiguration, and test commands within the code. The BCSW has features unique to the hardware it is responsible for managing. In this case, the CoNNeCT BPM is configured with an updated CPU (Atmel AT697 SPARC processor) and a unique set of memory and I/O peripherals that require customized software to operate. These features include configuration of new AT697 registers, interfacing to a new HouseKeeper with a flash controller interface, a new dual Xilinx configuration/scrub interface, and an updated 1553 remote terminal (RT) core. The BCSW is intended to provide a "safe" mode for the BPM when initially powered on or when an unexpected trap occurs, causing the processor to reset. The BCSW allows the 1553 bus controller in the spacecraft or payload controller to operate the BPM over 1553 to upload code; upload Xilinx bit files; perform rudimentary tests; read, write, and copy the non-volatile flash memory; and configure the Xilinx interface. Commands also exist over 1553 to cause the CPU to jump or call a specified address to begin execution of user-supplied code. This may be in the form of a real-time operating system, test routine, or specific application code to run on the BPM.
Implementation of 4-way Superscalar Hash MIPS Processor Using FPGA
NASA Astrophysics Data System (ADS)
Sahib Omran, Safaa; Fouad Jumma, Laith
2018-05-01
Due to the quick advancements in the personal communications systems and wireless communications, giving data security has turned into a more essential subject. This security idea turns into a more confounded subject when next-generation system requirements and constant calculation speed are considered in real-time. Hash functions are among the most essential cryptographic primitives and utilized as a part of the many fields of signature authentication and communication integrity. These functions are utilized to acquire a settled size unique fingerprint or hash value of an arbitrary length of message. In this paper, Secure Hash Algorithms (SHA) of types SHA-1, SHA-2 (SHA-224, SHA-256) and SHA-3 (BLAKE) are implemented on Field-Programmable Gate Array (FPGA) in a processor structure. The design is described and implemented using a hardware description language, namely VHSIC “Very High Speed Integrated Circuit” Hardware Description Language (VHDL). Since the logical operation of the hash types of (SHA-1, SHA-224, SHA-256 and SHA-3) are 32-bits, so a Superscalar Hash Microprocessor without Interlocked Pipelines (MIPS) processor are designed with only few instructions that were required in invoking the desired Hash algorithms, when the four types of hash algorithms executed sequentially using the designed processor, the total time required equal to approximately 342 us, with a throughput of 4.8 Mbps while the required to execute the same four hash algorithms using the designed four-way superscalar is reduced to 237 us with improved the throughput to 5.1 Mbps.
Development of small scale cluster computer for numerical analysis
NASA Astrophysics Data System (ADS)
Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.
2017-09-01
In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Regan, Caitlin; Hayakawa, Carole; Choi, Bernard
2017-12-01
Due to its simplicity and low cost, laser speckle imaging (LSI) has achieved widespread use in biomedical applications. However, interpretation of the blood-flow maps remains ambiguous, as LSI enables only limited visualization of vasculature below scattering layers such as the epidermis and skull. Here, we describe a computational model that enables flexible in-silico study of the impact of these factors on LSI measurements. The model uses Monte Carlo methods to simulate light and momentum transport in a heterogeneous tissue geometry. The virtual detectors of the model track several important characteristics of light. This model enables study of LSI aspects that may be difficult or unwieldy to address in an experimental setting, and enables detailed study of the fundamental origins of speckle contrast modulation in tissue-specific geometries. We applied the model to an in-depth exploration of the spectral dependence of speckle contrast signal in the skin, the effects of epidermal melanin content on LSI, and the depth-dependent origins of our signal. We found that LSI of transmitted light allows for a more homogeneous integration of the signal from the entire bulk of the tissue, whereas epi-illumination measurements of contrast are limited to a fraction of the light penetration depth. We quantified the spectral depth dependence of our contrast signal in the skin, and did not observe a statistically significant effect of epidermal melanin on speckle contrast. Finally, we corroborated these simulated results with experimental LSI measurements of flow beneath a thin absorbing layer. The results of this study suggest the use of LSI in the clinic to monitor perfusion in patients with different skin types, or inhomogeneous epidermal melanin distributions.
Regan, Caitlin; Hayakawa, Carole; Choi, Bernard
2017-01-01
Due to its simplicity and low cost, laser speckle imaging (LSI) has achieved widespread use in biomedical applications. However, interpretation of the blood-flow maps remains ambiguous, as LSI enables only limited visualization of vasculature below scattering layers such as the epidermis and skull. Here, we describe a computational model that enables flexible in-silico study of the impact of these factors on LSI measurements. The model uses Monte Carlo methods to simulate light and momentum transport in a heterogeneous tissue geometry. The virtual detectors of the model track several important characteristics of light. This model enables study of LSI aspects that may be difficult or unwieldy to address in an experimental setting, and enables detailed study of the fundamental origins of speckle contrast modulation in tissue-specific geometries. We applied the model to an in-depth exploration of the spectral dependence of speckle contrast signal in the skin, the effects of epidermal melanin content on LSI, and the depth-dependent origins of our signal. We found that LSI of transmitted light allows for a more homogeneous integration of the signal from the entire bulk of the tissue, whereas epi-illumination measurements of contrast are limited to a fraction of the light penetration depth. We quantified the spectral depth dependence of our contrast signal in the skin, and did not observe a statistically significant effect of epidermal melanin on speckle contrast. Finally, we corroborated these simulated results with experimental LSI measurements of flow beneath a thin absorbing layer. The results of this study suggest the use of LSI in the clinic to monitor perfusion in patients with different skin types, or inhomogeneous epidermal melanin distributions. PMID:29296499
Yoo, Hye Jin; Hong, Jin Pyo; Cho, Maeng Je; Fava, Maurizio; Mischoulon, David; Heo, Jung-Yoon; Kim, Kiwon; Jeon, Hong Jin
2016-10-01
Major depressive disorder (MDD) is a well-known risk factor for suicidality, but depressed mood has been used non-specifically to describe the emotional state. We sought to compare influence of MDD versus sustained depressed mood on suicidality. A total of 12,532 adults, randomly selected through the one-person-per-household method, completed a face-to-face interview using the Korean version of Composite International Diagnostic Interview (K-CIDI) and a questionnaire for lifetime suicidal ideation (LSI) and lifetime suicidal attempt (LSA). Of 12,361 adults, 565 were assessed as 'sustained depressed mood group' having depressed mood for more than two weeks without MDD (4.6%), and 810 adults were assessed as having full MDD (6.55%) which consisted of 'MDD with depressed mood group' (6.0%) and 'MDD without depressed mood group' (0.5%). The MDD with depressed mood group showed higher odds ratios for LSI and LSA than the sustained depressed mood group. Contrarily, no significant differences were found in LSI and LSA between the MDD group with and without depressed mood. MDD showed significant associations with LSI (AOR=2.83, 95%CI 2.12-3.78) and LSA (AOR=2.17, 95%CI 1.34-3.52), whereas sustained depressed mood showed significant associations with neither LSI nor LSA after adjusting for MDD and other psychiatric comorbidities. Interaction effect of sustained depressed mood with MDD was significant for LSI but not for LSA. Sustained depressed mood was not related to LSI and LSA after adjusting for psychiatric comorbidities, whereas MDD was significantly associated with both LSI and LSA regardless of the presence of sustained depressed mood. Copyright © 2016 Elsevier B.V. All rights reserved.
Mission and data operations IBM 360 user's guide
NASA Technical Reports Server (NTRS)
Balakirsky, J.
1973-01-01
The M and DO computer systems are introduced and supplemented. The hardware and software status is discussed, along with standard processors and user libraries. Data management techniques are presented, as well as machine independence, debugging facilities, and overlay considerations.
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.
1994-01-01
Electronic and optoelectronic hardware implementations of highly parallel computing architectures address several ill-defined and/or computation-intensive problems not easily solved by conventional computing techniques. The concurrent processing architectures developed are derived from a variety of advanced computing paradigms including neural network models, fuzzy logic, and cellular automata. Hardware implementation technologies range from state-of-the-art digital/analog custom-VLSI to advanced optoelectronic devices such as computer-generated holograms and e-beam fabricated Dammann gratings. JPL's concurrent processing devices group has developed a broad technology base in hardware implementable parallel algorithms, low-power and high-speed VLSI designs and building block VLSI chips, leading to application-specific high-performance embeddable processors. Application areas include high throughput map-data classification using feedforward neural networks, terrain based tactical movement planner using cellular automata, resource optimization (weapon-target assignment) using a multidimensional feedback network with lateral inhibition, and classification of rocks using an inner-product scheme on thematic mapper data. In addition to addressing specific functional needs of DOD and NASA, the JPL-developed concurrent processing device technology is also being customized for a variety of commercial applications (in collaboration with industrial partners), and is being transferred to U.S. industries. This viewgraph p resentation focuses on two application-specific processors which solve the computation intensive tasks of resource allocation (weapon-target assignment) and terrain based tactical movement planning using two extremely different topologies. Resource allocation is implemented as an asynchronous analog competitive assignment architecture inspired by the Hopfield network. Hardware realization leads to a two to four order of magnitude speed-up over conventional techniques and enables multiple assignments, (many to many), not achievable with standard statistical approaches. Tactical movement planning (finding the best path from A to B) is accomplished with a digital two-dimensional concurrent processor array. By exploiting the natural parallel decomposition of the problem in silicon, a four order of magnitude speed-up over optimized software approaches has been demonstrated.
The use of imprecise processing to improve accuracy in weather & climate prediction
NASA Astrophysics Data System (ADS)
Düben, Peter D.; McNamara, Hugh; Palmer, T. N.
2014-08-01
The use of stochastic processing hardware and low precision arithmetic in atmospheric models is investigated. Stochastic processors allow hardware-induced faults in calculations, sacrificing bit-reproducibility and precision in exchange for improvements in performance and potentially accuracy of forecasts, due to a reduction in power consumption that could allow higher resolution. A similar trade-off is achieved using low precision arithmetic, with improvements in computation and communication speed and savings in storage and memory requirements. As high-performance computing becomes more massively parallel and power intensive, these two approaches may be important stepping stones in the pursuit of global cloud-resolving atmospheric modelling. The impact of both hardware induced faults and low precision arithmetic is tested using the Lorenz '96 model and the dynamical core of a global atmosphere model. In the Lorenz '96 model there is a natural scale separation; the spectral discretisation used in the dynamical core also allows large and small scale dynamics to be treated separately within the code. Such scale separation allows the impact of lower-accuracy arithmetic to be restricted to components close to the truncation scales and hence close to the necessarily inexact parametrised representations of unresolved processes. By contrast, the larger scales are calculated using high precision deterministic arithmetic. Hardware faults from stochastic processors are emulated using a bit-flip model with different fault rates. Our simulations show that both approaches to inexact calculations do not substantially affect the large scale behaviour, provided they are restricted to act only on smaller scales. By contrast, results from the Lorenz '96 simulations are superior when small scales are calculated on an emulated stochastic processor than when those small scales are parametrised. This suggests that inexact calculations at the small scale could reduce computation and power costs without adversely affecting the quality of the simulations. This would allow higher resolution models to be run at the same computational cost.
CPU architecture for a fast and energy-saving calculation of convolution neural networks
NASA Astrophysics Data System (ADS)
Knoll, Florian J.; Grelcke, Michael; Czymmek, Vitali; Holtorf, Tim; Hussmann, Stephan
2017-06-01
One of the most difficult problem in the use of artificial neural networks is the computational capacity. Although large search engine companies own specially developed hardware to provide the necessary computing power, for the conventional user only remains the state of the art method, which is the use of a graphic processing unit (GPU) as a computational basis. Although these processors are well suited for large matrix computations, they need massive energy. Therefore a new processor on the basis of a field programmable gate array (FPGA) has been developed and is optimized for the application of deep learning. This processor is presented in this paper. The processor can be adapted for a particular application (in this paper to an organic farming application). The power consumption is only a fraction of a GPU application and should therefore be well suited for energy-saving applications.
The MasPar MP-1 As a Computer Arithmetic Laboratory
Anuta, Michael A.; Lozier, Daniel W.; Turner, Peter R.
1996-01-01
This paper is a blueprint for the use of a massively parallel SIMD computer architecture for the simulation of various forms of computer arithmetic. The particular system used is a DEC/MasPar MP-1 with 4096 processors in a square array. This architecture has many advantages for such simulations due largely to the simplicity of the individual processors. Arithmetic operations can be spread across the processor array to simulate a hardware chip. Alternatively they may be performed on individual processors to allow simulation of a massively parallel implementation of the arithmetic. Compromises between these extremes permit speed-area tradeoffs to be examined. The paper includes a description of the architecture and its features. It then summarizes some of the arithmetic systems which have been, or are to be, implemented. The implementation of the level-index and symmetric level-index, LI and SLI, systems is described in some detail. An extensive bibliography is included. PMID:27805123
Vectorization with SIMD extensions speeds up reconstruction in electron tomography.
Agulleiro, J I; Garzón, E M; García, I; Fernández, J J
2010-06-01
Electron tomography allows structural studies of cellular structures at molecular detail. Large 3D reconstructions are needed to meet the resolution requirements. The processing time to compute these large volumes may be considerable and so, high performance computing techniques have been used traditionally. This work presents a vector approach to tomographic reconstruction that relies on the exploitation of the SIMD extensions available in modern processors in combination to other single processor optimization techniques. This approach succeeds in producing full resolution tomograms with an important reduction in processing time, as evaluated with the most common reconstruction algorithms, namely WBP and SIRT. The main advantage stems from the fact that this approach is to be run on standard computers without the need of specialized hardware, which facilitates the development, use and management of programs. Future trends in processor design open excellent opportunities for vector processing with processor's SIMD extensions in the field of 3D electron microscopy.
System Architecture For High Speed Sorting Of Potatoes
NASA Astrophysics Data System (ADS)
Marchant, J. A.; Onyango, C. M.; Street, M. J.
1989-03-01
This paper illustrates an industrial application of vision processing in which potatoes are sorted according to their size and shape at speeds of up to 40 objects per second. The result is a multi-processing approach built around the VME bus. A hardware unit has been designed and constructed to encode the boundary of the potatoes, to reducing the amount of data to be processed. A master 68000 processor is used to control this unit and to handle data transfers along the bus. Boundary data is passed to one of three 68010 slave processors each responsible for a line of potatoes across a conveyor belt. The slave processors calculate attributes such as shape, size and estimated weight of each potato and the master processor uses this data to operate the sorting mechanism. The system has been interfaced with a commercial grading machine and performance trials are now in progress.
Practical, redundant, failure-tolerant, self-reconfiguring embedded system architecture
Klarer, Paul R.; Hayward, David R.; Amai, Wendy A.
2006-10-03
This invention relates to system architectures, specifically failure-tolerant and self-reconfiguring embedded system architectures. The invention provides both a method and architecture for redundancy. There can be redundancy in both software and hardware for multiple levels of redundancy. The invention provides a self-reconfiguring architecture for activating redundant modules whenever other modules fail. The architecture comprises: a communication backbone connected to two or more processors and software modules running on each of the processors. Each software module runs on one processor and resides on one or more of the other processors to be available as a backup module in the event of failure. Each module and backup module reports its status over the communication backbone. If a primary module does not report, its backup module takes over its function. If the primary module becomes available again, the backup module returns to its backup status.
Moving formal methods into practice. Verifying the FTPP Scoreboard: Results, phase 1
NASA Technical Reports Server (NTRS)
Srivas, Mandayam; Bickford, Mark
1992-01-01
This report documents the Phase 1 results of an effort aimed at formally verifying a key hardware component, called Scoreboard, of a Fault-Tolerant Parallel Processor (FTPP) being built at Charles Stark Draper Laboratory (CSDL). The Scoreboard is part of the FTPP virtual bus that guarantees reliable communication between processors in the presence of Byzantine faults in the system. The Scoreboard implements a piece of control logic that approves and validates a message before it can be transmitted. The goal of Phase 1 was to lay the foundation of the Scoreboard verification. A formal specification of the functional requirements and a high-level hardware design for the Scoreboard were developed. The hardware design was based on a preliminary Scoreboard design developed at CSDL. A main correctness theorem, from which the functional requirements can be established as corollaries, was proved for the Scoreboard design. The goal of Phase 2 is to verify the final detailed design of Scoreboard. This task is being conducted as part of a NASA-sponsored effort to explore integration of formal methods in the development cycle of current fault-tolerant architectures being built in the aerospace industry.
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan R.
2015-05-01
New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
The Making of a Government LSI - From Warfare Capability to Operational System
2015-04-30
continues to evolve and implement Lead System Integrator (LSI) acquisition strategies, they have started to define numerous program initiatives that...employ more integrated engineering and management processes and techniques. These initiatives are developing varying acquisition approaches that define (1...government LSI transformation. Navy Systems Commands have begun adding a higher level of integration into their acquisition process with the
Magnetomotive laser speckle imaging
Kim, Jeehyun; Oh, Junghwan; Choi, Bernard
2010-01-01
Laser speckle imaging (LSI) involves analysis of reflectance images collected during coherent optical excitation of an object to compute wide-field maps of tissue blood flow. An intrinsic limitation of LSI for resolving microvascular architecture is that its signal depends on relative motion of interrogated red blood cells. Hence, with LSI, small-diameter arterioles, venules, and capillaries are difficult to resolve due to the slow flow speeds associated with such vasculature. Furthermore, LSI characterization of subsurface blood flow is subject to blurring due to scattering, further limiting the ability of LSI to resolve or quantify blood flow in small vessels. Here, we show that magnetic activation of superparamagnetic iron oxide (SPIO) nanoparticles modulate the speckle flow index (SFI) values estimated from speckle contrast analysis of collected images. With application of an ac magnetic field to a solution of stagnant SPIO particles, an apparent increase in SFI is induced. Furthermore, with application of a focused dc magnetic field, a focal decrease in SFI values is induced. Magnetomotive LSI may enable wide-field mapping of suspicious tissue regions, enabling subsequent high-resolution optical interrogation of these regions. Similarly, subsequent photoactivation of intravascular SPIO nanoparticles could then be performed to induce selective photothermal destruction of unwanted vasculature. PMID:20210436
Shao, Chenzhong; Tanaka, Shuji; Nakayama, Takahiro; Hata, Yoshiyuki
2018-01-01
For installing many sensors in a limited space with a limited computing resource, the digitization of the sensor output at the site of sensation has advantages such as a small amount of wiring, low signal interference and high scalability. For this purpose, we have developed a dedicated Complementary Metal-Oxide-Semiconductor (CMOS) Large-Scale Integration (LSI) (referred to as “sensor platform LSI”) for bus-networked Micro-Electro-Mechanical-Systems (MEMS)-LSI integrated sensors. In this LSI, collision avoidance, adaptation and event-driven functions are simply implemented to relieve data collision and congestion in asynchronous serial bus communication. In this study, we developed a network system with 48 sensor platform LSIs based on Printed Circuit Board (PCB) in a backbone bus topology with the bus length being 2.4 m. We evaluated the serial communication performance when 48 LSIs operated simultaneously with the adaptation function. The number of data packets received from each LSI was almost identical, and the average sampling frequency of 384 capacitance channels (eight for each LSI) was 73.66 Hz. PMID:29342923
Developing software to use parallel processing effectively. Final report, June-December 1987
DOE Office of Scientific and Technical Information (OSTI.GOV)
Center, J.
1988-10-01
This report describes the difficulties involved in writing efficient parallel programs and describes the hardware and software support currently available for generating software that utilizes processing effectively. Historically, the processing rate of single-processor computers has increased by one order of magnitude every five years. However, this pace is slowing since electronic circuitry is coming up against physical barriers. Unfortunately, the complexity of engineering and research problems continues to require ever more processing power (far in excess of the maximum estimated 3 Gflops achievable by single-processor computers). For this reason, parallel-processing architectures are receiving considerable interest, since they offer high performancemore » more cheaply than a single-processor supercomputer, such as the Cray.« less
Laser Speckle Imaging of Blood Flow Beneath Static Scattering Media
NASA Astrophysics Data System (ADS)
Regan, Caitlin Anderson
Laser speckle imaging (LSI) is a wide-field optical imaging technique that provides information about the movement of scattering particles in biological samples. LSI is used to create maps of relative blood flow and perfusion in samples such as the skin, brain, teeth, gingiva, and other biological tissues. The presence of static, or non-moving, optical scatterers affects the ability of LSI to provide true quantitative and spatially resolved measurements of blood flow. With in vitro experiments using tissue-simulating phantoms, we determined that temporal analysis of raw speckle image sequences improved the quantitative accuracy of LSI to measure flow beneath a static scattering layer. We then applied the temporal algorithm to assess the potential of LSI to monitor oral health. We designed and tested two generations of miniature LSI devices to measure flow in the pulpal chamber of teeth and in the gingiva. Our preliminary clinical pilot data indicated that speckle contrast may correlate with gingival health. To improve visualization of subsurface blood vessels, we developed a technique called photothermal LSI. We applied a short pulse of laser energy to selectively perturb the motion of red blood cells, increasing the signal from vasculature relative to the surroundings. To study the spectral and depth dependence of laser speckle contrast, we developed a Monte Carlo model of light and momentum transport to simulate speckle contrast. With an increase in the thickness of the overlying static-scattering layer, we observed a quadratic decrease in the quantity of dynamically scattered light collected by the detector. We next applied the model to study multi-exposure speckle imaging (MESI), a method that purportedly improves quantitative accuracy of subsurface blood flow measurements. We unexpectedly determined that MESI faced similar depth limitations as conventional LSI, findings that were supported by in vitro experimental data. Finally, we used the model to study the effects of epidermal melanin absorption on LSI, and demonstrated that speckle contrast is less sensitive to varying melanin content than reflectance. We then proposed a two-wavelength measurement protocol that may enable melanin-independent LSI measurements of blood flow in patients with varying skin types. In conclusion, through in vitro and in silico experiments, we were able to further the understanding of the depth dependent origins of laser speckle contrast as well as the inherent limitations of this technology.
A CPU benchmark for protein crystallographic refinement.
Bourne, P E; Hendrickson, W A
1990-01-01
The CPU time required to complete a cycle of restrained least-squares refinement of a protein structure from X-ray crystallographic data using the FORTRAN codes PROTIN and PROLSQ are reported for 48 different processors, ranging from single-user workstations to supercomputers. Sequential, vector, VLIW, multiprocessor, and RISC hardware architectures are compared using both a small and a large protein structure. Representative compile times for each hardware type are also given, and the improvement in run-time when coding for a specific hardware architecture considered. The benchmarks involve scalar integer and vector floating point arithmetic and are representative of the calculations performed in many scientific disciplines.
High speed quantitative digital microscopy
NASA Technical Reports Server (NTRS)
Castleman, K. R.; Price, K. H.; Eskenazi, R.; Ovadya, M. M.; Navon, M. A.
1984-01-01
Modern digital image processing hardware makes possible quantitative analysis of microscope images at high speed. This paper describes an application to automatic screening for cervical cancer. The system uses twelve MC6809 microprocessors arranged in a pipeline multiprocessor configuration. Each processor executes one part of the algorithm on each cell image as it passes through the pipeline. Each processor communicates with its upstream and downstream neighbors via shared two-port memory. Thus no time is devoted to input-output operations as such. This configuration is expected to be at least ten times faster than previous systems.
Using SDI-12 with ST microelectronics MCU's
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saari, Alexandra; Hinzey, Shawn Adrian; Frigo, Janette Rose
2015-09-03
ST Microelectronics microcontrollers and processors are readily available, capable and economical processors. Unfortunately they lack a broad user base like similar offerings from Texas Instrument, Atmel, or Microchip. All of these devices could be useful in economical devices for remote sensing applications used with environmental sensing. With the increased need for environmental studies, and limited budgets, flexibility in hardware is very important. To that end, and in an effort to increase open support of ST devices, I am sharing my teams' experience in interfacing a common environmental sensor communication protocol (SDI-12) with ST devices.
The development of an engineering computer graphics laboratory
NASA Technical Reports Server (NTRS)
Anderson, D. C.; Garrett, R. E.
1975-01-01
Hardware and software systems developed to further research and education in interactive computer graphics were described, as well as several of the ongoing application-oriented projects, educational graphics programs, and graduate research projects. The software system consists of a FORTRAN 4 subroutine package, in conjunction with a PDP 11/40 minicomputer as the primary computation processor and the Imlac PDS-1 as an intelligent display processor. The package comprises a comprehensive set of graphics routines for dynamic, structured two-dimensional display manipulation, and numerous routines to handle a variety of input devices at the Imlac.
Computing NLTE Opacities -- Node Level Parallel Calculation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holladay, Daniel
Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.
NSTAR Ion Thrusters and Power Processors
NASA Technical Reports Server (NTRS)
Bond, T. A.; Christensen, J. A.
1999-01-01
The purpose of the NASA Solar Electric Propulsion Technology Applications Readiness (NSTAR) project is to validate ion propulsion technology for use on future NASA deep space missions. This program, which was initiated in September 1995, focused on the development of two sets of flight quality ion thrusters, power processors, and controllers that provided the same performance as engineering model hardware and also met the dynamic and environmental requirements of the Deep Space 1 Project. One of the flight sets was used for primary propulsion for the Deep Space 1 spacecraft which was launched in October 1998.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Supinski, B.; Caliga, D.
2017-09-28
The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshii, Kazutomo; Llopis, Pablo; Zhang, Kaicheng
As CMOS scaling nears its end, parameter variations (process, temperature and voltage) are becoming a major concern. To overcome parameter variations and provide stability, modern processors are becoming dynamic, opportunistically adjusting voltage and frequency based on thermal and energy constraints, which negatively impacts traditional bulk-synchronous parallelism-minded hardware and software designs. As node-level architecture is growing in complexity, implementing variation control mechanisms only with hardware can be a challenging task. In this paper we investigate a software strategy to manage hardwareinduced variations, leveraging low-level monitoring/controlling mechanisms.
JSC Wireless Sensor Network Update
NASA Technical Reports Server (NTRS)
Wagner, Robert
2010-01-01
Sensor nodes composed of three basic components... radio module: COTS radio module implementing standardized WSN protocol; treated as WSN modem by main board main board: contains application processor (TI MSP430 microcontroller), memory, power supply; responsible for sensor data acquisition, pre-processing, and task scheduling; re-used in every application with growing library of embedded C code sensor card: contains application-specific sensors, data conditioning hardware, and any advanced hardware not built into main board (DSPs, faster A/D, etc.); requires (re-) development for each application.
Solutions and debugging for data consistency in multiprocessors with noncoherent caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernstein, D.; Mendelson, B.; Breternitz, M. Jr.
1995-02-01
We analyze two important problems that arise in shared-memory multiprocessor systems. The stale data problem involves ensuring that data items in local memory of individual processors are current, independent of writes done by other processors. False sharing occurs when two processors have copies of the same shared data block but update different portions of the block. The false sharing problem involves guaranteeing that subsequent writes are properly combined. In modern architectures these problems are usually solved in hardware, by exploiting mechanisms for hardware controlled cache consistency. This leads to more expensive and nonscalable designs. Therefore, we are concentrating on softwaremore » methods for ensuring cache consistency that would allow for affordable and scalable multiprocessing systems. Unfortunately, providing software control is nontrivial, both for the compiler writer and for the application programmer. For this reason we are developing a debugging environment that will facilitate the development of compiler-based techniques and will help the programmer to tune his or her application using explicit cache management mechanisms. We extend the notion of a race condition for IBM Shared Memory System POWER/4, taking into consideration its noncoherent caches, and propose techniques for detection of false sharing problems. Identification of the stale data problem is discussed as well, and solutions are suggested.« less
Huang, Kuan-Ju; Shih, Wei-Yeh; Chang, Jui Chung; Feng, Chih Wei; Fang, Wai-Chi
2013-01-01
This paper presents a pipeline VLSI design of fast singular value decomposition (SVD) processor for real-time electroencephalography (EEG) system based on on-line recursive independent component analysis (ORICA). Since SVD is used frequently in computations of the real-time EEG system, a low-latency and high-accuracy SVD processor is essential. During the EEG system process, the proposed SVD processor aims to solve the diagonal, inverse and inverse square root matrices of the target matrices in real time. Generally, SVD requires a huge amount of computation in hardware implementation. Therefore, this work proposes a novel design concept for data flow updating to assist the pipeline VLSI implementation. The SVD processor can greatly improve the feasibility of real-time EEG system applications such as brain computer interfaces (BCIs). The proposed architecture is implemented using TSMC 90 nm CMOS technology. The sample rate of EEG raw data adopts 128 Hz. The core size of the SVD processor is 580×580 um(2), and the speed of operation frequency is 20MHz. It consumes 0.774mW of power during the 8-channel EEG system per execution time.
Cheung, Kit; Schultz, Simon R; Luk, Wayne
2015-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation.
Cheung, Kit; Schultz, Simon R.; Luk, Wayne
2016-01-01
NeuroFlow is a scalable spiking neural network simulation platform for off-the-shelf high performance computing systems using customizable hardware processors such as Field-Programmable Gate Arrays (FPGAs). Unlike multi-core processors and application-specific integrated circuits, the processor architecture of NeuroFlow can be redesigned and reconfigured to suit a particular simulation to deliver optimized performance, such as the degree of parallelism to employ. The compilation process supports using PyNN, a simulator-independent neural network description language, to configure the processor. NeuroFlow supports a number of commonly used current or conductance based neuronal models such as integrate-and-fire and Izhikevich models, and the spike-timing-dependent plasticity (STDP) rule for learning. A 6-FPGA system can simulate a network of up to ~600,000 neurons and can achieve a real-time performance of 400,000 neurons. Using one FPGA, NeuroFlow delivers a speedup of up to 33.6 times the speed of an 8-core processor, or 2.83 times the speed of GPU-based platforms. With high flexibility and throughput, NeuroFlow provides a viable environment for large-scale neural network simulation. PMID:26834542
Multicore Education through Simulation
ERIC Educational Resources Information Center
Ozturk, O.
2011-01-01
A project-oriented course for advanced undergraduate and graduate students is described for simulating multiple processor cores. Simics, a free simulator for academia, was utilized to enable students to explore computer architecture, operating systems, and hardware/software cosimulation. Motivation for including this course in the curriculum is…
Compiler-assisted multiple instruction rollback recovery using a read buffer
NASA Technical Reports Server (NTRS)
Alewine, N. J.; Chen, S.-K.; Fuchs, W. K.; Hwu, W.-M.
1993-01-01
Multiple instruction rollback (MIR) is a technique that has been implemented in mainframe computers to provide rapid recovery from transient processor failures. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs have also been developed which remove rollback data hazards directly with data-flow transformations. This paper focuses on compiler-assisted techniques to achieve multiple instruction rollback recovery. We observe that some data hazards resulting from instruction rollback can be resolved efficiently by providing an operand read buffer while others are resolved more efficiently with compiler transformations. A compiler-assisted multiple instruction rollback scheme is developed which combines hardware-implemented data redundancy with compiler-driven hazard removal transformations. Experimental performance evaluations indicate improved efficiency over previous hardware-based and compiler-based schemes.
Scalable digital hardware for a trapped ion quantum computer
NASA Astrophysics Data System (ADS)
Mount, Emily; Gaultney, Daniel; Vrijsen, Geert; Adams, Michael; Baek, So-Young; Hudek, Kai; Isabella, Louis; Crain, Stephen; van Rynbach, Andre; Maunz, Peter; Kim, Jungsang
2016-12-01
Many of the challenges of scaling quantum computer hardware lie at the interface between the qubits and the classical control signals used to manipulate them. Modular ion trap quantum computer architectures address scalability by constructing individual quantum processors interconnected via a network of quantum communication channels. Successful operation of such quantum hardware requires a fully programmable classical control system capable of frequency stabilizing the continuous wave lasers necessary for loading, cooling, initialization, and detection of the ion qubits, stabilizing the optical frequency combs used to drive logic gate operations on the ion qubits, providing a large number of analog voltage sources to drive the trap electrodes, and a scheme for maintaining phase coherence among all the controllers that manipulate the qubits. In this work, we describe scalable solutions to these hardware development challenges.
Freiberger, Manuel; Egger, Herbert; Liebmann, Manfred; Scharfetter, Hermann
2011-11-01
Image reconstruction in fluorescence optical tomography is a three-dimensional nonlinear ill-posed problem governed by a system of partial differential equations. In this paper we demonstrate that a combination of state of the art numerical algorithms and a careful hardware optimized implementation allows to solve this large-scale inverse problem in a few seconds on standard desktop PCs with modern graphics hardware. In particular, we present methods to solve not only the forward but also the non-linear inverse problem by massively parallel programming on graphics processors. A comparison of optimized CPU and GPU implementations shows that the reconstruction can be accelerated by factors of about 15 through the use of the graphics hardware without compromising the accuracy in the reconstructed images.
Design and evaluation of a fault-tolerant multiprocessor using hardware recovery blocks
NASA Technical Reports Server (NTRS)
Lee, Y. H.; Shin, K. G.
1982-01-01
A fault-tolerant multiprocessor with a rollback recovery mechanism is discussed. The rollback mechanism is based on the hardware recovery block which is a hardware equivalent to the software recovery block. The hardware recovery block is constructed by consecutive state-save operations and several state-save units in every processor and memory module. When a fault is detected, the multiprocessor reconfigures itself to replace the faulty component and then the process originally assigned to the faulty component retreats to one of the previously saved states in order to resume fault-free execution. A mathematical model is proposed to calculate both the coverage of multi-step rollback recovery and the risk of restart. A performance evaluation in terms of task execution time is also presented.
Report on phase 1 of the Microprocessor Seminar. [and associated large scale integration
NASA Technical Reports Server (NTRS)
1977-01-01
Proceedings of a seminar on microprocessors and associated large scale integrated (LSI) circuits are presented. The potential for commonality of device requirements, candidate processes and mechanisms for qualifying candidate LSI technologies for high reliability applications, and specifications for testing and testability were among the topics discussed. Various programs and tentative plans of the participating organizations in the development of high reliability LSI circuits are given.
Shehzad, Danish; Bozkuş, Zeki
2016-01-01
Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.
Bozkuş, Zeki
2016-01-01
Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models. PMID:27413363
Fang, Wai-Chi; Huang, Kuan-Ju; Chou, Chia-Ching; Chang, Jui-Chung; Cauwenberghs, Gert; Jung, Tzyy-Ping
2014-01-01
This is a proposal for an efficient very-large-scale integration (VLSI) design, 16-channel on-line recursive independent component analysis (ORICA) processor ASIC for real-time EEG system, implemented with TSMC 40 nm CMOS technology. ORICA is appropriate to be used in real-time EEG system to separate artifacts because of its highly efficient and real-time process features. The proposed ORICA processor is composed of an ORICA processing unit and a singular value decomposition (SVD) processing unit. Compared with previous work [1], this proposed ORICA processor has enhanced effectiveness and reduced hardware complexity by utilizing a deeper pipeline architecture, shared arithmetic processing unit, and shared registers. The 16-channel random signals which contain 8-channel super-Gaussian and 8-channel sub-Gaussian components are used to analyze the dependence of the source components, and the average correlation coefficient is 0.95452 between the original source signals and extracted ORICA signals. Finally, the proposed ORICA processor ASIC is implemented with TSMC 40 nm CMOS technology, and it consumes 15.72 mW at 100 MHz operating frequency.
NASA Astrophysics Data System (ADS)
Krasilenko, Vladimir G.; Nikolsky, Alexander I.; Lazarev, Alexander A.; Lazareva, Maria V.
2010-05-01
In the paper we show that the biologically motivated conception of time-pulse encoding usage gives a set of advantages (single methodological basis, universality, tuning simplicity, learning and programming et al) at creation and design of sensor systems with parallel input-output and processing for 2D structures hybrid and next generations neuro-fuzzy neurocomputers. We show design principles of programmable relational optoelectronic time-pulse encoded processors on the base of continuous logic, order logic and temporal waves processes. We consider a structure that execute analog signal extraction, analog and time-pulse coded variables sorting. We offer optoelectronic realization of such base relational order logic element, that consists of time-pulse coded photoconverters (pulse-width and pulse-phase modulators) with direct and complementary outputs, sorting network on logical elements and programmable commutation blocks. We make technical parameters estimations of devices and processors on such base elements by simulation and experimental research: optical input signals power 0.2 - 20 uW, processing time 1 - 10 us, supply voltage 1 - 3 V, consumption power 10 - 100 uW, extended functional possibilities, learning possibilities. We discuss some aspects of possible rules and principles of learning and programmable tuning on required function, relational operation and realization of hardware blocks for modifications of such processors. We show that it is possible to create sorting machines, neural networks and hybrid data-processing systems with untraditional numerical systems and pictures operands on the basis of such quasiuniversal hardware simple blocks with flexible programmable tuning.
ACE: Automatic Centroid Extractor for real time target tracking
NASA Technical Reports Server (NTRS)
Cameron, K.; Whitaker, S.; Canaris, J.
1990-01-01
A high performance video image processor has been implemented which is capable of grouping contiguous pixels from a raster scan image into groups and then calculating centroid information for each object in a frame. The algorithm employed to group pixels is very efficient and is guaranteed to work properly for all convex shapes as well as most concave shapes. Processing speeds are adequate for real time processing of video images having a pixel rate of up to 20 million pixels per second. Pixels may be up to 8 bits wide. The processor is designed to interface directly to a transputer serial link communications channel with no additional hardware. The full custom VLSI processor was implemented in a 1.6 mu m CMOS process and measures 7200 mu m on a side.
NASA Technical Reports Server (NTRS)
Harrison, D. A., III; Chladek, J. T.
1983-01-01
A real-time signal processor was developed for the NASA/JSC L-and C-band airborne radar scatterometer sensor systems. The purpose of the effort was to reduce ground data processing costs. Conversion of two quadrature channels of data (like and cross polarized) was made to obtain Power Spectral Density (PSD) values. A chirp-z transform (CZT) approach was used to filter the Doppler return signal and improved high frequency and angular resolution was realized. The processors have been tested with record signals and excellent results were obtained. CZT filtering can be readily applied to scatterometers operating at other wavelengths by altering the sample frequency. The design of the hardware and software and the results of the performance tests are described in detail.
DOT National Transportation Integrated Search
2003-04-01
The objective of this study was to assess the feasibility of using commercial off-the-shelf(COTS)processor-based systems for safety- related railroad applications. From the safety perspective,the fundamental challenges of using COTS products are most...
ERIC Educational Resources Information Center
Anderson, Cheryl A.
Designed to answer basic questions educators have about microcomputer hardware and software and their applications in teaching, this paper describes the revolution in computer technology that has resulted from the development of the microchip processor and provides information on the major computer components; i.e.; input, central processing unit,…
NASA Astrophysics Data System (ADS)
Langlois, Serge; Fouquet, Olivier; Gouy, Yann; Riant, David
2014-08-01
On-Board Computers (OBC) are more and more using integrated systems on-chip (SOC) that embed processors running from 50MHz up to several hundreds of MHz, and around which are plugged some dedicated communication controllers together with other Input/Output channels.For ground testing and On-Board SoftWare (OBSW) validation purpose, a representative simulation of these systems, faster than real-time and with cycle-true timing of execution, is not achieved with current purely software simulators.Since a few years some hybrid solutions where put in place ([1], [2]), including hardware in the loop so as to add accuracy and performance in the computer software simulation.This paper presents the results of the works engaged by Thales Alenia Space (TAS-F) at the end of 2010, that led to a validated HW simulator of the UT699 by mid- 2012 and that is now qualified and fully used in operational contexts.
Accelerating list management for MPI.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hemmert, K. Scott; Rodrigues, Arun F.; Underwood, Keith Douglas
2005-07-01
The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a message is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By movingmore » functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages.« less
NASA Technical Reports Server (NTRS)
Farley, Douglas L.
2005-01-01
NASA's Aviation Safety and Security Program is pursuing research in on-board Structural Health Management (SHM) technologies for purposes of reducing or eliminating aircraft accidents due to system and component failures. Under this program, NASA Langley Research Center (LaRC) is developing a strain-based structural health-monitoring concept that incorporates a fiber optic-based measuring system for acquiring strain values. This fiber optic-based measuring system provides for the distribution of thousands of strain sensors embedded in a network of fiber optic cables. The resolution of strain value at each discrete sensor point requires a computationally demanding data reduction software process that, when hosted on a conventional processor, is not suitable for near real-time measurement. This report describes the development and integration of an alternative computing environment using dedicated computing hardware for performing the data reduction. Performance comparison between the existing and the hardware-based system is presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gering, Kevin L.
A method, system, and computer-readable medium are described for characterizing performance loss of an object undergoing an arbitrary aging condition. Baseline aging data may be collected from the object for at least one known baseline aging condition over time, determining baseline multiple sigmoid model parameters from the baseline data, and performance loss of the object may be determined over time through multiple sigmoid model parameters associated with the object undergoing the arbitrary aging condition using a differential deviation-from-baseline approach from the baseline multiple sigmoid model parameters. The system may include an object, monitoring hardware configured to sample performance characteristics ofmore » the object, and a processor coupled to the monitoring hardware. The processor is configured to determine performance loss for the arbitrary aging condition from a comparison of the performance characteristics of the object deviating from baseline performance characteristics associated with a baseline aging condition.« less
EMSE at TREC 2015 Clinical Decision Support Track
2015-11-20
pseudo relevant documents, semantic ressources of UMLS , and a hybrid approach called SMERA that combines LSI and UMLS based approaches. Only three of...approach to query expansion uses ontologies ( UMLS ) and a lo- cal approach based on pseudo relevant feedback documents using LSI. A brief description of...pseudo relevance feedback documents, and a semantic method based on UMLS concepts. The LSI-based method was used only to expand summary terms that can’t
Vapor Compression Distillation Urine Processor Lessons Learned from Development and Life Testing
NASA Technical Reports Server (NTRS)
Hutchens, Cindy F.; Long, David A.
1999-01-01
Vapor Compression Distillation (VCD) is the chosen technology for urine processing aboard the International Space Station (155). Development and life testing over the past several years have brought to the forefront problems and solutions for the VCD technology. Testing between 1992 and 1998 has been instrumental in developing estimates of hardware life and reliability. It has also helped improve the hardware design in ways that either correct existing problems or enhance the existing design of the hardware. The testing has increased the confidence in the VCD technology and reduced technical and programmatic risks. This paper summarizes the test results and changes that have been made to the VCD design.
NASA Technical Reports Server (NTRS)
Cole, Richard
1991-01-01
The major goals of this effort are as follows: (1) to examine technology insertion options to optimize Advanced Information Processing System (AIPS) performance in the Advanced Launch System (ALS) environment; (2) to examine the AIPS concepts to ensure that valuable new technologies are not excluded from the AIPS/ALS implementations; (3) to examine advanced microprocessors applicable to AIPS/ALS, (4) to examine radiation hardening technologies applicable to AIPS/ALS; (5) to reach conclusions on AIPS hardware building blocks implementation technologies; and (6) reach conclusions on appropriate architectural improvements. The hardware building blocks are the Fault-Tolerant Processor, the Input/Output Sequencers (IOS), and the Intercomputer Interface Sequencers (ICIS).
Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; LeCun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe
2012-01-01
Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons. PMID:22518097
Farabet, Clément; Paz, Rafael; Pérez-Carrasco, Jose; Zamarreño-Ramos, Carlos; Linares-Barranco, Alejandro; Lecun, Yann; Culurciello, Eugenio; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabe
2012-01-01
Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons.
Radiation hardened microprocessor for small payloads
NASA Technical Reports Server (NTRS)
Shah, Ravi
1993-01-01
The RH-3000 program is developing a rad-hard space qualified 32-bit MIPS R-3000 RISC processor under the Naval Research Lab sponsorship. In addition, under IR&D Harris is developing RHC-3000 for embedded control applications where low cost and radiation tolerance are primary concerns. The development program leverages heavily from commercial development of the MIPS R-3000. The commercial R-3000 has a large installed user base and several foundry partners are currently producing a wide variety of R-3000 derivative products. One of the MIPS derivative products, the LR33000 from LSI Logic, was used as the basis for the design of the RH-3000 chipset. The RH-3000 chipset consists of three core chips and two support chips. The core chips include the CPU, which is the R-3000 integer unit and the FPA/MD chip pair, which performs the R-3010 floating point functions. The two support whips contain all the support functions required for fault tolerance support, real-time support, memory management, timers, and other functions. The Harris development effort had first passed silicon success in June, 1992 with the first rad-hard 32-bit RH-3000 CPU chip. The CPU device is 30 kgates, has a 508 mil by 503 mil die size and is fabricated at Harris Semiconductor on the rad-hard CMOS Silicon on Sapphire (SOS) process. The CPU device successfully passed tesing against 600,000 test vectors derived directly on the LSI/MIPS test suite and has been operational as a single board computer running C code for the past year. In addition, the RH-3000 program has developed the methodology for converting commercially developed designs utilizing logic synthesis techniques based on a combination of VHDK and schematic data bases.
Real-Time Spatio-Temporal Twice Whitening for MIMO Energy Detector
DOE Office of Scientific and Technical Information (OSTI.GOV)
Humble, Travis S; Mitra, Pramita; Barhen, Jacob
2010-01-01
While many techniques exist for local spectrum sensing of a primary user, each represents a computationally demanding task to secondary user receivers. In software-defined radio, computational complexity lengthens the time for a cognitive radio to recognize changes in the transmission environment. This complexity is even more significant for spatially multiplexed receivers, e.g., in SIMO and MIMO, where the spatio-temporal data sets grow in size with the number of antennae. Limits on power and space for the processor hardware further constrain SDR performance. In this report, we discuss improvements in spatio-temporal twice whitening (STTW) for real-time local spectrum sensing by demonstratingmore » a form of STTW well suited for MIMO environments. We implement STTW on the Coherent Logix hx3100 processor, a multicore processor intended for low-power, high-throughput software-defined signal processing. These results demonstrate how coupling the novel capabilities of emerging multicore processors with algorithmic advances can enable real-time, software-defined processing of large spatio-temporal data sets.« less
NASA Astrophysics Data System (ADS)
Yokoyama, Yoshiaki; Kim, Minseok; Arai, Hiroyuki
At present, when using space-time processing techniques with multiple antennas for mobile radio communication, real-time weight adaptation is necessary. Due to the progress of integrated circuit technology, dedicated processor implementation with ASIC or FPGA can be employed to implement various wireless applications. This paper presents a resource and performance evaluation of the QRD-RLS systolic array processor based on fixed-point CORDIC algorithm with FPGA. In this paper, to save hardware resources, we propose the shared architecture of a complex CORDIC processor. The required precision of internal calculation, the circuit area for the number of antenna elements and wordlength, and the processing speed will be evaluated. The resource estimation provides a possible processor configuration with a current FPGA on the market. Computer simulations assuming a fading channel will show a fast convergence property with a finite number of training symbols. The proposed architecture has also been implemented and its operation was verified by beamforming evaluation through a radio propagation experiment.
An embedded laser marking controller based on ARM and FPGA processors.
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking.
A network control concept for the 30/20 GHz communication system baseband processor
NASA Technical Reports Server (NTRS)
Sabourin, D. J.; Hay, R. E.
1982-01-01
The architecture and system design for a satellite-switched TDMA communication system employing on-board processing was developed by Motorola for NASA's Lewis Research Center. The system design is based on distributed processing techniques that provide extreme flexibility in the selection of a network control protocol without impacting the satellite or ground terminal hardware. A network control concept that includes system synchronization and allows burst synchronization to occur within the system operational requirement is described. This concept integrates the tracking and control links with the communication links via the baseband processor, resulting in an autonomous system operational approach.
NASA Astrophysics Data System (ADS)
Ponticorvo, A.; Rowland, R.; Yang, B.; Lertsakdadet, B.; Crouzet, C.; Bernal, N.; Choi, B.; Durkin, A. J.
2017-02-01
Burn wounds are often characterized by injury depth, which then dictates wound management strategy. While most superficial burns and full thickness burns can be diagnosed through visual inspection, clinicians experience difficulty with accurate diagnosis of burns that fall between these extremes. Accurately diagnosing burn severity in a timely manner is critical for starting the appropriate treatment plan at the earliest time points to improve patient outcomes. To address this challenge, research groups have studied the use of commercial laser Doppler imaging (LDI) systems to provide objective characterization of burn-wound severity. Despite initial promising findings, LDI systems are not commonplace in part due to long acquisition times that can suffer from artifacts in moving patients. Commercial LDI systems are being phased out in favor of laser speckle imaging (LSI) systems that can provide similar information with faster acquisition speeds. To better understand the accuracy and usefulness of commercial LSI systems in burn-oriented research, we studied the performance of a commercial LSI system in three different sample systems and compared its results to a research-grade LSI system in the same environments. The first sample system involved laboratory measurements of intralipid (1%) flowing through a tissue simulating phantom, the second preclinical measurements in a controlled burn study in which wounds of graded severity were created on a Yorkshire pig, and the third clinical measurements involving a small sample of clinical patients. In addition to the commercial LSI system, a research grade LSI system that was designed and fabricated in our labs was used to quantitatively compare the performance of both systems and also to better understand the "Perfusion Unit" output of commercial systems.
Travers, Karin; Sallum, Rachel H; Burns, Meghan D; Barr, Charles E; Beattie, Mary S; Pashos, Chris L; Luce, Bryan R
2015-01-01
Purpose Patient registries are used to monitor safety, examine real-world effectiveness, and may potentially contribute to comparative effectiveness research. To our knowledge, life sciences industry (LSI)-sponsored registries have not been systematically categorized. This study represents a first step toward understanding such registries over time. Methods Studies described as registries were identified in the ClinicalTrials.gov database. Characteristics from these registry records were abstracted and analyzed. Results Of 1202 registries identified, approximately 47% reported LSI sponsorship. These 562 LSI registries varied in focus: medical devices (n = 193, 34%), specific drugs (n = 173, 31%), procedures (n = 29, 5%), or particular diseases (n = 139, 25%). Thirty-three registries (<6%) evaluated pregnancy outcomes. The most common therapeutic area was cardiovascular (n = 234, 42%); others included endocrinology, immunology, oncology, musculoskeletal disorders, and neurology. The two most often measured outcomes were clinical effectiveness and safety, each of which appeared in 363/562 (65%) of LSI registries. Other outcomes included real-world clinical practice patterns (n = 122, 22%), patient-reported outcomes (n = 106, 19%), disease epidemiology/natural history (n = 69, 12%), and economic outcomes (n = 30, 5%). The number of LSI registries and their geographic diversity has increased over time. Conclusions The LSI registries represent a substantial proportion of all patient registries documented in ClinicalTrials.gov. These prospective studies are growing in number and encompass diverse therapeutic areas and geographic regions. Most registries measure multiple outcomes and capture real-world data that may be unavailable through other study designs. This classification of LSI registries documents their use for studying heterogeneity of diseases, examining treatment patterns, measuring patient-reported outcomes, examining economic outcomes, and performing comparative effectiveness research. © 2014 The Authors. Pharmacoepidemiology and Drug Safety published by John Wiley & Sons, Ltd. PMID:25079108
Chen, Defu; Ren, Jie; Wang, Ying; Zhao, Hongyou; Li, Buhong; Gu, Ying
2016-03-01
Laser Doppler imaging (LDI) and laser speckle imaging (LSI) are two major optical techniques aiming at non-invasively imaging the skin blood perfusion. However, the relationship between perfusion values determined by LDI and LSI has not been fully explored. 8 healthy volunteers and 13 PWS patients were recruited. The perfusions in normal skin on the forearm of 8 healthy volunteers were simultaneously measured by both LDI and LSI during post-occlusive reactive hyperemia (PORH). Furthermore, the perfusions of port wine stains (PWS) lesions and contralateral normal skin of 10 PWS patients were also determined. In addition, the perfusions for PWS lesions from 3 PWS patients were successively monitored at 0, 10 and 20min during vascular-targeted photodynamic therapy (V-PDT). The average perfusion values determined by LSI were compared with those of LDI for each subject. In the normal skin during PORH, power function provided better fits of perfusion values than linear function: powers for individual subjects go from 1.312 to 1.942 (R(2)=0.8967-0.9951). There was a linear relationship between perfusion values determined by LDI and LSI in PWS and contralateral normal skin (R(2)=0.7308-0.9623), and in PWS during V-PDT (R(2)=0.8037-0.9968). The perfusion values determined by LDI and LSI correlate closely in normal skin and PWS over a broad range of skin perfusion. However, it still suggests that perfusion range and characteristics of the measured skin should be carefully considered if LDI and LSI measures are compared. Copyright © 2015 Elsevier B.V. All rights reserved.
Use of a Microprocessor to Implement an ADCCP Protocol (Federal Standard 1003).
1980-07-01
results of other studies, to evaluate the operational and economic impact of incorporating various options in Federal Standard 1003. The effort...the LSI interface and the microprocessor; the LSI chip deposits bytes in its buffer as the producer, and the MPU reads this data as the consumer...on the interface between the MPU and the LSI protocol chip. This requires two main processes to be running at the same time--transmit and receive. The
Amisaki, Takashi; Toyoda, Shinjiro; Miyagawa, Hiroh; Kitamura, Kunihiro
2003-04-15
Evaluation of long-range Coulombic interactions still represents a bottleneck in the molecular dynamics (MD) simulations of biological macromolecules. Despite the advent of sophisticated fast algorithms, such as the fast multipole method (FMM), accurate simulations still demand a great amount of computation time due to the accuracy/speed trade-off inherently involved in these algorithms. Unless higher order multipole expansions, which are extremely expensive to evaluate, are employed, a large amount of the execution time is still spent in directly calculating particle-particle interactions within the nearby region of each particle. To reduce this execution time for pair interactions, we developed a computation unit (board), called MD-Engine II, that calculates nonbonded pairwise interactions using a specially designed hardware. Four custom arithmetic-processors and a processor for memory manipulation ("particle processor") are mounted on the computation board. The arithmetic processors are responsible for calculation of the pair interactions. The particle processor plays a central role in realizing efficient cooperation with the FMM. The results of a series of 50-ps MD simulations of a protein-water system (50,764 atoms) indicated that a more stringent setting of accuracy in FMM computation, compared with those previously reported, was required for accurate simulations over long time periods. Such a level of accuracy was efficiently achieved using the cooperative calculations of the FMM and MD-Engine II. On an Alpha 21264 PC, the FMM computation at a moderate but tolerable level of accuracy was accelerated by a factor of 16.0 using three boards. At a high level of accuracy, the cooperative calculation achieved a 22.7-fold acceleration over the corresponding conventional FMM calculation. In the cooperative calculations of the FMM and MD-Engine II, it was possible to achieve more accurate computation at a comparable execution time by incorporating larger nearby regions. Copyright 2003 Wiley Periodicals, Inc. J Comput Chem 24: 582-592, 2003
Friedmann, Simon; Frémaux, Nicolas; Schemmel, Johannes; Gerstner, Wulfram; Meier, Karlheinz
2013-01-01
In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project.
Friedmann, Simon; Frémaux, Nicolas; Schemmel, Johannes; Gerstner, Wulfram; Meier, Karlheinz
2013-01-01
In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project. PMID:24065877
Orbiter CIU/IUS communications hardware evaluation
NASA Technical Reports Server (NTRS)
Huth, G. K.
1979-01-01
The DOD and NASA inertial upper stage communication system design, hardware specifications and interfaces were analyzed to determine their compatibility with the Orbiter payload communications equipment (Payload Interrogator, Payload Signal Processors, Communications Interface Unit, and the Orbiter operational communications equipment (the S-Band and Ku-band systems). Topics covered include (1) IUS/shuttle Orbiter communications interface definition; (2) Orbiter avionics equipment serving the IUS; (3) IUS communication equipment; (4) IUS/shuttle Orbiter RF links; (5) STDN/TDRS S-band related activities; and (6) communication interface unit/Orbiter interface issues. A test requirement plan overview is included.
Operating System Abstraction Layer (OSAL)
NASA Technical Reports Server (NTRS)
Yanchik, Nicholas J.
2007-01-01
This viewgraph presentation reviews the concept of the Operating System Abstraction Layer (OSAL) and its benefits. The OSAL is A small layer of software that allows programs to run on many different operating systems and hardware platforms It runs independent of the underlying OS & hardware and it is self-contained. The benefits of OSAL are that it removes dependencies from any one operating system, promotes portable, reusable flight software. It allows for Core Flight software (FSW) to be built for multiple processors and operating systems. The presentation discusses the functionality, the various OSAL releases, and describes the specifications.
Emergency product generation for disaster management using RISAT and DMSAR quick look SAR processors
NASA Astrophysics Data System (ADS)
Desai, Nilesh; Sharma, Ritesh; Kumar, Saravana; Misra, Tapan; Gujraty, Virendra; Rana, SurinderSingh
2006-12-01
Since last few years, ISRO has embarked upon the development of two complex Synthetic Aperture Radar (SAR) missions, viz. Spaceborne Radar Imaging Satellite (RISAT) and Airborne SAR for Disaster Mangement (DMSAR), as a capacity building measure under country's Disaster Management Support (DMS) Program, for estimating the extent of damage over large areas (~75 Km) and also assess the effectiveness of the relief measures undertaken during natural disasters such as cyclones, epidemics, earthquakes, floods and landslides, forest fires, crop diseases etc. Synthetic Aperture Radar (SAR) has an unique role to play in mapping and monitoring of large areas affected by natural disasters especially floods, owing to its unique capability to see through clouds as well as all-weather imaging capability. The generation of SAR images with quick turn around time is very essential to meet the above DMS objectives. Thus the development of SAR Processors, for these two SAR systems poses considerable challenges and design efforts. Considering the growing user demand and inevitable necessity for a full-fledged high throughput processor, to process SAR data and generate image in real or near-real time, the design and development of a generic SAR Processor has been taken up and evolved, which will meet the SAR processing requirements for both Airborne and Spaceborne SAR systems. This hardware SAR processor is being built, to the extent possible, using only Commercial-Off-The-Shelf (COTS) DSP and other hardware plug-in modules on a Compact PCI (cPCI) platform. Thus, the major thrust has been on working out Multi-processor Digital Signal Processor (DSP) architecture and algorithm development and optimization rather than hardware design and fabrication. For DMSAR, this generic SAR Processor operates as a Quick Look SAR Processor (QLP) on-board the aircraft to produce real time full swath DMSAR images and as a ground based Near-Real Time high precision full swath Processor (NRTP). It will generate full-swath (6 to 75 Kms) DMSAR images in 1m / 3m / 5m / 10m / 30m resolution SAR operating modes. For RISAT mission, this generic Quick Look SAR Processor will be mainly used for browse product generation at NRSA-Shadnagar (SAN) ground receive station. RISAT QLP/NRTP is also proposed to provide an alternative emergency SAR product generation chain. For this, the S/C aux data appended in Onboard SAR Frame Format (x, y, z, x', y', z', roll, pitch, yaw) and predicted orbit from previous days Orbit Determination data will be used. The QLP / NRTP will produce ground range images in real / near real time. For emergency data product generation, additional Off-line tasks like geo-tagging, masking, QC etc needs to be performed on the processed image. The QLP / NRTP would generate geo-tagged images from the annotation data available from the SAR P/L data itself. Since the orbit & attitude information are taken as it is, the location accuracy will be poorer compared to the product generated using ADIF, where smoothened attitude and orbit are made available. Additional tasks like masking, output formatting and Quality checking of the data product will be carried out at Balanagar, NRSA after the image annotated data from QLP / NRTP is sent to Balanagar. The necessary interfaces to the QLP/NRTP for Emergency product generation are also being worked out. As is widely acknowledged, QLP/NRTP for RISAT and DMSAR is an ambitious effort and the technology of future. It is expected that by the middle of next decade, the next generation SAR missions worldwide will have onboard SAR Processors of varying capabilities and generate SAR Data products and Information products onboard instead of SAR raw data. Thus, it is also envisaged that these activities related to QLP/NRTP implementation for RISAT ground segment and DMSAR will be a significant step which will directly feed into the development of onboard real time processing systems for ISRO's future space borne SAR missions. This paper describes the design requirements, configuration details and salient features, apart from highlighting the utility of these Quick Look SAR processors for RISAT and DMSAR, for generation of emergency products for Disaster management.
Microfluidic large-scale integration: the evolution of design rules for biological automation.
Melin, Jessica; Quake, Stephen R
2007-01-01
Microfluidic large-scale integration (mLSI) refers to the development of microfluidic chips with thousands of integrated micromechanical valves and control components. This technology is utilized in many areas of biology and chemistry and is a candidate to replace today's conventional automation paradigm, which consists of fluid-handling robots. We review the basic development of mLSI and then discuss design principles of mLSI to assess the capabilities and limitations of the current state of the art and to facilitate the application of mLSI to areas of biology. Many design and practical issues, including economies of scale, parallelization strategies, multiplexing, and multistep biochemical processing, are discussed. Several microfluidic components used as building blocks to create effective, complex, and highly integrated microfluidic networks are also highlighted.
Photon Counting Using Edge-Detection Algorithm
NASA Technical Reports Server (NTRS)
Gin, Jonathan W.; Nguyen, Danh H.; Farr, William H.
2010-01-01
New applications such as high-datarate, photon-starved, free-space optical communications require photon counting at flux rates into gigaphoton-per-second regimes coupled with subnanosecond timing accuracy. Current single-photon detectors that are capable of handling such operating conditions are designed in an array format and produce output pulses that span multiple sample times. In order to discern one pulse from another and not to overcount the number of incoming photons, a detection algorithm must be applied to the sampled detector output pulses. As flux rates increase, the ability to implement such a detection algorithm becomes difficult within a digital processor that may reside within a field-programmable gate array (FPGA). Systems have been developed and implemented to both characterize gigahertz bandwidth single-photon detectors, as well as process photon count signals at rates into gigaphotons per second in order to implement communications links at SCPPM (serial concatenated pulse position modulation) encoded data rates exceeding 100 megabits per second with efficiencies greater than two bits per detected photon. A hardware edge-detection algorithm and corresponding signal combining and deserialization hardware were developed to meet these requirements at sample rates up to 10 GHz. The photon discriminator deserializer hardware board accepts four inputs, which allows for the ability to take inputs from a quadphoton counting detector, to support requirements for optical tracking with a reduced number of hardware components. The four inputs are hardware leading-edge detected independently. After leading-edge detection, the resultant samples are ORed together prior to deserialization. The deserialization is performed to reduce the rate at which data is passed to a digital signal processor, perhaps residing within an FPGA. The hardware implements four separate analog inputs that are connected through RF connectors. Each analog input is fed to a high-speed 1-bit comparator, which digitizes the input referenced to an adjustable threshold value. This results in four independent serial sample streams of binary 1s and 0s, which are ORed together at rates up to 10 GHz. This single serial stream is then deserialized by a factor of 16 to create 16 signal lines at a rate of 622.5 MHz or lower for input to a high-speed digital processor assembly. The new design and corresponding hardware can be employed with a quad-photon counting detector capable of handling photon rates on the order of multi-gigaphotons per second, whereas prior art was only capable of handling a single input at 1/4 the flux rate. Additionally, the hardware edge-detection algorithm has provided the ability to process 3-10 higher photon flux rates than previously possible by removing the limitation that photoncounting detector output pulses on multiple channels being ORed not overlap. Now, only the leading edges of the pulses are required to not overlap. This new photon counting digitizer hardware architecture supports a universal front end for an optical communications receiver operating at data rates from kilobits to over one gigabit per second to meet increased mission data volume requirements.
MicroShell Minimalist Shell for Xilinx Microprocessors
NASA Technical Reports Server (NTRS)
Werne, Thomas A.
2011-01-01
MicroShell is a lightweight shell environment for engineers and software developers working with embedded microprocessors in Xilinx FPGAs. (MicroShell has also been successfully ported to run on ARM Cortex-M1 microprocessors in Actel ProASIC3 FPGAs, but without project-integration support.) Micro Shell decreases the time spent performing initial tests of field-programmable gate array (FPGA) designs, simplifies running customizable one-time-only experiments, and provides a familiar-feeling command-line interface. The program comes with a collection of useful functions and enables the designer to add an unlimited number of custom commands, which are callable from the command-line. The commands are parameterizable (using the C-based command-line parameter idiom), so the designer can use one function to exercise hardware with different values. Also, since many hardware peripherals instantiated in FPGAs have reasonably simple register-mapped I/O interfaces, the engineer can edit and view hardware parameter settings at any time without stopping the processor. MicroShell comes with a set of support scripts that interface seamlessly with Xilinx's EDK tool. Adding an instance of MicroShell to a project is as simple as marking a check box in a library configuration dialog box and specifying a software project directory. The support scripts then examine the hardware design, build design-specific functions, conditionally include processor-specific functions, and complete the compilation process. For code-size constrained designs, most of the stock functionality can be excluded from the compiled library. When all of the configurable options are removed from the binary, MicroShell has an unoptimized memory footprint of about 4.8 kB and a size-optimized footprint of about 2.3 kB. Since MicroShell allows unfettered access to all processor-accessible memory locations, it is possible to perform live patching on a running system. This can be useful, for instance, if a bug is discovered in a routine but the system cannot be rebooted: Shell allows a skilled operator to directly edit the binary executable in memory. With some forethought, MicroShell code can be located in a different memory location from custom code, permitting the custom functionality to be overwritten at any time without stopping the controlling shell.
A Real-Time Capable Software-Defined Receiver Using GPU for Adaptive Anti-Jam GPS Sensors
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S.; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities. PMID:22164116
A real-time capable software-defined receiver using GPU for adaptive anti-jam GPS sensors.
Seo, Jiwon; Chen, Yu-Hsuan; De Lorenzo, David S; Lo, Sherman; Enge, Per; Akos, Dennis; Lee, Jiyun
2011-01-01
Due to their weak received signal power, Global Positioning System (GPS) signals are vulnerable to radio frequency interference. Adaptive beam and null steering of the gain pattern of a GPS antenna array can significantly increase the resistance of GPS sensors to signal interference and jamming. Since adaptive array processing requires intensive computational power, beamsteering GPS receivers were usually implemented using hardware such as field-programmable gate arrays (FPGAs). However, a software implementation using general-purpose processors is much more desirable because of its flexibility and cost effectiveness. This paper presents a GPS software-defined radio (SDR) with adaptive beamsteering capability for anti-jam applications. The GPS SDR design is based on an optimized desktop parallel processing architecture using a quad-core Central Processing Unit (CPU) coupled with a new generation Graphics Processing Unit (GPU) having massively parallel processors. This GPS SDR demonstrates sufficient computational capability to support a four-element antenna array and future GPS L5 signal processing in real time. After providing the details of our design and optimization schemes for future GPU-based GPS SDR developments, the jamming resistance of our GPS SDR under synthetic wideband jamming is presented. Since the GPS SDR uses commercial-off-the-shelf hardware and processors, it can be easily adopted in civil GPS applications requiring anti-jam capabilities.
The Tera Multithreaded Architecture and Unstructured Meshes
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.; Mavriplis, Dimitri J.
1998-01-01
The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.
Parallel implementation of D-Phylo algorithm for maximum likelihood clusters.
Malik, Shamita; Sharma, Dolly; Khatri, Sunil Kumar
2017-03-01
This study explains a newly developed parallel algorithm for phylogenetic analysis of DNA sequences. The newly designed D-Phylo is a more advanced algorithm for phylogenetic analysis using maximum likelihood approach. The D-Phylo while misusing the seeking capacity of k -means keeps away from its real constraint of getting stuck at privately conserved motifs. The authors have tested the behaviour of D-Phylo on Amazon Linux Amazon Machine Image(Hardware Virtual Machine)i2.4xlarge, six central processing unit, 122 GiB memory, 8 × 800 Solid-state drive Elastic Block Store volume, high network performance up to 15 processors for several real-life datasets. Distributing the clusters evenly on all the processors provides us the capacity to accomplish a near direct speed if there should arise an occurrence of huge number of processors.
NASA Technical Reports Server (NTRS)
Bickford, Mark; Srivas, Mandayam
1991-01-01
Presented here is a formal specification and verification of a property of a quadruplicately redundant fault tolerant microprocessor system design. A complete listing of the formal specification of the system and the correctness theorems that are proved are given. The system performs the task of obtaining interactive consistency among the processors using a special instruction on the processors. The design is based on an algorithm proposed by Pease, Shostak, and Lamport. The property verified insures that an execution of the special instruction by the processors correctly accomplishes interactive consistency, providing certain preconditions hold, using a computer aided design verification tool, Spectool, and the theorem prover, Clio. A major contribution of the work is the demonstration of a significant fault tolerant hardware design that is mechanically verified by a theorem prover.
Elements of oxygen production systems using Martian atmosphere
NASA Technical Reports Server (NTRS)
Ash, R. L.; Huang, J.-K.; Johnson, P. B.; Sivertson, W. E., Jr.
1986-01-01
Hardware elements have been studied in terms of their applicability to Mars oxygen production systems. Various aspects of the system design are discussed and areas requiring further research are identified. Initial work on system reliability is discussed and a methodology for applying expert system technology to the oxygen processor is described.
Burbank works on the EPIC in the Node 2
2012-02-28
ISS030-E-114433 (29 Feb. 2012) --- In the International Space Station?s Destiny laboratory, NASA astronaut Dan Burbank, Expedition 30 commander, upgrades Multiplexer/Demultiplexer (MDM) computers and Portable Computer System (PCS) laptops and installs the Enhanced Processor & Integrated Communications (EPIC) hardware in the Payload 1 (PL-1) MDM.
Development and Operation of a Database Machine for Online Access and Update of a Large Database.
ERIC Educational Resources Information Center
Rush, James E.
1980-01-01
Reviews the development of a fault tolerant database processor system which replaced OCLC's conventional file system. A general introduction to database management systems and the operating environment is followed by a description of the hardware selection, software processes, and system characteristics. (SW)
The Mercury System: Embedding Computation into Disk Drives
2004-08-20
enabling technologies to build extremely fast data search engines . We do this by moving the search closer to the data, and performing it in hardware...engine searches in parallel across a disk or disk surface 2. System Parallelism: Searching is off-loaded to search engines and main processor can
Virtualization for Cost-Effective Teaching of Assembly Language Programming
ERIC Educational Resources Information Center
Cadenas, José O.; Sherratt, R. Simon; Howlett, Des; Guy, Chris G.; Lundqvist, Karsten O.
2015-01-01
This paper describes a virtual system that emulates an ARM-based processor machine, created to replace a traditional hardware-based system for teaching assembly language. The virtual system proposed here integrates, in a single environment, all the development tools necessary to deliver introductory or advanced courses on modern assembly language…
Network Interface Specification for the T1 Microprocessor
1994-05-01
features data transfer directly to/from processor registers, hardware dispatch directly to Active Message handlers (along with limited context...Implementation Choices 9 3.1 Overview .................................... 9 3.2 Context ..................................... 10 3.3 Data Transfer...details of the data transfer functional units, interconnect structure, and network operation. Application Layer Communication Model Communication
NASA Technical Reports Server (NTRS)
Davies, A. G.; Chien, S.; Baker, V.; Castano, R.; Cichy, B.; Doggett, T.; Dohm, J. M.; Greeley, R.; Ip, F.; Rabideau, G.
2005-01-01
ASE has successfully demonstrated that a spacecraft can be driven by science analysis and autonomously controlled. ASE is available for flight on other missions. Mission hardware design should consider ASE requirements for available onboard data storage, onboard memory size and processor speed.
Model Railroading and Computer Fundamentals
ERIC Educational Resources Information Center
McCormick, John W.
2007-01-01
Less than one half of one percent of all processors manufactured today end up in computers. The rest are embedded in other devices such as automobiles, airplanes, trains, satellites, and nearly every modern electronic device. Developing software for embedded systems requires a greater knowledge of hardware than developing for a typical desktop…
Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Alewine, Neal Jon
1993-01-01
Multiple instruction rollback (MIR) is a technique to provide rapid recovery from transient processor failures and was implemented in hardware by researchers and slow in mainframe computers. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs were also developed which remove rollback data hazards directly with data flow manipulations, thus eliminating the need for most data redundancy hardware. Compiler-assisted techniques to achieve multiple instruction rollback recovery are addressed. It is observed that data some hazards resulting from instruction rollback can be resolved more efficiently by providing hardware redundancy while others are resolved more efficiently with compiler transformations. A compiler-assisted multiple instruction rollback scheme is developed which combines hardware-implemented data redundancy with compiler-driven hazard removal transformations. Experimental performance evaluations were conducted which indicate improved efficiency over previous hardware-based and compiler-based schemes. Various enhancements to the compiler transformations and to the data redundancy hardware developed for the compiler-assisted MIR scheme are described and evaluated. The final topic deals with the application of compiler-assisted MIR techniques to aid in exception repair and branch repair in a speculative execution architecture.
NASA Astrophysics Data System (ADS)
Krasilenko, Vladimir G.; Bardachenko, Vitaliy F.; Nikolsky, Alexander I.; Lazarev, Alexander A.
2007-04-01
In the paper we show that the biologically motivated conception of the use of time-pulse encoding gives the row of advantages (single methodological basis, universality, simplicity of tuning, training and programming et al) at creation and designing of sensor systems with parallel input-output and processing, 2D-structures of hybrid and neuro-fuzzy neurocomputers of next generations. We show principles of construction of programmable relational optoelectronic time-pulse coded processors, continuous logic, order logic and temporal waves processes, that lie in basis of the creation. We consider structure that executes extraction of analog signal of the set grade (order), sorting of analog and time-pulse coded variables. We offer optoelectronic realization of such base relational elements of order logic, which consists of time-pulse coded phototransformers (pulse-width and pulse-phase modulators) with direct and complementary outputs, sorting network on logical elements and programmable commutations blocks. We make estimations of basic technical parameters of such base devices and processors on their basis by simulation and experimental research: power of optical input signals - 0.200-20 μW, processing time - microseconds, supply voltage - 1.5-10 V, consumption power - hundreds of microwatts per element, extended functional possibilities, training possibilities. We discuss some aspects of possible rules and principles of training and programmable tuning on the required function, relational operation and realization of hardware blocks for modifications of such processors. We show as on the basis of such quasiuniversal hardware simple block and flexible programmable tuning it is possible to create sorting machines, neural networks and hybrid data-processing systems with the untraditional numerical systems and pictures operands.
Scalable Multiprocessor for High-Speed Computing in Space
NASA Technical Reports Server (NTRS)
Lux, James; Lang, Minh; Nishimoto, Kouji; Clark, Douglas; Stosic, Dorothy; Bachmann, Alex; Wilkinson, William; Steffke, Richard
2004-01-01
A report discusses the continuing development of a scalable multiprocessor computing system for hard real-time applications aboard a spacecraft. "Hard realtime applications" signifies applications, like real-time radar signal processing, in which the data to be processed are generated at "hundreds" of pulses per second, each pulse "requiring" millions of arithmetic operations. In these applications, the digital processors must be tightly integrated with analog instrumentation (e.g., radar equipment), and data input/output must be synchronized with analog instrumentation, controlled to within fractions of a microsecond. The scalable multiprocessor is a cluster of identical commercial-off-the-shelf generic DSP (digital-signal-processing) computers plus generic interface circuits, including analog-to-digital converters, all controlled by software. The processors are computers interconnected by high-speed serial links. Performance can be increased by adding hardware modules and correspondingly modifying the software. Work is distributed among the processors in a parallel or pipeline fashion by means of a flexible master/slave control and timing scheme. Each processor operates under its own local clock; synchronization is achieved by broadcasting master time signals to all the processors, which compute offsets between the master clock and their local clocks.
Replication of Space-Shuttle Computers in FPGAs and ASICs
NASA Technical Reports Server (NTRS)
Ferguson, Roscoe C.
2008-01-01
A document discusses the replication of the functionality of the onboard space-shuttle general-purpose computers (GPCs) in field-programmable gate arrays (FPGAs) and application-specific integrated circuits (ASICs). The purpose of the replication effort is to enable utilization of proven space-shuttle flight software and software-development facilities to the extent possible during development of software for flight computers for a new generation of launch vehicles derived from the space shuttles. The replication involves specifying the instruction set of the central processing unit and the input/output processor (IOP) of the space-shuttle GPC in a hardware description language (HDL). The HDL is synthesized to form a "core" processor in an FPGA or, less preferably, in an ASIC. The core processor can be used to create a flight-control card to be inserted into a new avionics computer. The IOP of the GPC as implemented in the core processor could be designed to support data-bus protocols other than that of a multiplexer interface adapter (MIA) used in the space shuttle. Hence, a computer containing the core processor could be tailored to communicate via the space-shuttle GPC bus and/or one or more other buses.
Development of compact fuel processor for 2 kW class residential PEMFCs
NASA Astrophysics Data System (ADS)
Seo, Yu Taek; Seo, Dong Joo; Jeong, Jin Hyeok; Yoon, Wang Lai
Korea Institute of Energy Research (KIER) has been developing a novel fuel processing system to provide hydrogen rich gas to residential polymer electrolyte membrane fuel cells (PEMFCs) cogeneration system. For the effective design of a compact hydrogen production system, the unit processes of steam reforming, high and low temperature water gas shift, steam generator and internal heat exchangers are thermally and physically integrated into a packaged hardware system. Several prototypes are under development and the prototype I fuel processor showed thermal efficiency of 73% as a HHV basis with methane conversion of 81%. Recently tested prototype II has been shown the improved performance of thermal efficiency of 76% with methane conversion of 83%. In both prototypes, two-stage PrOx reactors reduce CO concentration less than 10 ppm, which is the prerequisite CO limit condition of product gas for the PEMFCs stack. After confirming the initial performance of prototype I fuel processor, it is coupled with PEMFC single cell to test the durability and demonstrated that the fuel processor is operated for 3 days successfully without any failure of fuel cell voltage. Prototype II fuel processor also showed stable performance during the durability test.
Assessment of directionality performances: comparison between Freedom and CP810 sound processors.
Razza, Sergio; Albanese, Greta; Ermoli, Lucilla; Zaccone, Monica; Cristofari, Eliana
2013-10-01
To compare speech recognition in noise for the Nucleus Freedom and CP810 sound processors using different directional settings among those available in the SmartSound portfolio. Single-subject, repeated measures study. Tertiary care referral center. Thirty-one monoaurally and binaurally implanted subjects (24 children and 7 adults) were enrolled. They were all experienced Nucleus Freedom sound processor users and achieved a 100% open set word recognition score in quiet listening conditions. Each patient was fitted with the Freedom and the CP810 processor. The program setting incorporated Adaptive Dynamic Range Optimization (ADRO) and adopted the directional algorithm BEAM (both devices) and ZOOM (only on CP810). Speech reception threshold (SRT) was assessed in a free-field layout, with disyllabic word list and interfering multilevel babble noise in the 3 different pre-processing configurations. On average, CP810 improved significantly patients' SRTs as compared to Freedom SP after 1 hour of use. Instead, no significant difference was observed in patients' SRT between the BEAM and the ZOOM algorithm fitted in the CP810 processor. The results suggest that hardware developments achieved in the design of CP810 allow an immediate and relevant directional advantage as compared to the previous-generation Freedom device.
An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm
NASA Technical Reports Server (NTRS)
Weir, John M.; Wells, B. Earl
2003-01-01
Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.
NASA Astrophysics Data System (ADS)
Ishii, Akira; Tai, Haruka; Mitsudo, Jun
2007-10-01
This paper describes a real-time system for measuring the three-dimensional shape of solder bumps arrayed on an LSI chip-size-package (CSP) board presented for inspection based on the shape-from-focus technique. It uses a copper-alloy mirror deformed by a piezoelectric actuator as a varifocal mirror enabling a simple, fast, precise focusing mechanism without moving parts to be built. A practical measuring speed of 1.69 s/package for a small CSP board (4 x 4 mm2) was achieved by incorporating an exclusive field programmable gate array processor to calculate focus measure and by constructing a domed array of LEDs as a high-intensity, uniform illumination system so that a fast (150 fps) and high-resolution (1024 x 1024 pixels/frame) CMOS image sensor could be used. Accurate measurements of bump height were also achieved with errors of 10 μm (2σ) meeting the requirements for testing the coplanarity of a bump array.
FPGA Coprocessor for Accelerated Classification of Images
NASA Technical Reports Server (NTRS)
Pingree, Paula J.; Scharenbroich, Lucas J.; Werne, Thomas A.
2008-01-01
An effort related to that described in the preceding article focuses on developing a spaceborne processing platform for fast and accurate onboard classification of image data, a critical part of modern satellite image processing. The approach again has been to exploit the versatility of recently developed hybrid Virtex-4FX field-programmable gate array (FPGA) to run diverse science applications on embedded processors while taking advantage of the reconfigurable hardware resources of the FPGAs. In this case, the FPGA serves as a coprocessor that implements legacy C-language support-vector-machine (SVM) image-classification algorithms to detect and identify natural phenomena such as flooding, volcanic eruptions, and sea-ice break-up. The FPGA provides hardware acceleration for increased onboard processing capability than previously demonstrated in software. The original C-language program demonstrated on an imaging instrument aboard the Earth Observing-1 (EO-1) satellite implements a linear-kernel SVM algorithm for classifying parts of the images as snow, water, ice, land, or cloud or unclassified. Current onboard processors, such as on EO-1, have limited computing power, extremely limited active storage capability and are no longer considered state-of-the-art. Using commercially available software that translates C-language programs into hardware description language (HDL) files, the legacy C-language program, and two newly formulated programs for a more capable expanded-linear-kernel and a more accurate polynomial-kernel SVM algorithm, have been implemented in the Virtex-4FX FPGA. In tests, the FPGA implementations have exhibited significant speedups over conventional software implementations running on general-purpose hardware.
Environmental Control and Life Support Integration Strategy for 6-Crew Operations
NASA Technical Reports Server (NTRS)
2009-01-01
The International Space Station (ISS) crew compliment will be increasing in size from 3 to 6 crew members in the summer of 2009. In order to support this increase in crew on ISS, the United States on-orbit Segment (USOS) has been outfitted with a suite of regenerative Environmental Control and Life Support (ECLS) hardware including an Oxygen Generation System(OGS), Waste and Hygiene Compartment (WHC), and a Water Recovery System (WRS). The WRS includes the Urine Processor Assembly (UPA) and the Water Processor Assembly (WPA). A critical step in advancing to a 6Crew support capability on ISS is a full checkedout and verification of the Regenerative ECLS hardware. With a successful checkout, the ISS will achieve full redundancy in its onorbit life support system between the USOS and Russian Segment (RS). The additional redundancy created by the Regenerative ECLS hardware creates the opportunity for independent support capabilities between segments, and for the first time since the start of ISS, the necessity to revise Life Support strategy agreements. Independent operating strategies coupled with the loss of the Space Shuttle supply and return capabilities in 2010 offers additional challenges. These challenges create the need for a higher level of onorbit consumables reserve to ensure crewmember life support during a system failure. This paper will discuss the evolution of the ISS Life Support hardware strategy in support of 6Crew on ISS, as well as the continued work which will be necessary to ensure the support of crew and ISS Program objectives through the life of station.
Recent developments in microfluidic large scale integration.
Araci, Ismail Emre; Brisk, Philip
2014-02-01
In 2002, Thorsen et al. integrated thousands of micromechanical valves on a single microfluidic chip and demonstrated that the control of the fluidic networks can be simplified through multiplexors [1]. This enabled realization of highly parallel and automated fluidic processes with substantial sample economy advantage. Moreover, the fabrication of these devices by multilayer soft lithography was easy and reliable hence contributed to the power of the technology; microfluidic large scale integration (mLSI). Since then, mLSI has found use in wide variety of applications in biology and chemistry. In the meantime, efforts to improve the technology have been ongoing. These efforts mostly focus on; novel materials, components, micromechanical valve actuation methods, and chip architectures for mLSI. In this review, these technological advances are discussed and, recent examples of the mLSI applications are summarized. Copyright © 2013 Elsevier Ltd. All rights reserved.
Investigation of transient thermal dissipation in thinned LSI for advanced packaging
NASA Astrophysics Data System (ADS)
Araga, Yuuki; Shimamoto, Haruo; Melamed, Samson; Kikuchi, Katsuya; Aoyagi, Masahiro
2018-04-01
Thinning of LSI is necessary for superior form factor and performance in dense cutting-edge packaging technologies. At the same time, degradation of thermal characteristics caused by the steep thermal gradient on LSIs with thinned base silicon is a concern. To manage a thermal environment in advanced packages, thermal characteristics of the thinned LSIs must be clarified. In this study, static and dynamic thermal dissipations were analyzed before and after thinning silicon to determine variations of thermal characteristics in thinned LSI. Measurement results revealed that silicon thinning affects dynamic thermal characteristics as well as static one. The transient variations of thermal characteristics of thinned LSI are precisely verified by analysis using an equivalent model based on the thermal network method. The results of analysis suggest that transient thermal characteristics can be easily estimated by employing the equivalent model.
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure.
Zhang, Wen; Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.
Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure
Xiao, Fan; Li, Bin; Zhang, Siguang
2016-01-01
Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods. PMID:27579031
Distributed asynchronous microprocessor architectures in fault tolerant integrated flight systems
NASA Technical Reports Server (NTRS)
Dunn, W. R.
1983-01-01
The paper discusses the implementation of fault tolerant digital flight control and navigation systems for rotorcraft application. It is shown that in implementing fault tolerance at the systems level using advanced LSI/VLSI technology, aircraft physical layout and flight systems requirements tend to define a system architecture of distributed, asynchronous microprocessors in which fault tolerance can be achieved locally through hardware redundancy and/or globally through application of analytical redundancy. The effects of asynchronism on the execution of dynamic flight software is discussed. It is shown that if the asynchronous microprocessors have knowledge of time, these errors can be significantly reduced through appropiate modifications of the flight software. Finally, the papear extends previous work to show that through the combined use of time referencing and stable flight algorithms, individual microprocessors can be configured to autonomously tolerate intermittent faults.
Causality and Information Dynamics in Networked Systems with Many Agents
2017-05-11
representation, the LSI filter given by A(τ) must be invertible, and a sufficient condition for this invertibility is that there is some c > 0 such that the...linear shift-invariant (LSI) filter B̃ij(z) = ∑p τ=1Bij(τ)z −τ whose coefficients are arranged into a column vector B̃ij = (Bij(1), . . . , Bij(p)). In...to the adjacency matrix of the underlying graph, so that looking “depth-wise” at location ij gives the coefficients B̃ij ∈ IRp of the LSI filter from
The artificial retina processor for track reconstruction at the LHC crossing rate
Abba, A.; Bedeschi, F.; Citterio, M.; ...
2015-03-16
We present results of an R&D study for a specialized processor capable of precisely reconstructing, in pixel detectors, hundreds of charged-particle tracks from high-energy collisions at 40 MHz rate. We apply a highly parallel pattern-recognition algorithm, inspired by studies of the processing of visual images by the brain as it happens in nature, and describe in detail an efficient hardware implementation in high-speed, high-bandwidth FPGA devices. This is the first detailed demonstration of reconstruction of offline-quality tracks at 40 MHz and makes the device suitable for processing Large Hadron Collider events at the full crossing frequency.
An Embedded Laser Marking Controller Based on ARM and FPGA Processors
Dongyun, Wang; Xinpiao, Ye
2014-01-01
Laser marking is an important branch of the laser information processing technology. The existing laser marking machine based on PC and WINDOWS operating system, are large and inconvenient to move. Still, it cannot work outdoors or in other harsh environments. In order to compensate for the above mentioned disadvantages, this paper proposed an embedded laser marking controller based on ARM and FPGA processors. Based on the principle of laser galvanometer scanning marking, the hardware and software were designed for the application. Experiments showed that this new embedded laser marking controller controls the galvanometers synchronously and could achieve precise marking. PMID:24772028
Does the Intel Xeon Phi processor fit HEP workloads?
NASA Astrophysics Data System (ADS)
Nowak, A.; Bitzes, G.; Dotti, A.; Lazzaro, A.; Jarp, S.; Szostek, P.; Valsan, L.; Botezatu, M.; Leduc, J.
2014-06-01
This paper summarizes the five years of CERN openlab's efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization and vectorization on benchmarks involving software frameworks such as Geant4 and ROOT. Finally, we extrapolate current software and hardware trends and project them onto accelerators of the future, with the specifics of offline and online HEP processing in mind.
Digital Platform for Wafer-Level MEMS Testing and Characterization Using Electrical Response
Brito, Nuno; Ferreira, Carlos; Alves, Filipe; Cabral, Jorge; Gaspar, João; Monteiro, João; Rocha, Luís
2016-01-01
The uniqueness of microelectromechanical system (MEMS) devices, with their multiphysics characteristics, presents some limitations to the borrowed test methods from traditional integrated circuits (IC) manufacturing. Although some improvements have been performed, this specific area still lags behind when compared to the design and manufacturing competencies developed over the last decades by the IC industry. A complete digital solution for fast testing and characterization of inertial sensors with built-in actuation mechanisms is presented in this paper, with a fast, full-wafer test as a leading ambition. The full electrical approach and flexibility of modern hardware design technologies allow a fast adaptation for other physical domains with minimum effort. The digital system encloses a processor and the tailored signal acquisition, processing, control, and actuation hardware control modules, capable of the structure position and response analysis when subjected to controlled actuation signals in real time. The hardware performance, together with the simplicity of the sequential programming on a processor, results in a flexible and powerful tool to evaluate the newest and fastest control algorithms. The system enables measurement of resonant frequency (Fr), quality factor (Q), and pull-in voltage (Vpi) within 1.5 s with repeatability better than 5 ppt (parts per thousand). A full-wafer with 420 devices under test (DUTs) has been evaluated detecting the faulty devices and providing important design specification feedback to the designers. PMID:27657087
Digital Platform for Wafer-Level MEMS Testing and Characterization Using Electrical Response.
Brito, Nuno; Ferreira, Carlos; Alves, Filipe; Cabral, Jorge; Gaspar, João; Monteiro, João; Rocha, Luís
2016-09-21
The uniqueness of microelectromechanical system (MEMS) devices, with their multiphysics characteristics, presents some limitations to the borrowed test methods from traditional integrated circuits (IC) manufacturing. Although some improvements have been performed, this specific area still lags behind when compared to the design and manufacturing competencies developed over the last decades by the IC industry. A complete digital solution for fast testing and characterization of inertial sensors with built-in actuation mechanisms is presented in this paper, with a fast, full-wafer test as a leading ambition. The full electrical approach and flexibility of modern hardware design technologies allow a fast adaptation for other physical domains with minimum effort. The digital system encloses a processor and the tailored signal acquisition, processing, control, and actuation hardware control modules, capable of the structure position and response analysis when subjected to controlled actuation signals in real time. The hardware performance, together with the simplicity of the sequential programming on a processor, results in a flexible and powerful tool to evaluate the newest and fastest control algorithms. The system enables measurement of resonant frequency (Fr), quality factor (Q), and pull-in voltage (Vpi) within 1.5 s with repeatability better than 5 ppt (parts per thousand). A full-wafer with 420 devices under test (DUTs) has been evaluated detecting the faulty devices and providing important design specification feedback to the designers.
Hardware Acceleration of Sparse Cognitive Algorithms
2016-05-01
Processor in Memory (PiM) extensions and a 65 nm ASIC version. They were compared against a 28 nm GPU baseline using the KTH video action recognition...30 Table 17. Memory Requirement of Proposed ASIC...for improvement of performance per unit of power for customized implementations of the Sparsey and Numenta Hierarchical Temporal Memory (HTM
Read All about It: Motivate Your Students with These Exercises
ERIC Educational Resources Information Center
Tuttle, Harry Grover
2007-01-01
Educators at elementary, middle, and high school levels will find that integrating digital tools and resources--many commonly used by students in their "out of school" lives--can be a springboard to creativity and new skills. In this article, the author describes how word processors, presentation software and hardware, mind-mapping applications,…
1987-05-01
workload (beyond that of say an equivalent academic or corporate technical libary ) for the Defense Department libraries. Figure 9 illustrates the range...summer. The hardware configuration for the system is as follows: " Digital Equipment Corporation VAX 11/750 central processor with 6 mega- bytes of real
2008-09-01
of magnetic UXO. The prototype STAR Sensor comprises: a) A cubic array of eight fluxgate magnetometers . b) A 24-channel data acquisition/signal...array (shaded boxes) of eight low noise Triaxial Fluxgate Magnetometers (TFM) develops 24 channels of vector B- field data. Processor hardware
Simulation of Fault Tolerance in a Hypercube Arrangement of Discrete Processors.
1987-12-01
Geometric Properties .................... 22 Binary Properties ....................... 26 Intel Hypercube Hardware Arrangement ... 28 IV. Cube-Connected... Properties of the CCC..............35 CCC Redundancy............................... 38 iii 6L V. Re-Configurable Cube-Connected Cycles ....... 40 Global...o........ 74 iv List of Figures Page Figure 1: Hypercubes of Different Dimensions ......... 21 Figure 2: Hypercube Properties
DOE Office of Scientific and Technical Information (OSTI.GOV)
2012-12-17
Grizzly is a simulation tool for assessing the effects of age-related degradation on systems, structures, and components of nuclear power plants. Grizzly is built on the MOOSE framework, and uses a Jacobian-free Newton Krylov method to obtain solutions to tightly coupled thermo-mechanical simulations. Grizzly runs on a wide range of hardware, from a single processor to massively parallel machines.
Video sensor architecture for surveillance applications.
Sánchez, Jordi; Benet, Ginés; Simó, José E
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%.
NASA Astrophysics Data System (ADS)
Pillans, Luke; Harmer, Jack; Edwards, Tim; Richardson, Lee
2016-05-01
Geolocation is the process of calculating a target position based on bearing and range relative to the known location of the observer. A high performance thermal imager with integrated geolocation functions is a powerful long range targeting device. Firefly is a software defined camera core incorporating a system-on-a-chip processor running the AndroidTM operating system. The processor has a range of industry standard serial interfaces which were used to interface to peripheral devices including a laser rangefinder and a digital magnetic compass. The core has built in Global Positioning System (GPS) which provides the third variable required for geolocation. The graphical capability of Firefly allowed flexibility in the design of the man-machine interface (MMI), so the finished system can give access to extensive functionality without appearing cumbersome or over-complicated to the user. This paper covers both the hardware and software design of the system, including how the camera core influenced the selection of peripheral hardware, and the MMI design process which incorporated user feedback at various stages.
FPGA-accelerated algorithm for the regular expression matching system
NASA Astrophysics Data System (ADS)
Russek, P.; Wiatr, K.
2015-01-01
This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
Video Sensor Architecture for Surveillance Applications
Sánchez, Jordi; Benet, Ginés; Simó, José E.
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%. PMID:22438723
ATCA digital controller hardware for vertical stabilization of plasmas in tokamaks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Batista, A. J. N.; Sousa, J.; Varandas, C. A. F.
2006-10-15
The efficient vertical stabilization (VS) of plasmas in tokamaks requires a fast reaction of the VS controller, for example, after detection of edge localized modes (ELM). For controlling the effects of very large ELMs a new digital control hardware, based on the Advanced Telecommunications Computing Architecture trade mark sign (ATCA), is being developed aiming to reduce the VS digital control loop cycle (down to an optimal value of 10 {mu}s) and improve the algorithm performance. The system has 1 ATCA trade mark sign processor module and up to 12 ATCA trade mark sign control modules, each one with 32 analogmore » input channels (12 bit resolution), 4 analog output channels (12 bit resolution), and 8 digital input/output channels. The Aurora trade mark sign and PCI Express trade mark sign communication protocols will be used for data transport, between modules, with expected latencies below 2 {mu}s. Control algorithms are implemented on a ix86 based processor with 6 Gflops and on field programmable gate arrays with 80 GMACS, interconnected by serial gigabit links in a full mesh topology.« less
NASA Technical Reports Server (NTRS)
Butler, Roy
2013-01-01
The growth in computer hardware performance, coupled with reduced energy requirements, has led to a rapid expansion of the resources available to software systems, driving them towards greater logical abstraction, flexibility, and complexity. This shift in focus from compacting functionality into a limited field towards developing layered, multi-state architectures in a grand field has both driven and been driven by the history of embedded processor design in the robotic spacecraft industry.The combinatorial growth of interprocess conditions is accompanied by benefits (concurrent development, situational autonomy, and evolution of goals) and drawbacks (late integration, non-deterministic interactions, and multifaceted anomalies) in achieving mission success, as illustrated by the case of the Mars Reconnaissance Orbiter. Approaches to optimizing the benefits while mitigating the drawbacks have taken the form of the formalization of requirements, modular design practices, extensive system simulation, and spacecraft data trend analysis. The growth of hardware capability and software complexity can be expected to continue, with future directions including stackable commodity subsystems, computer-generated algorithms, runtime reconfigurable processors, and greater autonomy.
Programmable computing with a single magnetoresistive element
NASA Astrophysics Data System (ADS)
Ney, A.; Pampuch, C.; Koch, R.; Ploog, K. H.
2003-10-01
The development of transistor-based integrated circuits for modern computing is a story of great success. However, the proved concept for enhancing computational power by continuous miniaturization is approaching its fundamental limits. Alternative approaches consider logic elements that are reconfigurable at run-time to overcome the rigid architecture of the present hardware systems. Implementation of parallel algorithms on such `chameleon' processors has the potential to yield a dramatic increase of computational speed, competitive with that of supercomputers. Owing to their functional flexibility, `chameleon' processors can be readily optimized with respect to any computer application. In conventional microprocessors, information must be transferred to a memory to prevent it from getting lost, because electrically processed information is volatile. Therefore the computational performance can be improved if the logic gate is additionally capable of storing the output. Here we describe a simple hardware concept for a programmable logic element that is based on a single magnetic random access memory (MRAM) cell. It combines the inherent advantage of a non-volatile output with flexible functionality which can be selected at run-time to operate as an AND, OR, NAND or NOR gate.
INTEGRATED MONITORING HARDWARE DEVELOPMENTS AT LOS ALAMOS
DOE Office of Scientific and Technical Information (OSTI.GOV)
R. PARKER; J. HALBIG; ET AL
1999-09-01
The hardware of the integrated monitoring system supports a family of instruments having a common internal architecture and firmware. Instruments can be easily configured from application-specific personality boards combined with common master-processor and high- and low-voltage power supply boards, and basic operating firmware. The instruments are designed to function autonomously to survive power and communication outages and to adapt to changing conditions. The personality boards allow measurement of gross gammas and neutrons, neutron coincidence and multiplicity, and gamma spectra. In addition, the Intelligent Local Node (ILON) provides a moderate-bandwidth network to tie together instruments, sensors, and computers.
RAMP: A fault tolerant distributed microcomputer structure for aircraft navigation and control
NASA Technical Reports Server (NTRS)
Dunn, W. R.
1980-01-01
RAMP consists of distributed sets of parallel computers partioned on the basis of software and packaging constraints. To minimize hardware and software complexity, the processors operate asynchronously. It was shown that through the design of asymptotically stable control laws, data errors due to the asynchronism were minimized. It was further shown that by designing control laws with this property and making minor hardware modifications to the RAMP modules, the system became inherently tolerant to intermittent faults. A laboratory version of RAMP was constructed and is described in the paper along with the experimental results.
A 3D-PIV System for Gas Turbine Applications
NASA Astrophysics Data System (ADS)
Acharya, Sumanta
2002-08-01
Funds were received in April 2001 under the Department of Defense DURIP program for construction of a 48 processor high performance computing cluster. This report details the hardware, which was purchased, and how it has been used to enable and enhance research activities directly supported by, and of interest to, the Air Force Office of Scientific Research and the Department of Defense. The report is divided into two major sections. The first section after the summary describes the computer cluster, its setup, and some cluster hardware, and presents highlights of those efforts since installation of the cluster.
Dense and Sparse Matrix Operations on the Cell Processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel W.; Shalf, John; Oliker, Leonid
2005-05-01
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
Color sensor and neural processor on one chip
NASA Astrophysics Data System (ADS)
Fiesler, Emile; Campbell, Shannon R.; Kempem, Lother; Duong, Tuan A.
1998-10-01
Low-cost, compact, and robust color sensor that can operate in real-time under various environmental conditions can benefit many applications, including quality control, chemical sensing, food production, medical diagnostics, energy conservation, monitoring of hazardous waste, and recycling. Unfortunately, existing color sensor are either bulky and expensive or do not provide the required speed and accuracy. In this publication we describe the design of an accurate real-time color classification sensor, together with preprocessing and a subsequent neural network processor integrated on a single complementary metal oxide semiconductor (CMOS) integrated circuit. This one-chip sensor and information processor will be low in cost, robust, and mass-producible using standard commercial CMOS processes. The performance of the chip and the feasibility of its manufacturing is proven through computer simulations based on CMOS hardware parameters. Comparisons with competing methodologies show a significantly higher performance for our device.
NASA Astrophysics Data System (ADS)
Selker, Ted
1983-05-01
Lens focusing using a hardware model of a retina (Reticon RL256 light sensitive array) with a low cost processor (8085 with 512 bytes of ROM and 512 bytes of RAM) was built. This system was developed and tested on a variety of visual stimuli to demonstrate that: a)an algorithm which moves a lens to maximize the sum of the difference of light level on adjacent light sensors will converge to best focus in all but contrived situations. This is a simpler algorithm than any previously suggested; b) it is feasible to use unmodified video sensor arrays with in-expensive processors to aid video camera use. In the future, software could be developed to extend the processor's usefulness, possibly to track an actor by panning and zooming to give a earners operator increased ease of framing; c) lateral inhibition is an adequate basis for determining best focus. This supports a simple anatomically motivated model of how our brain focuses our eyes.
An optical processor for object recognition and tracking
NASA Technical Reports Server (NTRS)
Sloan, J.; Udomkesmalee, S.
1987-01-01
The design and development of a miniaturized optical processor that performs real time image correlation are described. The optical correlator utilizes the Vander Lugt matched spatial filter technique. The correlation output, a focused beam of light, is imaged onto a CMOS photodetector array. In addition to performing target recognition, the device also tracks the target. The hardware, composed of optical and electro-optical components, occupies only 590 cu cm of volume. A complete correlator system would also include an input imaging lens. This optical processing system is compact, rugged, requires only 3.5 watts of operating power, and weighs less than 3 kg. It represents a major achievement in miniaturizing optical processors. When considered as a special-purpose processing unit, it is an attractive alternative to conventional digital image recognition processing. It is conceivable that the combined technology of both optical and ditital processing could result in a very advanced robot vision system.
Fault-Tolerant, Real-Time, Multi-Core Computer System
NASA Technical Reports Server (NTRS)
Gostelow, Kim P.
2012-01-01
A document discusses a fault-tolerant, self-aware, low-power, multi-core computer for space missions with thousands of simple cores, achieving speed through concurrency. The proposed machine decides how to achieve concurrency in real time, rather than depending on programmers. The driving features of the system are simple hardware that is modular in the extreme, with no shared memory, and software with significant runtime reorganizing capability. The document describes a mechanism for moving ongoing computations and data that is based on a functional model of execution. Because there is no shared memory, the processor connects to its neighbors through a high-speed data link. Messages are sent to a neighbor switch, which in turn forwards that message on to its neighbor until reaching the intended destination. Except for the neighbor connections, processors are isolated and independent of each other. The processors on the periphery also connect chip-to-chip, thus building up a large processor net. There is no particular topology to the larger net, as a function at each processor allows it to forward a message in the correct direction. Some chip-to-chip connections are not necessarily nearest neighbors, providing short cuts for some of the longer physical distances. The peripheral processors also provide the connections to sensors, actuators, radios, science instruments, and other devices with which the computer system interacts.
A measurement-based study of concurrency in a multiprocessor
NASA Technical Reports Server (NTRS)
Mcguire, Patrick John
1987-01-01
A systematic measurement-based methodology for characterizing the amount of concurrency present in a workload, and the effect of concurrency on system performance indices such as cache miss rate and bus activity are developed. Hardware and software instrumentation of an Alliant FX/8 was used to obtain data from a real workload environment. Results show that 35% of the workload is concurrent, with the concurrent periods typically using all available processors. Measurements of periods of change in concurrency show uneven usage of processors during these times. Other system measures, including cache miss rate and processor bus activity, are analyzed with respect to the concurrency measures. Probability of a cache miss is seen to increase with concurrency. The change in cache miss rate is much more sensitive to the fraction of concurrent code in the worklaod than the number of processors active during concurrency. Regression models are developed to quantify the relationships between cache miss rate, bus activity, and the concurrency measures. The model for cache miss rate predicts an increase in the median miss rate value as much as 300% for a 100% increase in concurrency in the workload.
Electro-optic voltage sensor with Multiple Beam Splitting
Woods, Gregory K.; Renak, Todd W.; Crawford, Thomas M.; Davidson, James R.
2000-01-01
A miniature electro-optic voltage sensor system capable of accurate operation at high voltages without use of the dedicated voltage dividing hardware. The invention achieves voltage measurement without significant error contributions from neighboring conductors or environmental perturbations. The invention employs a transmitter, a sensor, a detector, and a signal processor. The transmitter produces a beam of electromagnetic radiation which is routed into the sensor. Within the sensor the beam undergoes the Pockels electro-optic effect. The electro-optic effect produces a modulation of the beam's polarization, which is in turn converted to a pair of independent conversely-amplitude-modulated signals, from which the voltage of the E-field is determined by the signal processor. The use of converse AM signals enables the signal processor to better distinguish signal from noise. The sensor converts the beam by splitting the beam in accordance with the axes of the beam's polarization state (an ellipse) into at least two AM signals. These AM signals are fed into a signal processor and processed to determine the voltage between a ground conductor and the conductor on which voltage is being measured.
Multibus-based parallel processor for simulation
NASA Technical Reports Server (NTRS)
Ogrady, E. P.; Wang, C.-H.
1983-01-01
A Multibus-based parallel processor simulation system is described. The system is intended to serve as a vehicle for gaining hands-on experience, testing system and application software, and evaluating parallel processor performance during development of a larger system based on the horizontal/vertical-bus interprocessor communication mechanism. The prototype system consists of up to seven Intel iSBC 86/12A single-board computers which serve as processing elements, a multiple transmission controller (MTC) designed to support system operation, and an Intel Model 225 Microcomputer Development System which serves as the user interface and input/output processor. All components are interconnected by a Multibus/IEEE 796 bus. An important characteristic of the system is that it provides a mechanism for a processing element to broadcast data to other selected processing elements. This parallel transfer capability is provided through the design of the MTC and a minor modification to the iSBC 86/12A board. The operation of the MTC, the basic hardware-level operation of the system, and pertinent details about the iSBC 86/12A and the Multibus are described.
NASA Astrophysics Data System (ADS)
Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos
2011-01-01
General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Assessing the Progress of Trapped-Ion Processors Towards Fault-Tolerant Quantum Computation
NASA Astrophysics Data System (ADS)
Bermudez, A.; Xu, X.; Nigmatullin, R.; O'Gorman, J.; Negnevitsky, V.; Schindler, P.; Monz, T.; Poschinger, U. G.; Hempel, C.; Home, J.; Schmidt-Kaler, F.; Biercuk, M.; Blatt, R.; Benjamin, S.; Müller, M.
2017-10-01
A quantitative assessment of the progress of small prototype quantum processors towards fault-tolerant quantum computation is a problem of current interest in experimental and theoretical quantum information science. We introduce a necessary and fair criterion for quantum error correction (QEC), which must be achieved in the development of these quantum processors before their sizes are sufficiently big to consider the well-known QEC threshold. We apply this criterion to benchmark the ongoing effort in implementing QEC with topological color codes using trapped-ion quantum processors and, more importantly, to guide the future hardware developments that will be required in order to demonstrate beneficial QEC with small topological quantum codes. In doing so, we present a thorough description of a realistic trapped-ion toolbox for QEC and a physically motivated error model that goes beyond standard simplifications in the QEC literature. We focus on laser-based quantum gates realized in two-species trapped-ion crystals in high-optical aperture segmented traps. Our large-scale numerical analysis shows that, with the foreseen technological improvements described here, this platform is a very promising candidate for fault-tolerant quantum computation.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-01
... inclusion of the standard formula). This is because, in developing the minimum franchise fee to be included... acquiring the existing LSI (and any required new LSI improvements). The minimum franchise fee, accordingly...
Scalable parallel communications
NASA Technical Reports Server (NTRS)
Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.
1992-01-01
Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups.
Compiler-assisted multiple instruction rollback recovery using a read buffer
NASA Technical Reports Server (NTRS)
Alewine, Neal J.; Chen, Shyh-Kwei; Fuchs, W. Kent; Hwu, Wen-Mei W.
1995-01-01
Multiple instruction rollback (MIR) is a technique that has been implemented in mainframe computers to provide rapid recovery from transient processor failures. Hardware-based MIR designs eliminate rollback data hazards by providing data redundancy implemented in hardware. Compiler-based MIR designs have also been developed which remove rollback data hazards directly with data-flow transformations. This paper describes compiler-assisted techniques to achieve multiple instruction rollback recovery. We observe that some data hazards resulting from instruction rollback can be resolved efficiently by providing an operand read buffer while others are resolved more efficiently with compiler transformations. The compiler-assisted scheme presented consists of hardware that is less complex than shadow files, history files, history buffers, or delayed write buffers, while experimental evaluation indicates performance improvement over compiler-based schemes.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-31
... are applicable to fleets comprised of four or more pieces of equipment powered by LSI engines... comment. If you send an email comment directly to EPA without going through http://www.regulations.gov...
Analyzing CMOS/SOS fabrication for LSI arrays
NASA Technical Reports Server (NTRS)
Ipri, A. C.
1978-01-01
Report discusses set of design rules that have been developed as result of work with test arrays. Set of optimum dimensions is given that would maximize process output and would correspondingly minimize costs in fabrication of large-scale integration (LSI) arrays.
NASA Technical Reports Server (NTRS)
Clark, R. T.; Mccallister, R. D.
1982-01-01
The particular coding option identified as providing the best level of coding gain performance in an LSI-efficient implementation was the optimal constraint length five, rate one-half convolutional code. To determine the specific set of design parameters which optimally matches this decoder to the LSI constraints, a breadboard MCD (maximum-likelihood convolutional decoder) was fabricated and used to generate detailed performance trade-off data. The extensive performance testing data gathered during this design tradeoff study are summarized, and the functional and physical MCD chip characteristics are presented.
Screening and Expression of a Silicon Transporter Gene (Lsi1) in Wild-Type Indica Rice Cultivars.
Sahebi, Mahbod; Hanafi, Mohamed M; Rafii, M Y; Azizi, Parisa; Abiri, Rambod; Kalhori, Nahid; Atabaki, Narges
2017-01-01
Silicon (Si) is one of the most prevalent elements in the soil. It is beneficial for plant growth and development, and it contributes to plant defense against different stresses. The Lsi1 gene encodes a Si transporter that was identified in a mutant Japonica rice variety. This gene was not identified in fourteen Malaysian rice varieties during screening. Then, a mutant version of Lsi1 was substituted for the native version in the three most common Malaysian rice varieties, MR219, MR220, and MR276, to evaluate the function of the transgene. Real-time PCR was used to explore the differential expression of Lsi1 in the three transgenic rice varieties. Silicon concentrations in the roots and leaves of transgenic plants were significantly higher than in wild-type plants. Transgenic varieties showed significant increases in the activities of the enzymes SOD, POD, APX, and CAT; photosynthesis; and chlorophyll content; however, the highest chlorophyll A and B levels were observed in transgenic MR276. Transgenic varieties have shown a stronger root and leaf structure, as well as hairier roots, compared to the wild-type plants. This suggests that Lsi1 plays a key role in rice, increasing the absorption and accumulation of Si, then alters antioxidant activities, and improves morphological properties.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuester, W.; Seidl, G.; Linkesch, W.
1981-10-01
For the diagnosis of primary osteoporosis, various semiquantitative radiologic methods were compared in 149 unselected patients, aged over 50 years. Crush fracture syndrome (CFS), lumbar spine index (LSI), and Singh Index (SI) were assessed by three radiologists and after reevaluation, the intra- and interobserver errors were calculated. The reliability of the subjective grading was improved by joint and repeated reading of the radiographs. Additionally, the peripheral trabecular bone content was measured by photon absorption densitometry (PAD). To test the value of the various semiquantitative methods. LSI, Si, and PAD have been compared with sex-matching before and after separation into agemore » in decades in CFS-positive and CFS-negative patients. In an attempt to differentiate osteoporotics and non-osteoporotics by CFS, our results indicate that CFS-positive and CFS-negative males cannot be separated by LSI, Si, and PAD, whereas in females these methods can discriminate irrespective of the age in decades. However, in age related groups, only SI can discriminate significantly between CFS-positive and CFS-negative females. Correlation of the semiquantitative methods, regardless of the diagnosis of a CFS, revealed a significant correlation-between SI and PAD, but no correlation between LSI and SI, and LSI and PAD, respectively.« less
Development of an LSI for Tactile Sensor Systems on the Whole-Body of Robots
NASA Astrophysics Data System (ADS)
Muroyama, Masanori; Makihata, Mitsutoshi; Nakano, Yoshihiro; Matsuzaki, Sakae; Yamada, Hitoshi; Yamaguchi, Ui; Nakayama, Takahiro; Nonomura, Yutaka; Fujiyoshi, Motohiro; Tanaka, Shuji; Esashi, Masayoshi
We have developed a network type tactile sensor system, which realizes high-density tactile sensors on the whole-body of nursing and communication robots. The system consists of three kinds of nodes: host, relay and sensor nodes. Roles of the sensor node are to sense forces and, to encode the sensing data and to transmit the encoded data on serial channels by interruption handling. Relay nodes and host deal with a number of the encoded sensing data from the sensor nodes. A sensor node consists of a capacitive MEMS force sensor and a signal processing/transmission LSI. In this paper, details of an LSI for the sensor node are described. We designed experimental sensor node LSI chips by a commercial 0.18µm standard CMOS process. The 0.18µm LSIs were supplied in wafer level for MEMS post-process. The LSI chip area is 2.4mm × 2.4mm, which includes logic, CF converter and memory circuits. The maximum clock frequency of the chip with a large capacitive load is 10MHz. Measured power consumption at 10MHz clock is 2.23mW. Experimental results indicate that size, response time, sensor sensitivity and power consumption are all enough for practical tactile sensor systems.
Screening and Expression of a Silicon Transporter Gene (Lsi1) in Wild-Type Indica Rice Cultivars
Abiri, Rambod; Kalhori, Nahid; Atabaki, Narges
2017-01-01
Silicon (Si) is one of the most prevalent elements in the soil. It is beneficial for plant growth and development, and it contributes to plant defense against different stresses. The Lsi1 gene encodes a Si transporter that was identified in a mutant Japonica rice variety. This gene was not identified in fourteen Malaysian rice varieties during screening. Then, a mutant version of Lsi1 was substituted for the native version in the three most common Malaysian rice varieties, MR219, MR220, and MR276, to evaluate the function of the transgene. Real-time PCR was used to explore the differential expression of Lsi1 in the three transgenic rice varieties. Silicon concentrations in the roots and leaves of transgenic plants were significantly higher than in wild-type plants. Transgenic varieties showed significant increases in the activities of the enzymes SOD, POD, APX, and CAT; photosynthesis; and chlorophyll content; however, the highest chlorophyll A and B levels were observed in transgenic MR276. Transgenic varieties have shown a stronger root and leaf structure, as well as hairier roots, compared to the wild-type plants. This suggests that Lsi1 plays a key role in rice, increasing the absorption and accumulation of Si, then alters antioxidant activities, and improves morphological properties. PMID:28191468
Reconfigurable Hardware for Compressing Hyperspectral Image Data
NASA Technical Reports Server (NTRS)
Aranki, Nazeeh; Namkung, Jeffrey; Villapando, Carlos; Kiely, Aaron; Klimesh, Matthew; Xie, Hua
2010-01-01
High-speed, low-power, reconfigurable electronic hardware has been developed to implement ICER-3D, an algorithm for compressing hyperspectral-image data. The algorithm and parts thereof have been the topics of several NASA Tech Briefs articles, including Context Modeler for Wavelet Compression of Hyperspectral Images (NPO-43239) and ICER-3D Hyperspectral Image Compression Software (NPO-43238), which appear elsewhere in this issue of NASA Tech Briefs. As described in more detail in those articles, the algorithm includes three main subalgorithms: one for computing wavelet transforms, one for context modeling, and one for entropy encoding. For the purpose of designing the hardware, these subalgorithms are treated as modules to be implemented efficiently in field-programmable gate arrays (FPGAs). The design takes advantage of industry- standard, commercially available FPGAs. The implementation targets the Xilinx Virtex II pro architecture, which has embedded PowerPC processor cores with flexible on-chip bus architecture. It incorporates an efficient parallel and pipelined architecture to compress the three-dimensional image data. The design provides for internal buffering to minimize intensive input/output operations while making efficient use of offchip memory. The design is scalable in that the subalgorithms are implemented as independent hardware modules that can be combined in parallel to increase throughput. The on-chip processor manages the overall operation of the compression system, including execution of the top-level control functions as well as scheduling, initiating, and monitoring processes. The design prototype has been demonstrated to be capable of compressing hyperspectral data at a rate of 4.5 megasamples per second at a conservative clock frequency of 50 MHz, with a potential for substantially greater throughput at a higher clock frequency. The power consumption of the prototype is less than 6.5 W. The reconfigurability (by means of reprogramming) of the FPGAs makes it possible to effectively alter the design to some extent to satisfy different requirements without adding hardware. The implementation could be easily propagated to future FPGA generations and/or to custom application-specific integrated circuits.
Accelerated Adaptive MGS Phase Retrieval
NASA Technical Reports Server (NTRS)
Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang
2011-01-01
The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.
Spacewire on Earth orbiting scatterometers
NASA Technical Reports Server (NTRS)
Bachmann, Alex; Lang, Minh; Lux, James; Steffke, Richard
2002-01-01
The need for a high speed, reliable and easy to implement communication link has led to the development of a space flight oriented version of IEEE 1355 called SpaceWire. SpaceWire is based on high-speed (200 Mbps) serial point-to-point links using Low Voltage Differential Signaling (LVDS). SpaceWIre has provisions for routing messages between a large network of processors, using wormhole routing for low overhead and latency. {additionally, there are available space qualified hybrids, which provide the Link layer to the user's bus}. A test bed of multiple digital signal processor breadboards, demonstrating the ability to meet signal processing requirements for an orbiting scatterometer has been implemented using three Astrium MCM-DSPs, each breadboard consists of a Multi Chip Module (MCM) that combines a space qualified Digital Signal Processor and peripherals, including IEEE-1355 links. With the addition of appropriate physical layer interfaces and software on the DSP, the SpaceWire link is used to communicate between processors on the test bed, e.g. sending timing references, commands, status, and science data among the processors. Results are presented on development issues surrounding the use of SpaceWire in this environment, from physical layer implementation (cables, connectors, LVDS drivers) to diagnostic tools, driver firmware, and development methodology. The tools, methods, and hardware, software challenges and preliminary performance are investigated and discussed.
Portable parallel stochastic optimization for the design of aeropropulsion components
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Rhodes, G. S.
1994-01-01
This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
Definition and fabrication of an airborne scatterometer radar signal processor
NASA Technical Reports Server (NTRS)
1976-01-01
A hardware/software system which incorporates a microprocessor design and software for the calculation of normalized radar cross section in real time was developed. Interface is provided to decommutate the NASA ADAS data stream for aircraft parameters used in processing and to provide output in the form of strip chart and pcm compatible data recording.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-03
... for audit information on a Crab EDR. Based on experience in these EDR programs, in the final rule... hardware, software, or Internet is restored, the User must enter this same information into the electronic... Fisheries Act catcher vessels, catcher/processor, and mothership sectors as well as representatives for the...
Klijn, Eva; Hulscher, Hester C; Balvers, Rutger K; Holland, Wim P J; Bakker, Jan; Vincent, Arnaud J P E; Dirven, Clemens M F; Ince, Can
2013-02-01
The goal of awake neurosurgery is to maximize resection of brain lesions with minimal injury to functional brain areas. Laser speckle imaging (LSI) is a noninvasive macroscopic technique with high spatial and temporal resolution used to monitor changes in capillary perfusion. In this study, the authors hypothesized that LSI can be useful as a noncontact method of functional brain mapping during awake craniotomy for tumor removal. Such a modality would be an advance in this type of neurosurgery since current practice involves the application of invasive intraoperative single-point electrocortical (electrode) stimulation and measurements. After opening the dura mater, patients were woken up, and LSI was set up to image the exposed brain area. Patients were instructed to follow a rest-activation-rest protocol in which activation consisted of the hand-clenching motor task. Subsequently, exposed brain areas were mapped for functional motor areas by using standard electrocortical stimulation (ECS). Changes in the LSI signal were analyzed offline and compared with the results of ECS. In functional motor areas of the hand mapped with ECS, cortical blood flow measured using LSI significantly increased from 2052 ± 818 AU to 2471 ± 675 AU during hand clenching, whereas capillary blood flow did not change in the control regions (areas mapped using ECS with no functional activity). The main finding of this study was that changes in laser speckle perfusion as a measure of cortical microvascular blood flow when performing a motor task with the hand relate well to the ECS map. The authors have shown the feasibility of using LSI for direct visualization of cortical microcirculatory blood flow changes during neurosurgery.
Hao, Yi-Qi; Zhao, Xin-Feng; She, Deng-Ying; Xu, Bing; Zhang, Da-Yong; Liao, Wan-Jin
2012-01-01
Reduced seed yields following self-pollination have repeatedly been observed, but the underlying mechanisms remain elusive when self-pollen tubes can readily grow into ovaries, because pre-, post-zygotic late-acting self-incompatibility (LSI), or early-acting inbreeding depression (ID) can induce self-sterility. The main objective of this study was to differentiate these processes in Aconitum kusnezoffii, a plant lacking stigmatic or stylar inhibition of self-pollination. We performed a hand-pollination experiment in a natural population of A. kusnezoffii, compared seed set among five pollination treatments, and evaluated the distribution of seed size and seed set. Embryonic development suggested fertilization following self-pollination. A partial pre-zygotic LSI was suggested to account for the reduced seed set by two lines of evidence. The seed set of chase-pollination treatment significantly exceeded that of self-pollination treatment, and the proportion of unfertilized ovules was the highest following self-pollination. Meanwhile, early-acting ID, rather than post-zygotic LSI, was suggested by the findings that the size of aborted selfed seeds varied continuously and widely; and the selfed seed set both exhibited a continuous distribution and positively correlated with the crossed seed set. These results indicated that the embryos were aborted at different stages due to the expression of many deleterious alleles throughout the genome during seed maturation. No signature of post-zygotic LSI was found. Both partial pre-zygotic LSI and early-acting ID contribute to the reduction in selfed seed set in A. kusnezoffii, with pre-zygotic LSI rejecting part of the self-pollen and early-acting ID aborting part of the self-fertilized seeds. PMID:23056570
Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C
1993-05-10
An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.
Close to real life. [solving for transonic flow about lifting airfoils using supercomputers
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Bailey, F. Ron
1988-01-01
NASA's Numerical Aerodynamic Simulation (NAS) facility for CFD modeling of highly complex aerodynamic flows employs as its basic hardware two Cray-2s, an ETA-10 Model Q, an Amdahl 5880 mainframe computer that furnishes both support processing and access to 300 Gbytes of disk storage, several minicomputers and superminicomputers, and a Thinking Machines 16,000-device 'connection machine' processor. NAS, which was the first supercomputer facility to standardize operating-system and communication software on all processors, has done important Space Shuttle aerodynamics simulations and will be critical to the configurational refinement of the National Aerospace Plane and its intergrated powerplant, which will involve complex, high temperature reactive gasdynamic computations.
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis
NASA Technical Reports Server (NTRS)
Brent, G. A.
1978-01-01
A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
Experience with custom processors in space flight applications
NASA Technical Reports Server (NTRS)
Fraeman, M. E.; Hayes, J. R.; Lohr, D. A.; Ballard, B. W.; Williams, R. L.; Henshaw, R. M.
1991-01-01
The Applied Physics Laboratory (APL) has developed a magnetometer instrument for a swedish satellite named Freja with launch scheduled for August 1992 on a Chinese Long March rocket. The magnetometer controller utilized a custom microprocessor designed at APL with the Genesil silicon compiler. The processor evolved from our experience with an older bit-slice design and two prior single chip efforts. The architecture of our microprocessor greatly lowered software development costs because it was optimized to provide an interactive and extensible programming environment hosted by the target hardware. Radiation tolerance of the microprocessor was also tested and was adequate for Freja's mission -- 20 kRad(Si) total dose and very infrequent latch-up and single event upset events.
Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Gaeke, Brian R.; Husbands, Parry; Li, Xiaoye S.; Oliker, Leonid; Yelick, Katherine A.; Biegel, Bryan (Technical Monitor)
2002-01-01
The increasing gap between processor and memory performance has lead to new architectural models for memory-intensive applications. In this paper, we explore the performance of a set of memory-intensive benchmarks and use them to compare the performance of conventional cache-based microprocessors to a mixed logic and DRAM processor called VIRAM. The benchmarks are based on problem statements, rather than specific implementations, and in each case we explore the fundamental hardware requirements of the problem, as well as alternative algorithms and data structures that can help expose fine-grained parallelism or simplify memory access patterns. The benchmarks are characterized by their memory access patterns, their basic control structures, and the ratio of computation to memory operation.
ERIC Educational Resources Information Center
Alexander, George
1984-01-01
Discusses small-scale integrated (SSI), medium-scale integrated (MSI), large-scale integrated (LSI), very large-scale integrated (VLSI), and ultra large-scale integrated (ULSI) chips. The development and properties of these chips, uses of gallium arsenide, Josephson devices (two superconducting strips sandwiching a thin insulator), and future…
VizieR Online Data Catalog: Hα observations of LSI+61 303 (Zamanov+, 2013)
NASA Astrophysics Data System (ADS)
Zamanov, R.; Stoyanov, K.; Marti, J.; Tomov, N. A.; Belcheva, G.; Luque-Escamilla, P. L.; Latev, G.
2013-09-01
Optical spectroscopic observations of the Hα emission line (137 spectra obtained during the period of September 1998 - January 2013) are presented for the the radio- and gamma-ray-emitting Be/X-ray binary LSI+61 303. (2 data files).
Hardware/Software Issues for Video Guidance Systems: The Coreco Frame Grabber
NASA Technical Reports Server (NTRS)
Bales, John W.
1996-01-01
The F64 frame grabber is a high performance video image acquisition and processing board utilizing the TMS320C40 and TMS34020 processors. The hardware is designed for the ISA 16 bit bus and supports multiple digital or analog cameras. It has an acquisition rate of 40 million pixels per second, with a variable sampling frequency of 510 kHz to MO MHz. The board has a 4MB frame buffer memory expandable to 32 MB, and has a simultaneous acquisition and processing capability. It supports both VGA and RGB displays, and accepts all analog and digital video input standards.
Reducing adaptive optics latency using Xeon Phi many-core processors
NASA Astrophysics Data System (ADS)
Barr, David; Basden, Alastair; Dipper, Nigel; Schwartz, Noah
2015-11-01
The next generation of Extremely Large Telescopes (ELTs) for astronomy will rely heavily on the performance of their adaptive optics (AO) systems. Real-time control is at the heart of the critical technologies that will enable telescopes to deliver the best possible science and will require a very significant extrapolation from current AO hardware existing for 4-10 m telescopes. Investigating novel real-time computing architectures and testing their eligibility against anticipated challenges is one of the main priorities of technology development for the ELTs. This paper investigates the suitability of the Intel Xeon Phi, which is a commercial off-the-shelf hardware accelerator. We focus on wavefront reconstruction performance, implementing a straightforward matrix-vector multiplication (MVM) algorithm. We present benchmarking results of the Xeon Phi on a real-time Linux platform, both as a standalone processor and integrated into an existing real-time controller (RTC). Performance of single and multiple Xeon Phis are investigated. We show that this technology has the potential of greatly reducing the mean latency and variations in execution time (jitter) of large AO systems. We present both a detailed performance analysis of the Xeon Phi for a typical E-ELT first-light instrument along with a more general approach that enables us to extend to any AO system size. We show that systematic and detailed performance analysis is an essential part of testing novel real-time control hardware to guarantee optimal science results.
Hokuto, Toshiki; Yasukawa, Tomoyuki; Kunikata, Ryota; Suda, Atsushi; Inoue, Kumi Y; Ino, Kosuke; Matsue, Tomokazu; Mizutani, Fumio
2016-06-01
Electrochemical imaging is an excellent technique to characterize an activity of biomaterials, such as enzymes and cells. Large scale integration-based amperometric sensor (Bio-LSI) has been developed for the simultaneous and continuous detection of the concentration distribution of redox species generated by reactions of biomolecules. In this study, the Bio-LSI system was demonstrated to be applicable for simultaneous detection of different anaytes in multiple specimens. The multiple specimens containing human immunoglobulin G (hIgG) and mouse IgG (mIgG) were introduced into each channel of the upper substrate across the antibody lines for hIgG and mIgG on the lower substrate. Hydrogen peroxide generated by the enzyme reaction of glucose oxidase captured at intersections was simultaneously detected by 400 microelectrodes of Bio-LSI chip. The oxidation current increased with increasing the concentrations of hIgG, which can be detected in the range of 0.01-1.0 µg mL(-1) . Simultaneous detection of hIgG and mIgG in multiple specimens was achieved by using line pattern of both antibodies. Therefore, the presence of different target molecules in the multiple samples would be quantitatively and simultaneously visualized as a current image by the Bio-LSI system. Copyright © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Burmeister, David M.; Ponticorvo, Adrien; Yang, Bruce; Becerra, Sandra C.; Choi, Bernard; Durkin, Anthony J.; Christy, Robert J.
2015-01-01
Surgical intervention of second degree burns is often delayed because of the difficulty in visual diagnosis, which increases the risk of scarring and infection. Non-invasive metrics have shown promise in accurately assessing burn depth. Here, we examine the use of spatial frequency domain imaging (SFDI) and laser speckle imaging (LSI) for predicting burn depth. Contact burn wounds of increasing severity were created on the dorsum of a Yorkshire pig, and wounds were imaged with SFDI/LSI starting immediately after-burn and then daily for the next 4 days. In addition, on each day the burn wounds were biopsied for histological analysis of burn depth, defined by collagen coagulation, apoptosis, and adnexal/vascular necrosis. Histological results show that collagen coagulation progressed from day 0 to day 1, and then stabilized. Results of burn wound imaging using non-invasive techniques were able to produce metrics that correlate to different predictors of burn depth. Collagen coagulation and apoptosis correlated with SFDI scattering coefficient parameter ( μs′) and adnexal/vascular necrosis on the day of burn correlated with blood flow determined by LSI. Therefore, incorporation of SFDI scattering coefficient and blood flow determined by LSI may provide an algorithm for accurate assessment of the severity of burn wounds in real time. PMID:26138371
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
Accelerating artificial intelligence with reconfigurable computing
NASA Astrophysics Data System (ADS)
Cieszewski, Radoslaw
Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.
Inexact hardware for modelling weather & climate
NASA Astrophysics Data System (ADS)
Düben, Peter D.; McNamara, Hugh; Palmer, Tim
2014-05-01
The use of stochastic processing hardware and low precision arithmetic in atmospheric models is investigated. Stochastic processors allow hardware-induced faults in calculations, sacrificing exact calculations in exchange for improvements in performance and potentially accuracy and a reduction in power consumption. A similar trade-off is achieved using low precision arithmetic, with improvements in computation and communication speed and savings in storage and memory requirements. As high-performance computing becomes more massively parallel and power intensive, these two approaches may be important stepping stones in the pursuit of global cloud resolving atmospheric modelling. The impact of both, hardware induced faults and low precision arithmetic is tested in the dynamical core of a global atmosphere model. Our simulations show that both approaches to inexact calculations do not substantially affect the quality of the model simulations, provided they are restricted to act only on smaller scales. This suggests that inexact calculations at the small scale could reduce computation and power costs without adversely affecting the quality of the simulations.
Reconfigurable Hardware Adapts to Changing Mission Demands
NASA Technical Reports Server (NTRS)
2003-01-01
A new class of computing architectures and processing systems, which use reconfigurable hardware, is creating a revolutionary approach to implementing future spacecraft systems. With the increasing complexity of electronic components, engineers must design next-generation spacecraft systems with new technologies in both hardware and software. Derivation Systems, Inc., of Carlsbad, California, has been working through NASA s Small Business Innovation Research (SBIR) program to develop key technologies in reconfigurable computing and Intellectual Property (IP) soft cores. Founded in 1993, Derivation Systems has received several SBIR contracts from NASA s Langley Research Center and the U.S. Department of Defense Air Force Research Laboratories in support of its mission to develop hardware and software for high-assurance systems. Through these contracts, Derivation Systems began developing leading-edge technology in formal verification, embedded Java, and reconfigurable computing for its PF3100, Derivational Reasoning System (DRS ), FormalCORE IP, FormalCORE PCI/32, FormalCORE DES, and LavaCORE Configurable Java Processor, which are designed for greater flexibility and security on all space missions.
First Results from a Hardware-in-the-Loop Demonstration of Closed-Loop Autonomous Formation Flying
NASA Technical Reports Server (NTRS)
Gill, E.; Naasz, Bo; Ebinuma, T.
2003-01-01
A closed-loop system for the demonstration of autonomous satellite formation flying technologies using hardware-in-the-loop has been developed. Making use of a GPS signal simulator with a dual radio frequency outlet, the system includes two GPS space receivers as well as a powerful onboard navigation processor dedicated to the GPS-based guidance, navigation, and control of a satellite formation in real-time. The closed-loop system allows realistic simulations of autonomous formation flying scenarios, enabling research in the fields of tracking and orbit control strategies for a wide range of applications. The autonomous closed-loop formation acquisition and keeping strategy is based on Lyapunov's direct control method as applied to the standard set of Keplerian elements. This approach not only assures global and asymptotic stability of the control but also maintains valuable physical insight into the applied control vectors. Furthermore, the approach can account for system uncertainties and effectively avoids a computationally expensive solution of the two point boundary problem, which renders the concept particularly attractive for implementation in onboard processors. A guidance law has been developed which strictly separates the relative from the absolute motion, thus avoiding the numerical integration of a target trajectory in the onboard processor. Moreover, upon using precise kinematic relative GPS solutions, a dynamical modeling or filtering is avoided which provides for an efficient implementation of the process on an onboard processor. A sample formation flying scenario has been created aiming at the autonomous transition of a Low Earth Orbit satellite formation from an initial along-track separation of 800 m to a target distance of 100 m. Assuming a low-thrust actuator which may be accommodated on a small satellite, a typical control accuracy of less than 5 m has been achieved which proves the applicability of autonomous formation flying techniques to formations of satellites as close as 50 m.
Ithurburn, Matthew P; Paterno, Mark V; Ford, Kevin R; Hewett, Timothy E; Schmitt, Laura C
2017-09-01
Previous work shows that young athletes after anterior cruciate ligament reconstruction (ACLR) demonstrate single-leg (SL) landing movement asymmetries at the time of return to sport (RTS); however, the effect of movement asymmetries on longitudinal knee-related function after ACLR has not been examined. Hypothesis/Purpose: The purpose of this study was to examine the effect of SL drop-landing movement symmetry at the time of RTS on knee-related function 2 years later in young athletes after ACLR. The first hypothesis was that young athletes who demonstrated SL drop-landing asymmetries at RTS would demonstrate decreased knee function 2 years later compared with those who demonstrated symmetric SL drop-landing mechanics. The second hypothesis was that SL drop-landing movement symmetry at RTS would be associated with knee functional recovery 2 years later. Cohort study; Level of evidence, 2. This study included 48 young athletes who had undergone ACLR and were assessed at the time of RTS (77% female; mean [±SD] age at RTS, 17.6 ± 2.6 years) and followed for 2 years after RTS. Three sagittal-plane landing variables of interest were calculated using 3-dimensional motion analysis during an SL drop-landing task at the time of RTS: knee flexion excursion, peak internal knee extension moment, and peak trunk flexion. The limb symmetry index (LSI) was calculated for each landing variable using the following: LSI = (involved/uninvolved) × 100%. The LSI was used to divide the cohort into symmetric (SYM) and asymmetric (ASYM) groups for each landing variable: knee flexion excursion (SYM: LSI ≥ 90% [n = 23]; ASYM: LSI < 85% [n = 18]), peak internal knee extension moment (SYM: LSI ≥ 90% [n = 19]; ASYM: LSI < 85% [n = 22]), and peak trunk flexion (SYM: LSI ≤ 105% [n = 25]; ASYM: LSI > 115% [n = 19]). At 2 years after RTS, knee-related function was evaluated using the Knee Injury and Osteoarthritis Outcome Score (KOOS), International Knee Documentation Committee (IKDC) subjective knee form, and performance on SL hop tests. Functional recovery was defined based on literature cutoffs for knee-related functional measures. Differences in 2-year function were compared between the symmetry groups using Mann-Whitney U tests because of nonnormality. Logistic regression was used to determine if landing symmetry at the time of RTS would be associated with 2-year knee functional recovery after RTS. The ASYM knee flexion excursion group demonstrated decreased function at 2 years after RTS compared with the SYM group on the KOOS-Pain (ASYM: 93.0 ± 8.2; SYM: 98.4 ± 3.0; P = .008) and the KOOS-Quality of Life (ASYM: 81.6 ± 16.1; SYM: 94.1 ± 9.7; P = .008). Knee flexion excursion was associated with knee functional recovery on the KOOS-Pain and the KOOS-Quality of Life ( P = .033 and P = .012, respectively) at 2 years after RTS, after controlling for the quadriceps strength LSI and graft type. Young athletes after ACLR with asymmetries in knee kinematics at the time of RTS reported decreased self-reported function 2 years later; however, the clinical importance of these differences needs to be further understood.
Electro-optic voltage sensor with beam splitting
Woods, Gregory K.; Renak, Todd W.; Davidson, James R.; Crawford, Thomas M.
2002-01-01
The invention is a miniature electro-optic voltage sensor system capable of accurate operation at high voltages without use of the dedicated voltage dividing hardware typically found in the prior art. The invention achieves voltage measurement without significant error contributions from neighboring conductors or environmental perturbations. The invention employs a transmitter, a sensor, a detector, and a signal processor. The transmitter produces a beam of electromagnetic radiation which is routed into the sensor. Within the sensor the beam undergoes the Pockels electro-optic effect. The electro-optic effect produces a modulation of the beam's polarization, which is in turn converted to a pair of independent conversely-amplitude-modulated signals, from which the voltage of the E-field is determined by the signal processor. The use of converse AM signals enables the signal processor to better distinguish signal from noise. The sensor converts the beam by splitting the beam in accordance with the axes of the beam's polarization state (an ellipse) into at least two AM signals. These AM signals are fed into a signal processor and processed to determine the voltage between a ground conductor and the conductor on which voltage is being measured.
Broadband set-top box using MAP-CA processor
NASA Astrophysics Data System (ADS)
Bush, John E.; Lee, Woobin; Basoglu, Chris
2001-12-01
Advances in broadband access are expected to exert a profound impact in our everyday life. It will be the key to the digital convergence of communication, computer and consumer equipment. A common thread that facilitates this convergence comprises digital media and Internet. To address this market, Equator Technologies, Inc., is developing the Dolphin broadband set-top box reference platform using its MAP-CA Broadband Signal ProcessorT chip. The Dolphin reference platform is a universal media platform for display and presentation of digital contents on end-user entertainment systems. The objective of the Dolphin reference platform is to provide a complete set-top box system based on the MAP-CA processor. It includes all the necessary hardware and software components for the emerging broadcast and the broadband digital media market based on IP protocols. Such reference design requires a broadband Internet access and high-performance digital signal processing. By using the MAP-CA processor, the Dolphin reference platform is completely programmable, allowing various codecs to be implemented in software, such as MPEG-2, MPEG-4, H.263 and proprietary codecs. The software implementation also enables field upgrades to keep pace with evolving technology and industry demands.
NASA Technical Reports Server (NTRS)
Fura, David A.; Windley, Phillip J.; Cohen, Gerald C.
1993-01-01
This technical report contains the HOL listings of the specification of the design and major portions of the requirements for a commercially developed processor interface unit (or PIU). The PIU is an interface chip performing memory interface, bus interface, and additional support services for a commercial microprocessor within a fault-tolerant computer system. This system, the Fault-Tolerant Embedded Processor (FTEP), is targeted towards applications in avionics and space requiring extremely high levels of mission reliability, extended maintenance-free operation, or both. This report contains the actual HOL listings of the PIU specification as it currently exists. Section two of this report contains general-purpose HOL theories that support the PIU specification. These theories include definitions for the hardware components used in the PIU, our implementation of bit words, and our implementation of temporal logic. Section three contains the HOL listings for the PIU design specification. Aside from the PIU internal bus (I-Bus), this specification is complete. Section four contains the HOL listings for a major portion of the PIU requirements specification. Specifically, it contains most of the definition for the PIU behavior associated with memory accesses initiated by the local processor.
Parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1975-01-01
The letter discusses the parallel and pipeline organization of fast-unitary-transform algorithms such as the fast Fourier transform, and points out the efficiency of a combined parallel-pipeline processor of a transform such as the Haar transform, in which (2 to the n-th power) -1 hardware 'butterflies' generate a transform of order 2 to the n-th power every computation cycle.
Analog hardware for delta-backpropagation neural networks
NASA Technical Reports Server (NTRS)
Eberhardt, Silvio P. (Inventor)
1992-01-01
This is a fully parallel analog backpropagation learning processor which comprises a plurality of programmable resistive memory elements serving as synapse connections whose values can be weighted during learning with buffer amplifiers, summing circuits, and sample-and-hold circuits arranged in a plurality of neuron layers in accordance with delta-backpropagation algorithms modified so as to control weight changes due to circuit drift.
Fault-Tolerant Computing: An Overview
1991-06-01
Addison Wesley:, Reading, MA) 1984. [8] J. Wakerly , Error Detecting Codes, Self-Checking Circuits and Applications , (Elsevier North Holland, Inc.- New York... applicable to bit-sliced organi- zations of hardware. In the first time step, the normal computation is performed on the operands and the results...for error detection and fault tolerance in parallel processor systems while perform- ing specific computation-intensive applications [111. Contrary to
Is It Time for a US Cyber Force?
2015-02-17
network of information technology (IT) and resident data, including the Internet , telecommunications networks, computer systems, and embedded processors...and controllers.13 JP 3-12 further goes on to explain cyberspace in terms of three layers: physical network, logical network, and cyber- persona .14...zero day) vulnerabilities against Microsoft operating system code using trusted hardware vendor certificates to cloak their presence. Though not
Robust media processing on programmable power-constrained systems
NASA Astrophysics Data System (ADS)
McVeigh, Jeff
2005-03-01
To achieve consumer-level quality, media systems must process continuous streams of audio and video data while maintaining exacting tolerances on sampling rate, jitter, synchronization, and latency. While it is relatively straightforward to design fixed-function hardware implementations to satisfy worst-case conditions, there is a growing trend to utilize programmable multi-tasking solutions for media applications. The flexibility of these systems enables support for multiple current and future media formats, which can reduce design costs and time-to-market. This paper provides practical engineering solutions to achieve robust media processing on such systems, with specific attention given to power-constrained platforms. The techniques covered in this article utilize the fundamental concepts of algorithm and software optimization, software/hardware partitioning, stream buffering, hierarchical prioritization, and system resource and power management. A novel enhancement to dynamically adjust processor voltage and frequency based on buffer fullness to reduce system power consumption is examined in detail. The application of these techniques is provided in a case study of a portable video player implementation based on a general-purpose processor running a non real-time operating system that achieves robust playback of synchronized H.264 video and MP3 audio from local storage and streaming over 802.11.
Architecture for Control of the K9 Rover
NASA Technical Reports Server (NTRS)
Bresina, John L.; Bualat, maria; Fair, Michael; Wright, Anne; Washington, Richard
2006-01-01
Software featuring a multilevel architecture is used to control the hardware on the K9 Rover, which is a mobile robot used in research on robots for scientific exploration and autonomous operation in general. The software consists of five types of modules: Device Drivers - These modules, at the lowest level of the architecture, directly control motors, cameras, data buses, and other hardware devices. Resource Managers - Each of these modules controls several device drivers. Resource managers can be commanded by either a remote operator or the pilot or conditional-executive modules described below. Behaviors and Data Processors - These modules perform computations for such functions as planning paths, avoiding obstacles, visual tracking, and stereoscopy. These modules can be commanded only by the pilot. Pilot - The pilot receives a possibly complex command from the remote operator or the conditional executive, then decomposes the command into (1) more-specific commands to the resource managers and (2) requests for information from the behaviors and data processors. Conditional Executive - This highest-level module interprets a command plan sent by the remote operator, determines whether resources required for execution of the plan are available, monitors execution, and, if necessary, selects an alternate branch of the plan.
Massively Multithreaded Maxflow for Image Segmentation on the Cray XMT-2
Bokhari, Shahid H.; Çatalyürek, Ümit V.; Gurcan, Metin N.
2014-01-01
SUMMARY Image segmentation is a very important step in the computerized analysis of digital images. The maxflow mincut approach has been successfully used to obtain minimum energy segmentations of images in many fields. Classical algorithms for maxflow in networks do not directly lend themselves to efficient parallel implementations on contemporary parallel processors. We present the results of an implementation of Goldberg-Tarjan preflow-push algorithm on the Cray XMT-2 massively multithreaded supercomputer. This machine has hardware support for 128 threads in each physical processor, a uniformly accessible shared memory of up to 4 TB and hardware synchronization for each 64 bit word. It is thus well-suited to the parallelization of graph theoretic algorithms, such as preflow-push. We describe the implementation of the preflow-push code on the XMT-2 and present the results of timing experiments on a series of synthetically generated as well as real images. Our results indicate very good performance on large images and pave the way for practical applications of this machine architecture for image analysis in a production setting. The largest images we have run are 320002 pixels in size, which are well beyond the largest previously reported in the literature. PMID:25598745
Fast multi-core based multimodal registration of 2D cross-sections and 3D datasets.
Scharfe, Michael; Pielot, Rainer; Schreiber, Falk
2010-01-11
Solving bioinformatics tasks often requires extensive computational power. Recent trends in processor architecture combine multiple cores into a single chip to improve overall performance. The Cell Broadband Engine (CBE), a heterogeneous multi-core processor, provides power-efficient and cost-effective high-performance computing. One application area is image analysis and visualisation, in particular registration of 2D cross-sections into 3D image datasets. Such techniques can be used to put different image modalities into spatial correspondence, for example, 2D images of histological cuts into morphological 3D frameworks. We evaluate the CBE-driven PlayStation 3 as a high performance, cost-effective computing platform by adapting a multimodal alignment procedure to several characteristic hardware properties. The optimisations are based on partitioning, vectorisation, branch reducing and loop unrolling techniques with special attention to 32-bit multiplies and limited local storage on the computing units. We show how a typical image analysis and visualisation problem, the multimodal registration of 2D cross-sections and 3D datasets, benefits from the multi-core based implementation of the alignment algorithm. We discuss several CBE-based optimisation methods and compare our results to standard solutions. More information and the source code are available from http://cbe.ipk-gatersleben.de. The results demonstrate that the CBE processor in a PlayStation 3 accelerates computational intensive multimodal registration, which is of great importance in biological/medical image processing. The PlayStation 3 as a low cost CBE-based platform offers an efficient option to conventional hardware to solve computational problems in image processing and bioinformatics.
STAR: FPGA-based software defined satellite transponder
NASA Astrophysics Data System (ADS)
Davalle, Daniele; Cassettari, Riccardo; Saponara, Sergio; Fanucci, Luca; Cucchi, Luca; Bigongiari, Franco; Errico, Walter
2013-05-01
This paper presents STAR, a flexible Telemetry, Tracking & Command (TT&C) transponder for Earth Observation (EO) small satellites, developed in collaboration with INTECS and SITAEL companies. With respect to state-of-the-art EO transponders, STAR includes the possibility of scientific data transfer thanks to the 40 Mbps downlink data-rate. This feature represents an important optimization in terms of hardware mass, which is important for EO small satellites. Furthermore, in-flight re-configurability of communication parameters via telecommand is important for in-orbit link optimization, which is especially useful for low orbit satellites where visibility can be as short as few hundreds of seconds. STAR exploits the principles of digital radio to minimize the analog section of the transceiver. 70MHz intermediate frequency (IF) is the interface with an external S/X band radio-frequency front-end. The system is composed of a dedicated configurable high-speed digital signal processing part, the Signal Processor (SP), described in technology-independent VHDL working with a clock frequency of 184.32MHz and a low speed control part, the Control Processor (CP), based on the 32-bit Gaisler LEON3 processor clocked at 32 MHz, with SpaceWire and CAN interfaces. The quantization parameters were fine-tailored to reach a trade-off between hardware complexity and implementation loss which is less than 0.5 dB at BER = 10-5 for the RX chain. The IF ports require 8-bit precision. The system prototype is fitted on the Xilinx Virtex 6 VLX75T-FF484 FPGA of which a space-qualified version has been announced. The total device occupation is 82 %.
Monte Carlo dose calculation using a cell processor based PlayStation 3 system
NASA Astrophysics Data System (ADS)
Chow, James C. L.; Lam, Phil; Jaffray, David A.
2012-02-01
This study investigates the performance of the EGSnrc computer code coupled with a Cell-based hardware in Monte Carlo simulation of radiation dose in radiotherapy. Performance evaluations of two processor-intensive functions namely, HOWNEAR and RANMAR_GET in the EGSnrc code were carried out basing on the 20-80 rule (Pareto principle). The execution speeds of the two functions were measured by the profiler gprof specifying the number of executions and total time spent on the functions. A testing architecture designed for Cell processor was implemented in the evaluation using a PlayStation3 (PS3) system. The evaluation results show that the algorithms examined are readily parallelizable on the Cell platform, provided that an architectural change of the EGSnrc was made. However, as the EGSnrc performance was limited by the PowerPC Processing Element in the PS3, PC coupled with graphics processing units or GPCPU may provide a more viable avenue for acceleration.
NASA Technical Reports Server (NTRS)
Phyne, J. R.; Nelson, M. D.
1975-01-01
The design and implementation of hardware and software systems involved in using a 40,000 bit/second communication line as the connecting link between an IMLAC PDS 1-D display computer and a Univac 1108 computer system were described. The IMLAC consists of two independent processors sharing a common memory. The display processor generates the deflection and beam control currents as it interprets a program contained in the memory; the minicomputer has a general instruction set and is responsible for starting and stopping the display processor and for communicating with the outside world through the keyboard, teletype, light pen, and communication line. The processing time associated with each data byte was minimized by designing the input and output processes as finite state machines which automatically sequence from each state to the next. Several tests of the communication link and the IMLAC software were made using a special low capacity computer grade cable between the IMLAC and the Univac.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barhen, Jacob; Imam, Neena
2007-01-01
Revolutionary computing technologies are defined in terms of technological breakthroughs, which leapfrog over near-term projected advances in conventional hardware and software to produce paradigm shifts in computational science. For underwater threat source localization using information provided by a dynamical sensor network, one of the most promising computational advances builds upon the emergence of digital optical-core devices. In this article, we present initial results of sensor network calculations that focus on the concept of signal wavefront time-difference-of-arrival (TDOA). The corresponding algorithms are implemented on the EnLight processing platform recently introduced by Lenslet Laboratories. This tera-scale digital optical core processor is optimizedmore » for array operations, which it performs in a fixed-point-arithmetic architecture. Our results (i) illustrate the ability to reach the required accuracy in the TDOA computation, and (ii) demonstrate that a considerable speed-up can be achieved when using the EnLight 64a prototype processor as compared to a dual Intel XeonTM processor.« less
Selection of a Brine Processor Technology for NASA Manned Missions
NASA Technical Reports Server (NTRS)
Carter, Donald L.; Gleich, Andrew F.
2016-01-01
The current ISS Water Recovery System (WRS) reclaims water from crew urine, humidity condensate, and Sabatier product water. Urine is initially processed by the Urine Processor Assembly (UPA) which recovers 75% of the urine as distillate. The remainder of the water is present in the waste brine which is currently disposed of as trash on ISS. For future missions this additional water must be reclaimed due to the significant resupply penalty for missions beyond Low Earth Orbit (LEO). NASA has pursued various technology development programs for a brine processor in the past several years. This effort has culminated in a technology down-select to identify the optimum technology for future manned missions. The technology selection is based on various criteria, including mass, power, reliability, maintainability, and safety. Beginning in 2016 the selected technology will be transitioned to a flight hardware program for demonstration on ISS. This paper summarizes the technology selection process, the competing technologies, and the rationale for the technology selected for future manned missions.
Airborne optical tracking control system design study
NASA Astrophysics Data System (ADS)
1992-09-01
The Kestrel LOS Tracking Program involves the development of a computer and algorithms for use in passive tracking of airborne targets from a high altitude balloon platform. The computer receivers track error signals from a video tracker connected to one of the imaging sensors. In addition, an on-board IRU (gyro), accelerometers, a magnetometer, and a two-axis inclinometer provide inputs which are used for initial acquisitions and course and fine tracking. Signals received by the control processor from the video tracker, IRU, accelerometers, magnetometer, and inclinometer are utilized by the control processor to generate drive signals for the payload azimuth drive, the Gimballed Mirror System (GMS), and the Fast Steering Mirror (FSM). The hardware which will be procured under the LOS tracking activity is the Controls Processor (CP), the IRU, and the FSM. The performance specifications for the GMS and the payload canister azimuth driver are established by the LOS tracking design team in an effort to achieve a tracking jitter of less than 3 micro-rad, 1 sigma for one axis.
Software dependability in the Tandem GUARDIAN system
NASA Technical Reports Server (NTRS)
Lee, Inhwan; Iyer, Ravishankar K.
1995-01-01
Based on extensive field failure data for Tandem's GUARDIAN operating system this paper discusses evaluation of the dependability of operational software. Software faults considered are major defects that result in processor failures and invoke backup processes to take over. The paper categorizes the underlying causes of software failures and evaluates the effectiveness of the process pair technique in tolerating software faults. A model to describe the impact of software faults on the reliability of an overall system is proposed. The model is used to evaluate the significance of key factors that determine software dependability and to identify areas for improvement. An analysis of the data shows that about 77% of processor failures that are initially considered due to software are confirmed as software problems. The analysis shows that the use of process pairs to provide checkpointing and restart (originally intended for tolerating hardware faults) allows the system to tolerate about 75% of reported software faults that result in processor failures. The loose coupling between processors, which results in the backup execution (the processor state and the sequence of events) being different from the original execution, is a major reason for the measured software fault tolerance. Over two-thirds (72%) of measured software failures are recurrences of previously reported faults. Modeling, based on the data, shows that, in addition to reducing the number of software faults, software dependability can be enhanced by reducing the recurrence rate.
RTEMS SMP and MTAPI for Efficient Multi-Core Space Applications on LEON3/LEON4 Processors
NASA Astrophysics Data System (ADS)
Cederman, Daniel; Hellstrom, Daniel; Sherrill, Joel; Bloom, Gedare; Patte, Mathieu; Zulianello, Marco
2015-09-01
This paper presents the final result of an European Space Agency (ESA) activity aimed at improving the software support for LEON processors used in SMP configurations. One of the benefits of using a multicore system in a SMP configuration is that in many instances it is possible to better utilize the available processing resources by load balancing between cores. This however comes with the cost of having to synchronize operations between cores, leading to increased complexity. While in an AMP system one can use multiple instances of operating systems that are only uni-processor capable, a SMP system requires the operating system to be written to support multicore systems. In this activity we have improved and extended the SMP support of the RTEMS real-time operating system and ensured that it fully supports the multicore capable LEON processors. The targeted hardware in the activity has been the GR712RC, a dual-core core LEON3FT processor, and the functional prototype of ESA's Next Generation Multiprocessor (NGMP), a quad core LEON4 processor. The final version of the NGMP is now available as a product under the name GR740. An implementation of the Multicore Task Management API (MTAPI) has been developed as part of this activity to aid in the parallelization of applications for RTEMS SMP. It allows for simplified development of parallel applications using the task-based programming model. An existing space application, the Gaia Video Processing Unit, has been ported to RTEMS SMP using the MTAPI implementation to demonstrate the feasibility and usefulness of multicore processors for space payload software. The activity is funded by ESA under contract 4000108560/13/NL/JK. Gedare Bloom is supported in part by NSF CNS-0934725.
Accuracy of a Screening Tool for Early Identification of Language Impairment
ERIC Educational Resources Information Center
Uilenburg, Noëlle; Wiefferink, Karin; Verkerk, Paul; van Denderen, Margot; van Schie, Carla; Oudesluys-Murphy, Ann-Marie
2018-01-01
Purpose: A screening tool called the "VTO Language Screening Instrument" (VTO-LSI) was developed to enable more uniform and earlier detection of language impairment. This report, consisting of 2 retrospective studies, focuses on the effects of using the VTO-LSI compared to regular detection procedures. Method: Study 1 retrospectively…
High-speed real-time animated displays on the ADAGE (trademark) RDS 3000 raster graphics system
NASA Technical Reports Server (NTRS)
Kahlbaum, William M., Jr.; Ownbey, Katrina L.
1989-01-01
Techniques which may be used to increase the animation update rate of real-time computer raster graphic displays are discussed. They were developed on the ADAGE RDS 3000 graphic system in support of the Advanced Concepts Simulator at the NASA Langley Research Center. These techniques involve the use of a special purpose parallel processor, for high-speed character generation. The description of the parallel processor includes the Barrel Shifter which is part of the hardware and is the key to the high-speed character rendition. The final result of this total effort was a fourfold increase in the update rate of an existing primary flight display from 4 to 16 frames per second.
International Space Station (ISS)
2001-02-01
The Marshall Space Flight Center (MSFC) is responsible for designing and building the life support systems that will provide the crew of the International Space Station (ISS) a comfortable environment in which to live and work. Scientists and engineers at the MSFC are working together to provide the ISS with systems that are safe, efficient, and cost-effective. These compact and powerful systems are collectively called the Environmental Control and Life Support Systems, or simply, ECLSS. This photograph shows the development Water Processor located in two racks in the ECLSS test area at the Marshall Space Flight Center. Actual waste water, simulating Space Station waste, is generated and processed through the hardware to evaluate the performance of technologies in the flight Water Processor design.
[Hardware for graphics systems].
Goetz, C
1991-02-01
In all personal computer applications, be it for private or professional use, the decision of which "brand" of computer to buy is of central importance. In the USA Apple computers are mainly used in universities, while in Europe computers of the so-called "industry standard" by IBM (or clones thereof) have been increasingly used for many years. Independently of any brand name considerations, the computer components purchased must meet the current (and projected) needs of the user. Graphic capabilities and standards, processor speed, the use of co-processors, as well as input and output devices such as "mouse", printers and scanners are discussed. This overview is meant to serve as a decision aid. Potential users are given a short but detailed summary of current technical features.
Automated Sequence Processor: Something Old, Something New
NASA Technical Reports Server (NTRS)
Streiffert, Barbara; Schrock, Mitchell; Fisher, Forest; Himes, Terry
2012-01-01
High productivity required for operations teams to meet schedules Risk must be minimized. Scripting used to automate processes. Scripts perform essential operations functions. Automated Sequence Processor (ASP) was a grass-roots task built to automate the command uplink process System engineering task for ASP revitalization organized. ASP is a set of approximately 200 scripts written in Perl, C Shell, AWK and other scripting languages.. ASP processes/checks/packages non-interactive commands automatically.. Non-interactive commands are guaranteed to be safe and have been checked by hardware or software simulators.. ASP checks that commands are non-interactive.. ASP processes the commands through a command. simulator and then packages them if there are no errors.. ASP must be active 24 hours/day, 7 days/week..
A high-speed, large-capacity, 'jukebox' optical disk system
NASA Technical Reports Server (NTRS)
Ammon, G. J.; Calabria, J. A.; Thomas, D. T.
1985-01-01
Two optical disk 'jukebox' mass storage systems which provide access to any data in a store of 10 to the 13th bits (1250G bytes) within six seconds have been developed. The optical disk jukebox system is divided into two units, including a hardware/software controller and a disk drive. The controller provides flexibility and adaptability, through a ROM-based microcode-driven data processor and a ROM-based software-driven control processor. The cartridge storage module contains 125 optical disks housed in protective cartridges. Attention is given to a conceptual view of the disk drive unit, the NASA optical disk system, the NASA database management system configuration, the NASA optical disk system interface, and an open systems interconnect reference model.
Environmental Control and Life Support Systems Test Facility at MSFC
NASA Technical Reports Server (NTRS)
2001-01-01
The Marshall Space Flight Center (MSFC) is responsible for designing and building the life support systems that will provide the crew of the International Space Station (ISS) a comfortable environment in which to live and work. Scientists and engineers at the MSFC are working together to provide the ISS with systems that are safe, efficient, and cost-effective. These compact and powerful systems are collectively called the Environmental Control and Life Support Systems, or simply, ECLSS. This photograph shows the development Water Processor located in two racks in the ECLSS test area at the Marshall Space Flight Center. Actual waste water, simulating Space Station waste, is generated and processed through the hardware to evaluate the performance of technologies in the flight Water Processor design.
Shao, Chenzhong; Tanaka, Shuji; Nakayama, Takahiro; Hata, Yoshiyuki; Bartley, Travis; Nonomura, Yutaka; Muroyama, Masanori
2017-08-28
Robot tactile sensation can enhance human-robot communication in terms of safety, reliability and accuracy. The final goal of our project is to widely cover a robot body with a large number of tactile sensors, which has significant advantages such as accurate object recognition, high sensitivity and high redundancy. In this study, we developed a multi-sensor system with dedicated Complementary Metal-Oxide-Semiconductor (CMOS) Large-Scale Integration (LSI) circuit chips (referred to as "sensor platform LSI") as a framework of a serial bus-based tactile sensor network system. The sensor platform LSI supports three types of sensors: an on-chip temperature sensor, off-chip capacitive and resistive tactile sensors, and communicates with a relay node via a bus line. The multi-sensor system was first constructed on a printed circuit board to evaluate basic functions of the sensor platform LSI, such as capacitance-to-digital and resistance-to-digital conversion. Then, two kinds of external sensors, nine sensors in total, were connected to two sensor platform LSIs, and temperature, capacitive and resistive sensing data were acquired simultaneously. Moreover, we fabricated flexible printed circuit cables to demonstrate the multi-sensor system with 15 sensor platform LSIs operating simultaneously, which showed a more realistic implementation in robots. In conclusion, the multi-sensor system with up to 15 sensor platform LSIs on a bus line supporting temperature, capacitive and resistive sensing was successfully demonstrated.
El Bcheraoui, Charbel; Sutton, Madeline Y; Hardnett, Felicia P; Jones, Sandra B
2013-01-01
Over 1.1 million Americans are living with human immunodeficiency virus (HIV), and African-American youth and young adults are disproportionately affected. Condoms are the most effective prevention tool, yet data regarding condom use patterns for African-American college youth are lacking. To inform and strengthen HIV prevention strategies with African-American college-age youth, we surveyed students attending 24 historically Black colleges and universities regarding condom use patterns. Students were administered anonymous questionnaires online to explore knowledge, attitudes, and practices related to condom use during last sexual intercourse (LSI). Among 824 sexually active respondents (51.8% female, median age 20 years, 90.6% heterosexuals), 526 (63.8%) reported condom use during LSI. Students who used condoms for disease prevention, whose mothers completed high school or had some college education or completed college were more likely to have used a condom during LSI. Spontaneity of sexual encounters, not feeling at risk of HIV, and partner-related perceptions were associated with condom non-use during LSI (p<0.05). Over a third of our college youth sample did not use a condom during LSI and may benefit from increased condom education efforts. These efforts should highlight condoms' effectiveness in protection from HIV. Future HIV education and prevention strategies with similar groups of young adults should explore the implications of maternal education, clarify perceptions of HIV risk, and consider strategies that increase consistent condom use to interrupt sexual transmission of HIV.
Gomez-Huelgas, R; Jansen-Chaparro, S; Baca-Osorio, A J; Mancera-Romero, J; Tinahones, F J; Bernal-López, M R
2015-06-01
The impact of a lifestyle intervention (LSI) program for the long-term management of subjects with metabolic syndrome in a primary care setting is not known. This 3-year prospective controlled trial randomized adult subjects with metabolic syndrome to receive intensive LSI or to usual care in a community health centre in Malaga, Spain. LSI subjects received instruction on Mediterranean diet and a regular aerobic exercise program by their primary care professionals. Primary outcome included changes from baseline on different components of metabolic syndrome (abdominal circumference, blood pressure, HDL-cholesterol, fasting plasma glucose and triglycerides). Among the 2,492 subjects screened, 601 subjects with metabolic syndrome (24.1%) were randomized to LSI (n = 298) or to usual care (n = 303); of them, a 77% and a 58%, respectively, completed the study. At the end of the study period, LSI resulted in significant differences vs. usual care in abdominal circumference (-0.4 ± 6 cm vs. + 2.1 ± 6.7 cm, p < 0.001), systolic blood pressure (-5.5 ± 15 mmHg vs. -0.6 ± 19 mmHg, p = 0.004), diastolic blood pressure (-4.6 ± 10 mmHg vs. -0.2 ± 13 mmHg, p < 0.001) and HDL-cholesterol (+4 ± 12 mg/dL vs. + 2 ± 12 mg/dL, p = 0.05); however, there were no differences in fasting plasma glucose and triglyceride concentration (-4 ± 35 mg/dl vs. -1 ± 32 mg/dl, p = 0.43 and -0.4 ± 83 mg/dl vs. +6 ± 113 mg/dl, p = 0.28). Intensive LSI counseling provided by primary care professionals resulted in significant improvements in abdominal circumference, blood pressure and HDL-cholesterol but had limited effects on glucose and triglyceride levels in patients with metabolic syndrome. Copyright © 2015 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Sample entropy predicts lifesaving interventions in trauma patients with normal vital signs.
Naraghi, L; Mejaddam, A Y; Birkhan, O A; Chang, Y; Cropano, C M; Mesar, T; Larentzakis, A; Peev, M; Sideris, A C; Van der Wilden, G M; Imam, A M; Hwabejire, J O; Velmahos, G C; Fagenholz, P J; Yeh, D; de Moya, M A; King, D R
2015-08-01
Heart rate complexity, commonly described as a "new vital sign," has shown promise in predicting injury severity, but its use in clinical practice is not yet widely adopted. We previously demonstrated the ability of this noninvasive technology to predict lifesaving interventions (LSIs) in trauma patients. This study was conducted to prospectively evaluate the utility of real-time, automated, noninvasive, instantaneous sample entropy (SampEn) analysis to predict the need for an LSI in a trauma alert population presenting with normal vital signs. Prospective enrollment of patients who met criteria for trauma team activation and presented with normal vital signs was conducted at a level I trauma center. High-fidelity electrocardiogram recording was used to calculate SampEn and SD of the normal-to-normal R-R interval (SDNN) continuously in real time for 2 hours with a portable, handheld device. Patients who received an LSI were compared to patients without any intervention (non-LSI). Multivariable analysis was performed to control for differences between the groups. Treating clinicians were blinded to results. Of 129 patients enrolled, 38 (29%) received 136 LSIs within 24 hours of hospital arrival. Initial systolic blood pressure was similar in both groups. Lifesaving intervention patients had a lower Glasgow Coma Scale. The mean SampEn on presentation was 0.7 (0.4-1.2) in the LSI group compared to 1.5 (1.1-2.0) in the non-LSI group (P < .0001). The area under the curve with initial SampEn alone was 0.73 (95% confidence interval [CI], 0.64-0.81) and increased to 0.93 (95% CI, 0.89-0.98) after adding sedation to the model. Sample entropy of less than 0.8 yields sensitivity, specificity, negative predictive value, and positive predictive value of 58%, 86%, 82%, and 65%, respectively, with an overall accuracy of 76% for predicting an LSI. SD of the normal-to-normal R-R interval had no predictive value. In trauma patients with normal presenting vital signs, decreased SampEn is an independent predictor of the need for LSI. Real-time SampEn analysis may be a useful adjunct to standard vital signs monitoring. Adoption of real-time, instantaneous SampEn monitoring for trauma patients, especially in resource-constrained environments, should be considered. Copyright © 2015 Elsevier Inc. All rights reserved.
Implementing real-time robotic systems using CHIMERA II
NASA Technical Reports Server (NTRS)
Stewart, David B.; Schmitz, Donald E.; Khosla, Pradeep K.
1990-01-01
A description is given of the CHIMERA II programming environment and operating system, which was developed for implementing real-time robotic systems. Sensor-based robotic systems contain both general- and special-purpose hardware, and thus the development of applications tends to be a time-consuming task. The CHIMERA II environment is designed to reduce the development time by providing a convenient software interface between the hardware and the user. CHIMERA II supports flexible hardware configurations which are based on one or more VME-backplanes. All communication across multiple processors is transparent to the user through an extensive set of interprocessor communication primitives. CHIMERA II also provides a high-performance real-time kernel which supports both deadline and highest-priority-first scheduling. The flexibility of CHIMERA II allows hierarchical models for robot control, such as NASREM, to be implemented with minimal programming time and effort.
Design Tools for Reconfigurable Hardware in Orbit (RHinO)
NASA Technical Reports Server (NTRS)
French, Mathew; Graham, Paul; Wirthlin, Michael; Larchev, Gregory; Bellows, Peter; Schott, Brian
2004-01-01
The Reconfigurable Hardware in Orbit (RHinO) project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. These tools leverage an established FPGA design environment and focus primarily on space effects mitigation and power optimization. The project is creating software to automatically test and evaluate the single-event-upsets (SEUs) sensitivities of an FPGA design and insert mitigation techniques. Extensions into the tool suite will also allow evolvable algorithm techniques to reconfigure around single-event-latchup (SEL) events. In the power domain, tools are being created for dynamic power visualiization and optimization. Thus, this technology seeks to enable the use of Reconfigurable Hardware in Orbit, via an integrated design tool-suite aiming to reduce risk, cost, and design time of multimission reconfigurable space processors using SRAM-based FPGAs.
MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware.
Lommen, Arjen; Kools, Harrie J
2012-08-01
A new, multi-threaded version of the GC-MS and LC-MS data processing software, metAlign, has been developed which is able to utilize multiple cores on one PC. This new version was tested using three different multi-core PCs with different operating systems. The performance of noise reduction, baseline correction and peak-picking was 8-19 fold faster compared to the previous version on a single core machine from 2008. The alignment was 5-10 fold faster. Factors influencing the performance enhancement are discussed. Our observations show that performance scales with the increase in processor core numbers we currently see in consumer PC hardware development.
NASA Astrophysics Data System (ADS)
Lele, Sanjiva K.
2002-08-01
Funds were received in April 2001 under the Department of Defense DURIP program for construction of a 48 processor high performance computing cluster. This report details the hardware which was purchased and how it has been used to enable and enhance research activities directly supported by, and of interest to, the Air Force Office of Scientific Research and the Department of Defense. The report is divided into two major sections. The first section after this summary describes the computer cluster, its setup, and some cluster performance benchmark results. The second section explains ongoing research efforts which have benefited from the cluster hardware, and presents highlights of those efforts since installation of the cluster.
A digitally implemented preambleless demodulator for maritime and mobile data communications
NASA Astrophysics Data System (ADS)
Chalmers, Harvey; Shenoy, Ajit; Verahrami, Farhad B.
The hardware design and software algorithms for a low-bit-rate, low-cost, all-digital preambleless demodulator are described. The demodulator operates under severe high-noise conditions, fast Doppler frequency shifts, large frequency offsets, and multipath fading. Sophisticated algorithms, including a fast Fourier transform (FFT)-based burst acquisition algorithm, a cycle-slip resistant carrier phase tracker, an innovative Doppler tracker, and a fast acquisition symbol synchronizer, were developed and extensively simulated for reliable burst reception. The compact digital signal processor (DSP)-based demodulator hardware uses a unique personal computer test interface for downloading test data files. The demodulator test results demonstrate a near-ideal performance within 0.2 dB of theory.
Strong Motion Seismograph Based On MEMS Accelerometer
NASA Astrophysics Data System (ADS)
Teng, Y.; Hu, X.
2013-12-01
The MEMS strong motion seismograph we developed used the modularization method to design its software and hardware.It can fit various needs in different application situation.The hardware of the instrument is composed of a MEMS accelerometer,a control processor system,a data-storage system,a wired real-time data transmission system by IP network,a wireless data transmission module by 3G broadband,a GPS calibration module and power supply system with a large-volumn lithium battery in it. Among it,the seismograph's sensor adopted a three-axis with 14-bit high resolution and digital output MEMS accelerometer.Its noise level just reach about 99μg/√Hz and ×2g to ×8g dynamically selectable full-scale.Its output data rates from 1.56Hz to 800Hz. Its maximum current consumption is merely 165μA,and the device is so small that it is available in a 3mm×3mm×1mm QFN package. Furthermore,there is access to both low pass filtered data as well as high pass filtered data,which minimizes the data analysis required for earthquake signal detection. So,the data post-processing can be simplified. Controlling process system adopts a 32-bit low power consumption embedded ARM9 processor-S3C2440 and is based on the Linux operation system.The processor's operating clock at 400MHz.The controlling system's main memory is a 64MB SDRAM with a 256MB flash-memory.Besides,an external high-capacity SD card data memory can be easily added.So the system can meet the requirements for data acquisition,data processing,data transmission,data storage,and so on. Both wired and wireless network can satisfy remote real-time monitoring, data transmission,system maintenance,status monitoring or updating software.Linux was embedded and multi-layer designed conception was used.The code, including sensor hardware driver,the data acquisition,earthquake setting out and so on,was written on medium layer.The hardware driver consist of IIC-Bus interface driver, IO driver and asynchronous notification driver. The application program layer mainly concludes: earthquake parameter module, local database managing module, data transmission module, remote monitoring, FTP service and so on. The application layer adopted multi-thread process. The whole strong motion seismograph was encapsulated in a small aluminum box, which size is 80mm×120mm×55mm. The inner battery can work continuesly more than 24 hours. The MEMS accelerograph uses modular design for its software part and hardware part. It has remote software update function and can meet the following needs: a) Auto picking up the earthquake event; saving the data on wave-event files and hours files; It may be used for monitoring strong earthquake, explosion, bridge and house health. b) Auto calculate the earthquake parameters, and transferring those parameters by 3G wireless broadband network. This kind of seismograph has characteristics of low cost, easy installation. They can be concentrated in the urban region or areas need to specially care. We can set up a ground motion parameters quick report sensor network while large earthquake break out. Then high-resolution-fine shake-map can be easily produced for the need of emergency rescue. c) By loading P-wave detection program modules, it can be used for earthquake early warning for large earthquakes; d) Can easily construct a high-density layout seismic monitoring network owning remote control and modern intelligent earthquake sensor.
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Villa, Oreste; Tumeo, Antonino; Secchi, Simone
Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less
Development of 3-Year Roadmap to Transform the Discipline of Systems Engineering
2010-03-31
quickly humans could physically construct them. Indeed, magnetic core memory was entirely constructed by human hands until it was superseded by...For their mainframe computers, IBM develops the applications, operating system, computer hardware and microprocessors (off the shelf standard memory ...processor developers work on potential computational and memory pipelines to support the required performance capabilities and use the available transistors
NASA Technical Reports Server (NTRS)
Akashi, M.; Kawaguchi, S.; Watanabe, Z.; Misaki, A.; Niwa, M.; Okamoto, Y.; Fujinaga, T.; Ichimura, M.; Shibata, T.; Dake, S.
1985-01-01
A reader system for the detection of cascade showers via luminescence induced by heating sensitive material (BaSO4:Eu) is developed. The reader system is composed of following six instruments: (1) heater, (2) light guide, (3) image intensifier, (4) CCD camera, (5) image processor, (6) microcomputer. The efficiency of these apparatuses and software application for image analysis is reported.
2012-11-01
few sensors/complex computations, and many sensors/simple computation. II. CHALLENGES WITH NANO-ENABLED NEUROMORPHIC CHIPS A wide variety of...scenarios. Neuromorphic processors, which are based on the highly parallelized computing architecture of the mammalian brain, show great promise in...in the brain. This fundamentally different approach, frequently referred to as neuromorphic computing, is thought to be better able to solve fuzzy
Neuron Design in Neuromorphic Computing Systems and Its Application in Wireless Communications
2017-03-01
0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing...for data representation using hardware spike timing dependent encoding for neuromorphic processors; (b) explore the applications of neuromorphic...envisioned architecture will serve as the foundation for unprecedented capabilities in real- time applications such as the MIMO channel estimation that
A note on parallel and pipeline computation of fast unitary transforms
NASA Technical Reports Server (NTRS)
Fino, B. J.; Algazi, V. R.
1974-01-01
The parallel and pipeline organization of fast unitary transform algorithms such as the Fast Fourier Transform are discussed. The efficiency is pointed out of a combined parallel-pipeline processor of a transform such as the Haar transform in which 2 to the n minus 1 power hardware butterflies generate a transform of order 2 to the n power every computation cycle.
Systolic Signal Processor/High Frequency Direction Finding
1990-10-01
MUSIC ) algorithm and the finite impulse response (FIR) filter onto the testbed hardware was supported by joint sponsorship of the block and major bid...computational throughput. The systolic implementations of a four-channel finite impulse response (FIR) filter and multiple signal classification ( MUSIC ... MUSIC ) algorithm was mated to a bank of finite impulse response (FIR) filters and a four-channel data acquisition subsystem. A complete description
Distributed Systems Technology Survey.
1987-03-01
and prolocols. 2. Hardware Technology Ecnomic factor we a majo reonm for the prolierat of dlstbted systoe. Processors, memory, an magne tc ndoptical...destined messages and pertorn the a pro te forwarding. There gImsno agreement that a ightweight process mechanism is essential to support com- monly used...Xerox PARC environment [311. Shared file servers, discussed below, are essential to the success of such a scheme. 11. ecurlity A distributed
Trust-Management, Intrusion-Tolerance, Accountability, and Reconstitution Architecture (TIARA)
2009-12-01
Tainting, tagged, metadata, architecture, hardware, processor, microkernel , zero-kernel, co-design 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF... microkernels (e.g., [27]) embraced the idea that it was beneficial to reduce the ker- nel, separating out services as separate processes isolated from...limited adoption. More recently Tanenbaum [72] notes the security virtues of microkernels and suggests the modern importance of security makes it
MOBS - A modular on-board switching system
NASA Astrophysics Data System (ADS)
Berner, W.; Grassmann, W.; Piontek, M.
The authors describe a multibeam satellite system that is designed for business services and for communications at a high bit rate. The repeater is regenerative with a modular onboard switching system. It acts not only as baseband switch but also as the central node of the network, performing network control and protocol evaluation. The hardware is based on a modular bus/memory architecture with associated processors.
Parallel ICA and its hardware implementation in hyperspectral image analysis
NASA Astrophysics Data System (ADS)
Du, Hongtao; Qi, Hairong; Peterson, Gregory D.
2004-04-01
Advances in hyperspectral images have dramatically boosted remote sensing applications by providing abundant information using hundreds of contiguous spectral bands. However, the high volume of information also results in excessive computation burden. Since most materials have specific characteristics only at certain bands, a lot of these information is redundant. This property of hyperspectral images has motivated many researchers to study various dimensionality reduction algorithms, including Projection Pursuit (PP), Principal Component Analysis (PCA), wavelet transform, and Independent Component Analysis (ICA), where ICA is one of the most popular techniques. It searches for a linear or nonlinear transformation which minimizes the statistical dependence between spectral bands. Through this process, ICA can eliminate superfluous but retain practical information given only the observations of hyperspectral images. One hurdle of applying ICA in hyperspectral image (HSI) analysis, however, is its long computation time, especially for high volume hyperspectral data sets. Even the most efficient method, FastICA, is a very time-consuming process. In this paper, we present a parallel ICA (pICA) algorithm derived from FastICA. During the unmixing process, pICA divides the estimation of weight matrix into sub-processes which can be conducted in parallel on multiple processors. The decorrelation process is decomposed into the internal decorrelation and the external decorrelation, which perform weight vector decorrelations within individual processors and between cooperative processors, respectively. In order to further improve the performance of pICA, we seek hardware solutions in the implementation of pICA. Until now, there are very few hardware designs for ICA-related processes due to the complicated and iterant computation. This paper discusses capacity limitation of FPGA implementations for pICA in HSI analysis. A synthesis of Application-Specific Integrated Circuit (ASIC) is designed for pICA-based dimensionality reduction in HSI analysis. The pICA design is implemented using standard-height cells and aimed at TSMC 0.18 micron process. During the synthesis procedure, three ICA-related reconfigurable components are developed for the reuse and retargeting purpose. Preliminary results show that the standard-height cell based ASIC synthesis provide an effective solution for pICA and ICA-related processes in HSI analysis.
Launching GUPPI: the Green Bank Ultimate Pulsar Processing Instrument
NASA Astrophysics Data System (ADS)
DuPlain, Ron; Ransom, Scott; Demorest, Paul; Brandt, Patrick; Ford, John; Shelton, Amy L.
2008-08-01
The National Radio Astronomy Observatory (NRAO) is launching the Green Bank Ultimate Pulsar Processing Instrument (GUPPI), a prototype flexible digital signal processor designed for pulsar observations with the Robert C. Byrd Green Bank Telescope (GBT). GUPPI uses field programmable gate array (FPGA) hardware and design tools developed by the Center for Astronomy Signal Processing and Electronics Research (CASPER) at the University of California, Berkeley. The NRAO has been concurrently developing GUPPI software and hardware using minimal software resources. The software handles instrument monitor and control, data acquisition, and hardware interfacing. GUPPI is currently an expert-only spectrometer, but supports future integration with the full GBT production system. The NRAO was able to take advantage of the unique flexibility of the CASPER FPGA hardware platform, develop hardware and software in parallel, and build a suite of software tools for monitoring, controlling, and acquiring data with a new instrument over a short timeline of just a few months. The NRAO interacts regularly with CASPER and its users, and GUPPI stands as an example of what reconfigurable computing and open-source development can do for radio astronomy. GUPPI is modular for portability, and the NRAO provides the results of development as an open-source resource.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-04
...: (1) there is inadequate lead time to permit the development of the necessary technology giving... with section 209(e)(1) of the Clean Air Act, California's LSI regulations must not affect new farming... Act if there is inadequate lead-time to permit the development of technology necessary to meet those...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-22
... franchise fee payable to the Government that is based upon consideration of the probable value to the... depreciation. Thus, if we use the standard LSI formula to establish the required minimum franchise fee for the... franchise fee that we would establish under the standard LSI formula, this business decision relies on...
Deep learning for medical image segmentation - using the IBM TrueNorth neurosynaptic system
NASA Astrophysics Data System (ADS)
Moran, Steven; Gaonkar, Bilwaj; Whitehead, William; Wolk, Aidan; Macyszyn, Luke; Iyer, Subramanian S.
2018-03-01
Deep convolutional neural networks have found success in semantic image segmentation tasks in computer vision and medical imaging. These algorithms are executed on conventional von Neumann processor architectures or GPUs. This is suboptimal. Neuromorphic processors that replicate the structure of the brain are better-suited to train and execute deep learning models for image segmentation by relying on massively-parallel processing. However, given that they closely emulate the human brain, on-chip hardware and digital memory limitations also constrain them. Adapting deep learning models to execute image segmentation tasks on such chips, requires specialized training and validation. In this work, we demonstrate for the first-time, spinal image segmentation performed using a deep learning network implemented on neuromorphic hardware of the IBM TrueNorth Neurosynaptic System and validate the performance of our network by comparing it to human-generated segmentations of spinal vertebrae and disks. To achieve this on neuromorphic hardware, the training model constrains the coefficients of individual neurons to {-1,0,1} using the Energy Efficient Deep Neuromorphic (EEDN)1 networks training algorithm. Given the 1 million neurons and 256 million synapses, the scale and size of the neural network implemented by the IBM TrueNorth allows us to execute the requisite mapping between segmented images and non-uniform intensity MR images >20 times faster than on a GPU-accelerated network and using <0.1 W. This speed and efficiency implies that a trained neuromorphic chip can be deployed in intra-operative environments where real-time medical image segmentation is necessary.
Research in software allocation for advanced manned mission communications and tracking systems
NASA Technical Reports Server (NTRS)
Warnagiris, Tom; Wolff, Bill; Kusmanoff, Antone
1990-01-01
An assessment of the planned processing hardware and software/firmware for the Communications and Tracking System of the Space Station Freedom (SSF) was performed. The intent of the assessment was to determine the optimum distribution of software/firmware in the processing hardware for maximum throughput with minimum required memory. As a product of the assessment process an assessment methodology was to be developed that could be used for similar assessments of future manned spacecraft system designs. The assessment process was hampered by changing requirements for the Space Station. As a result, the initial objective of determining the optimum software/firmware allocation was not fulfilled, but several useful conclusions and recommendations resulted from the assessment. It was concluded that the assessment process would not be completely successful for a system with changing requirements. It was also concluded that memory requirements and hardware requirements were being modified to fit as a consequence of the change process, and although throughput could not be quantitized, potential problem areas could be identified. Finally, inherent flexibility of the system design was essential for the success of a system design with changing requirements. Recommendations resulting from the assessment included development of common software for some embedded controller functions, reduction of embedded processor requirements by hardwiring some Orbital Replacement Units (ORUs) to make better use of processor capabilities, and improvement in communications between software development personnel to enhance the integration process. Lastly, a critical observation was made regarding the software integration tasks did not appear to be addressed in the design process to the degree necessary for successful satisfaction of the system requirements.
Exascale Hardware Architectures Working Group
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hemmert, S; Ang, J; Chiang, P
2011-03-15
The ASC Exascale Hardware Architecture working group is challenged to provide input on the following areas impacting the future use and usability of potential exascale computer systems: processor, memory, and interconnect architectures, as well as the power and resilience of these systems. Going forward, there are many challenging issues that will need to be addressed. First, power constraints in processor technologies will lead to steady increases in parallelism within a socket. Additionally, all cores may not be fully independent nor fully general purpose. Second, there is a clear trend toward less balanced machines, in terms of compute capability compared tomore » memory and interconnect performance. In order to mitigate the memory issues, memory technologies will introduce 3D stacking, eventually moving on-socket and likely on-die, providing greatly increased bandwidth but unfortunately also likely providing smaller memory capacity per core. Off-socket memory, possibly in the form of non-volatile memory, will create a complex memory hierarchy. Third, communication energy will dominate the energy required to compute, such that interconnect power and bandwidth will have a significant impact. All of the above changes are driven by the need for greatly increased energy efficiency, as current technology will prove unsuitable for exascale, due to unsustainable power requirements of such a system. These changes will have the most significant impact on programming models and algorithms, but they will be felt across all layers of the machine. There is clear need to engage all ASC working groups in planning for how to deal with technological changes of this magnitude. The primary function of the Hardware Architecture Working Group is to facilitate codesign with hardware vendors to ensure future exascale platforms are capable of efficiently supporting the ASC applications, which in turn need to meet the mission needs of the NNSA Stockpile Stewardship Program. This issue is relatively immediate, as there is only a small window of opportunity to influence hardware design for 2018 machines. Given the short timeline a firm co-design methodology with vendors is of prime importance.« less
Sensitivity of subject-specific models to errors in musculo-skeletal geometry.
Carbone, V; van der Krogt, M M; Koopman, H F J M; Verdonschot, N
2012-09-21
Subject-specific musculo-skeletal models of the lower extremity are an important tool for investigating various biomechanical problems, for instance the results of surgery such as joint replacements and tendon transfers. The aim of this study was to assess the potential effects of errors in musculo-skeletal geometry on subject-specific model results. We performed an extensive sensitivity analysis to quantify the effect of the perturbation of origin, insertion and via points of each of the 56 musculo-tendon parts contained in the model. We used two metrics, namely a Local Sensitivity Index (LSI) and an Overall Sensitivity Index (OSI), to distinguish the effect of the perturbation on the predicted force produced by only the perturbed musculo-tendon parts and by all the remaining musculo-tendon parts, respectively, during a simulated gait cycle. Results indicated that, for each musculo-tendon part, only two points show a significant sensitivity: its origin, or pseudo-origin, point and its insertion, or pseudo-insertion, point. The most sensitive points belong to those musculo-tendon parts that act as prime movers in the walking movement (insertion point of the Achilles Tendon: LSI=15.56%, OSI=7.17%; origin points of the Rectus Femoris: LSI=13.89%, OSI=2.44%) and as hip stabilizers (insertion points of the Gluteus Medius Anterior: LSI=17.92%, OSI=2.79%; insertion point of the Gluteus Minimus: LSI=21.71%, OSI=2.41%). The proposed priority list provides quantitative information to improve the predictive accuracy of subject-specific musculo-skeletal models. Copyright © 2012 Elsevier Ltd. All rights reserved.
The CEOS constellation for land surface imaging
Bailey, G.B.; Berger, Marsha; Jeanjean, H.; Gallo, K.P.
2007-01-01
A constellation of satellites that routinely and frequently images the Earth's land surface in consistently calibrated wavelengths from the visible through the microwave and in spatial detail that ranges from sub-meter to hundreds of meters would offer enormous potential benefits to society. A well-designed and effectively operated land surface imaging satellite constellation could have great positive impact not only on the quality of life for citizens of all nations, but also on mankind's very ability to sustain life as we know it on this planet long into the future. The primary objective of the Committee on Earth Observation Satellites (CEOS) Land Surface Imaging (LSI) Constellation is to define standards (or guidelines) that describe optimal future LSI Constellation capabilities, characteristics, and practices. Standards defined for a LSI Constellation will be based on a thorough understanding of user requirements, and they will address at least three fundamental areas of the systems comprising a Land Surface Imaging Constellation: the space segments, the ground segments, and relevant policies and plans. Studies conducted by the LSI Constellation Study Team also will address current and shorter-term problems and issues facing the land remote sensing community today, such as seeking ways to work more cooperatively in the operation of existing land surface imaging systems and helping to accomplish tangible benefits to society through application of land surface image data acquired by existing systems. 2007 LSI Constellation studies are designed to establish initial international agreements, develop preliminary standards for a mid-resolution land surface imaging constellation, and contribute data to a global forest assessment.
Accelerating a MPEG-4 video decoder through custom software/hardware co-design
NASA Astrophysics Data System (ADS)
Díaz, Jorge L.; Barreto, Dacil; García, Luz; Marrero, Gustavo; Carballo, Pedro P.; Núñez, Antonio
2007-05-01
In this paper we present a novel methodology to accelerate an MPEG-4 video decoder using software/hardware co-design for wireless DAB/DMB networks. Software support includes the services provided by the embedded kernel μC/OS-II, and the application tasks mapped to software. Hardware support includes several custom co-processors and a communication architecture with bridges to the main system bus and with a dual port SRAM. Synchronization among tasks is achieved at two levels, by a hardware protocol and by kernel level scheduling services. Our reference application is an MPEG-4 video decoder composed of several software functions and written using a special C++ library named CASSE. Profiling and space exploration techniques were used previously over the Advanced Simple Profile (ASP) MPEG-4 decoder to determinate the best HW/SW partition developed here. This research is part of the ARTEMI project and its main goal is the establishment of methodologies for the design of real-time complex digital systems using Programmable Logic Devices with embedded microprocessors as target technology and the design of multimedia systems for broadcasting networks as reference application.
International Space Station USOS Waste and Hygiene Compartment Development
NASA Technical Reports Server (NTRS)
Link, Dwight E., Jr.; Broyan, James Lee, Jr.; Gelmis, Karen; Philistine, Cynthia; Balistreri, Steven
2007-01-01
The International Space Station (ISS) currently provides human waste collection and hygiene facilities in the Russian Segment Service Module (SM) which supports a three person crew. Additional hardware is planned for the United States Operational Segment (USOS) to support expansion of the crew to six person capability. The additional hardware will be integrated in an ISS standard equipment rack structure that was planned to be installed in the Node 3 element; however, the ISS Program Office recently directed implementation of the rack, or Waste and Hygiene Compartment (WHC), into the U.S. Laboratory element to provide early operational capability. In this configuration, preserved urine from the WHC waste collection system can be processed by the Urine Processor Assembly (UPA) in either the U.S. Lab or Node 3 to recover water for crew consumption or oxygen production. The human waste collection hardware is derived from the Service Module system and is provided by RSC-Energia. This paper describes the concepts, design, and integration of the WHC waste collection hardware into the USOS including integration with U.S. Lab and Node 3 systems.
Fast multi-core based multimodal registration of 2D cross-sections and 3D datasets
2010-01-01
Background Solving bioinformatics tasks often requires extensive computational power. Recent trends in processor architecture combine multiple cores into a single chip to improve overall performance. The Cell Broadband Engine (CBE), a heterogeneous multi-core processor, provides power-efficient and cost-effective high-performance computing. One application area is image analysis and visualisation, in particular registration of 2D cross-sections into 3D image datasets. Such techniques can be used to put different image modalities into spatial correspondence, for example, 2D images of histological cuts into morphological 3D frameworks. Results We evaluate the CBE-driven PlayStation 3 as a high performance, cost-effective computing platform by adapting a multimodal alignment procedure to several characteristic hardware properties. The optimisations are based on partitioning, vectorisation, branch reducing and loop unrolling techniques with special attention to 32-bit multiplies and limited local storage on the computing units. We show how a typical image analysis and visualisation problem, the multimodal registration of 2D cross-sections and 3D datasets, benefits from the multi-core based implementation of the alignment algorithm. We discuss several CBE-based optimisation methods and compare our results to standard solutions. More information and the source code are available from http://cbe.ipk-gatersleben.de. Conclusions The results demonstrate that the CBE processor in a PlayStation 3 accelerates computational intensive multimodal registration, which is of great importance in biological/medical image processing. The PlayStation 3 as a low cost CBE-based platform offers an efficient option to conventional hardware to solve computational problems in image processing and bioinformatics. PMID:20064262
System analysis of graphics processor architecture using virtual prototyping
NASA Astrophysics Data System (ADS)
Hancock, William R.; Groat, Jeff; Steeves, Todd; Spaanenburg, Henk; Shackleton, John
1995-06-01
Honeywell has been actively involved in the definition of the next generation display processors for military and commercial cockpits. A major concern is how to achieve super graphics workstation performance in avionics application. Most notable are requirements for low volume, low power, harsh environmental conditions, real-time performance and low cost. This paper describes the application of VHDL to the system analysis tasks associated with achieving these goals in a cost effective manner. The paper will describe the top level architecture identified to provide the graphical and video processing power needed to drive future high resolution display devices and to generate more natural panoramic 3D formats. The major discussion, however, will be on the use of VHDL to model the processing elements and customized pipelines needed to realize the architecture and for doing the complex system tradeoff studies necessary to achieve a cost effective implementation. New software tools have been developed to allow 'virtual' prototyping in the VHDL environment. This results in a hardware/software codesign using VHDL performance and functional models. This unique architectural tool allows simulation and tradeoffs within a standard and tightly integrated toolset, which eventually will be used to specify and design the entire system from the top level requirements and system performance to the lowest level individual ASICs. New processing elements, algorithms, and standard graphical inputs can be designed, tested and evaluated without the costly hardware prototyping using the innovative 'virtual' prototyping techniques which are evolving on this project. In addition, virtual prototyping of the display processor does not bind the preliminary design to point solutions as a physical prototype will. when the development schedule is known, one can extrapolate processing elements performance and design the system around the most current technology.
NASA Astrophysics Data System (ADS)
Olson, Richard F.
2013-05-01
Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.
Technical Reliability Studies. EOS/ESD Technology Abstracts
1982-01-01
RESISTANT BIPOLAR TRANSISTOR DESIGN AND ITS APPLICATIONS TO LINEAR INTEGRATED CIRCUITS 16145 MODULE ELECTROSTATIC DISCHARGE SIMULATOR 15786 SOME...T.M. 16476 STATIC DISCHARGE MODELING TECHNIQUES FOR EVALUATION OF INTEGRATED (FET) CIRCUIT DESTRUCTION 16145 MODULE ELECTAOSTATIC DISCHARGE SIMULATOR...PLASTIC LSI CIRCUITS PRklE, L.A., II 16145 MODULE ELECTROSTATIC DISCHARGE SIMULATOR PRICE, R.D. 13455 EVALUATION OF PLASTIC LSI CIRCUITS PSHAENICH, A
C-MOS bulk metal design handbook. [LSI standard cell (circuits)
NASA Technical Reports Server (NTRS)
Edge, T. M.
1977-01-01
The LSI standard cell array technique was used in the fabrication of more than 20 CMOS custom arrays. This technique consists of a series of computer programs and design automation techniques referred to as the Computer Aided Design And Test (CADAT) system that automatically translate a partitioned logic diagram into a set of instructions for driving an automatic plotter which generates precision mask artwork for complex LSI arrays of CMOS standard cells. The standard cell concept for producing LSI arrays begins with the design, layout, and validation of a group of custom circuits called standard cells. Once validated, these cells are given identification or pattern numbers and are permanently stored. To use one of these cells in a logic design, the user calls for the desired cell by pattern number. The Place, Route in Two Dimension (PR2D) computer program is then used to automatically generate the metalization and/or tunnels to interconnect the standard cells into the required function. Data sheets that describe the function, artwork, and performance of each of the standard cells, the general procedure for implementation of logic in CMOS standard cells, and additional detailed design information are presented.
NASA Astrophysics Data System (ADS)
Kobayashi, Takuma; Tagawa, Ayato; Noda, Toshihiko; Sasagawa, Kiyotaka; Tokuda, Takashi; Hatanaka, Yumiko; Tamura, Hideki; Ishikawa, Yasuyuki; Shiosaka, Sadao; Ohta, Jun
2010-11-01
The combination of optical imaging with voltage-sensitive dyes is a powerful tool for studying the spatiotemporal patterns of neural activity and understanding the neural networks of the brain. To visualize the potential status of multiple neurons simultaneously using a compact instrument with high density and a wide range, we present a novel measurement system using an implantable biomedical photonic LSI device with a red absorptive light filter for voltage-sensitive dye imaging (BpLSI-red). The BpLSI-red was developed for sensing fluorescence by the on-chip LSI, which was designed by using complementary metal-oxide-semiconductor (CMOS) technology. A micro-electro-mechanical system (MEMS) microfabrication technique was used to postprocess the CMOS sensor chip; light-emitting diodes (LEDs) were integrated for illumination and to enable long-term cell culture. Using the device, we succeeded in visualizing the membrane potential of 2000-3000 cells and the process of depolarization of pheochromocytoma cells (PC12 cells) and mouse cerebral cortical neurons in a primary culture with cellular resolution. Therefore, our measurement application enables the detection of multiple neural activities simultaneously.
Zhang, Zhen; Ma, Cheng; Zhu, Rong
2017-08-23
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas.
Zhang, Zhen; Zhu, Rong
2017-01-01
Artificial Neural Networks (ANNs), including Deep Neural Networks (DNNs), have become the state-of-the-art methods in machine learning and achieved amazing success in speech recognition, visual object recognition, and many other domains. There are several hardware platforms for developing accelerated implementation of ANN models. Since Field Programmable Gate Array (FPGA) architectures are flexible and can provide high performance per watt of power consumption, they have drawn a number of applications from scientists. In this paper, we propose a FPGA-based, granularity-variable neuromorphic processor (FBGVNP). The traits of FBGVNP can be summarized as granularity variability, scalability, integrated computing, and addressing ability: first, the number of neurons is variable rather than constant in one core; second, the multi-core network scale can be extended in various forms; third, the neuron addressing and computing processes are executed simultaneously. These make the processor more flexible and better suited for different applications. Moreover, a neural network-based controller is mapped to FBGVNP and applied in a multi-input, multi-output, (MIMO) real-time, temperature-sensing and control system. Experiments validate the effectiveness of the neuromorphic processor. The FBGVNP provides a new scheme for building ANNs, which is flexible, highly energy-efficient, and can be applied in many areas. PMID:28832522
A survey on the design of multiprocessing systems for artificial intelligence applications
NASA Technical Reports Server (NTRS)
Wah, Benjamin W.; Li, Guo Jie
1989-01-01
Some issues in designing computers for artificial intelligence (AI) processing are discussed. These issues are divided into three levels: the representation level, the control level, and the processor level. The representation level deals with the knowledge and methods used to solve the problem and the means to represent it. The control level is concerned with the detection of dependencies and parallelism in the algorithmic and program representations of the problem, and with the synchronization and sheduling of concurrent tasks. The processor level addresses the hardware and architectural components needed to evaluate the algorithmic and program representations. Solutions for the problems of each level are illustrated by a number of representative systems. Design decisions in existing projects on AI computers are classed into top-down, bottom-up, and middle-out approaches.
Periodic Application of Concurrent Error Detection in Processor Array Architectures. PhD. Thesis -
NASA Technical Reports Server (NTRS)
Chen, Paul Peichuan
1993-01-01
Processor arrays can provide an attractive architecture for some applications. Featuring modularity, regular interconnection and high parallelism, such arrays are well-suited for VLSI/WSI implementations, and applications with high computational requirements, such as real-time signal processing. Preserving the integrity of results can be of paramount importance for certain applications. In these cases, fault tolerance should be used to ensure reliable delivery of a system's service. One aspect of fault tolerance is the detection of errors caused by faults. Concurrent error detection (CED) techniques offer the advantage that transient and intermittent faults may be detected with greater probability than with off-line diagnostic tests. Applying time-redundant CED techniques can reduce hardware redundancy costs. However, most time-redundant CED techniques degrade a system's performance.
Analysis of the packet formation process in packet-switched networks
NASA Astrophysics Data System (ADS)
Meditch, J. S.
Two new queueing system models for the packet formation process in packet-switched telecommunication networks are developed, and their applications in process stability, performance analysis, and optimization studies are illustrated. The first, an M/M/1 queueing system characterization of the process, is a highly aggregated model which is useful for preliminary studies. The second, a marked extension of an earlier M/G/1 model, permits one to investigate stability, performance characteristics, and design of the packet formation process in terms of the details of processor architecture, and hardware and software implementations with processor structure and as many parameters as desired as variables. The two new models together with the earlier M/G/1 characterization span the spectrum of modeling complexity for the packet formation process from basic to advanced.
Vectorized program architectures for supercomputer-aided circuit design
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rizzoli, V.; Ferlito, M.; Neri, A.
1986-01-01
Vector processors (supercomputers) can be effectively employed in MIC or MMIC applications to solve problems of large numerical size such as broad-band nonlinear design or statistical design (yield optimization). In order to fully exploit the capabilities of a vector hardware, any program architecture must be structured accordingly. This paper presents a possible approach to the ''semantic'' vectorization of microwave circuit design software. Speed-up factors of the order of 50 can be obtained on a typical vector processor (Cray X-MP), with respect to the most powerful scaler computers (CDC 7600), with cost reductions of more than one order of magnitude. Thismore » could broaden the horizon of microwave CAD techniques to include problems that are practically out of the reach of conventional systems.« less
NASA Astrophysics Data System (ADS)
Iwasaki, Y.; CP-PACS Collaboration
1998-01-01
The CP-PACS project is a five year plan, which formally started in April 1992 and has been completed in March 1997, to develop a massively parallel computer for carrying out research in computational physics with primary emphasis on lattice QCD. The initial version of the CP-PACS computer with a theoretical peak speed of 307 GFLOPS with 1024 processors was completed in March 1996. The final version with a peak speed of 614 GFLOPS with 2048 processors was completed in September 1996, and has been in full operation since October 1996. We describe the architecture, the final specification, the hardware implementation, and the software of the CP-PACS computer. The CP-PACS has been used for hadron spectroscopy production runs since July 1996. The performance for lattice QCD applications and the LINPACK benchmark are given.
A complexity-scalable software-based MPEG-2 video encoder.
Chen, Guo-bin; Lu, Xin-ning; Wang, Xing-guo; Liu, Ji-lin
2004-05-01
With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.
Custom LSI plus hybrid equals cost effectiveness
NASA Astrophysics Data System (ADS)
Friedman, S. N.
The possibility to combine various technologies, such as Bi-Polar linear and CMOS/Digital makes it feasible to create systems with a tailored performance not available on a single monolithic circuit. The custom LSI 'BLOCK', especially if it is universal in nature, is proving to be a cost effective way for the developer to improve his product. The custom LSI represents a low price part in contrast to the discrete components it will replace. In addition, the hybrid assembly can realize a savings in labor as a result of the reduced parts handling and associated wire bonds. The possibility of the use of automated system manufacturing techniques leads to greater reliability as the human factor is partly eliminated. Attention is given to reliability predictions, cost considerations, and a product comparison study.
The Adaptive Effects Of Virtual Interfaces: Vestibulo-Ocular Reflex and Simulator Sickness.
1998-08-07
rearrangement: a pattern of stimulation differing from that existing as a result of normal interactions with the real world. Stimulus rearrangements can...is immersive and interactive . virtual interface: a system of transducers, signal processors, computer hardware and software that create an... interactive medium through which: 1) information is transmitted to the senses in the form of two- and three dimensional virtual images and 2) psychomotor
Some Issues in Programming Multi-Mini-Processors
1975-01-01
Hardware ^nd software are to be combined optimally to perform that specialized task. This in essence is the stategy followed by the BBN group in...large memory is directly addressable. MIXED SOLUTIONS The most promising approach appears to involve mixing several of the previous solutions...mini- or micro-computers. Possibly the problem will be solved by avoiding it. Some new minis are appearing on the market now with large physical
A Heterogeneous Multiprocessor Graphics System Using Processor-Enhanced Memories
1989-02-01
frames per second, font generation directly from conic spline descriptions, and rapid calculation of radiosity form factors. The hardware consists of...generality for rendering curved surfaces, volume data, objects dcscri id with Constructive Solid Geometry, for rendering scenes using the radiosity ...f.aces and for computing a spherical radiosity lighting model (see Section 7.6). Custom Memory Chips \\ 208 bits x 128 pixels - Renderer Board ix p o a
NASA Technical Reports Server (NTRS)
Wilmot, Jonathan
2005-01-01
The contents include the following: High availability. Hardware is in harsh environment. Flight processor (constraints) very widely due to power and weight constraints. Software must be remotely modifiable and still operate while changes are being made. Many custom one of kind interfaces for one of a kind missions. Sustaining engineering. Price of failure is high, tens to hundreds of millions of dollars.
Design of the Bus Interface Unit for the Distributed Processor/Memory System.
1976-12-01
microroutine flowchart developed. Once this had been done , a high-speed, flexible microprocessor that would be adapt- able to a hardware...routine) was translated Into microcode and provide the mnemonic code and flowchart , Chapter V summarizes and discusses actual system construction...Fig. 11. This diagram shows that the BIU is driven by Interrupt stimuli which select the beginn ing address of the appropriate microroutine rather
Security Primitives for Reconfigurable Hardware-Based Systems
2010-05-01
work, we propose security primitives using ideas centered around the notion of “moats and drawbridges .” The primitives encompass four design properties...Santa Bar- bara, CA 93106; email: sherwood@cs.ucsb.edu; R. Kastner, Department of Computer Science and Engineering , University of California, San...fingerprint reader), the other to control the ethernet IP core—and an AES encryption engine used by both of the processor cores. These cores are all implemented
Development of an Autonomous Navigation Technology Test Vehicle
2004-08-01
as an independent thread on processors using the Linux operating system. The computer hardware selected for the nodes that host the MRS threads...communications system design. Linux was chosen as the operating system for all of the single board computers used on the Mule. Linux was specifically...used for system analysis and development. The simple realization of multi-thread processing and inter-process communications in Linux made it a
LIBS data analysis using a predictor-corrector based digital signal processor algorithm
NASA Astrophysics Data System (ADS)
Sanders, Alex; Griffin, Steven T.; Robinson, Aaron
2012-06-01
There are many accepted sensor technologies for generating spectra for material classification. Once the spectra are generated, communication bandwidth limitations favor local material classification with its attendant reduction in data transfer rates and power consumption. Transferring sensor technologies such as Cavity Ring-Down Spectroscopy (CRDS) and Laser Induced Breakdown Spectroscopy (LIBS) require effective material classifiers. A result of recent efforts has been emphasis on Partial Least Squares - Discriminant Analysis (PLS-DA) and Principle Component Analysis (PCA). Implementation of these via general purpose computers is difficult in small portable sensor configurations. This paper addresses the creation of a low mass, low power, robust hardware spectra classifier for a limited set of predetermined materials in an atmospheric matrix. Crucial to this is the incorporation of PCA or PLS-DA classifiers into a predictor-corrector style implementation. The system configuration guarantees rapid convergence. Software running on multi-core Digital Signal Processor (DSPs) simulates a stream-lined plasma physics model estimator, reducing Analog-to-Digital (ADC) power requirements. This paper presents the results of a predictorcorrector model implemented on a low power multi-core DSP to perform substance classification. This configuration emphasizes the hardware system and software design via a predictor corrector model that simultaneously decreases the sample rate while performing the classification.
Progress Towards a Rad-Hydro Code for Modern Computing Architectures LA-UR-10-02825
NASA Astrophysics Data System (ADS)
Wohlbier, J. G.; Lowrie, R. B.; Bergen, B.; Calef, M.
2010-11-01
We are entering an era of high performance computing where data movement is the overwhelming bottleneck to scalable performance, as opposed to the speed of floating-point operations per processor. All multi-core hardware paradigms, whether heterogeneous or homogeneous, be it the Cell processor, GPGPU, or multi-core x86, share this common trait. In multi-physics applications such as inertial confinement fusion or astrophysics, one may be solving multi-material hydrodynamics with tabular equation of state data lookups, radiation transport, nuclear reactions, and charged particle transport in a single time cycle. The algorithms are intensely data dependent, e.g., EOS, opacity, nuclear data, and multi-core hardware memory restrictions are forcing code developers to rethink code and algorithm design. For the past two years LANL has been funding a small effort referred to as Multi-Physics on Multi-Core to explore ideas for code design as pertaining to inertial confinement fusion and astrophysics applications. The near term goals of this project are to have a multi-material radiation hydrodynamics capability, with tabular equation of state lookups, on cartesian and curvilinear block structured meshes. In the longer term we plan to add fully implicit multi-group radiation diffusion and material heat conduction, and block structured AMR. We will report on our progress to date.
Design and implementation of a random neural network routing engine.
Kocak, T; Seeber, J; Terzioglu, H
2003-01-01
Random neural network (RNN) is an analytically tractable spiked neural network model that has been implemented in software for a wide range of applications for over a decade. This paper presents the hardware implementation of the RNN model. Recently, cognitive packet networks (CPN) is proposed as an alternative packet network architecture where there is no routing table, instead the RNN based reinforcement learning is used to route packets. Particularly, we describe implementation details for the RNN based routing engine of a CPN network processor chip: the smart packet processor (SPP). The SPP is a dual port device that stores, modifies, and interprets the defining characteristics of multiple RNN models. In addition to hardware design improvements over the software implementation such as the dual access memory, output calculation step, and reduced output calculation module, this paper introduces a major modification to the reinforcement learning algorithm used in the original CPN specification such that the number of weight terms are reduced from 2n/sup 2/ to 2n. This not only yields significant memory savings, but it also simplifies the calculations for the steady state probabilities (neuron outputs in RNN). Simulations have been conducted to confirm the proper functionality for the isolated SPP design as well as for the multiple SPP's in a networked environment.
Szatmary, J; Hadani, I; Julesz, B
1997-01-01
Rogers and Graham (1979) developed a system to show that head-movement-contingent motion parallax produces monocular depth perception in random dot patterns. Their display system comprised an oscilloscope driven by function generators or a special graphics board that triggered the X and Y deflection of the raster scan signal. Replication of this system required costly hardware that is no longer on the market. In this paper the Rogers-Graham method is reproduced with an Intel processor based IBM PC compatible machine with no additional hardware cost. An adapted joystick sampled through the standard game-port can serve as a provisional head-movement sensor. Monitor resolution for displaying motion is effectively enhanced 16 times by the use of anti-aliasing, enabling the display of thousands of random dots in real-time with a refresh rate of 60 Hz or above. A color monitor enables the use of the anaglyph method, thus combining stereoscopic and monocular parallax on a single display without the loss of speed. The power of this system is demonstrated by a psychophysical measurement in which subjects nulled head-movement-contingent illusory parallax, evoked by a static stereogram, with real parallax. The amount of real parallax required to null the illusory stereoscopic parallax monotonically increased with disparity.
On program restructuring, scheduling, and communication for parallel processor systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Polychronopoulos, Constantine D.
1986-08-01
This dissertation discusses several software and hardware aspects of program execution on large-scale, high-performance parallel processor systems. The issues covered are program restructuring, partitioning, scheduling and interprocessor communication, synchronization, and hardware design issues of specialized units. All this work was performed focusing on a single goal: to maximize program speedup, or equivalently, to minimize parallel execution time. Parafrase, a Fortran restructuring compiler was used to transform programs in a parallel form and conduct experiments. Two new program restructuring techniques are presented, loop coalescing and subscript blocking. Compile-time and run-time scheduling schemes are covered extensively. Depending on the program construct, thesemore » algorithms generate optimal or near-optimal schedules. For the case of arbitrarily nested hybrid loops, two optimal scheduling algorithms for dynamic and static scheduling are presented. Simulation results are given for a new dynamic scheduling algorithm. The performance of this algorithm is compared to that of self-scheduling. Techniques for program partitioning and minimization of interprocessor communication for idealized program models and for real Fortran programs are also discussed. The close relationship between scheduling, interprocessor communication, and synchronization becomes apparent at several points in this work. Finally, the impact of various types of overhead on program speedup and experimental results are presented.« less
NASA Astrophysics Data System (ADS)
Peschmann, K. R.; Parker, D. L.; Smith, V.
1982-11-01
An abundant number of different CT scanner models has been developed in the past ten years, meeting increasing standards of performance. From the beginning they remained a comparatively expensive piece of equipment. This is due not only to their technical complexity but is also due to the difficulties involved in assessing "true" specifications (avoiding "overde-sign"). Our aim has been to provide, for Radiation Therapy Treatment Planning, a low cost CT scanner system featuring large freedom in patient positioning. We have taken advantage of the concurrent tremendously increased amount of knowledge and experience in the technical area of CT1 . By way of extensive computer simulations we gained confidence that an inexpensive C-arm simulator gantry and a simple one phase-two pulse generator in connection with a standard x-ray tube could be used, without sacrificing image quality. These components have been complemented by a commercial high precision shaft encoder, a simple and effective fan beam collimator, a high precision, high efficiency, luminescence crystal-silicon photodiode detector with 256 channels, low noise electronic preamplifier and sampling filter stages, a simplified data aquisition system furnished by Toshiba/ Analogic and an LSI 11/23 microcomputer plus data storage disk as well as various smaller interfaces linking the electrical components. The quality of CT scan pictures of phantoms,performed by the end of last year confirmed that this simple approach is working well. As a next step we intend to upgrade this system with an array processor in order to shorten recon-struction time to one minute per slice. We estimate that the system including this processor could be manufactured for a selling price of $210,000.
Computer simulation of a space SAR using a range-sequential processor for soil moisture mapping
NASA Technical Reports Server (NTRS)
Fujita, M.; Ulaby, F. (Principal Investigator)
1982-01-01
The ability of a spaceborne synthetic aperture radar (SAR) to detect soil moisture was evaluated by means of a computer simulation technique. The computer simulation package includes coherent processing of the SAR data using a range-sequential processor, which can be set up through hardware implementations, thereby reducing the amount of telemetry involved. With such a processing approach, it is possible to monitor the earth's surface on a continuous basis, since data storage requirements can be easily met through the use of currently available technology. The Development of the simulation package is described, followed by an examination of the application of the technique to actual environments. The results indicate that in estimating soil moisture content with a four-look processor, the difference between the assumed and estimated values of soil moisture is within + or - 20% of field capacity for 62% of the pixels for agricultural terrain and for 53% of the pixels for hilly terrain. The estimation accuracy for soil moisture may be improved by reducing the effect of fading through non-coherent averaging.
International Space Station (ISS)
2001-02-01
The Marshall Space Flight Center (MSFC) is responsible for designing and building the life support systems that will provide the crew of the International Space Station (ISS) a comfortable environment in which to live and work. Scientists and engineers at the MSFC are working together to provide the ISS with systems that are safe, efficient, and cost-effective. These compact and powerful systems are collectively called the Environmental Control and Life Support Systems, or simply, ECLSS. This photograph shows the fifth generation Urine Processor Development Hardware. The Urine Processor Assembly (UPA) is a part of the Water Recovery System (WRS) on the ISS. It uses a chase change process called vapor compression distillation technology to remove contaminants from urine. The UPA accepts and processes pretreated crewmember urine to allow it to be processed along with other wastewaters in the Water Processor Assembly (WPA). The WPA removes free gas, organic, and nonorganic constituents before the water goes through a series of multifiltration beds for further purification. Product water quality is monitored primarily through conductivity measurements. Unacceptable water is sent back through the WPA for reprocessing. Clean water is sent to a storage tank.
Architectural Techniques For Managing Non-volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
As chip power dissipation becomes a critical challenge in scaling processor performance, computer architects are forced to fundamentally rethink the design of modern processors and hence, the chip-design industry is now at a major inflection point in its hardware roadmap. The high leakage power and low density of SRAM poses serious obstacles in its use for designing large on-chip caches and for this reason, researchers are exploring non-volatile memory (NVM) devices, such as spin torque transfer RAM, phase change RAM and resistive RAM. However, since NVMs are not strictly superior to SRAM, effective architectural techniques are required for making themmore » a universal memory solution. This book discusses techniques for designing processor caches using NVM devices. It presents algorithms and architectures for improving their energy efficiency, performance and lifetime. It also provides both qualitative and quantitative evaluation to help the reader gain insights and motivate them to explore further. This book will be highly useful for beginners as well as veterans in computer architecture, chip designers, product managers and technical marketing professionals.« less
Multiphase complete exchange on Paragon, SP2 and CS-2
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.
1995-01-01
The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. On hypercubes, multiphase complete exchange has been developed and shown to provide optimal performance over varying message sizes. Most commercial multicomputer systems do not have a hypercube interconnect. However, they use special purpose hardware and dedicated communication processors to achieve very high performance communication and can be made to emulate the hypercube quite well. Multiphase complete exchange has been implemented on three contemporary parallel architectures: the Intel Paragon, IBM SP2 and Meiko CS-2. The essential features of these machines are described and their basic interprocessor communication overheads are discussed. The performance of multiphase complete exchange is evaluated on each machine. It is shown that the theoretical ideas developed for hypercubes are also applicable in practice to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions.
Performances of multiprocessor multidisk architectures for continuous media storage
NASA Astrophysics Data System (ADS)
Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.
1996-03-01
Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Design of infrasound-detection system via adaptive LMSTDE algorithm
NASA Technical Reports Server (NTRS)
Khalaf, C. S.; Stoughton, J. W.
1984-01-01
A proposed solution to an aviation safety problem is based on passive detection of turbulent weather phenomena through their infrasonic emission. This thesis describes a system design that is adequate for detection and bearing evaluation of infrasounds. An array of four sensors, with the appropriate hardware, is used for the detection part. Bearing evaluation is based on estimates of time delays between sensor outputs. The generalized cross correlation (GCC), as the conventional time-delay estimation (TDE) method, is first reviewed. An adaptive TDE approach, using the least mean square (LMS) algorithm, is then discussed. A comparison between the two techniques is made and the advantages of the adaptive approach are listed. The behavior of the GCC, as a Roth processor, is examined for the anticipated signals. It is shown that the Roth processor has the desired effect of sharpening the peak of the correlation function. It is also shown that the LMSTDE technique is an equivalent implementation of the Roth processor in the time domain. A LMSTDE lead-lag model, with a variable stability coefficient and a convergence criterion, is designed.
Transputer parallel processing at NASA Lewis Research Center
NASA Technical Reports Server (NTRS)
Ellis, Graham K.
1989-01-01
The transputer parallel processing lab at NASA Lewis Research Center (LeRC) consists of 69 processors (transputers) that can be connected into various networks for use in general purpose concurrent processing applications. The main goal of the lab is to develop concurrent scientific and engineering application programs that will take advantage of the computational speed increases available on a parallel processor over the traditional sequential processor. Current research involves the development of basic programming tools. These tools will help standardize program interfaces to specific hardware by providing a set of common libraries for applications programmers. The thrust of the current effort is in developing a set of tools for graphics rendering/animation. The applications programmer currently has two options for on-screen plotting. One option can be used for static graphics displays and the other can be used for animated motion. The option for static display involves the use of 2-D graphics primitives that can be called from within an application program. These routines perform the standard 2-D geometric graphics operations in real-coordinate space as well as allowing multiple windows on a single screen.
SETI prototype system for NASA's Sky Survey microwave observing project - A progress report
NASA Technical Reports Server (NTRS)
Klein, M. J.; Gulkis, S.; Wilck, H. C.
1990-01-01
Two complementary search strategies, a Targeted Search and a Sky Survey, are part of NASA's SETI microwave observing project scheduled to begin in October of 1992. The current progress in the development of hardware and software elements of the JPL Sky Survey data processing system are presented. While the Targeted Search stresses sensitivity allowing the detection of either continuous or pulsed signals over the 1-3 GHz frequency range, the Sky Survey gives up sensitivity to survey the 99 percent of the sky that is not covered by the Targeted Search. The Sky Survey spans a larger frequency range from 1-10 GHz. The two searches will deploy special-purpose digital signal processing equipment designed and built to automate the observing and data processing activities. A two-million channel digital wideband spectrum analyzer and a signal processor system will serve as a prototype for the SETI Sky Survey processor. The design will permit future expansion to meet the SETI requirement that the processor concurrently search for left and right circularly polarized signals.
Hardware accelerator design for tracking in smart camera
NASA Astrophysics Data System (ADS)
Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil
2011-10-01
Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.
Instrumentation & Data Acquisition System (D AS) Engineer
NASA Technical Reports Server (NTRS)
Jackson, Markus Deon
2015-01-01
The primary job of an Instrumentation and Data Acquisition System (DAS) Engineer is to properly measure physical phenomenon of hardware using appropriate instrumentation and DAS equipment designed to record data during a specified test of the hardware. A DAS system includes a CPU or processor, a data storage device such as a hard drive, a data communication bus such as Universal Serial Bus, software to control the DAS system processes like calibrations, recording of data and processing of data. It also includes signal conditioning amplifiers, and certain sensors for specified measurements. My internship responsibilities have included testing and adjusting Pacific Instruments Model 9355 signal conditioning amplifiers, writing and performing checkout procedures, writing and performing calibration procedures while learning the basics of instrumentation.
OpenCL Implementation of NeuroIsing
NASA Astrophysics Data System (ADS)
Zapart, C. A.
Recent advances in graphics card hardware combined with anintroduction of the OpenCL standard promise to accelerate numerical simulations across diverse scientific disciplines. One such field benefiting from new hardware/software paradigms is econophysics. The paper describes an OpenCL implementation of a selected econophysics model: NeuroIsing, which has been designed to execute in parallel on a vendor-independent graphics card. Originally introduced in the paper [C.~A.~Zapart, ``Econophysics in Financial Time Series Prediction'', PhD thesis, Graduate University for Advanced Studies, Japan (2009)], at first it was implemented on a CELL processor running inside a SONY PS3 games console. The NeuroIsing framework can be applied to predicting and trading foreign exchange as well as stock market index futures.
Enhancing instruction scheduling with a block-structured ISA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Melvin, S.; Patt, Y.
It is now generally recognized that not enough parallelism exists within the small basic blocks of most general purpose programs to satisfy high performance processors. Thus, a wide variety of techniques have been developed to exploit instruction level parallelism across basic block boundaries. In this paper we discuss some previous techniques along with their hardware and software requirements. Then we propose a new paradigm for an instruction set architecture (ISA): block-structuring. This new paradigm is presented, its hardware and software requirements are discussed and the results from a simulation study are presented. We show that a block-structured ISA utilizes bothmore » dynamic and compile-time mechanisms for exploiting instruction level parallelism and has significant performance advantages over a conventional ISA.« less
Pulse-coupled neural network implementation in FPGA
NASA Astrophysics Data System (ADS)
Waldemark, Joakim T. A.; Lindblad, Thomas; Lindsey, Clark S.; Waldemark, Karina E.; Oberg, Johnny; Millberg, Mikael
1998-03-01
Pulse Coupled Neural Networks (PCNN) are biologically inspired neural networks, mainly based on studies of the visual cortex of small mammals. The PCNN is very well suited as a pre- processor for image processing, particularly in connection with object isolation, edge detection and segmentation. Several implementations of PCNN on von Neumann computers, as well as on special parallel processing hardware devices (e.g. SIMD), exist. However, these implementations are not as flexible as required for many applications. Here we present an implementation in Field Programmable Gate Arrays (FPGA) together with a performance analysis. The FPGA hardware implementation may be considered a platform for further, extended implementations and easily expanded into various applications. The latter may include advanced on-line image analysis with close to real-time performance.
Method for star identification using neural networks
NASA Astrophysics Data System (ADS)
Lindsey, Clark S.; Lindblad, Thomas; Eide, Age J.
1997-04-01
Identification of star constellations with an onboard star tracker provides the highest precision of all attitude determination techniques for spacecraft. A method for identification of star constellations inspired by neural network (NNW) techniques is presented. It compares feature vectors derived from histograms of distances to multiple stars around the unknown star. The NNW method appears most robust with respect to position noise and would require a smaller database than conventional methods, especially for small fields of view. The neural network method is quite slow when performed on a sequential (serial) processor, but would provide very high speed if implemented in special hardware. Such hardware solutions could also yield lower low weight and low power consumption, both important features for small satellites.
Biosequence Similarity Search on the Mercury System
Krishnamurthy, Praveen; Buhler, Jeremy; Chamberlain, Roger; Franklin, Mark; Gyang, Kwame; Jacob, Arpith; Lancaster, Joseph
2007-01-01
Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database. Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store and into reconfigurable hardware. An important component of application deployment on the Mercury system is the functional decomposition of the application onto both the reconfigurable hardware and the traditional processor. Both the Mercury BLASTN application design and its performance analysis are described. PMID:18846267
A Nonlinear Digital Control Solution for a DC/DC Power Converter
NASA Technical Reports Server (NTRS)
Zhu, Minshao
2002-01-01
A digital Nonlinear Proportional-Integral-Derivative (NPID) control algorithm was proposed to control a 1-kW, PWM, DC/DC, switching power converter. The NPID methodology is introduced and a practical hardware control solution is obtained. The design of the controller was completed using Matlab (trademark) Simulink, while the hardware-in-the-loop testing was performed using both the dSPACE (trademark) rapid prototyping system, and a stand-alone Texas Instruments (trademark) Digital Signal Processor (DSP)-based system. The final Nonlinear digital control algorithm was implemented and tested using the ED408043-1 Westinghouse DC-DC switching power converter. The NPID test results are discussed and compared to the results of a standard Proportional-Integral (PI) controller.
Numerical aerodynamic simulation facility feasibility study
NASA Technical Reports Server (NTRS)
1979-01-01
There were three major issues examined in the feasibility study. First, the ability of the proposed system architecture to support the anticipated workload was evaluated. Second, the throughput of the computational engine (the flow model processor) was studied using real application programs. Third, the availability reliability, and maintainability of the system were modeled. The evaluations were based on the baseline systems. The results show that the implementation of the Numerical Aerodynamic Simulation Facility, in the form considered, would indeed be a feasible project with an acceptable level of risk. The technology required (both hardware and software) either already exists or, in the case of a few parts, is expected to be announced this year. Facets of the work described include the hardware configuration, software, user language, and fault tolerance.
NASA Technical Reports Server (NTRS)
1976-01-01
Technologies required to support the stated OAST thrust to increase information return by X1000, while reducing costs by a factor of 10 are identified. The most significant driver is the need for an overall end-to-end data system management technology. Maximum use of LSI component technology and trade-offs between hardware and software are manifest in most all considerations of technology needs. By far, the greatest need for data handling technology was identified for the space Exploration and Global Services themes. Major advances are needed in NASA's ability to provide cost effective mass reduction of space data, and automated assessment of earth looking imagery, with a concomitant reduction in cost per useful bit. A combined approach embodying end-to-end system analysis, with onboard data set selection, onboard data processing, highly parallel image processing (both ground and space), low cost, high capacity memories, and low cost user data distribution systems would be necessary.
The graphics and data acquisition software package
NASA Technical Reports Server (NTRS)
Crosier, W. G.
1981-01-01
A software package was developed for use with micro and minicomputers, particularly the LSI-11/DPD-11 series. The package has a number of Fortran-callable subroutines which perform a variety of frequently needed tasks for biomedical applications. All routines are well documented, flexible, easy to use and modify, and require minimal programmer knowledge of peripheral hardware. The package is also economical of memory and CPU time. A single subroutine call can perform any one of the following functions: (1) plot an array of integer values from sampled A/D data, (2) plot an array of Y values versus an array of X values; (3) draw horizontal and/or vertical grid lines of selectable type; (4) annotate grid lines with user units; (5) get coordinates of user controlled crosshairs from the terminal for interactive graphics; (6) sample any analog channel with program selectable gain; (7) wait a specified time interval, and (8) perform random access I/O of one or more blocks of a sequential disk file. Several miscellaneous functions are also provided.
Moya-Angeler, Joaquín; Vaquero, Javier; Forriol, Francisco
2017-06-01
The purpose of this study was to evaluate the functional status prior to and at different times after anterior cruciate ligament reconstruction (ACLR), and to analyze the changes in the kinetic patterns of the involved and uninvolved lower limb during gait, sprint and three hop tests. Seventy-four male patients with an ACL injury were included in the study. All patients performed a standardized kinetic protocol including gait, sprint and three hop tests (single-leg hop, drop vertical jump and vertical jump tests), preoperatively and at 3, 6, and 12 months after ACLR with a semitendinosus gracilis tendon autograft. Measurements were performed with two force plates. The lower limb symmetry index (LSI) was calculated to determine whether a side-to-side leg difference was classified as normal (LSI >90%) or abnormal (LSI <90%). The LSI presented high values (>90%) at almost all times before and after ACLR in gait, sprint and single-leg hop tests (p < 0.005), with a tendency to increase postoperatively. A lower LSI was observed (<90%) in tests where both extremities were tested simultaneously, such as the drop vertical jump and vertical hop tests (p < 0.05). We observed a tendency to increase symmetry restoration in the kinetics of the involved and uninvolved limb up to twelve months after ACLR, especially in those tests, in which, both limbs were tested individually (gait analysis, sprint and single-leg hop tests). Therefore, the isolation of the involved and uninvolved limb seems to be a critical component in the functional rehabilitation and evaluation of patients before and after ACLR. level III.
NASA Astrophysics Data System (ADS)
Morishita, Tetsuya
2012-07-01
We report a first-principles molecular-dynamics study of the relaxation dynamics in liquid silicon (l-Si) over a wide temperature range (1000-2200 K). We find that the intermediate scattering function for l-Si exhibits a compressed exponential decay above 1200 K including the supercooled regime, which is in stark contrast to that for normal "dense" liquids which typically show stretched exponential decay in the supercooled regime. The coexistence of particles having ballistic-like motion and those having diffusive-like motion is demonstrated, which accounts for the compressed exponential decay in l-Si. An attempt to elucidate the crossover from the ballistic to the diffusive regime in the "time-dependent" diffusion coefficient is made and the temperature-independent universal feature of the crossover is disclosed.
Data Relay Board with Protocol for High-Speed, Free-Space Optical Communications
NASA Technical Reports Server (NTRS)
Wright, Malcolm; Clare, Loren; Gould, Gary; Pedyash, Maxim
2004-01-01
In a free-space optical communication system, the mitigation of transient outages through the incorporation of error-control methods is of particular concern, the outages being caused by scintillation fades and obscurants. The focus of this innovative technology is the development of a data relay system for a reliable high-data-rate free-spacebased optical-transport network. The data relay boards will establish the link, maintain synchronous connection, group the data into frames, and provide for automatic retransmission (ARQ) of lost or erred frames. A certain Quality of Service (QoS) can then be ensured, compatible with the required data rate. The protocol to be used by the data relay system is based on the draft CCSDS standard data-link protocol Proximity-1, selected by orbiters to multiple lander assets in the Mars network, for example. In addition to providing data-link protocol capabilities for the free-space optical link and buffering the data, the data relay system will interface directly with user applications over Gigabit Ethernet and/or with highspeed storage resources via Fibre Channel. The hardware implementation is built on a network-processor-based architecture. This technology combines the power of a hardware switch capable of data switching and packet routing at Gbps rates, with the flexibility of a software- driven processor that can host highly adaptive and reconfigurable protocols used, for example, in wireless local-area networks (LANs). The system will be implemented in a modular multi-board fashion. The main hardware elements of the data relay system are the new data relay board developed by Rockwell Scientific, a COTS Gigabit Ethernet board for user interface, and a COTS Fibre Channel board that connects to local storage. The boards reside in a cPCI back plane, and can be housed in a VME-type enclosure.
New technologies for supporting real-time on-board software development
NASA Astrophysics Data System (ADS)
Kerridge, D.
1995-03-01
The next generation of on-board data management systems will be significantly more complex than current designs, and will be required to perform more complex and demanding tasks in software. Improved hardware technology, in the form of the MA31750 radiation hard processor, is one key component in addressing the needs of future embedded systems. However, to complement these hardware advances, improved support for the design and implementation of real-time data management software is now needed. This will help to control the cost and risk assoicated with developing data management software development as it becomes an increasingly significant element within embedded systems. One particular problem with developing embedded software is managing the non-functional requirements in a systematic way. This paper identifies how Logica has exploited recent developments in hard real-time theory to address this problem through the use of new hard real-time analysis and design methods which can be supported by specialized tools. The first stage in transferring this technology from the research domain to industrial application has already been completed. The MA37150 Hard Real-Time Embedded Software Support Environment (HESSE) is a loosely integrated set of hardware and software tools which directly support the process of hard real-time analysis for software targeting the MA31750 processor. With further development, this HESSE promises to provide embedded system developers with software tools which can reduce the risks associated with developing complex hard real-time software. Supported in this way by more sophisticated software methods and tools, it is foreseen that MA31750 based embedded systems can meet the processing needs for the next generation of on-board data management systems.
Hardware realization of an SVM algorithm implemented in FPGAs
NASA Astrophysics Data System (ADS)
Wiśniewski, Remigiusz; Bazydło, Grzegorz; Szcześniak, Paweł
2017-08-01
The paper proposes a technique of hardware realization of a space vector modulation (SVM) of state function switching in matrix converter (MC), oriented on the implementation in a single field programmable gate array (FPGA). In MC the SVM method is based on the instantaneous space-vector representation of input currents and output voltages. The traditional computation algorithms usually involve digital signal processors (DSPs) which consumes the large number of power transistors (18 transistors and 18 independent PWM outputs) and "non-standard positions of control pulses" during the switching sequence. Recently, hardware implementations become popular since computed operations may be executed much faster and efficient due to nature of the digital devices (especially concurrency). In the paper, we propose a hardware algorithm of SVM computation. In opposite to the existing techniques, the presented solution applies COordinate Rotation DIgital Computer (CORDIC) method to solve the trigonometric operations. Furthermore, adequate arithmetic modules (that is, sub-devices) used for intermediate calculations, such as code converters or proper sectors selectors (for output voltages and input current) are presented in detail. The proposed technique has been implemented as a design described with the use of Verilog hardware description language. The preliminary results of logic implementation oriented on the Xilinx FPGA (particularly, low-cost device from Artix-7 family from Xilinx was used) are also presented.