Hypercluster Parallel Processor
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela
1992-01-01
Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
Using a software-defined computer in teaching the basics of computer architecture and operation
NASA Astrophysics Data System (ADS)
Kosowska, Julia; Mazur, Grzegorz
2017-08-01
The paper describes the concept and implementation of SDC_One software-defined computer designed for experimental and didactic purposes. Equipped with extensive hardware monitoring mechanisms, the device enables the students to monitor the computer's operation on bus transfer cycle or instruction cycle basis, providing the practical illustration of basic aspects of computer's operation. In the paper, we describe the hardware monitoring capabilities of SDC_One and some scenarios of using it in teaching the basics of computer architecture and microprocessor operation.
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1991-01-01
The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Laboratory for Computer Science Progress Report 19, 1 July 1981-30 June 1982.
1984-05-01
Multiprocessor Architectures 202 4. TRIX Operating System 209 5. VLSI Tools 212 ’SYSTEMATIC PROGRAM DEVELOPMENT, 221 1. Introduction 222 2. Specification...exploring distributed operating systems and the architecture of single-user powerful computers that are interconnected by communication networks. The...to now. In particular, we expect to experiment with languages, operating systems , and applications that establish the feasibility of distributed
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
Design and Training of Limited-Interconnect Architectures
1991-07-16
and signal processing. Neuromorphic (brain like) models, allow an alternative for achieving real-time operation tor such tasks, while having a...compact and robust architecture. Neuromorphic models consist of interconnections of simple computational nodes. In this approach, each node computes a...operational performance. I1. Research Objectives The research objectives were: 1. Development of on- chip local training rules specifically designed for
Optimizing Security of Cloud Computing within the DoD
2010-12-01
information security governance and risk management; application security; cryptography; security architecture and design; operations security; business ...governance and risk management; application security; cryptography; security architecture and design; operations security; business continuity...20 7. Operational Security (OPSEC).........................................................20 8. Business Continuity Planning (BCP) and Disaster
Optical systolic solutions of linear algebraic equations
NASA Technical Reports Server (NTRS)
Neuman, C. P.; Casasent, D.
1984-01-01
The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
Computing architecture for autonomous microgrids
Goldsmith, Steven Y.
2015-09-29
A computing architecture that facilitates autonomously controlling operations of a microgrid is described herein. A microgrid network includes numerous computing devices that execute intelligent agents, each of which is assigned to a particular entity (load, source, storage device, or switch) in the microgrid. The intelligent agents can execute in accordance with predefined protocols to collectively perform computations that facilitate uninterrupted control of the .
Blueprint for a microwave trapped ion quantum computer.
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K
2017-02-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1992-01-01
The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.
Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang
2016-12-07
The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
A heterogeneous hierarchical architecture for real-time computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skroch, D.A.; Fornaro, R.J.
The need for high-speed data acquisition and control algorithms has prompted continued research in the area of multiprocessor systems and related programming techniques. The result presented here is a unique hardware and software architecture for high-speed real-time computer systems. The implementation of a prototype of this architecture has required the integration of architecture, operating systems and programming languages into a cohesive unit. This report describes a Heterogeneous Hierarchial Architecture for Real-Time (H{sup 2} ART) and system software for program loading and interprocessor communication.
Modelling parallel programs and multiprocessor architectures with AXE
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Fineman, Charles E.
1991-01-01
AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
OS friendly microprocessor architecture: Hardware level computer security
NASA Astrophysics Data System (ADS)
Jungwirth, Patrick; La Fratta, Patrick
2016-05-01
We present an introduction to the patented OS Friendly Microprocessor Architecture (OSFA) and hardware level computer security. Conventional microprocessors have not tried to balance hardware performance and OS performance at the same time. Conventional microprocessors have depended on the Operating System for computer security and information assurance. The goal of the OS Friendly Architecture is to provide a high performance and secure microprocessor and OS system. We are interested in cyber security, information technology (IT), and SCADA control professionals reviewing the hardware level security features. The OS Friendly Architecture is a switched set of cache memory banks in a pipeline configuration. For light-weight threads, the memory pipeline configuration provides near instantaneous context switching times. The pipelining and parallelism provided by the cache memory pipeline provides for background cache read and write operations while the microprocessor's execution pipeline is running instructions. The cache bank selection controllers provide arbitration to prevent the memory pipeline and microprocessor's execution pipeline from accessing the same cache bank at the same time. This separation allows the cache memory pages to transfer to and from level 1 (L1) caching while the microprocessor pipeline is executing instructions. Computer security operations are implemented in hardware. By extending Unix file permissions bits to each cache memory bank and memory address, the OSFA provides hardware level computer security.
Fault tolerant computer control for a Maglev transportation system
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George
1994-01-01
Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1988-01-01
The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
A High Performance Computer Architecture for Embedded And/Or Multi-Computer Applications
1990-09-01
commercially available, real - time operating system . CHOICES and ARTS are real-time operating systems developed at the University of Illinois and CMU...respectively. Selection of a real - time operating system will be made in the next phase of the project. U BIBLIOGRAPHY U Wulf, Wm. A. The WM Computer
Blueprint for a microwave trapped ion quantum computer
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.
2017-01-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
NASA Technical Reports Server (NTRS)
Mielke, R.; Stoughton, J.; Som, S.; Obando, R.; Malekpour, M.; Mandala, B.
1990-01-01
A functional description of the ATAMM Multicomputer Operating System is presented. ATAMM (Algorithm to Architecture Mapping Model) is a marked graph model which describes the implementation of large grained, decomposed algorithms on data flow architectures. AMOS, the ATAMM Multicomputer Operating System, is an operating system which implements the ATAMM rules. A first generation version of AMOS which was developed for the Advanced Development Module (ADM) is described. A second generation version of AMOS being developed for the Generic VHSIC Spaceborne Computer (GVSC) is also presented.
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Radenski, Atanas; Follen, Gregory J. (Technical Monitor)
2001-01-01
The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
Digital optical computers at the optoelectronic computing systems center
NASA Technical Reports Server (NTRS)
Jordan, Harry F.
1991-01-01
The Digital Optical Computing Program within the National Science Foundation Engineering Research Center for Opto-electronic Computing Systems has as its specific goal research on optical computing architectures suitable for use at the highest possible speeds. The program can be targeted toward exploiting the time domain because other programs in the Center are pursuing research on parallel optical systems, exploiting optical interconnection and optical devices and materials. Using a general purpose computing architecture as the focus, we are developing design techniques, tools and architecture for operation at the speed of light limit. Experimental work is being done with the somewhat low speed components currently available but with architectures which will scale up in speed as faster devices are developed. The design algorithms and tools developed for a general purpose, stored program computer are being applied to other systems such as optimally controlled optical communication networks.
Virtual Business Operating Environment in the Cloud: Conceptual Architecture and Challenges
NASA Astrophysics Data System (ADS)
Nezhad, Hamid R. Motahari; Stephenson, Bryan; Singhal, Sharad; Castellanos, Malu
Advances in service oriented architecture (SOA) have brought us close to the once imaginary vision of establishing and running a virtual business, a business in which most or all of its business functions are outsourced to online services. Cloud computing offers a realization of SOA in which IT resources are offered as services that are more affordable, flexible and attractive to businesses. In this paper, we briefly study advances in cloud computing, and discuss the benefits of using cloud services for businesses and trade-offs that they have to consider. We then present 1) a layered architecture for the virtual business, and 2) a conceptual architecture for a virtual business operating environment. We discuss the opportunities and research challenges that are ahead of us in realizing the technical components of this conceptual architecture. We conclude by giving the outlook and impact of cloud services on both large and small businesses.
Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A
2013-11-05
Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
Efficient architecture for spike sorting in reconfigurable hardware.
Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying
2013-11-01
This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Bit storage and bit flip operations in an electromechanical oscillator.
Mahboob, I; Yamaguchi, H
2008-05-01
The Parametron was first proposed as a logic-processing system almost 50 years ago. In this approach the two stable phases of an excited harmonic oscillator provide the basis for logic operations. Computer architectures based on LC oscillators were developed for this approach, but high power consumption and difficulties with integration meant that the Parametron was rendered obsolete by the transistor. Here we propose an approach to mechanical logic based on nanoelectromechanical systems that is a variation on the Parametron architecture and, as a first step towards a possible nanomechanical computer, we demonstrate both bit storage and bit flip operations.
First 3 years of operation of RIACS (Research Institute for Advanced Computer Science) (1983-1985)
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
The focus of the Research Institute for Advanced Computer Science (RIACS) is to explore matches between advanced computing architectures and the processes of scientific research. An architecture evaluation of the MIT static dataflow machine, specification of a graphical language for expressing distributed computations, and specification of an expert system for aiding in grid generation for two-dimensional flow problems was initiated. Research projects for 1984 and 1985 are summarized.
NASA Technical Reports Server (NTRS)
Smith, T. B., III; Lala, J. H.
1984-01-01
The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.
Bit-parallel arithmetic in a massively-parallel associative processor
NASA Technical Reports Server (NTRS)
Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.
1992-01-01
A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.
Layered Architectures for Quantum Computers and Quantum Repeaters
NASA Astrophysics Data System (ADS)
Jones, Nathan C.
This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
A distributed parallel storage architecture and its potential application within EOSDIS
NASA Technical Reports Server (NTRS)
Johnston, William E.; Tierney, Brian; Feuquay, Jay; Butzer, Tony
1994-01-01
We describe the architecture, implementation, use of a scalable, high performance, distributed-parallel data storage system developed in the ARPA funded MAGIC gigabit testbed. A collection of wide area distributed disk servers operate in parallel to provide logical block level access to large data sets. Operated primarily as a network-based cache, the architecture supports cooperation among independently owned resources to provide fast, large-scale, on-demand storage to support data handling, simulation, and computation.
NASA Astrophysics Data System (ADS)
Lhamon, Michael Earl
A pattern recognition system which uses complex correlation filter banks requires proportionally more computational effort than single-real valued filters. This introduces increased computation burden but also introduces a higher level of parallelism, that common computing platforms fail to identify. As a result, we consider algorithm mapping to both optical and digital processors. For digital implementation, we develop computationally efficient pattern recognition algorithms, referred to as, vector inner product operators that require less computational effort than traditional fast Fourier methods. These algorithms do not need correlation and they map readily onto parallel digital architectures, which imply new architectures for optical processors. These filters exploit circulant-symmetric matrix structures of the training set data representing a variety of distortions. By using the same mathematical basis as with the vector inner product operations, we are able to extend the capabilities of more traditional correlation filtering to what we refer to as "Super Images". These "Super Images" are used to morphologically transform a complicated input scene into a predetermined dot pattern. The orientation of the dot pattern is related to the rotational distortion of the object of interest. The optical implementation of "Super Images" yields feature reduction necessary for using other techniques, such as artificial neural networks. We propose a parallel digital signal processor architecture based on specific pattern recognition algorithms but general enough to be applicable to other similar problems. Such an architecture is classified as a data flow architecture. Instead of mapping an algorithm to an architecture, we propose mapping the DSP architecture to a class of pattern recognition algorithms. Today's optical processing systems have difficulties implementing full complex filter structures. Typically, optical systems (like the 4f correlators) are limited to phase-only implementation with lower detection performance than full complex electronic systems. Our study includes pseudo-random pixel encoding techniques for approximating full complex filtering. Optical filter bank implementation is possible and they have the advantage of time averaging the entire filter bank at real time rates. Time-averaged optical filtering is computational comparable to billions of digital operations-per-second. For this reason, we believe future trends in high speed pattern recognition will involve hybrid architectures of both optical and DSP elements.
Client-Server: What Is It and Are We There Yet?
ERIC Educational Resources Information Center
Gershenfeld, Nancy
1995-01-01
Discusses client-server architecture in dumb terminals, personal computers, local area networks, and graphical user interfaces. Focuses on functions offered by client personal computers: individualized environments; flexibility in running operating systems; advanced operating system features; multiuser environments; and centralized data…
Cloud Computing for Mission Design and Operations
NASA Technical Reports Server (NTRS)
Arrieta, Juan; Attiyah, Amy; Beswick, Robert; Gerasimantos, Dimitrios
2012-01-01
The space mission design and operations community already recognizes the value of cloud computing and virtualization. However, natural and valid concerns, like security, privacy, up-time, and vendor lock-in, have prevented a more widespread and expedited adoption into official workflows. In the interest of alleviating these concerns, we propose a series of guidelines for internally deploying a resource-oriented hub of data and algorithms. These guidelines provide a roadmap for implementing an architecture inspired in the cloud computing model: associative, elastic, semantical, interconnected, and adaptive. The architecture can be summarized as exposing data and algorithms as resource-oriented Web services, coordinated via messaging, and running on virtual machines; it is simple, and based on widely adopted standards, protocols, and tools. The architecture may help reduce common sources of complexity intrinsic to data-driven, collaborative interactions and, most importantly, it may provide the means for teams and agencies to evaluate the cloud computing model in their specific context, with minimal infrastructure changes, and before committing to a specific cloud services provider.
Controlling Infrastructure Costs: Right-Sizing the Mission Control Facility
NASA Technical Reports Server (NTRS)
Martin, Keith; Sen-Roy, Michael; Heiman, Jennifer
2009-01-01
Johnson Space Center's Mission Control Center is a space vehicle, space program agnostic facility. The current operational design is essentially identical to the original facility architecture that was developed and deployed in the mid-90's. In an effort to streamline the support costs of the mission critical facility, the Mission Operations Division (MOD) of Johnson Space Center (JSC) has sponsored an exploratory project to evaluate and inject current state-of-the-practice Information Technology (IT) tools, processes and technology into legacy operations. The general push in the IT industry has been trending towards a data-centric computer infrastructure for the past several years. Organizations facing challenges with facility operations costs are turning to creative solutions combining hardware consolidation, virtualization and remote access to meet and exceed performance, security, and availability requirements. The Operations Technology Facility (OTF) organization at the Johnson Space Center has been chartered to build and evaluate a parallel Mission Control infrastructure, replacing the existing, thick-client distributed computing model and network architecture with a data center model utilizing virtualization to provide the MCC Infrastructure as a Service. The OTF will design a replacement architecture for the Mission Control Facility, leveraging hardware consolidation through the use of blade servers, increasing utilization rates for compute platforms through virtualization while expanding connectivity options through the deployment of secure remote access. The architecture demonstrates the maturity of the technologies generally available in industry today and the ability to successfully abstract the tightly coupled relationship between thick-client software and legacy hardware into a hardware agnostic "Infrastructure as a Service" capability that can scale to meet future requirements of new space programs and spacecraft. This paper discusses the benefits and difficulties that a migration to cloud-based computing philosophies has uncovered when compared to the legacy Mission Control Center architecture. The team consists of system and software engineers with extensive experience with the MCC infrastructure and software currently used to support the International Space Station (ISS) and Space Shuttle program (SSP).
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis
NASA Technical Reports Server (NTRS)
Brent, G. A.
1978-01-01
A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
The expanded role of computers in Space Station Freedom real-time operations
NASA Technical Reports Server (NTRS)
Crawford, R. Paul; Cannon, Kathleen V.
1990-01-01
The challenges that NASA and its international partners face in their real-time operation of the Space Station Freedom necessitate an increased role on the part of computers. In building the operational concepts concerning the role of the computer, the Space Station program is using lessons learned experience from past programs, knowledge of the needs of future space programs, and technical advances in the computer industry. The computer is expected to contribute most significantly in real-time operations by forming a versatile operating architecture, a responsive operations tool set, and an environment that promotes effective and efficient utilization of Space Station Freedom resources.
DFT algorithms for bit-serial GaAs array processor architectures
NASA Technical Reports Server (NTRS)
Mcmillan, Gary B.
1988-01-01
Systems and Processes Engineering Corporation (SPEC) has developed an innovative array processor architecture for computing Fourier transforms and other commonly used signal processing algorithms. This architecture is designed to extract the highest possible array performance from state-of-the-art GaAs technology. SPEC's architectural design includes a high performance RISC processor implemented in GaAs, along with a Floating Point Coprocessor and a unique Array Communications Coprocessor, also implemented in GaAs technology. Together, these data processors represent the latest in technology, both from an architectural and implementation viewpoint. SPEC has examined numerous algorithms and parallel processing architectures to determine the optimum array processor architecture. SPEC has developed an array processor architecture with integral communications ability to provide maximum node connectivity. The Array Communications Coprocessor embeds communications operations directly in the core of the processor architecture. A Floating Point Coprocessor architecture has been defined that utilizes Bit-Serial arithmetic units, operating at very high frequency, to perform floating point operations. These Bit-Serial devices reduce the device integration level and complexity to a level compatible with state-of-the-art GaAs device technology.
A FORCEnet Framework for Analysis of Existing Naval C4I Architectures
2003-06-01
best qualities of humans and computers. f. Information Weapons Information weapons integrate the use of military deception, psychological ...operations, to include electronic warfare, psychological operations, computer network attack, computer network defense, operations security, and military...F/A-18 ( ATARS /SHARP), S-3B (SSU), SH-60 LAMPS (HAWKLINK) and P-3C (AIP, Special Projects). CDL-N consists of two antennas (one meter diameter
The symbolic computation and automatic analysis of trajectories
NASA Technical Reports Server (NTRS)
Grossman, Robert
1991-01-01
Research was generally done on computation of trajectories of dynamical systems, especially control systems. Algorithms were further developed for rewriting expressions involving differential operators. The differential operators involved arise in the local analysis of nonlinear control systems. An initial design was completed of the system architecture for software to analyze nonlinear control systems using data base computing.
Space Ultrareliable Modular Computer (SUMC) instruction simulator
NASA Technical Reports Server (NTRS)
Curran, R. T.
1972-01-01
The design principles, description, functional operation, and recommended expansion and enhancements are presented for the Space Ultrareliable Modular Computer interpretive simulator. Included as appendices are the user's manual, program module descriptions, target instruction descriptions, simulator source program listing, and a sample program printout. In discussing the design and operation of the simulator, the key problems involving host computer independence and target computer architectural scope are brought into focus.
A performance analysis of advanced I/O architectures for PC-based network file servers
NASA Astrophysics Data System (ADS)
Huynh, K. D.; Khoshgoftaar, T. M.
1994-12-01
In the personal computing and workstation environments, more and more I/O adapters are becoming complete functional subsystems that are intelligent enough to handle I/O operations on their own without much intervention from the host processor. The IBM Subsystem Control Block (SCB) architecture has been defined to enhance the potential of these intelligent adapters by defining services and conventions that deliver command information and data to and from the adapters. In recent years, a new storage architecture, the Redundant Array of Independent Disks (RAID), has been quickly gaining acceptance in the world of computing. In this paper, we would like to discuss critical system design issues that are important to the performance of a network file server. We then present a performance analysis of the SCB architecture and disk array technology in typical network file server environments based on personal computers (PCs). One of the key issues investigated in this paper is whether a disk array can outperform a group of disks (of same type, same data capacity, and same cost) operating independently, not in parallel as in a disk array.
Exploration of operator method digital optical computers for application to NASA
NASA Technical Reports Server (NTRS)
1990-01-01
Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
Stewart, Terrence C; Eliasmith, Chris
2013-06-01
Quantum probability (QP) theory can be seen as a type of vector symbolic architecture (VSA): mental states are vectors storing structured information and manipulated using algebraic operations. Furthermore, the operations needed by QP match those in other VSAs. This allows existing biologically realistic neural models to be adapted to provide a mechanistic explanation of the cognitive phenomena described in the target article by Pothos & Busemeyer (P&B).
Architectural requirements for the Red Storm computing system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Camp, William J.; Tomkins, James Lee
This report is based on the Statement of Work (SOW) describing the various requirements for delivering 3 new supercomputer system to Sandia National Laboratories (Sandia) as part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative (ASCI) program. This system is named Red Storm and will be a distributed memory, massively parallel processor (MPP) machine built primarily out of commodity parts. The requirements presented here distill extensive architectural and design experience accumulated over a decade and a half of research, development and production operation of similar machines at Sandia. Red Storm will have an unusually high bandwidth, low latencymore » interconnect, specially designed hardware and software reliability features, a light weight kernel compute node operating system and the ability to rapidly switch major sections of the machine between classified and unclassified computing environments. Particular attention has been paid to architectural balance in the design of Red Storm, and it is therefore expected to achieve an atypically high fraction of its peak speed of 41 TeraOPS on real scientific computing applications. In addition, Red Storm is designed to be upgradeable to many times this initial peak capability while still retaining appropriate balance in key design dimensions. Installation of the Red Storm computer system at Sandia's New Mexico site is planned for 2004, and it is expected that the system will be operated for a minimum of five years following installation.« less
Autonomic Computing for Spacecraft Ground Systems
NASA Technical Reports Server (NTRS)
Li, Zhenping; Savkli, Cetin; Jones, Lori
2007-01-01
Autonomic computing for spacecraft ground systems increases the system reliability and reduces the cost of spacecraft operations and software maintenance. In this paper, we present an autonomic computing solution for spacecraft ground systems at NASA Goddard Space Flight Center (GSFC), which consists of an open standard for a message oriented architecture referred to as the GMSEC architecture (Goddard Mission Services Evolution Center), and an autonomic computing tool, the Criteria Action Table (CAT). This solution has been used in many upgraded ground systems for NASA 's missions, and provides a framework for developing solutions with higher autonomic maturity.
Traffic information computing platform for big data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn
Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance
NASA Technical Reports Server (NTRS)
Rushby, John
1999-01-01
Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
Dynamic array processing for computationally intensive expert systems in CLIPS
NASA Technical Reports Server (NTRS)
Athavale, N. N.; Ragade, R. K.; Fenske, T. E.; Cassaro, M. A.
1990-01-01
This paper puts forth an architecture for implementing a loop for advanced data structure of arrays in CLIPS. An attempt is made to use multi-field variables in such an architecture to process a set of data during the decision making cycle. Also, current limitations on the expert system shells are discussed in brief in this paper. The resulting architecture is designed to circumvent the current limitations set by the expert system shell and also by the operating environment. Such advanced data structures are needed for tightly coupling symbolic and numeric computation modules.
Dual-scale topology optoelectronic processor.
Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H
1991-12-15
The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
Examining the architecture of cellular computing through a comparative study with a computer
Wang, Degeng; Gribskov, Michael
2005-01-01
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software–hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's ‘hardware’ equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the ‘bandwidth’ of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed. PMID:16849179
Examining the architecture of cellular computing through a comparative study with a computer.
Wang, Degeng; Gribskov, Michael
2005-06-22
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
Integrated command, control communication and computation system study
NASA Technical Reports Server (NTRS)
1981-01-01
The study was conducted in three phases: a functional requirements phase; a functional architecture phase; and a design plan phase. The major emphasis was on the functional architecture phase and the approaches used for its functional hierarchy, operations concept, and interfaces.
High-performance image processing architecture
NASA Astrophysics Data System (ADS)
Coffield, Patrick C.
1992-04-01
The proposed architecture is a logical design specifically for image processing and other related computations. The design is a hybrid electro-optical concept consisting of three tightly coupled components: a spatial configuration processor (the optical analog portion), a weighting processor (digital), and an accumulation processor (digital). The systolic flow of data and image processing operations are directed by a control buffer and pipelined to each of the three processing components. The image processing operations are defined by an image algebra developed by the University of Florida. The algebra is capable of describing all common image-to-image transformations. The merit of this architectural design is how elegantly it handles the natural decomposition of algebraic functions into spatially distributed, point-wise operations. The effect of this particular decomposition allows convolution type operations to be computed strictly as a function of the number of elements in the template (mask, filter, etc.) instead of the number of picture elements in the image. Thus, a substantial increase in throughput is realized. The logical architecture may take any number of physical forms. While a hybrid electro-optical implementation is of primary interest, the benefits and design issues of an all digital implementation are also discussed. The potential utility of this architectural design lies in its ability to control all the arithmetic and logic operations of the image algebra's generalized matrix product. This is the most powerful fundamental formulation in the algebra, thus allowing a wide range of applications.
Guidance and Control System for an Autonomous Vehicle
1990-06-01
implementing an appropriate computer architecture in support of these goals is also discussed and detailed, along with the choice of associated computer hardware and real - time operating system software. (rh)
Analysis of Disaster Preparedness Planning Measures in DoD Computer Facilities
1993-09-01
city, stae, aod ZP code) 10 Source of Funding Numbers SProgram Element No lProject No ITask No lWork Unit Accesion I 11 Title include security...Computer Disaster Recovery .... 13 a. PC and LAN Lessons Learned . . ..... 13 2. Distributed Architectures . . . .. . 14 3. Backups...amount of expense, but no client problems." (Leeke, 1993, p. 8) 2. Distributed Architectures The majority of operations that were disrupted by the
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness
NASA Astrophysics Data System (ADS)
Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.
2014-09-01
Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Follen, Gregory J. (Technical Monitor); Radenski, Atanas
2003-01-01
The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
Cloud Computing in Support of Synchronized Disaster Response Operations
2010-09-01
scalable, Web application based on cloud computing technologies to facilitate communication between a broad range of public and private entities without...requiring them to compromise security or competitive advantage. The proposed design applies the unique benefits of cloud computing architectures such as
Digital optical interconnects for photonic computing
NASA Astrophysics Data System (ADS)
Guilfoyle, Peter S.; Stone, Richard V.; Zeise, Frederick F.
1994-05-01
A 32-bit digital optical computer (DOC II) has been implemented in hardware utilizing 8,192 free-space optical interconnects. The architecture exploits parallel interconnect technology by implementing microcode at the primitive level. A burst mode of 0.8192 X 1012 binary operations per sec has been reliably demonstrated. The prototype has been successful in demonstrating general purpose computation. In addition to emulating the RISC instruction set within the UNIX operating environment, relational database text search operations have been implemented on DOC II.
Simulation system architecture design for generic communications link
NASA Technical Reports Server (NTRS)
Tsang, Chit-Sang; Ratliff, Jim
1986-01-01
This paper addresses a computer simulation system architecture design for generic digital communications systems. It addresses the issues of an overall system architecture in order to achieve a user-friendly, efficient, and yet easily implementable simulation system. The system block diagram and its individual functional components are described in detail. Software implementation is discussed with the VAX/VMS operating system used as a target environment.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-10
... design feature associated with the architecture and connectivity capabilities of the airplanes' computer... vulnerabilities to the airplanes' systems. The proposed network architecture includes the following connectivity.... Operator business and administrative support systems, and 3. Passenger entertainment systems, and access by...
NASA Technical Reports Server (NTRS)
Rickard, D. A.; Bodenheimer, R. E.
1976-01-01
Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
FPGA-based real-time phase measuring profilometry algorithm design and implementation
NASA Astrophysics Data System (ADS)
Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng
2016-11-01
Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
36 CFR 1194.26 - Desktop and portable computers.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 36 Parks, Forests, and Public Property 3 2014-07-01 2014-07-01 false Desktop and portable computers. 1194.26 Section 1194.26 Parks, Forests, and Public Property ARCHITECTURAL AND TRANSPORTATION... § 1194.26 Desktop and portable computers. (a) All mechanically operated controls and keys shall comply...
36 CFR 1194.26 - Desktop and portable computers.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 36 Parks, Forests, and Public Property 3 2012-07-01 2012-07-01 false Desktop and portable computers. 1194.26 Section 1194.26 Parks, Forests, and Public Property ARCHITECTURAL AND TRANSPORTATION... § 1194.26 Desktop and portable computers. (a) All mechanically operated controls and keys shall comply...
36 CFR 1194.26 - Desktop and portable computers.
Code of Federal Regulations, 2011 CFR
2011-07-01
... 36 Parks, Forests, and Public Property 3 2011-07-01 2011-07-01 false Desktop and portable computers. 1194.26 Section 1194.26 Parks, Forests, and Public Property ARCHITECTURAL AND TRANSPORTATION... § 1194.26 Desktop and portable computers. (a) All mechanically operated controls and keys shall comply...
36 CFR 1194.26 - Desktop and portable computers.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 36 Parks, Forests, and Public Property 3 2010-07-01 2010-07-01 false Desktop and portable computers. 1194.26 Section 1194.26 Parks, Forests, and Public Property ARCHITECTURAL AND TRANSPORTATION... § 1194.26 Desktop and portable computers. (a) All mechanically operated controls and keys shall comply...
Performances of multiprocessor multidisk architectures for continuous media storage
NASA Astrophysics Data System (ADS)
Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.
1996-03-01
Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Petri net model for analysis of concurrently processed complex algorithms
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1986-01-01
This paper presents a Petri-net model suitable for analyzing the concurrent processing of computationally complex algorithms. The decomposed operations are to be processed in a multiple processor, data driven architecture. Of particular interest is the application of the model to both the description of the data/control flow of a particular algorithm, and to the general specification of the data driven architecture. A candidate architecture is also presented.
Gigaflop architecture, a hardware perspective
NASA Technical Reports Server (NTRS)
Feierbach, G. F.
1978-01-01
Any super computer built in the early 1980s will use components that are available by fall 1978. The architecture of such a system cannot depart radically from current super computers if the software experience painfully acquired from these computers in the 70's is to apply. Given the above constraints, 10 billion floating point operations per second (BFLOPS) are attainable and a problem memory of 512 million (64 bit) words could be supported by the technology of the time. In contrast to this, industry is likely to respond with commercially available machines with a performance of less than 150 MFLOPS. This is due to self-imposed constraints on the manufacturers to provide upward compatible architectures (same instruction set) and systems which can be sold in significant volumes. Since this computing speed is inadequate to meet the demands of computational fluid dynamics, a special processor is required. Issues which are felt to be significant in the pursuit of maximum compute capability in this special processor are discussed.
NASA Astrophysics Data System (ADS)
Zhang, Yunju; Chen, Zhongyi; Guo, Ming; Lin, Shunsheng; Yan, Yinyang
2018-01-01
With the large capacity of the power system, the development trend of the large unit and the high voltage, the scheduling operation is becoming more frequent and complicated, and the probability of operation error increases. This paper aims at the problem of the lack of anti-error function, single scheduling function and low working efficiency for technical support system in regional regulation and integration, the integrated construction of the error prevention of the integrated architecture of the system of dispatching anti - error of dispatching anti - error of power network based on cloud computing has been proposed. Integrated system of error prevention of Energy Management System, EMS, and Operation Management System, OMS have been constructed either. The system architecture has good scalability and adaptability, which can improve the computational efficiency, reduce the cost of system operation and maintenance, enhance the ability of regional regulation and anti-error checking with broad development prospects.
NASA Technical Reports Server (NTRS)
Montag, Bruce C.; Bishop, Alfred M.; Redfield, Joe B.
1989-01-01
The findings of a preliminary investigation by Southwest Research Institute (SwRI) in simulation host computer concepts is presented. It is designed to aid NASA in evaluating simulation technologies for use in spaceflight training. The focus of the investigation is on the next generation of space simulation systems that will be utilized in training personnel for Space Station Freedom operations. SwRI concludes that NASA should pursue a distributed simulation host computer system architecture for the Space Station Training Facility (SSTF) rather than a centralized mainframe based arrangement. A distributed system offers many advantages and is seen by SwRI as the only architecture that will allow NASA to achieve established functional goals and operational objectives over the life of the Space Station Freedom program. Several distributed, parallel computing systems are available today that offer real-time capabilities for time critical, man-in-the-loop simulation. These systems are flexible in terms of connectivity and configurability, and are easily scaled to meet increasing demands for more computing power.
Hybrid parallel computing architecture for multiview phase shifting
NASA Astrophysics Data System (ADS)
Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun
2014-11-01
The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
The science of computing - Parallel computation
NASA Technical Reports Server (NTRS)
Denning, P. J.
1985-01-01
Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
FFT Computation with Systolic Arrays, A New Architecture
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The use of the Cooley-Tukey algorithm for computing the l-d FFT lends itself to a particular matrix factorization which suggests direct implementation by linearly-connected systolic arrays. Here we present a new systolic architecture that embodies this algorithm. This implementation requires a smaller number of processors and a smaller number of memory cells than other recent implementations, as well as having all the advantages of systolic arrays. For the implementation of the decimation-in-frequency case, word-serial data input allows continuous real-time operation without the need of a serial-to-parallel conversion device. No control or data stream switching is necessary. Computer simulation of this architecture was done in the context of a 1024 point DFT with a fixed point processor, and CMOS processor implementation has started.
Associative architecture for image processing
NASA Astrophysics Data System (ADS)
Adar, Rutie; Akerib, Avidan
1997-09-01
This article presents a new generation in parallel processing architecture for real-time image processing. The approach is implemented in a real time image processor chip, called the XiumTM-2, based on combining a fully associative array which provides the parallel engine with a serial RISC core on the same die. The architecture is fully programmable and can be programmed to implement a wide range of color image processing, computer vision and media processing functions in real time. The associative part of the chip is based on patented pending methodology of Associative Computing Ltd. (ACL), which condenses 2048 associative processors, each of 128 'intelligent' bits. Each bit can be a processing bit or a memory bit. At only 33 MHz and 0.6 micron manufacturing technology process, the chip has a computational power of 3 billion ALU operations per second and 66 billion string search operations per second. The fully programmable nature of the XiumTM-2 chip enables developers to use ACL tools to write their own proprietary algorithms combined with existing image processing and analysis functions from ACL's extended set of libraries.
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode
NASA Technical Reports Server (NTRS)
Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William
1986-01-01
The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Integrating Software Modules For Robot Control
NASA Technical Reports Server (NTRS)
Volpe, Richard A.; Khosla, Pradeep; Stewart, David B.
1993-01-01
Reconfigurable, sensor-based control system uses state variables in systematic integration of reusable control modules. Designed for open-architecture hardware including many general-purpose microprocessors, each having own local memory plus access to global shared memory. Implemented in software as extension of Chimera II real-time operating system. Provides transparent computing mechanism for intertask communication between control modules and generic process-module architecture for multiprocessor realtime computation. Used to control robot arm. Proves useful in variety of other control and robotic applications.
Evaluation of fault-tolerant parallel-processor architectures over long space missions
NASA Technical Reports Server (NTRS)
Johnson, Sally C.
1989-01-01
The impact of a five year space mission environment on fault-tolerant parallel processor architectures is examined. The target application is a Strategic Defense Initiative (SDI) satellite requiring 256 parallel processors to provide the computation throughput. The reliability requirements are that the system still be operational after five years with .99 probability and that the probability of system failure during one-half hour of full operation be less than 10(-7). The fault tolerance features an architecture must possess to meet these reliability requirements are presented, many potential architectures are briefly evaluated, and one candidate architecture, the Charles Stark Draper Laboratory's Fault-Tolerant Parallel Processor (FTPP) is evaluated in detail. A methodology for designing a preliminary system configuration to meet the reliability and performance requirements of the mission is then presented and demonstrated by designing an FTPP configuration.
Benchmarking hardware architecture candidates for the NFIRAOS real-time controller
NASA Astrophysics Data System (ADS)
Smith, Malcolm; Kerley, Dan; Herriot, Glen; Véran, Jean-Pierre
2014-07-01
As a part of the trade study for the Narrow Field Infrared Adaptive Optics System, the adaptive optics system for the Thirty Meter Telescope, we investigated the feasibility of performing real-time control computation using a Linux operating system and Intel Xeon E5 CPUs. We also investigated a Xeon Phi based architecture which allows higher levels of parallelism. This paper summarizes both the CPU based real-time controller architecture and the Xeon Phi based RTC. The Intel Xeon E5 CPU solution meets the requirements and performs the computation for one AO cycle in an average of 767 microseconds. The Xeon Phi solution did not meet the 1200 microsecond time requirement and also suffered from unpredictable execution times. More detailed benchmark results are reported for both architectures.
OFMspert: An architecture for an operator's associate that evolves to an intelligent tutor
NASA Technical Reports Server (NTRS)
Mitchell, Christine M.
1991-01-01
With the emergence of new technology for both human-computer interaction and knowledge-based systems, a range of opportunities exist which enhance the effectiveness and efficiency of controllers of high-risk engineering systems. The design of an architecture for an operator's associate is described. This associate is a stand-alone model-based system designed to interact with operators of complex dynamic systems, such as airplanes, manned space systems, and satellite ground control systems in ways comparable to that of a human assistant. The operator function model expert system (OFMspert) architecture and the design and empirical validation of OFMspert's understanding component are described. The design and validation of OFMspert's interactive and control components are also described. A description of current work in which OFMspert provides the foundation in the development of an intelligent tutor that evolves to an assistant, as operator expertise evolves from novice to expert, is provided.
NASA Technical Reports Server (NTRS)
Harper, Richard E.; Babikyan, Carol A.; Butler, Bryan P.; Clasen, Robert J.; Harris, Chris H.; Lala, Jaynarayan H.; Masotto, Thomas K.; Nagle, Gail A.; Prizant, Mark J.; Treadwell, Steven
1994-01-01
The Army Avionics Research and Development Activity (AVRADA) is pursuing programs that would enable effective and efficient management of large amounts of situational data that occurs during tactical rotorcraft missions. The Computer Aided Low Altitude Night Helicopter Flight Program has identified automated Terrain Following/Terrain Avoidance, Nap of the Earth (TF/TA, NOE) operation as key enabling technology for advanced tactical rotorcraft to enhance mission survivability and mission effectiveness. The processing of critical information at low altitudes with short reaction times is life-critical and mission-critical necessitating an ultra-reliable/high throughput computing platform for dependable service for flight control, fusion of sensor data, route planning, near-field/far-field navigation, and obstacle avoidance operations. To address these needs the Army Fault Tolerant Architecture (AFTA) is being designed and developed. This computer system is based upon the Fault Tolerant Parallel Processor (FTPP) developed by Charles Stark Draper Labs (CSDL). AFTA is hard real-time, Byzantine, fault-tolerant parallel processor which is programmed in the ADA language. This document describes the results of the Detailed Design (Phase 2 and 3 of a 3-year project) of the AFTA development. This document contains detailed descriptions of the program objectives, the TF/TA NOE application requirements, architecture, hardware design, operating systems design, systems performance measurements and analytical models.
A novel VLSI processor architecture for supercomputing arrays
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Pattabiraman, S.; Devanathan, R.; Ahmed, Ashaf; Venkataraman, S.; Ganesh, N.
1993-01-01
Design of the processor element for general purpose massively parallel supercomputing arrays is highly complex and cost ineffective. To overcome this, the architecture and organization of the functional units of the processor element should be such as to suit the diverse computational structures and simplify mapping of complex communication structures of different classes of algorithms. This demands that the computation and communication structures of different class of algorithms be unified. While unifying the different communication structures is a difficult process, analysis of a wide class of algorithms reveals that their computation structures can be expressed in terms of basic IP,IP,OP,CM,R,SM, and MAA operations. The execution of these operations is unified on the PAcube macro-cell array. Based on this PAcube macro-cell array, we present a novel processor element called the GIPOP processor, which has dedicated functional units to perform the above operations. The architecture and organization of these functional units are such to satisfy the two important criteria mentioned above. The structure of the macro-cell and the unification process has led to a very regular and simpler design of the GIPOP processor. The production cost of the GIPOP processor is drastically reduced as it is designed on high performance mask programmable PAcube arrays.
Kepler Science Operations Center Architecture
NASA Technical Reports Server (NTRS)
Middour, Christopher; Klaus, Todd; Jenkins, Jon; Pletcher, David; Cote, Miles; Chandrasekaran, Hema; Wohler, Bill; Girouard, Forrest; Gunter, Jay P.; Uddin, Kamal;
2010-01-01
We give an overview of the operational concepts and architecture of the Kepler Science Data Pipeline. Designed, developed, operated, and maintained by the Science Operations Center (SOC) at NASA Ames Research Center, the Kepler Science Data Pipeline is central element of the Kepler Ground Data System. The SOC charter is to analyze stellar photometric data from the Kepler spacecraft and report results to the Kepler Science Office for further analysis. We describe how this is accomplished via the Kepler Science Data Pipeline, including the hardware infrastructure, scientific algorithms, and operational procedures. The SOC consists of an office at Ames Research Center, software development and operations departments, and a data center that hosts the computers required to perform data analysis. We discuss the high-performance, parallel computing software modules of the Kepler Science Data Pipeline that perform transit photometry, pixel-level calibration, systematic error-correction, attitude determination, stellar target management, and instrument characterization. We explain how data processing environments are divided to support operational processing and test needs. We explain the operational timelines for data processing and the data constructs that flow into the Kepler Science Data Pipeline.
Design of a real-time wind turbine simulator using a custom parallel architecture
NASA Technical Reports Server (NTRS)
Hoffman, John A.; Gluck, R.; Sridhar, S.
1995-01-01
The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.
NASA Technical Reports Server (NTRS)
LaValley, Brian W.; Little, Phillip D.; Walter, Chris J.
2011-01-01
This report documents the capabilities of the EDICT tools for error modeling and error propagation analysis when operating with models defined in the Architecture Analysis & Design Language (AADL). We discuss our experience using the EDICT error analysis capabilities on a model of the Scalable Processor-Independent Design for Enhanced Reliability (SPIDER) architecture that uses the Reliable Optical Bus (ROBUS). Based on these experiences we draw some initial conclusions about model based design techniques for error modeling and analysis of highly reliable computing architectures.
Architecture for hospital information integration
NASA Astrophysics Data System (ADS)
Chimiak, William J.; Janariz, Daniel L.; Martinez, Ralph
1999-07-01
The ongoing integration of hospital information systems (HIS) continues. Data storage systems, data networks and computers improve, data bases grow and health-care applications increase. Some computer operating systems continue to evolve and some fade. Health care delivery now depends on this computer-assisted environment. The result is the critical harmonization of the various hospital information systems becomes increasingly difficult. The purpose of this paper is to present an architecture for HIS integration that is computer-language-neutral and computer- hardware-neutral for the informatics applications. The proposed architecture builds upon the work done at the University of Arizona on middleware, the work of the National Electrical Manufacturers Association, and the American College of Radiology. It is a fresh approach to allowing applications engineers to access medical data easily and thus concentrates on the application techniques in which they are expert without struggling with medical information syntaxes. The HIS can be modeled using a hierarchy of information sub-systems thus facilitating its understanding. The architecture includes the resulting information model along with a strict but intuitive application programming interface, managed by CORBA. The CORBA requirement facilitates interoperability. It should also reduce software and hardware development times.
36 CFR § 1194.26 - Desktop and portable computers.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 36 Parks, Forests, and Public Property 3 2013-07-01 2012-07-01 true Desktop and portable computers. § 1194.26 Section § 1194.26 Parks, Forests, and Public Property ARCHITECTURAL AND TRANSPORTATION... § 1194.26 Desktop and portable computers. (a) All mechanically operated controls and keys shall comply...
PCs: Key to the Future. Business Center Provides Sound Skills and Good Attitudes.
ERIC Educational Resources Information Center
Pay, Renee W.
1991-01-01
The Advanced Computing/Management Training Program at Jordan Technical Center (Sandy, Utah) simulates an automated office to teach five sets of skills: computer architecture and operating systems, word processing, data processing, communications skills, and management principles. (SK)
Random noise effects in pulse-mode digital multilayer neural networks.
Kim, Y C; Shanblatt, M A
1995-01-01
A pulse-mode digital multilayer neural network (DMNN) based on stochastic computing techniques is implemented with simple logic gates as basic computing elements. The pulse-mode signal representation and the use of simple logic gates for neural operations lead to a massively parallel yet compact and flexible network architecture, well suited for VLSI implementation. Algebraic neural operations are replaced by stochastic processes using pseudorandom pulse sequences. The distributions of the results from the stochastic processes are approximated using the hypergeometric distribution. Synaptic weights and neuron states are represented as probabilities and estimated as average pulse occurrence rates in corresponding pulse sequences. A statistical model of the noise (error) is developed to estimate the relative accuracy associated with stochastic computing in terms of mean and variance. Computational differences are then explained by comparison to deterministic neural computations. DMNN feedforward architectures are modeled in VHDL using character recognition problems as testbeds. Computational accuracy is analyzed, and the results of the statistical model are compared with the actual simulation results. Experiments show that the calculations performed in the DMNN are more accurate than those anticipated when Bernoulli sequences are assumed, as is common in the literature. Furthermore, the statistical model successfully predicts the accuracy of the operations performed in the DMNN.
Analyzing Resiliency of the Smart Grid Communication Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anas AlMajali, Anas; Viswanathan, Arun; Neuman, Clifford
Smart grids are susceptible to cyber-attack as a result of new communication, control and computation techniques employed in the grid. In this paper, we characterize and analyze the resiliency of smart grid communication architecture, specifically an RF mesh based architecture, under cyber attacks. We analyze the resiliency of the communication architecture by studying the performance of high-level smart grid functions such as metering, and demand response which depend on communication. Disrupting the operation of these functions impacts the operational resiliency of the smart grid. Our analysis shows that it takes an attacker only a small fraction of meters to compromisemore » the communication resiliency of the smart grid. We discuss the implications of our result to critical smart grid functions and to the overall security of the smart grid.« less
Automatic control of a negative ion source
NASA Astrophysics Data System (ADS)
Saadatmand, K.; Sredniawski, J.; Solensten, L.
1989-04-01
A CAMAC based control architecture is devised for a Berkeley-type H - volume ion source [1]. The architecture employs three 80386 TM PCs. One PC is dedicated to control and monitoring of source operation. The other PC functions with digitizers to provide data acquisition of waveforms. The third PC is used for off-line analysis. Initially, operation of the source was put under remote computer control (supervisory). This was followed by development of an automated startup procedure. Finally, a study of the physics of operation is now underway to establish a data base from which automatic beam optimization can be derived.
NASA Astrophysics Data System (ADS)
Marukame, Takao; Nishi, Yoshifumi; Yasuda, Shin-ichi; Tanamoto, Tetsufumi
2018-04-01
The use of memristive devices for creating artificial neurons is promising for brain-inspired computing from the viewpoints of computation architecture and learning protocol. We present an energy-efficient multiplier accumulator based on a memristive array architecture incorporating both analog and digital circuitries. The analog circuitry is used to full advantage for neural networks, as demonstrated by the spike-timing-dependent plasticity (STDP) in fabricated AlO x /TiO x -based metal-oxide memristive devices. STDP protocols for controlling periodic analog resistance with long-range stability were experimentally verified using a variety of voltage amplitudes and spike timings.
A design framework for teleoperators with kinesthetic feedback
NASA Technical Reports Server (NTRS)
Hannaford, Blake
1989-01-01
The application of a hybrid two-port model to teleoperators with force and velocity sensing at the master and slave is presented. The interfaces between human operator and master, and between environment and slave, are ports through which the teleoperator is designed to exchange energy between the operator and the environment. By computing or measuring the input-output properties of this two-port network, the hybrid two-port model of an actual or simulated teleoperator system can be obtained. It is shown that the hybrid model (as opposed to other two-port forms) leads to an intuitive representation of ideal teleoperator performace and applies to several teleoperator architectures. Thus measured values of the h matrix or values computed from a simulation can be used to compare performance with th ideal. The frequency-dependent h matrix is computed from a detailed SPICE model of an actual system, and the method is applied to a proposed architecture.
Real-time FPGA architectures for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2000-03-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low level image processing. The FPGA-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on a dedicated VLSI to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real time performance are discussed. Some results are presented and discussed.
MultiLIS: A Description of the System Design and Operational Features.
ERIC Educational Resources Information Center
Kelly, Glen J.; And Others
1988-01-01
Describes development, hardware requirements, and features of the MultiLIS integrated library software package. A system profile provides pricing information, operational characteristics, and technical specifications. Sidebars discuss MultiLIS integration structure, incremental architecture, and NCR Tower Computers. (4 references) (MES)
Spatial operator factorization and inversion of the manipulator mass matrix
NASA Technical Reports Server (NTRS)
Rodriguez, Guillermo; Kreutz-Delgado, Kenneth
1992-01-01
This paper advances two linear operator factorizations of the manipulator mass matrix. Embedded in the factorizations are many of the techniques that are regarded as very efficient computational solutions to inverse and forward dynamics problems. The operator factorizations provide a high-level architectural understanding of the mass matrix and its inverse, which is not visible in the detailed algorithms. They also lead to a new approach to the development of computer programs or organize complexity in robot dynamics.
NASA Technical Reports Server (NTRS)
Stevens, H. D.; Miles, E. S.; Rock, S. J.; Cannon, R. H.
1994-01-01
Expanding man's presence in space requires capable, dexterous robots capable of being controlled from the Earth. Traditional 'hand-in-glove' control paradigms require the human operator to directly control virtually every aspect of the robot's operation. While the human provides excellent judgment and perception, human interaction is limited by low bandwidth, delayed communications. These delays make 'hand-in-glove' operation from Earth impractical. In order to alleviate many of the problems inherent to remote operation, Stanford University's Aerospace Robotics Laboratory (ARL) has developed the Object-Based Task-Level Control architecture. Object-Based Task-Level Control (OBTLC) removes the burden of teleoperation from the human operator and enables execution of tasks not possible with current techniques. OBTLC is a hierarchical approach to control where the human operator is able to specify high-level, object-related tasks through an intuitive graphical user interface. Infrequent task-level command replace constant joystick operations, eliminating communications bandwidth and time delay problems. The details of robot control and task execution are handled entirely by the robot and computer control system. The ARL has implemented the OBTLC architecture on a set of Free-Flying Space Robots. The capability of the OBTLC architecture has been demonstrated by controlling the ARL Free-Flying Space Robots from NASA Ames Research Center.
Fast semivariogram computation using FPGA architectures
NASA Astrophysics Data System (ADS)
Lagadapati, Yamuna; Shirvaikar, Mukul; Dong, Xuanliang
2015-02-01
The semivariogram is a statistical measure of the spatial distribution of data and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. The semivariogram is a plot of semivariances for different lag distances between pixels. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O(n2). Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz, but they can perform tens of thousands of calculations per clock cycle while operating in the low range of power. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. The design consists of several modules dedicated to the constituent computational tasks. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. Anisotropic semivariogram implementation is anticipated to be an extension of the current architecture, ostensibly based on refinements to the current modules. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from MRI scans are utilized for the experiments. Computational speedup is measured with respect to Matlab implementation on a personal computer with an Intel i7 multi-core processor. Preliminary simulation results indicate that a significant advantage in speed can be attained by the architectures, making the algorithm viable for implementation in medical devices
ASIC-based architecture for the real-time computation of 2D convolution with large kernel size
NASA Astrophysics Data System (ADS)
Shao, Rui; Zhong, Sheng; Yan, Luxin
2015-12-01
Bidimensional convolution is a low-level processing algorithm of interest in many areas, but its high computational cost constrains the size of the kernels, especially in real-time embedded systems. This paper presents a hardware architecture for the ASIC-based implementation of 2-D convolution with medium-large kernels. Aiming to improve the efficiency of storage resources on-chip, reducing off-chip bandwidth of these two issues, proposed construction of a data cache reuse. Multi-block SPRAM to cross cached images and the on-chip ping-pong operation takes full advantage of the data convolution calculation reuse, design a new ASIC data scheduling scheme and overall architecture. Experimental results show that the structure can achieve 40× 32 size of template real-time convolution operations, and improve the utilization of on-chip memory bandwidth and on-chip memory resources, the experimental results show that the structure satisfies the conditions to maximize data throughput output , reducing the need for off-chip memory bandwidth.
A vectorization of the Hess McDonnell Douglas potential flow program NUED for the STAR-100 computer
NASA Technical Reports Server (NTRS)
Boney, L. R.; Smith, R. E., Jr.
1979-01-01
The computer program NUED for analyzing potential flow about arbitrary three dimensional lifting bodies using the panel method was modified to use vector operations and run on the STAR-100 computer. A high speed of computation and ability to approximate the body surface with a large number of panels are characteristics of NUEDV. The new program shows that vector operations can be readily implemented in programs of this type to increase the computational speed on the STAR-100 computer. The virtual memory architecture of the STAR-100 facilitates the use of large numbers of panels to approximate the body surface.
Naval Open Architecture Machinery Control Systems for Next Generation Integrated Power Systems
2012-05-01
PORTABLE) OS / RTOS ADAPTATION MIDDLEWARE (FOR OS PORTABILITY) MACHINERY CONTROLLER FRAMEWORK MACHINERY CONTROL SYSTEM SERVICES POWER CONTROL SYSTEM...SERVICES SHIP SYSTEM SERVICES TTY 0 TTY N … OPERATING SYSTEM ( OS / RTOS ) COMPUTER HARDWARE UDP IP TCP RAW DEV 0 DEV N … POWER MANAGEMENT CONTROLLER...operating systems (DOS, Windows, Linux, OS /2, QNX, SCO Unix ...) COMPUTERS: ISA compatible motherboards, workstations and portables (Compaq, Dell
RICIS Symposium 1992: Mission and Safety Critical Systems Research and Applications
NASA Technical Reports Server (NTRS)
1992-01-01
This conference deals with computer systems which control systems whose failure to operate correctly could produce the loss of life and or property, mission and safety critical systems. Topics covered are: the work of standards groups, computer systems design and architecture, software reliability, process control systems, knowledge based expert systems, and computer and telecommunication protocols.
Software for Collaborative Use of Large Interactive Displays
NASA Technical Reports Server (NTRS)
Trimble, Jay; Shab, Thodore; Wales, Roxana; Vera, Alonso; Tollinger, Irene; McCurdy, Michael; Lyubimov, Dmitriy
2006-01-01
The MERBoard Collaborative Workspace, which is currently being deployed to support the Mars Exploration Rover (MER) Missions, is the first instantiation of a new computing architecture designed to support collaborative and group computing using computing devices situated in NASA mission operations room. It is a software system for generation of large-screen interactive displays by multiple users
A support architecture for reliable distributed computing systems
NASA Technical Reports Server (NTRS)
Dasgupta, Partha; Leblanc, Richard J., Jr.
1988-01-01
The Clouds project is well underway to its goal of building a unified distributed operating system supporting the object model. The operating system design uses the object concept of structuring software at all levels of the system. The basic operating system was developed and work is under progress to build a usable system.
Laboratory on legs: an architecture for adjustable morphology with legged robots
NASA Astrophysics Data System (ADS)
Haynes, G. Clark; Pusey, Jason; Knopf, Ryan; Johnson, Aaron M.; Koditschek, Daniel E.
2012-06-01
For mobile robots, the essential units of actuation, computation, and sensing must be designed to fit within the body of the robot. Additional capabilities will largely depend upon a given activity, and should be easily reconfigurable to maximize the diversity of applications and experiments. To address this issue, we introduce a modular architecture originally developed and tested in the design and implementation of the X-RHex hexapod that allows the robot to operate as a mobile laboratory on legs. In the present paper we will introduce the specification, design and very earliest operational data of Canid, an actively driven compliant-spined quadruped whose completely different morphology and intended dynamical operating point are nevertheless built around exactly the same "Lab on Legs" actuation, computation, and sensing infrastructure. We will review as well, more briefly a second RHex variation, the XRL platform, built using the same components.
One-Click Data Analysis Software for Science Operations
NASA Astrophysics Data System (ADS)
Navarro, Vicente
2015-12-01
One of the important activities of ESA Science Operations Centre is to provide Data Analysis Software (DAS) to enable users and scientists to process data further to higher levels. During operations and post-operations, Data Analysis Software (DAS) is fully maintained and updated for new OS and library releases. Nonetheless, once a Mission goes into the "legacy" phase, there are very limited funds and long-term preservation becomes more and more difficult. Building on Virtual Machine (VM), Cloud computing and Software as a Service (SaaS) technologies, this project has aimed at providing long-term preservation of Data Analysis Software for the following missions: - PIA for ISO (1995) - SAS for XMM-Newton (1999) - Hipe for Herschel (2009) - EXIA for EXOSAT (1983) Following goals have guided the architecture: - Support for all operations, post-operations and archive/legacy phases. - Support for local (user's computer) and cloud environments (ESAC-Cloud, Amazon - AWS). - Support for expert users, requiring full capabilities. - Provision of a simple web-based interface. This talk describes the architecture, challenges, results and lessons learnt gathered in this project.
A High Performance VLSI Computer Architecture For Computer Graphics
NASA Astrophysics Data System (ADS)
Chin, Chi-Yuan; Lin, Wen-Tai
1988-10-01
A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
DOT National Transportation Integrated Search
2002-04-01
The Logical Architecture is based on a Computer Aided Systems Engineering (CASE) model of the requirements for the flow of data and control through the various functions included in Intelligent Transportation Systems (ITS). Data Dictionary is the com...
Efficient fuzzy C-means architecture for image segmentation.
Li, Hui-Ya; Hwang, Wen-Jyi; Chang, Chia-Yen
2011-01-01
This paper presents a novel VLSI architecture for image segmentation. The architecture is based on the fuzzy c-means algorithm with spatial constraint for reducing the misclassification rate. In the architecture, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. In addition, an efficient pipelined circuit is used for the updating process for accelerating the computational speed. Experimental results show that the the proposed circuit is an effective alternative for real-time image segmentation with low area cost and low misclassification rate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Hsien-Hsin S
The overall objective of this research project is to develop novel architectural techniques as well as system software to achieve a highly secure and intrusion-tolerant computing system. Such system will be autonomous, self-adapting, introspective, with self-healing capability under the circumstances of improper operations, abnormal workloads, and malicious attacks. The scope of this research includes: (1) System-wide, unified introspection techniques for autonomic systems, (2) Secure information-flow microarchitecture, (3) Memory-centric security architecture, (4) Authentication control and its implication to security, (5) Digital right management, (5) Microarchitectural denial-of-service attacks on shared resources. During the period of the project, we developed several architectural techniquesmore » and system software for achieving a robust, secure, and reliable computing system toward our goal.« less
Prospective Architectures for Onboard vs Cloud-Based Decision Making for Unmanned Aerial Systems
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Teubert, Christopher
2017-01-01
This paper investigates propsective architectures for decision-making in unmanned aerial systems. When these unmanned vehicles operate in urban environments, there are several sources of uncertainty that affect their behavior, and decision-making algorithms need to be robust to account for these different sources of uncertainty. It is important to account for several risk-factors that affect the flight of these unmanned systems, and facilitate decision-making by taking into consideration these various risk-factors. In addition, there are several technical challenges related to autonomous flight of unmanned aerial systems; these challenges include sensing, obstacle detection, path planning and navigation, trajectory generation and selection, etc. Many of these activities require significant computational power and in many situations, all of these activities need to be performed in real-time. In order to efficiently integrate these activities, it is important to develop a systematic architecture that can facilitate real-time decision-making. Four prospective architectures are discussed in this paper; on one end of the spectrum, the first architecture considers all activities/computations being performed onboard the vehicle whereas on the other end of the spectrum, the fourth and final architecture considers all activities/computations being performed in the cloud, using a new service known as Prognostics as a Service that is being developed at NASA Ames Research Center. The four different architectures are compared, their advantages and disadvantages are explained and conclusions are presented.
Design and reliability analysis of DP-3 dynamic positioning control architecture
NASA Astrophysics Data System (ADS)
Wang, Fang; Wan, Lei; Jiang, Da-Peng; Xu, Yu-Ru
2011-12-01
As the exploration and exploitation of oil and gas proliferate throughout deepwater area, the requirements on the reliability of dynamic positioning system become increasingly stringent. The control objective ensuring safety operation at deep water will not be met by a single controller for dynamic positioning. In order to increase the availability and reliability of dynamic positioning control system, the triple redundancy hardware and software control architectures were designed and developed according to the safe specifications of DP-3 classification notation for dynamically positioned ships and rigs. The hardware redundant configuration takes the form of triple-redundant hot standby configuration including three identical operator stations and three real-time control computers which connect each other through dual networks. The function of motion control and redundancy management of control computers were implemented by software on the real-time operating system VxWorks. The software realization of task loose synchronization, majority voting and fault detection were presented in details. A hierarchical software architecture was planed during the development of software, consisting of application layer, real-time layer and physical layer. The behavior of the DP-3 dynamic positioning control system was modeled by a Markov model to analyze its reliability. The effects of variation in parameters on the reliability measures were investigated. The time domain dynamic simulation was carried out on a deepwater drilling rig to prove the feasibility of the proposed control architecture.
The MasPar MP-1 As a Computer Arithmetic Laboratory
Anuta, Michael A.; Lozier, Daniel W.; Turner, Peter R.
1996-01-01
This paper is a blueprint for the use of a massively parallel SIMD computer architecture for the simulation of various forms of computer arithmetic. The particular system used is a DEC/MasPar MP-1 with 4096 processors in a square array. This architecture has many advantages for such simulations due largely to the simplicity of the individual processors. Arithmetic operations can be spread across the processor array to simulate a hardware chip. Alternatively they may be performed on individual processors to allow simulation of a massively parallel implementation of the arithmetic. Compromises between these extremes permit speed-area tradeoffs to be examined. The paper includes a description of the architecture and its features. It then summarizes some of the arithmetic systems which have been, or are to be, implemented. The implementation of the level-index and symmetric level-index, LI and SLI, systems is described in some detail. An extensive bibliography is included. PMID:27805123
ERIC Educational Resources Information Center
Mitchell, Christine M.; Govindaraj, T.
1990-01-01
Discusses the use of intelligent tutoring systems as opposed to traditional on-the-job training for training operators of complex dynamic systems and describes the computer architecture for a system for operators of a NASA (National Aeronautics and Space Administration) satellite control system. An experimental evaluation with college students is…
Mobile Computing for Aerospace Applications
NASA Technical Reports Server (NTRS)
Alena, Richard; Swietek, Gregory E. (Technical Monitor)
1994-01-01
The use of commercial computer technology in specific aerospace mission applications can reduce the cost and project cycle time required for the development of special-purpose computer systems. Additionally, the pace of technological innovation in the commercial market has made new computer capabilities available for demonstrations and flight tests. Three areas of research and development being explored by the Portable Computer Technology Project at NASA Ames Research Center are the application of commercial client/server network computing solutions to crew support and payload operations, the analysis of requirements for portable computing devices, and testing of wireless data communication links as extensions to the wired network. This paper will present computer architectural solutions to portable workstation design including the use of standard interfaces, advanced flat-panel displays and network configurations incorporating both wired and wireless transmission media. It will describe the design tradeoffs used in selecting high-performance processors and memories, interfaces for communication and peripheral control, and high resolution displays. The packaging issues for safe and reliable operation aboard spacecraft and aircraft are presented. The current status of wireless data links for portable computers is discussed from a system design perspective. An end-to-end data flow model for payload science operations from the experiment flight rack to the principal investigator is analyzed using capabilities provided by the new generation of computer products. A future flight experiment on-board the Russian MIR space station will be described in detail including system configuration and function, the characteristics of the spacecraft operating environment, the flight qualification measures needed for safety review, and the specifications of the computing devices to be used in the experiment. The software architecture chosen shall be presented. An analysis of the performance characteristics of wireless data links in the spacecraft environment will be discussed. Network performance and operation will be modeled and preliminary test results presented. A crew support application will be demonstrated in conjunction with the network metrics experiment.
Hybrid quantum computing with ancillas
NASA Astrophysics Data System (ADS)
Proctor, Timothy J.; Kendon, Viv
2016-10-01
In the quest to build a practical quantum computer, it is important to use efficient schemes for enacting the elementary quantum operations from which quantum computer programs are constructed. The opposing requirements of well-protected quantum data and fast quantum operations must be balanced to maintain the integrity of the quantum information throughout the computation. One important approach to quantum operations is to use an extra quantum system - an ancilla - to interact with the quantum data register. Ancillas can mediate interactions between separated quantum registers, and by using fresh ancillas for each quantum operation, data integrity can be preserved for longer. This review provides an overview of the basic concepts of the gate model quantum computer architecture, including the different possible forms of information encodings - from base two up to continuous variables - and a more detailed description of how the main types of ancilla-mediated quantum operations provide efficient quantum gates.
An Architectural Experience for Interface Design
ERIC Educational Resources Information Center
Gong, Susan P.
2016-01-01
The problem of human-computer interface design was brought to the foreground with the emergence of the personal computer, the increasing complexity of electronic systems, and the need to accommodate the human operator in these systems. With each new technological generation discovering the interface design problems of its own technologies, initial…
Space and Ground Trades for Human Exploration and Wearable Computing
NASA Technical Reports Server (NTRS)
Lupisella, Mark; Donohue, John; Mandl, Dan; Ly, Vuong; Graves, Corey; Heimerdinger, Dan; Studor, George; Saiz, John; DeLaune, Paul; Clancey, William
2006-01-01
Human exploration of the Moon and Mars will present unique trade study challenges as ground system elements shift to planetary bodies and perhaps eventually to the bodies of human explorers in the form of wearable computing technologies. This presentation will highlight some of the key space and ground trade issues that will face the Exploration Initiative as NASA begins designing systems for the sustained human exploration of the Moon and Mars, with an emphasis on wearable computing. We will present some preliminary test results and scenarios that demonstrate how wearable computing might affect the trade space noted below. We will first present some background on wearable computing and its utility to NASA's Exploration Initiative. Next, we will discuss three broad architectural themes, some key ground and space trade issues within those themes and how they relate to wearable computing. Lastly, we will present some preliminary test results and suggest guidance for proceeding in the assessment and creation of a value-added role for wearable computing in the Exploration Initiative. The three broad ground-space architectural trade themes we will discuss are: 1. Functional Shift and Distribution: To what extent, if any, should traditional ground system functionality be shifted to, and distributed among, the Earth, Moon/Mars, and the human. explorer? 2. Situational Awareness and Autonomy: How much situational awareness (e.g. environmental conditions, biometrics, etc.) and autonomy is required and desired, and where should these capabilities reside? 3. Functional Redundancy: What functions (e.g. command, control, analysis) should exist simultaneously on Earth, the Moon/Mars, and the human explorer? These three themes can serve as the axes of a three-dimensional trade space, within which architectural solutions reside. We will show how wearable computers can fit into this trade space and what the possible implications could be for the rest of the ground and space architecture(s). We intend this to be an example of explorer-centric thinking in a fully integrated explorer paradigm, where integrated explorer refers to a human explorer having instant access to all relevant data, knowledge of the environment, science models, health and safety-related events, and other tools and information via wearable computing technologies. The trade study approach will include involvement from the relevant stakeholders (Constellation Systems, CCCI, EVA Project Office, Astronaut office, Mission Operations, Space Life Sciences, etc.) to develop operations concepts (and/or operations scenarios) from which a basic high-level set of requirements could be extracted. This set of requirements could serve as a foundation (along with stakeholder buy-in) that would help define the trade space and assist in identifying candidate technologies for further study and evolution to higher-level technology readiness levels.
Mark 4A antenna control system data handling architecture study
NASA Technical Reports Server (NTRS)
Briggs, H. C.; Eldred, D. B.
1991-01-01
A high-level review was conducted to provide an analysis of the existing architecture used to handle data and implement control algorithms for NASA's Deep Space Network (DSN) antennas and to make system-level recommendations for improving this architecture so that the DSN antennas can support the ever-tightening requirements of the next decade and beyond. It was found that the existing system is seriously overloaded, with processor utilization approaching 100 percent. A number of factors contribute to this overloading, including dated hardware, inefficient software, and a message-passing strategy that depends on serial connections between machines. At the same time, the system has shortcomings and idiosyncrasies that require extensive human intervention. A custom operating system kernel and an obscure programming language exacerbate the problems and should be modernized. A new architecture is presented that addresses these and other issues. Key features of the new architecture include a simplified message passing hierarchy that utilizes a high-speed local area network, redesign of particular processing function algorithms, consolidation of functions, and implementation of the architecture in modern hardware and software using mainstream computer languages and operating systems. The system would also allow incremental hardware improvements as better and faster hardware for such systems becomes available, and costs could potentially be low enough that redundancy would be provided economically. Such a system could support DSN requirements for the foreseeable future, though thorough consideration must be given to hard computational requirements, porting existing software functionality to the new system, and issues of fault tolerance and recovery.
NASA Technical Reports Server (NTRS)
Jones, J. R.; Bodenheimer, R. E.
1976-01-01
A simple programmable Tse processor organization and arithmetic operations necessary for extraction of the desired topological information are described. Hardware additions to this organization are discussed along with trade-offs peculiar to the tse computing concept. An improved organization is presented along with the complementary software for the various arithmetic operations. The performance of the two organizations is compared in terms of speed, power, and cost. Software routines developed to extract the desired information from an image are included.
A cognitive computational model inspired by the immune system response.
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective.
A Cognitive Computational Model Inspired by the Immune System Response
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective. PMID:25003131
NASA Technical Reports Server (NTRS)
Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.
1992-01-01
Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None, None
Smart grids are susceptible to cyber-attack as a result of new communication, control and computation techniques employed in the grid. In this paper, we characterize and analyze the resiliency of smart grid communication architecture, specifically an RF mesh based architecture, under cyber attacks. We analyze the resiliency of the communication architecture by studying the performance of high-level smart grid functions such as metering, and demand response which depend on communication. Disrupting the operation of these functions impacts the operational resiliency of the smart grid. Our analysis shows that it takes an attacker only a small fraction of meters to compromisemore » the communication resiliency of the smart grid. We discuss the implications of our result to critical smart grid functions and to the overall security of the smart grid.« less
Bespoke physics for living technology.
Ackley, David H
2013-01-01
In the physics of the natural world, basic tasks of life, such as homeostasis and reproduction, are extremely complex operations, requiring the coordination of billions of atoms even in simple cases. By contrast, artificial living organisms can be implemented in computers using relatively few bits, and copying a data structure is trivial. Of course, the physical overheads of the computers themselves are huge, but since their programmability allows digital "laws of physics" to be tailored like a custom suit, deploying living technology atop an engineered computational substrate might be as or more effective than building directly on the natural laws of physics, for a substantial range of desirable purposes. This article suggests basic criteria and metrics for bespoke physics computing architectures, describes one such architecture, and offers data and illustrations of custom living technology competing to reproduce while collaborating on an externally useful computation.
Systems Architecture for Fully Autonomous Space Missions
NASA Technical Reports Server (NTRS)
Esper, Jamie; Schnurr, R.; VanSteenberg, M.; Brumfield, Mark (Technical Monitor)
2002-01-01
The NASA Goddard Space Flight Center is working to develop a revolutionary new system architecture concept in support of fully autonomous missions. As part of GSFC's contribution to the New Millenium Program (NMP) Space Technology 7 Autonomy and on-Board Processing (ST7-A) Concept Definition Study, the system incorporates the latest commercial Internet and software development ideas and extends them into NASA ground and space segment architectures. The unique challenges facing the exploration of remote and inaccessible locales and the need to incorporate corresponding autonomy technologies within reasonable cost necessitate the re-thinking of traditional mission architectures. A measure of the resiliency of this architecture in its application to a broad range of future autonomy missions will depend on its effectiveness in leveraging from commercial tools developed for the personal computer and Internet markets. Specialized test stations and supporting software come to past as spacecraft take advantage of the extensive tools and research investments of billion-dollar commercial ventures. The projected improvements of the Internet and supporting infrastructure go hand-in-hand with market pressures that provide continuity in research. By taking advantage of consumer-oriented methods and processes, space-flight missions will continue to leverage on investments tailored to provide better services at reduced cost. The application of ground and space segment architectures each based on Local Area Networks (LAN), the use of personal computer-based operating systems, and the execution of activities and operations through a Wide Area Network (Internet) enable a revolution in spacecraft mission formulation, implementation, and flight operations. Hardware and software design, development, integration, test, and flight operations are all tied-in closely to a common thread that enables the smooth transitioning between program phases. The application of commercial software development techniques lays the foundation for delivery of product-oriented flight software modules and models. Software can then be readily applied to support the on-board autonomy required for mission self-management. An on-board intelligent system, based on advanced scripting languages, facilitates the mission autonomy required to offload ground system resources, and enables the spacecraft to manage itself safely through an efficient and effective process of reactive planning, science data acquisition, synthesis, and transmission to the ground. Autonomous ground systems in turn coordinate and support schedule contact times with the spacecraft. Specific autonomy software modules on-board include mission and science planners, instrument and subsystem control, and fault tolerance response software, all residing within a distributed computing environment supported through the flight LAN. Autonomy also requires the minimization of human intervention between users on the ground and the spacecraft, and hence calls for the elimination of the traditional operations control center as a funnel for data manipulation. Basic goal-oriented commands are sent directly from the user to the spacecraft through a distributed internet-based payload operations "center". The ensuing architecture calls for the use of spacecraft as point extensions on the Internet. This paper will detail the system architecture implementation chosen to enable cost-effective autonomous missions with applicability to a broad range of conditions. It will define the structure needed for implementation of such missions, including software and hardware infrastructures. The overall architecture is then laid out as a common thread in the mission life cycle from formulation through implementation and flight operations.
Evaluation of the Intel iWarp parallel processor for space flight applications
NASA Technical Reports Server (NTRS)
Hine, Butler P., III; Fong, Terrence W.
1993-01-01
The potential of a DARPA-sponsored advanced processor, the Intel iWarp, for use in future SSF Data Management Systems (DMS) upgrades is evaluated through integration into the Ames DMS testbed and applications testing. The iWarp is a distributed, parallel computing system well suited for high performance computing applications such as matrix operations and image processing. The system architecture is modular, supports systolic and message-based computation, and is capable of providing massive computational power in a low-cost, low-power package. As a consequence, the iWarp offers significant potential for advanced space-based computing. This research seeks to determine the iWarp's suitability as a processing device for space missions. In particular, the project focuses on evaluating the ease of integrating the iWarp into the SSF DMS baseline architecture and the iWarp's ability to support computationally stressing applications representative of SSF tasks.
Computer Sciences and Data Systems, volume 1
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
A Computational Architecture for Programmable Automation Research
NASA Astrophysics Data System (ADS)
Taylor, Russell H.; Korein, James U.; Maier, Georg E.; Durfee, Lawrence F.
1987-03-01
This short paper describes recent work at the IBM T. J. Watson Research Center directed at developing a highly flexible computational architecture for research on sensor-based programmable automation. The system described here has been designed with a focus on dynamic configurability, layered user inter-faces and incorporation of sensor-based real time operations into new commands. It is these features which distinguish it from earlier work. The system is cur-rently being implemented at IBM for research purposes and internal use and is an outgrowth of programmable automation research which has been ongoing since 1972 [e.g., 1, 2, 3, 4, 5, 6] .
Real-time control system for adaptive resonator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Flath, L; An, J; Brase, J
2000-07-24
Sustained operation of high average power solid-state lasers currently requires an adaptive resonator to produce the optimal beam quality. We describe the architecture of a real-time adaptive control system for correcting intra-cavity aberrations in a heat capacity laser. Image data collected from a wavefront sensor are processed and used to control phase with a high-spatial-resolution deformable mirror. Our controller takes advantage of recent developments in low-cost, high-performance processor technology. A desktop-based computational engine and object-oriented software architecture replaces the high-cost rack-mount embedded computers of previous systems.
An Efficient VLSI Architecture for Multi-Channel Spike Sorting Using a Generalized Hebbian Algorithm
Chen, Ying-Lun; Hwang, Wen-Jyi; Ke, Chi-En
2015-01-01
A novel VLSI architecture for multi-channel online spike sorting is presented in this paper. In the architecture, the spike detection is based on nonlinear energy operator (NEO), and the feature extraction is carried out by the generalized Hebbian algorithm (GHA). To lower the power consumption and area costs of the circuits, all of the channels share the same core for spike detection and feature extraction operations. Each channel has dedicated buffers for storing the detected spikes and the principal components of that channel. The proposed circuit also contains a clock gating system supplying the clock to only the buffers of channels currently using the computation core to further reduce the power consumption. The architecture has been implemented by an application-specific integrated circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture has lower power consumption and hardware area costs for real-time multi-channel spike detection and feature extraction. PMID:26287193
Chen, Ying-Lun; Hwang, Wen-Jyi; Ke, Chi-En
2015-08-13
A novel VLSI architecture for multi-channel online spike sorting is presented in this paper. In the architecture, the spike detection is based on nonlinear energy operator (NEO), and the feature extraction is carried out by the generalized Hebbian algorithm (GHA). To lower the power consumption and area costs of the circuits, all of the channels share the same core for spike detection and feature extraction operations. Each channel has dedicated buffers for storing the detected spikes and the principal components of that channel. The proposed circuit also contains a clock gating system supplying the clock to only the buffers of channels currently using the computation core to further reduce the power consumption. The architecture has been implemented by an application-specific integrated circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture has lower power consumption and hardware area costs for real-time multi-channel spike detection and feature extraction.
The Double-System Architecture for Trusted OS
NASA Astrophysics Data System (ADS)
Zhao, Yong; Li, Yu; Zhan, Jing
With the development of computer science and technology, current secure operating systems failed to respond to many new security challenges. Trusted operating system (TOS) is proposed to try to solve these problems. However, there are no mature, unified architectures for the TOS yet, since most of them cannot make clear of the relationship between security mechanism and the trusted mechanism. Therefore, this paper proposes a double-system architecture (DSA) for the TOS to solve the problem. The DSA is composed of the Trusted System (TS) and the Security System (SS). We constructed the TS by establishing a trusted environment and realized related SS. Furthermore, we proposed the Trusted Information Channel (TIC) to protect the information flow between TS and SS. In a word, the double system architecture we proposed can provide reliable protection for the OS through the SS with the supports provided by the TS.
Reliability models for dataflow computer systems
NASA Technical Reports Server (NTRS)
Kavi, K. M.; Buckles, B. P.
1985-01-01
The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
Integrated command, control, communications and computation system functional architecture
NASA Technical Reports Server (NTRS)
Cooley, C. G.; Gilbert, L. E.
1981-01-01
The functional architecture for an integrated command, control, communications, and computation system applicable to the command and control portion of the NASA End-to-End Data. System is described including the downlink data processing and analysis functions required to support the uplink processes. The functional architecture is composed of four elements: (1) the functional hierarchy which provides the decomposition and allocation of the command and control functions to the system elements; (2) the key system features which summarize the major system capabilities; (3) the operational activity threads which illustrate the interrelationahip between the system elements; and (4) the interfaces which illustrate those elements that originate or generate data and those elements that use the data. The interfaces also provide a description of the data and the data utilization and access techniques.
Battlefield Object Control via Internet Architecture
2002-01-01
superiority is the best way to reach the goal of competition superiority. Using information technology (IT) in data processing, including computer hardware... technologies : Global Positioning System (GPS), Geographic Information System (GIS), Battlefield Information Transmission System (BITS), and Intelligent...operational environment. Keywords: C4ISR Systems, Information Superiority, Battlefield Objects, Computer - Aided Prototyping System (CAPS), IP-based
YASS: A System Simulator for Operating System and Computer Architecture Teaching and Learning
ERIC Educational Resources Information Center
Mustafa, Besim
2013-01-01
A highly interactive, integrated and multi-level simulator has been developed specifically to support both the teachers and the learners of modern computer technologies at undergraduate level. The simulator provides a highly visual and user configurable environment with many pedagogical features aimed at facilitating deep understanding of concepts…
Avionics System Architecture Tool
NASA Technical Reports Server (NTRS)
Chau, Savio; Hall, Ronald; Traylor, marcus; Whitfield, Adrian
2005-01-01
Avionics System Architecture Tool (ASAT) is a computer program intended for use during the avionics-system-architecture- design phase of the process of designing a spacecraft for a specific mission. ASAT enables simulation of the dynamics of the command-and-data-handling functions of the spacecraft avionics in the scenarios in which the spacecraft is expected to operate. ASAT is built upon I-Logix Statemate MAGNUM, providing a complement of dynamic system modeling tools, including a graphical user interface (GUI), modeling checking capabilities, and a simulation engine. ASAT augments this with a library of predefined avionics components and additional software to support building and analyzing avionics hardware architectures using these components.
System design in an evolving system-of-systems architecture and concept of operations
NASA Astrophysics Data System (ADS)
Rovekamp, Roger N., Jr.
Proposals for space exploration architectures have increased in complexity and scope. Constituent systems (e.g., rovers, habitats, in-situ resource utilization facilities, transfer vehicles, etc) must meet the needs of these architectures by performing in multiple operational environments and across multiple phases of the architecture's evolution. This thesis proposes an approach for using system-of-systems engineering principles in conjunction with system design methods (e.g., Multi-objective optimization, genetic algorithms, etc) to create system design options that perform effectively at both the system and system-of-systems levels, across multiple concepts of operations, and over multiple architectural phases. The framework is presented by way of an application problem that investigates the design of power systems within a power sharing architecture for use in a human Lunar Surface Exploration Campaign. A computer model has been developed that uses candidate power grid distribution solutions for a notional lunar base. The agent-based model utilizes virtual control agents to manage the interactions of various exploration and infrastructure agents. The philosophy behind the model is based both on lunar power supply strategies proposed in literature, as well as on the author's own approaches for power distribution strategies of future lunar bases. In addition to proposing a framework for system design, further implications of system-of-systems engineering principles are briefly explored, specifically as they relate to producing more robust cross-cultural system-of-systems architecture solutions.
HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation
NASA Technical Reports Server (NTRS)
Sterling, Thomas; Bergman, Larry
2000-01-01
Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention semiconductor logic. Wave Division Multiplexing optical communications can approach a peak per fiber bandwidth of 1 Tbps and the new Data Vortex network topology employing this technology can connect tens of thousands of ports providing a bi-section bandwidth on the order of a Petabyte per second with latencies well below 100 nanoseconds, even under heavy loads. Processor-in-Memory (PIM) technology combines logic and memory on the same chip exposing the internal bandwidth of the memory row buffers at low latency. And holographic storage photorefractive storage technologies provide high-density memory with access a thousand times faster than conventional disk technologies. Together these technologies enable a new class of shared memory system architecture with a peak performance in the range of a Petaflops but size and power requirements comparable to today's largest Teraflops scale systems. To achieve high-sustained performance, HTMT combines an advanced multithreading processor architecture with a memory-driven coarse-grained latency management strategy called "percolation", yielding high efficiency while reducing the much of the parallel programming burden. This paper will present the basic system architecture characteristics made possible through this series of advanced technologies and then give a detailed description of the new percolation approach to runtime latency management.
A Quantitative Model for Assessing Visual Simulation Software Architecture
2011-09-01
Software Engineering Arnold Buss Research Associate Professor of MOVES LtCol Jeff Boleng, PhD Associate Professor of Computer Science U.S. Air Force Academy... science (operating and programming systems series). New York, NY, USA: Elsevier Science Ltd. Henry, S., & Kafura, D. (1984). The evaluation of software...Rudy Darken Professor of Computer Science Dissertation Supervisor Ted Lewis Professor of Computer Science Richard Riehle Professor of Practice
Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
NASA Astrophysics Data System (ADS)
Oyarzun, Guillermo; Borrell, Ricard; Gorobets, Andrey; Oliva, Assensi
2017-10-01
Nowadays, high performance computing (HPC) systems experience a disruptive moment with a variety of novel architectures and frameworks, without any clarity of which one is going to prevail. In this context, the portability of codes across different architectures is of major importance. This paper presents a portable implementation model based on an algebraic operational approach for direct numerical simulation (DNS) and large eddy simulation (LES) of incompressible turbulent flows using unstructured hybrid meshes. The strategy proposed consists in representing the whole time-integration algorithm using only three basic algebraic operations: sparse matrix-vector product, a linear combination of vectors and dot product. The main idea is based on decomposing the nonlinear operators into a concatenation of two SpMV operations. This provides high modularity and portability. An exhaustive analysis of the proposed implementation for hybrid CPU/GPU supercomputers has been conducted with tests using up to 128 GPUs. The main objective consists in understanding the challenges of implementing CFD codes on new architectures.
AHaH computing-from metastable switches to attractors to machine learning.
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures-all key capabilities of biological nervous systems and modern machine learning algorithms with real world application.
Architectural Analysis of a LLNL LWIR Sensor System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bond, Essex J.; Curry, Jim R.; LaFortune, Kai N.
The architecture of an LLNL airborne imaging and detection system is considered in this report. The purpose of the system is to find the location of substances of interest by detecting their chemical signatures using a long-wave infrared (LWIR) imager with geo-registration capability. The detection system consists of an LWIR imaging spectrometer as well as a network of computer hardware and analysis software for analyzing the images for the features of interest. The system has been in the operations phase now for well over a year, and as such, there is enough use data and feedback from the primary beneficiarymore » to assess the current successes and shortcomings of the LWIR system architecture. LWIR system has been successful in providing reliable data collection and the delivery of a report with results. The weakness of the architecture has been identified in two areas: with the network of computer hardware and software and with the feedback of the state of the system health. Regarding the former, the system computers and software that carry out the data acquisition are too complicated for routine operations and maintenance. With respect to the latter, the primary beneficiary of the instrument’s data does not have enough metrics to use to filter the large quantity of data to determine its utility. In addition to the needs in these two areas, a latent need of one of the stakeholders is identified. This report documents the strengths and weaknesses, as well as proposes a solution for enhancing the architecture that simultaneously addresses the two areas of weakness and leverages them to meet the newly identified latent need.« less
Real-time field programmable gate array architecture for computer vision
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Torres-Huitzil, Cesar
2001-01-01
This paper presents an architecture for real-time generic convolution of a mask and an image. The architecture is intended for fast low-level image processing. The field programmable gate array (FPGA)-based architecture takes advantage of the availability of registers in FPGAs to implement an efficient and compact module to process the convolutions. The architecture is designed to minimize the number of accesses to the image memory and it is based on parallel modules with internal pipeline operation in order to improve its performance. The architecture is prototyped in a FPGA, but it can be implemented on dedicated very- large-scale-integrated devices to reach higher clock frequencies. Complexity issues, FPGA resources utilization, FPGA limitations, and real-time performance are discussed. Some results are presented and discussed.
Subtlenoise: sonification of distributed computing operations
NASA Astrophysics Data System (ADS)
Love, P. A.
2015-12-01
The operation of distributed computing systems requires comprehensive monitoring to ensure reliability and robustness. There are two components found in most monitoring systems: one being visually rich time-series graphs and another being notification systems for alerting operators under certain pre-defined conditions. In this paper the sonification of monitoring messages is explored using an architecture that fits easily within existing infrastructures based on mature opensource technologies such as ZeroMQ, Logstash, and Supercollider (a synth engine). Message attributes are mapped onto audio attributes based on broad classification of the message (continuous or discrete metrics) but keeping the audio stream subtle in nature. The benefits of audio rendering are described in the context of distributed computing operations and may provide a less intrusive way to understand the operational health of these systems.
1988-12-01
Figures ................................... v List of Tables ................................... vi I. Introduction ..................... 1 Background... 1 Problem ................................ 3 Scope .................................. 4 Approach...Collected Data on Variables ...... 136 Appendix D: Collected Data on Operations...............208 iv List of Figures Figure Page 2- 1 Keyword Search
Proposed hardware architectures of particle filter for object tracking
NASA Astrophysics Data System (ADS)
Abd El-Halym, Howida A.; Mahmoud, Imbaby Ismail; Habib, SED
2012-12-01
In this article, efficient hardware architectures for particle filter (PF) are presented. We propose three different architectures for Sequential Importance Resampling Filter (SIRF) implementation. The first architecture is a two-step sequential PF machine, where particle sampling, weight, and output calculations are carried out in parallel during the first step followed by sequential resampling in the second step. For the weight computation step, a piecewise linear function is used instead of the classical exponential function. This decreases the complexity of the architecture without degrading the results. The second architecture speeds up the resampling step via a parallel, rather than a serial, architecture. This second architecture targets a balance between hardware resources and the speed of operation. The third architecture implements the SIRF as a distributed PF composed of several processing elements and central unit. All the proposed architectures are captured using VHDL synthesized using Xilinx environment, and verified using the ModelSim simulator. Synthesis results confirmed the resource reduction and speed up advantages of our architectures.
A scalable architecture for online anomaly detection of WLCG batch jobs
NASA Astrophysics Data System (ADS)
Kuehn, E.; Fischer, M.; Giffels, M.; Jung, C.; Petzold, A.
2016-10-01
For data centres it is increasingly important to monitor the network usage, and learn from network usage patterns. Especially configuration issues or misbehaving batch jobs preventing a smooth operation need to be detected as early as possible. At the GridKa data and computing centre we therefore operate a tool BPNetMon for monitoring traffic data and characteristics of WLCG batch jobs and pilots locally on different worker nodes. On the one hand local information itself are not sufficient to detect anomalies for several reasons, e.g. the underlying job distribution on a single worker node might change or there might be a local misconfiguration. On the other hand a centralised anomaly detection approach does not scale regarding network communication as well as computational costs. We therefore propose a scalable architecture based on concepts of a super-peer network.
Nonvolatile Memory Materials for Neuromorphic Intelligent Machines.
Jeong, Doo Seok; Hwang, Cheol Seong
2018-04-18
Recent progress in deep learning extends the capability of artificial intelligence to various practical tasks, making the deep neural network (DNN) an extremely versatile hypothesis. While such DNN is virtually built on contemporary data centers of the von Neumann architecture, physical (in part) DNN of non-von Neumann architecture, also known as neuromorphic computing, can remarkably improve learning and inference efficiency. Particularly, resistance-based nonvolatile random access memory (NVRAM) highlights its handy and efficient application to the multiply-accumulate (MAC) operation in an analog manner. Here, an overview is given of the available types of resistance-based NVRAMs and their technological maturity from the material- and device-points of view. Examples within the strategy are subsequently addressed in comparison with their benchmarks (virtual DNN in deep learning). A spiking neural network (SNN) is another type of neural network that is more biologically plausible than the DNN. The successful incorporation of resistance-based NVRAM in SNN-based neuromorphic computing offers an efficient solution to the MAC operation and spike timing-based learning in nature. This strategy is exemplified from a material perspective. Intelligent machines are categorized according to their architecture and learning type. Also, the functionality and usefulness of NVRAM-based neuromorphic computing are addressed. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Paraxial diffractive elements for space-variant linear transforms
NASA Astrophysics Data System (ADS)
Teiwes, Stephan; Schwarzer, Heiko; Gu, Ben-Yuan
1998-06-01
Optical linear transform architectures bear good potential for future developments of very powerful hybrid vision systems and neural network classifiers. The optical modules of such systems could be used as pre-processors to solve complex linear operations at very high speed in order to simplify an electronic data post-processing. However, the applicability of linear optical architectures is strongly connected with the fundamental question of how to implement a specific linear transform by optical means and physical imitations. The large majority of publications on this topic focusses on the optical implementation of space-invariant transforms by the well-known 4f-setup. Only few papers deal with approaches to implement selected space-variant transforms. In this paper, we propose a simple algebraic method to design diffractive elements for an optical architecture in order to realize arbitrary space-variant transforms. The design procedure is based on a digital model of scalar, paraxial wave theory and leads to optimal element transmission functions within the model. Its computational and physical limitations are discussed in terms of complexity measures. Finally, the design procedure is demonstrated by some examples. Firstly, diffractive elements for the realization of different rotation operations are computed and, secondly, a Hough transform element is presented. The correct optical functions of the elements are proved in computer simulation experiments.
NASA Astrophysics Data System (ADS)
Maiti, Anup Kumar; Nath Roy, Jitendra; Mukhopadhyay, Sourangshu
2007-08-01
In the field of optical computing and parallel information processing, several number systems have been used for different arithmetic and algebraic operations. Therefore an efficient conversion scheme from one number system to another is very important. Modified trinary number (MTN) has already taken a significant role towards carry and borrow free arithmetic operations. In this communication, we propose a tree-net architecture based all optical conversion scheme from binary number to its MTN form. Optical switch using nonlinear material (NLM) plays an important role.
Developing a Complete and Effective ACT-R Architecture
2008-01-01
of computational primitives , as contrasted with the predominant “one-off” and “grab-bag” cognitive models in the field. These architectures have...transport/ semaphore protocols connected via a glue script. Both protocols rely on the fact that file rename and file remove operations are atomic...the Trial Log file until just prior to processing the next input request. Thus, to perform synchronous identifications it is necessary to run an
Reverse time migration: A seismic processing application on the connection machine
NASA Technical Reports Server (NTRS)
Fiebrich, Rolf-Dieter
1987-01-01
The implementation of a reverse time migration algorithm on the Connection Machine, a massively parallel computer is described. Essential architectural features of this machine as well as programming concepts are presented. The data structures and parallel operations for the implementation of the reverse time migration algorithm are described. The algorithm matches the Connection Machine architecture closely and executes almost at the peak performance of this machine.
An object-oriented software approach for a distributed human tracking motion system
NASA Astrophysics Data System (ADS)
Micucci, Daniela L.
2003-06-01
Tracking is a composite job involving the co-operation of autonomous activities which exploit a complex information model and rely on a distributed architecture. Both information and activities must be classified and related in several dimensions: abstraction levels (what is modelled and how information is processed); topology (where the modelled entities are); time (when entities exist); strategy (why something happens); responsibilities (who is in charge of processing the information). A proper Object-Oriented analysis and design approach leads to a modular architecture where information about conceptual entities is modelled at each abstraction level via classes and intra-level associations, whereas inter-level associations between classes model the abstraction process. Both information and computation are partitioned according to level-specific topological models. They are also placed in a temporal framework modelled by suitable abstractions. Domain-specific strategies control the execution of the computations. Computational components perform both intra-level processing and intra-level information conversion. The paper overviews the phases of the analysis and design process, presents major concepts at each abstraction level, and shows how the resulting design turns into a modular, flexible and adaptive architecture. Finally, the paper sketches how the conceptual architecture can be deployed into a concrete distribute architecture by relying on an experimental framework.
NASA Technical Reports Server (NTRS)
Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas
2008-01-01
A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.
Architecture for a 1-GHz Digital RADAR
NASA Technical Reports Server (NTRS)
Mallik, Udayan
2011-01-01
An architecture for a Direct RF-digitization Type Digital Mode RADAR was developed at GSFC in 2008. Two variations of a basic architecture were developed for use on RADAR imaging missions using aircraft and spacecraft. Both systems can operate with a pulse repetition rate up to 10 MHz with 8 received RF samples per pulse repetition interval, or at up to 19 kHz with 4K received RF samples per pulse repetition interval. The first design describes a computer architecture for a Continuous Mode RADAR transceiver with a real-time signal processing and display architecture. The architecture can operate at a high pulse repetition rate without interruption for an infinite amount of time. The second design describes a smaller and less costly burst mode RADAR that can transceive high pulse repetition rate RF signals without interruption for up to 37 seconds. The burst-mode RADAR was designed to operate on an off-line signal processing paradigm. The temporal distribution of RF samples acquired and reported to the RADAR processor remains uniform and free of distortion in both proposed architectures. The majority of the RADAR's electronics is implemented in digital CMOS (complementary metal oxide semiconductor), and analog circuits are restricted to signal amplification operations and analog to digital conversion. An implementation of the proposed systems will create a 1-GHz, Direct RF-digitization Type, L-Band Digital RADAR--the highest band achievable for Nyquist Rate, Direct RF-digitization Systems that do not implement an electronic IF downsample stage (after the receiver signal amplification stage), using commercially available off-the-shelf integrated circuits.
Climate, weather, space weather: model development in an operational context
NASA Astrophysics Data System (ADS)
Folini, Doris
2018-05-01
Aspects of operational modeling for climate, weather, and space weather forecasts are contrasted, with a particular focus on the somewhat conflicting demands of "operational stability" versus "dynamic development" of the involved models. Some common key elements are identified, indicating potential for fruitful exchange across communities. Operational model development is compelling, driven by factors that broadly fall into four categories: model skill, basic physics, advances in computer architecture, and new aspects to be covered, from costumer needs over physics to observational data. Evaluation of model skill as part of the operational chain goes beyond an automated skill score. Permanent interaction between "pure research" and "operational forecast" people is beneficial to both sides. This includes joint model development projects, although ultimate responsibility for the operational code remains with the forecast provider. The pace of model development reflects operational lead times. The points are illustrated with selected examples, many of which reflect the author's background and personal contacts, notably with the Swiss Weather Service and the Max Planck Institute for Meteorology, Hamburg, Germany. In view of current and future challenges, large collaborations covering a range of expertise are a must - within and across climate, weather, and space weather. To profit from and cope with the rapid progress of computer architectures, supercompute centers must form part of the team.
High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations
NASA Technical Reports Server (NTRS)
Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.
2003-01-01
Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.
Computational approaches to vision
NASA Technical Reports Server (NTRS)
Barrow, H. G.; Tenenbaum, J. M.
1986-01-01
Vision is examined in terms of a computational process, and the competence, structure, and control of computer vision systems are analyzed. Theoretical and experimental data on the formation of a computer vision system are discussed. Consideration is given to early vision, the recovery of intrinsic surface characteristics, higher levels of interpretation, and system integration and control. A computational visual processing model is proposed and its architecture and operation are described. Examples of state-of-the-art vision systems, which include some of the levels of representation and processing mechanisms, are presented.
NASA Technical Reports Server (NTRS)
Brunelle, J. E.; Eckhardt, D. E., Jr.
1985-01-01
Results are presented of an experiment conducted in the NASA Avionics Integrated Research Laboratory (AIRLAB) to investigate the implementation of fault-tolerant software techniques on fault-tolerant computer architectures, in particular the Software Implemented Fault Tolerance (SIFT) computer. The N-version programming and recovery block techniques were implemented on a portion of the SIFT operating system. The results indicate that, to effectively implement fault-tolerant software design techniques, system requirements will be impacted and suggest that retrofitting fault-tolerant software on existing designs will be inefficient and may require system modification.
Colt: an experiment in wormhole run-time reconfiguration
NASA Astrophysics Data System (ADS)
Bittner, Ray; Athanas, Peter M.; Musgrove, Mark
1996-10-01
Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
An operational computer program to control Self Defense Surface Missile System operations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roe, C.L.
1991-12-01
An account is given of the system architecture and operational protocols of the NATO Seasparrow Surface Missile System (NSSMS) Operational Computer Program (OCP) which has been developed, and is being deployed multinationally, to respond against antiship missiles. Flowcharts are presented for the target detection and tracking, control, and engagement phases of the Self Defense Surface Missile System that is controlled by the OCP. USN and other NATO vessels will carry the NSSMS well into the next century; the OCP presently described will be deployed in the course of 1992 to enhance the self-defense capabilities of the NSSMS-equipped fleet. 8 refs.
An operating system for future aerospace vehicle computer systems
NASA Technical Reports Server (NTRS)
Foudriat, E. C.; Berman, W. J.; Will, R. W.; Bynum, W. L.
1984-01-01
The requirements for future aerospace vehicle computer operating systems are examined in this paper. The computer architecture is assumed to be distributed with a local area network connecting the nodes. Each node is assumed to provide a specific functionality. The network provides for communication so that the overall tasks of the vehicle are accomplished. The O/S structure is based upon the concept of objects. The mechanisms for integrating node unique objects with node common objects in order to implement both the autonomy and the cooperation between nodes is developed. The requirements for time critical performance and reliability and recovery are discussed. Time critical performance impacts all parts of the distributed operating system; e.g., its structure, the functional design of its objects, the language structure, etc. Throughout the paper the tradeoffs - concurrency, language structure, object recovery, binding, file structure, communication protocol, programmer freedom, etc. - are considered to arrive at a feasible, maximum performance design. Reliability of the network system is considered. A parallel multipath bus structure is proposed for the control of delivery time for time critical messages. The architecture also supports immediate recovery for the time critical message system after a communication failure.
NASA Technical Reports Server (NTRS)
Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron
1992-01-01
Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Compensated Crystal Assemblies for Type-II Entangled Photon Generation in Quantum Cluster States
2010-03-01
in quantum computational architectures that operate by principles entirely distinct from any based on classical physics. In contrast with other...of the SPDC spectral function, to enable applications in regions that have not been accessible with other methods. Quantum Information and Computation ...Eliminating frequency and space-time correlations in multi-photon states, PRA 64, 063815, 2001 [2]A. Zeilinger et.al. Experimental One-way computing
Universal Quantum Computing with Arbitrary Continuous-Variable Encoding.
Lau, Hoi-Kwan; Plenio, Martin B
2016-09-02
Implementing a qubit quantum computer in continuous-variable systems conventionally requires the engineering of specific interactions according to the encoding basis states. In this work, we present a unified formalism to conduct universal quantum computation with a fixed set of operations but arbitrary encoding. By storing a qubit in the parity of two or four qumodes, all computing processes can be implemented by basis state preparations, continuous-variable exponential-swap operations, and swap tests. Our formalism inherits the advantages that the quantum information is decoupled from collective noise, and logical qubits with different encodings can be brought to interact without decoding. We also propose a possible implementation of the required operations by using interactions that are available in a variety of continuous-variable systems. Our work separates the "hardware" problem of engineering quantum-computing-universal interactions, from the "software" problem of designing encodings for specific purposes. The development of quantum computer architecture could hence be simplified.
Universal Quantum Computing with Arbitrary Continuous-Variable Encoding
NASA Astrophysics Data System (ADS)
Lau, Hoi-Kwan; Plenio, Martin B.
2016-09-01
Implementing a qubit quantum computer in continuous-variable systems conventionally requires the engineering of specific interactions according to the encoding basis states. In this work, we present a unified formalism to conduct universal quantum computation with a fixed set of operations but arbitrary encoding. By storing a qubit in the parity of two or four qumodes, all computing processes can be implemented by basis state preparations, continuous-variable exponential-swap operations, and swap tests. Our formalism inherits the advantages that the quantum information is decoupled from collective noise, and logical qubits with different encodings can be brought to interact without decoding. We also propose a possible implementation of the required operations by using interactions that are available in a variety of continuous-variable systems. Our work separates the "hardware" problem of engineering quantum-computing-universal interactions, from the "software" problem of designing encodings for specific purposes. The development of quantum computer architecture could hence be simplified.
Scaling to Nanotechnology Limits with the PIMS Computer Architecture and a new Scaling Rule
DOE Office of Scientific and Technical Information (OSTI.GOV)
Debenedictis, Erik P.
2015-02-01
We describe a new approach to computing that moves towards the limits of nanotechnology using a newly formulated sc aling rule. This is in contrast to the current computer industry scali ng away from von Neumann's original computer at the rate of Moore's Law. We extend Moore's Law to 3D, which l eads generally to architectures that integrate logic and memory. To keep pow er dissipation cons tant through a 2D surface of the 3D structure requires using adiabatic principles. We call our newly proposed architecture Processor In Memory and Storage (PIMS). We propose a new computational model that integratesmore » processing and memory into "tiles" that comprise logic, memory/storage, and communications functions. Since the programming model will be relatively stable as a system scales, programs repr esented by tiles could be executed in a PIMS system built with today's technology or could become the "schematic diagram" for implementation in an ultimate 3D nanotechnology of the future. We build a systems software approach that offers advantages over and above the technological and arch itectural advantages. Firs t, the algorithms may be more efficient in the conventional sens e of having fewer steps. Second, the algorithms may run with higher power efficiency per operation by being a better match for the adiabatic scaling ru le. The performance analysis based on demonstrated ideas in physical science suggests 80,000 x improvement in cost per operation for the (arguably) gene ral purpose function of emulating neurons in Deep Learning.« less
NASA Astrophysics Data System (ADS)
Acernese, Fausto; Barone, Fabrizio; De Rosa, Rosario; Eleuteri, Antonio; Milano, Leopoldo; Pardi, Silvio; Ricciardi, Iolanda; Russo, Guido
2004-09-01
One of the main requirements of a digital system for the control of interferometric detectors of gravitational waves is the computing power, that is a direct consequence of the increasing complexity of the digital algorithms necessary for the control signals generation. For this specific task many specialized non standard real-time architectures have been developed, often very expensive and difficult to upgrade. On the other hand, such computing power is generally fully available for off-line applications on standard Pc based systems. Therefore, a possible and obvious solution may be provided by the integration of both the real-time and off-line architecture resulting in a hybrid control system architecture based on standards available components, trying to get both the advantages of the perfect data synchronization provided by the real-time systems and by the large computing power available on Pc based systems. Such integration may be provided by the implementation of the link between the two different architectures through the standard Ethernet network, whose data transfer speed is largely increasing in these years, using the TCP/IP, UDP and raw Ethernet protocols. In this paper we describe the architecture of an hybrid Ethernet based real-time control system prototype we implemented in Napoli, discussing its characteristics and performances. Finally we discuss a possible application to the real-time control of a suspended mass of the mode cleaner of the 3m prototype optical interferometer for gravitational wave detection (IDGW-3P) operational in Napoli.
NASA Astrophysics Data System (ADS)
Guilfoyle, Peter S.; Stone, Richard V.
1991-12-01
OptiComp is currently completing a 32-bit, fully programmable digital optical computer (DOC II) that is designed to operate in a UNIX environment running RISC microcode. OptiComp's DOC II architecture is focused toward parallel microcode implementation where data is input in a dual rail format. By exploiting the physical principals inherent to optics (speed and low power consumption), an architectural balance of optical interconnects and software code efficiency can be achieved including high fan-in and fan-out. OptiComp's DOC II program is jointly sponsored by the Office of Naval Research (ONR), the Strategic Defense Initiative Office (SDIO), NASA space station group and Rome Laboratory (USAF). This paper not only describes the motivational basis behind DOC II but also provides an optical overview and architectural summary of the device that allows the emulation of any digital instruction set.
Platform Architecture for Decentralized Positioning Systems.
Kasmi, Zakaria; Norrdine, Abdelmoumen; Blankenbach, Jörg
2017-04-26
A platform architecture for positioning systems is essential for the realization of a flexible localization system, which interacts with other systems and supports various positioning technologies and algorithms. The decentralized processing of a position enables pushing the application-level knowledge into a mobile station and avoids the communication with a central unit such as a server or a base station. In addition, the calculation of the position on low-cost and resource-constrained devices presents a challenge due to the limited computing, storage capacity, as well as power supply. Therefore, we propose a platform architecture that enables the design of a system with the reusability of the components, extensibility (e.g., with other positioning technologies) and interoperability. Furthermore, the position is computed on a low-cost device such as a microcontroller, which simultaneously performs additional tasks such as data collecting or preprocessing based on an operating system. The platform architecture is designed, implemented and evaluated on the basis of two positioning systems: a field strength system and a time of arrival-based positioning system.
Platform Architecture for Decentralized Positioning Systems
Kasmi, Zakaria; Norrdine, Abdelmoumen; Blankenbach, Jörg
2017-01-01
A platform architecture for positioning systems is essential for the realization of a flexible localization system, which interacts with other systems and supports various positioning technologies and algorithms. The decentralized processing of a position enables pushing the application-level knowledge into a mobile station and avoids the communication with a central unit such as a server or a base station. In addition, the calculation of the position on low-cost and resource-constrained devices presents a challenge due to the limited computing, storage capacity, as well as power supply. Therefore, we propose a platform architecture that enables the design of a system with the reusability of the components, extensibility (e.g., with other positioning technologies) and interoperability. Furthermore, the position is computed on a low-cost device such as a microcontroller, which simultaneously performs additional tasks such as data collecting or preprocessing based on an operating system. The platform architecture is designed, implemented and evaluated on the basis of two positioning systems: a field strength system and a time of arrival-based positioning system. PMID:28445414
AHaH Computing–From Metastable Switches to Attractors to Machine Learning
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures–all key capabilities of biological nervous systems and modern machine learning algorithms with real world application. PMID:24520315
High Performance GPU-Based Fourier Volume Rendering.
Abdellah, Marwan; Eldeib, Ayman; Sharawi, Amr
2015-01-01
Fourier volume rendering (FVR) is a significant visualization technique that has been used widely in digital radiography. As a result of its (N (2)logN) time complexity, it provides a faster alternative to spatial domain volume rendering algorithms that are (N (3)) computationally complex. Relying on the Fourier projection-slice theorem, this technique operates on the spectral representation of a 3D volume instead of processing its spatial representation to generate attenuation-only projections that look like X-ray radiographs. Due to the rapid evolution of its underlying architecture, the graphics processing unit (GPU) became an attractive competent platform that can deliver giant computational raw power compared to the central processing unit (CPU) on a per-dollar-basis. The introduction of the compute unified device architecture (CUDA) technology enables embarrassingly-parallel algorithms to run efficiently on CUDA-capable GPU architectures. In this work, a high performance GPU-accelerated implementation of the FVR pipeline on CUDA-enabled GPUs is presented. This proposed implementation can achieve a speed-up of 117x compared to a single-threaded hybrid implementation that uses the CPU and GPU together by taking advantage of executing the rendering pipeline entirely on recent GPU architectures.
1991-10-01
Real - Time Operating System , Hide Tokuda, et al., Carnegie Mellon University "* MARUTI, Hard Real - Time Operating System , Ashok...Architecture, Fred J. Pollack and Kevin C. Kahn, BiiN 10:00 - 10:20 BREAK 10:20 - 12:20 Session VII - Chair: James G. Smith, ONR • A Real - Time Operating System for...Detailed Description * POSIX: Detailed Description * V: Detailed Description * Real - Time Operating System
A Electro-Optical Image Algebra Processing System for Automatic Target Recognition
NASA Astrophysics Data System (ADS)
Coffield, Patrick Cyrus
The proposed electro-optical image algebra processing system is designed specifically for image processing and other related computations. The design is a hybridization of an optical correlator and a massively paralleled, single instruction multiple data processor. The architecture of the design consists of three tightly coupled components: a spatial configuration processor (the optical analog portion), a weighting processor (digital), and an accumulation processor (digital). The systolic flow of data and image processing operations are directed by a control buffer and pipelined to each of the three processing components. The image processing operations are defined in terms of basic operations of an image algebra developed by the University of Florida. The algebra is capable of describing all common image-to-image transformations. The merit of this architectural design is how it implements the natural decomposition of algebraic functions into spatially distributed, point use operations. The effect of this particular decomposition allows convolution type operations to be computed strictly as a function of the number of elements in the template (mask, filter, etc.) instead of the number of picture elements in the image. Thus, a substantial increase in throughput is realized. The implementation of the proposed design may be accomplished in many ways. While a hybrid electro-optical implementation is of primary interest, the benefits and design issues of an all digital implementation are also discussed. The potential utility of this architectural design lies in its ability to control a large variety of the arithmetic and logic operations of the image algebra's generalized matrix product. The generalized matrix product is the most powerful fundamental operation in the algebra, thus allowing a wide range of applications. No other known device or design has made this claim of processing speed and general implementation of a heterogeneous image algebra.
Data Telemetry and Acquisition System for Acoustic Signal Processing Investigations.
1996-02-20
were VME- based computer systems operating under the VxWorks real - time operating system . Each system shared a common hardware and software... real - time operating system . It interfaces to the Berg PCM Decommutator board, which searches for the embedded synchronization word in the data and re...software were built on top of this architecture. The multi-tasking, message queue and memory management facilities of the VxWorks real - time operating system are
Dense and Sparse Matrix Operations on the Cell Processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel W.; Shalf, John; Oliker, Leonid
2005-05-01
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
A fast, programmable hardware architecture for spaceborne SAR processing
NASA Technical Reports Server (NTRS)
Bennett, J. R.; Cumming, I. G.; Lim, J.; Wedding, R. M.
1983-01-01
The launch of spaceborne SARs during the 1980's is discussed. The satellite SARs require high quality and high throughput ground processors. Compression ratios in range and azimuth of greater than 500 and 150 respectively lead to frequency domain processing and data computation rates in excess of 2000 million real operations per second for C-band SARs under consideration. Various hardware architectures are examined and two promising candidates and proceeds to recommend a fast, programmable hardware architecture for spaceborne SAR processing are selected. Modularity and programmability are introduced as desirable attributes for the purpose of HTSP hardware selection.
32 bit digital optical computer - A hardware update
NASA Technical Reports Server (NTRS)
Guilfoyle, Peter S.; Carter, James A., III; Stone, Richard V.; Pape, Dennis R.
1990-01-01
Such state-of-the-art devices as multielement linear laser diode arrays, multichannel acoustooptic modulators, optical relays, and avalanche photodiode arrays, are presently applied to the implementation of a 32-bit supercomputer's general-purpose optical central processing architecture. Shannon's theorem, Morozov's control operator method (in conjunction with combinatorial arithmetic), and DeMorgan's law have been used to design an architecture whose 100 MHz clock renders it fully competitive with emerging planar-semiconductor technology. Attention is given to the architecture's multichannel Bragg cells, thermal design and RF crosstalk considerations, and the first and second anamorphic relay legs.
Integration of High-Performance Computing into Cloud Computing Services
NASA Astrophysics Data System (ADS)
Vouk, Mladen A.; Sills, Eric; Dreher, Patrick
High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).
Fault-tolerant battery system employing intra-battery network architecture
Hagen, Ronald A.; Chen, Kenneth W.; Comte, Christophe; Knudson, Orlin B.; Rouillard, Jean
2000-01-01
A distributed energy storing system employing a communications network is disclosed. A distributed battery system includes a number of energy storing modules, each of which includes a processor and communications interface. In a network mode of operation, a battery computer communicates with each of the module processors over an intra-battery network and cooperates with individual module processors to coordinate module monitoring and control operations. The battery computer monitors a number of battery and module conditions, including the potential and current state of the battery and individual modules, and the conditions of the battery's thermal management system. An over-discharge protection system, equalization adjustment system, and communications system are also controlled by the battery computer. The battery computer logs and reports various status data on battery level conditions which may be reported to a separate system platform computer. A module transitions to a stand-alone mode of operation if the module detects an absence of communication connectivity with the battery computer. A module which operates in a stand-alone mode performs various monitoring and control functions locally within the module to ensure safe and continued operation.
Category-theoretic models of algebraic computer systems
NASA Astrophysics Data System (ADS)
Kovalyov, S. P.
2016-01-01
A computer system is said to be algebraic if it contains nodes that implement unconventional computation paradigms based on universal algebra. A category-based approach to modeling such systems that provides a theoretical basis for mapping tasks to these systems' architecture is proposed. The construction of algebraic models of general-purpose computations involving conditional statements and overflow control is formally described by a reflector in an appropriate category of algebras. It is proved that this reflector takes the modulo ring whose operations are implemented in the conventional arithmetic processors to the Łukasiewicz logic matrix. Enrichments of the set of ring operations that form bases in the Łukasiewicz logic matrix are found.
Silicon CMOS architecture for a spin-based quantum computer.
Veldhorst, M; Eenink, H G J; Yang, C H; Dzurak, A S
2017-12-15
Recent advances in quantum error correction codes for fault-tolerant quantum computing and physical realizations of high-fidelity qubits in multiple platforms give promise for the construction of a quantum computer based on millions of interacting qubits. However, the classical-quantum interface remains a nascent field of exploration. Here, we propose an architecture for a silicon-based quantum computer processor based on complementary metal-oxide-semiconductor (CMOS) technology. We show how a transistor-based control circuit together with charge-storage electrodes can be used to operate a dense and scalable two-dimensional qubit system. The qubits are defined by the spin state of a single electron confined in quantum dots, coupled via exchange interactions, controlled using a microwave cavity, and measured via gate-based dispersive readout. We implement a spin qubit surface code, showing the prospects for universal quantum computation. We discuss the challenges and focus areas that need to be addressed, providing a path for large-scale quantum computing.
Science-Driven Computing: NERSC's Plan for 2006-2010
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simon, Horst D.; Kramer, William T.C.; Bailey, David H.
NERSC has developed a five-year strategic plan focusing on three components: Science-Driven Systems, Science-Driven Services, and Science-Driven Analytics. (1) Science-Driven Systems: Balanced introduction of the best new technologies for complete computational systems--computing, storage, networking, visualization and analysis--coupled with the activities necessary to engage vendors in addressing the DOE computational science requirements in their future roadmaps. (2) Science-Driven Services: The entire range of support activities, from high-quality operations and user services to direct scientific support, that enable a broad range of scientists to effectively use NERSC systems in their research. NERSC will concentrate on resources needed to realize the promise ofmore » the new highly scalable architectures for scientific discovery in multidisciplinary computational science projects. (3) Science-Driven Analytics: The architectural and systems enhancements and services required to integrate NERSC's powerful computational and storage resources to provide scientists with new tools to effectively manipulate, visualize, and analyze the huge data sets derived from simulations and experiments.« less
Atomic switch networks-nanoarchitectonic design of a complex system for natural computing.
Demis, E C; Aguilera, R; Sillin, H O; Scharnhorst, K; Sandouk, E J; Aono, M; Stieg, A Z; Gimzewski, J K
2015-05-22
Self-organized complex systems are ubiquitous in nature, and the structural complexity of these natural systems can be used as a model to design new classes of functional nanotechnology based on highly interconnected networks of interacting units. Conventional fabrication methods for electronic computing devices are subject to known scaling limits, confining the diversity of possible architectures. This work explores methods of fabricating a self-organized complex device known as an atomic switch network and discusses its potential utility in computing. Through a merger of top-down and bottom-up techniques guided by mathematical and nanoarchitectonic design principles, we have produced functional devices comprising nanoscale elements whose intrinsic nonlinear dynamics and memorization capabilities produce robust patterns of distributed activity and a capacity for nonlinear transformation of input signals when configured in the appropriate network architecture. Their operational characteristics represent a unique potential for hardware implementation of natural computation, specifically in the area of reservoir computing-a burgeoning field that investigates the computational aptitude of complex biologically inspired systems.
Molecular Sticker Model Stimulation on Silicon for a Maximum Clique Problem
Ning, Jianguo; Li, Yanmei; Yu, Wen
2015-01-01
Molecular computers (also called DNA computers), as an alternative to traditional electronic computers, are smaller in size but more energy efficient, and have massive parallel processing capacity. However, DNA computers may not outperform electronic computers owing to their higher error rates and some limitations of the biological laboratory. The stickers model, as a typical DNA-based computer, is computationally complete and universal, and can be viewed as a bit-vertically operating machine. This makes it attractive for silicon implementation. Inspired by the information processing method on the stickers computer, we propose a novel parallel computing model called DEM (DNA Electronic Computing Model) on System-on-a-Programmable-Chip (SOPC) architecture. Except for the significant difference in the computing medium—transistor chips rather than bio-molecules—the DEM works similarly to DNA computers in immense parallel information processing. Additionally, a plasma display panel (PDP) is used to show the change of solutions, and helps us directly see the distribution of assignments. The feasibility of the DEM is tested by applying it to compute a maximum clique problem (MCP) with eight vertices. Owing to the limited computing sources on SOPC architecture, the DEM could solve moderate-size problems in polynomial time. PMID:26075867
Interfacing laboratory instruments to multiuser, virtual memory computers
NASA Technical Reports Server (NTRS)
Generazio, Edward R.; Stang, David B.; Roth, Don J.
1989-01-01
Incentives, problems and solutions associated with interfacing laboratory equipment with multiuser, virtual memory computers are presented. The major difficulty concerns how to utilize these computers effectively in a medium sized research group. This entails optimization of hardware interconnections and software to facilitate multiple instrument control, data acquisition and processing. The architecture of the system that was devised, and associated programming and subroutines are described. An example program involving computer controlled hardware for ultrasonic scan imaging is provided to illustrate the operational features.
Baseline Architecture of ITER Control System
NASA Astrophysics Data System (ADS)
Wallander, A.; Di Maio, F.; Journeaux, J.-Y.; Klotz, W.-D.; Makijarvi, P.; Yonekawa, I.
2011-08-01
The control system of ITER consists of thousands of computers processing hundreds of thousands of signals. The control system, being the primary tool for operating the machine, shall integrate, control and coordinate all these computers and signals and allow a limited number of staff to operate the machine from a central location with minimum human intervention. The primary functions of the ITER control system are plant control, supervision and coordination, both during experimental pulses and 24/7 continuous operation. The former can be split in three phases; preparation of the experiment by defining all parameters; executing the experiment including distributed feed-back control and finally collecting, archiving, analyzing and presenting all data produced by the experiment. We define the control system as a set of hardware and software components with well defined characteristics. The architecture addresses the organization of these components and their relationship to each other. We distinguish between physical and functional architecture, where the former defines the physical connections and the latter the data flow between components. In this paper, we identify the ITER control system based on the plant breakdown structure. Then, the control system is partitioned into a workable set of bounded subsystems. This partition considers at the same time the completeness and the integration of the subsystems. The components making up subsystems are identified and defined, a naming convention is introduced and the physical networks defined. Special attention is given to timing and real-time communication for distributed control. Finally we discuss baseline technologies for implementing the proposed architecture based on analysis, market surveys, prototyping and benchmarking carried out during the last year.
NASA Astrophysics Data System (ADS)
Shi, X.
2015-12-01
As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
The science of visual analysis at extreme scale
NASA Astrophysics Data System (ADS)
Nowell, Lucy T.
2011-01-01
Driven by market forces and spanning the full spectrum of computational devices, computer architectures are changing in ways that present tremendous opportunities and challenges for data analysis and visual analytic technologies. Leadership-class high performance computing system will have as many as a million cores by 2020 and support 10 billion-way concurrency, while laptop computers are expected to have as many as 1,000 cores by 2015. At the same time, data of all types are increasing exponentially and automated analytic methods are essential for all disciplines. Many existing analytic technologies do not scale to make full use of current platforms and fewer still are likely to scale to the systems that will be operational by the end of this decade. Furthermore, on the new architectures and for data at extreme scales, validating the accuracy and effectiveness of analytic methods, including visual analysis, will be increasingly important.
Trapped-Ion Quantum Logic with Global Radiation Fields.
Weidt, S; Randall, J; Webster, S C; Lake, K; Webb, A E; Cohen, I; Navickas, T; Lekitsch, B; Retzker, A; Hensinger, W K
2016-11-25
Trapped ions are a promising tool for building a large-scale quantum computer. However, the number of required radiation fields for the realization of quantum gates in any proposed ion-based architecture scales with the number of ions within the quantum computer, posing a major obstacle when imagining a device with millions of ions. Here, we present a fundamentally different approach for trapped-ion quantum computing where this detrimental scaling vanishes. The method is based on individually controlled voltages applied to each logic gate location to facilitate the actual gate operation analogous to a traditional transistor architecture within a classical computer processor. To demonstrate the key principle of this approach we implement a versatile quantum gate method based on long-wavelength radiation and use this method to generate a maximally entangled state of two quantum engineered clock qubits with fidelity 0.985(12). This quantum gate also constitutes a simple-to-implement tool for quantum metrology, sensing, and simulation.
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing
NASA Astrophysics Data System (ADS)
Johl, John T.; Baker, Nick C.
1988-10-01
The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
An Efficient Hardware Circuit for Spike Sorting Based on Competitive Learning Networks.
Chen, Huan-Yuan; Chen, Chih-Chang; Hwang, Wen-Jyi
2017-09-28
This study aims to present an effective VLSI circuit for multi-channel spike sorting. The circuit supports the spike detection, feature extraction and classification operations. The detection circuit is implemented in accordance with the nonlinear energy operator algorithm. Both the peak detection and area computation operations are adopted for the realization of the hardware architecture for feature extraction. The resulting feature vectors are classified by a circuit for competitive learning (CL) neural networks. The CL circuit supports both online training and classification. In the proposed architecture, all the channels share the same detection, feature extraction, learning and classification circuits for a low area cost hardware implementation. The clock-gating technique is also employed for reducing the power dissipation. To evaluate the performance of the architecture, an application-specific integrated circuit (ASIC) implementation is presented. Experimental results demonstrate that the proposed circuit exhibits the advantages of a low chip area, a low power dissipation and a high classification success rate for spike sorting.
An Efficient Hardware Circuit for Spike Sorting Based on Competitive Learning Networks
Chen, Huan-Yuan; Chen, Chih-Chang
2017-01-01
This study aims to present an effective VLSI circuit for multi-channel spike sorting. The circuit supports the spike detection, feature extraction and classification operations. The detection circuit is implemented in accordance with the nonlinear energy operator algorithm. Both the peak detection and area computation operations are adopted for the realization of the hardware architecture for feature extraction. The resulting feature vectors are classified by a circuit for competitive learning (CL) neural networks. The CL circuit supports both online training and classification. In the proposed architecture, all the channels share the same detection, feature extraction, learning and classification circuits for a low area cost hardware implementation. The clock-gating technique is also employed for reducing the power dissipation. To evaluate the performance of the architecture, an application-specific integrated circuit (ASIC) implementation is presented. Experimental results demonstrate that the proposed circuit exhibits the advantages of a low chip area, a low power dissipation and a high classification success rate for spike sorting. PMID:28956859
micROS: a morphable, intelligent and collective robot operating system.
Yang, Xuejun; Dai, Huadong; Yi, Xiaodong; Wang, Yanzhen; Yang, Shaowu; Zhang, Bo; Wang, Zhiyuan; Zhou, Yun; Peng, Xuefeng
2016-01-01
Robots are developing in much the same way that personal computers did 40 years ago, and robot operating system is the critical basis. Current robot software is mainly designed for individual robots. We present in this paper the design of micROS, a morphable, intelligent and collective robot operating system for future collective and collaborative robots. We first present the architecture of micROS, including the distributed architecture for collective robot system as a whole and the layered architecture for every single node. We then present the design of autonomous behavior management based on the observe-orient-decide-act cognitive behavior model and the design of collective intelligence including collective perception, collective cognition, collective game and collective dynamics. We also give the design of morphable resource management, which first categorizes robot resources into physical, information, cognitive and social domains, and then achieve morphability based on self-adaptive software technology. We finally deploy micROS on NuBot football robots and achieve significant improvement in real-time performance.
CAD/CAE Integration Enhanced by New CAD Services Standard
NASA Technical Reports Server (NTRS)
Claus, Russell W.
2002-01-01
A Government-industry team led by the NASA Glenn Research Center has developed a computer interface standard for accessing data from computer-aided design (CAD) systems. The Object Management Group, an international computer standards organization, has adopted this CAD services standard. The new standard allows software (e.g., computer-aided engineering (CAE) and computer-aided manufacturing software to access multiple CAD systems through one programming interface. The interface is built on top of a distributed computing system called the Common Object Request Broker Architecture (CORBA). CORBA allows the CAD services software to operate in a distributed, heterogeneous computing environment.
NASA Astrophysics Data System (ADS)
Zaveri, Mazad Shaheriar
The semiconductor/computer industry has been following Moore's law for several decades and has reaped the benefits in speed and density of the resultant scaling. Transistor density has reached almost one billion per chip, and transistor delays are in picoseconds. However, scaling has slowed down, and the semiconductor industry is now facing several challenges. Hybrid CMOS/nano technologies, such as CMOL, are considered as an interim solution to some of the challenges. Another potential architectural solution includes specialized architectures for applications/models in the intelligent computing domain, one aspect of which includes abstract computational models inspired from the neuro/cognitive sciences. Consequently in this dissertation, we focus on the hardware implementations of Bayesian Memory (BM), which is a (Bayesian) Biologically Inspired Computational Model (BICM). This model is a simplified version of George and Hawkins' model of the visual cortex, which includes an inference framework based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the BM. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom hardware architectures using both traditional CMOS and hybrid nanotechnology - CMOL, and investigating the baseline performance/price of these architectures. The results suggest that CMOL is a promising candidate for implementing a BM. Such implementations can utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32 to 40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a PC implementation. We later use this methodology to investigate the hardware implementations of cortex-scale spiking neural system, which is an approximate neural equivalent of BICM based cortex-scale system. The results of this investigation also suggest that CMOL is a promising candidate to implement such large-scale neuromorphic systems. In general, the assessment of such hypothetical baseline hardware architectures provides the prospects for building large-scale (mammalian cortex-scale) implementations of neuromorphic/Bayesian/intelligent systems using state-of-the-art and beyond state-of-the-art silicon structures.
MAX - An advanced parallel computer for space applications
NASA Technical Reports Server (NTRS)
Lewis, Blair F.; Bunker, Robert L.
1991-01-01
MAX is a fault-tolerant multicomputer hardware and software architecture designed to meet the needs of NASA spacecraft systems. It consists of conventional computing modules (computers) connected via a dual network topology. One network is used to transfer data among the computers and between computers and I/O devices. This network's topology is arbitrary. The second network operates as a broadcast medium for operating system synchronization messages and supports the operating system's Byzantine resilience. A fully distributed operating system supports multitasking in an asynchronous event and data driven environment. A large grain dataflow paradigm is used to coordinate the multitasking and provide easy control of concurrency. It is the basis of the system's fault tolerance and allows both static and dynamical location of tasks. Redundant execution of tasks with software voting of results may be specified for critical tasks. The dataflow paradigm also supports simplified software design, test and maintenance. A unique feature is a method for reliably patching code in an executing dataflow application.
Towards fully analog hardware reservoir computing for speech recognition
NASA Astrophysics Data System (ADS)
Smerieri, Anteo; Duport, François; Paquot, Yvan; Haelterman, Marc; Schrauwen, Benjamin; Massar, Serge
2012-09-01
Reservoir computing is a very recent, neural network inspired unconventional computation technique, where a recurrent nonlinear system is used in conjunction with a linear readout to perform complex calculations, leveraging its inherent internal dynamics. In this paper we show the operation of an optoelectronic reservoir computer in which both the nonlinear recurrent part and the readout layer are implemented in hardware for a speech recognition application. The performance obtained is close to the one of to state-of-the-art digital reservoirs, while the analog architecture opens the way to ultrafast computation.
A low-power and high-quality implementation of the discrete cosine transformation
NASA Astrophysics Data System (ADS)
Heyne, B.; Götze, J.
2007-06-01
In this paper a computationally efficient and high-quality preserving DCT architecture is presented. It is obtained by optimizing the Loeffler DCT based on the Cordic algorithm. The computational complexity is reduced from 11 multiply and 29 add operations (Loeffler DCT) to 38 add and 16 shift operations (which is similar to the complexity of the binDCT). The experimental results show that the proposed DCT algorithm not only reduces the computational complexity significantly, but also retains the good transformation quality of the Loeffler DCT. Therefore, the proposed Cordic based Loeffler DCT is especially suited for low-power and high-quality CODECs in battery-based systems.
Diamond Eye: a distributed architecture for image data mining
NASA Astrophysics Data System (ADS)
Burl, Michael C.; Fowlkes, Charless; Roden, Joe; Stechert, Andre; Mukhtar, Saleem
1999-02-01
Diamond Eye is a distributed software architecture, which enables users (scientists) to analyze large image collections by interacting with one or more custom data mining servers via a Java applet interface. Each server is coupled with an object-oriented database and a computational engine, such as a network of high-performance workstations. The database provides persistent storage and supports querying of the 'mined' information. The computational engine provides parallel execution of expensive image processing, object recognition, and query-by-content operations. Key benefits of the Diamond Eye architecture are: (1) the design promotes trial evaluation of advanced data mining and machine learning techniques by potential new users (all that is required is to point a web browser to the appropriate URL), (2) software infrastructure that is common across a range of science mining applications is factored out and reused, and (3) the system facilitates closer collaborations between algorithm developers and domain experts.
System architecture for asynchronous multi-processor robotic control system
NASA Technical Reports Server (NTRS)
Steele, Robert D.; Long, Mark; Backes, Paul
1993-01-01
The architecture for the Modular Telerobot Task Execution System (MOTES) as implemented in the Supervisory Telerobotics (STELER) Laboratory is described. MOTES is the software component of the remote site of a local-remote telerobotic system which is being developed for NASA for space applications, in particular Space Station Freedom applications. The system is being developed to provide control and supervised autonomous control to support both space based operation and ground-remote control with time delay. The local-remote architecture places task planning responsibilities at the local site and task execution responsibilities at the remote site. This separation allows the remote site to be designed to optimize task execution capability within a limited computational environment such as is expected in flight systems. The local site task planning system could be placed on the ground where few computational limitations are expected. MOTES is written in the Ada programming language for a multiprocessor environment.
The Efficiency and the Scalability of an Explicit Operator on an IBM POWER4 System
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Biegel, Bryan A. (Technical Monitor)
2002-01-01
We present an evaluation of the efficiency and the scalability of an explicit CFD operator on an IBM POWER4 system. The POWER4 architecture exhibits a common trend in HPC architectures: boosting CPU processing power by increasing the number of functional units, while hiding the latency of memory access by increasing the depth of the memory hierarchy. The overall machine performance depends on the ability of the caches-buses-fabric-memory to feed the functional units with the data to be processed. In this study we evaluate the efficiency and scalability of one explicit CFD operator on an IBM POWER4. This operator performs computations at the points of a Cartesian grid and involves a few dozen floating point numbers and on the order of 100 floating point operations per grid point. The computations in all grid points are independent. Specifically, we estimate the efficiency of the RHS operator (SP of NPB) on a single processor as the observed/peak performance ratio. Then we estimate the scalability of the operator on a single chip (2 CPUs), a single MCM (8 CPUs), 16 CPUs, and the whole machine (32 CPUs). Then we perform the same measurements for a chache-optimized version of the RHS operator. For our measurements we use the HPM (Hardware Performance Monitor) counters available on the POWER4. These counters allow us to analyze the obtained performance results.
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan R.
2015-05-01
New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
Computers in Academic Architecture Libraries.
ERIC Educational Resources Information Center
Willis, Alfred; And Others
1992-01-01
Computers are widely used in architectural research and teaching in U.S. schools of architecture. A survey of libraries serving these schools sought information on the emphasis placed on computers by the architectural curriculum, accessibility of computers to library staff, and accessibility of computers to library patrons. Survey results and…
NASA Technical Reports Server (NTRS)
Kao, M. H.; Bodenheimer, R. E.
1976-01-01
The tse computer's capability of achieving image congruence between temporal and multiple images with misregistration due to rotational differences is reported. The coordinate transformations are obtained and a general algorithms is devised to perform image rotation using tse operations very efficiently. The details of this algorithm as well as its theoretical implications are presented. Step by step procedures of image registration are described in detail. Numerous examples are also employed to demonstrate the correctness and the effectiveness of the algorithms and conclusions and recommendations are made.
Hardware implementation of CMAC neural network with reduced storage requirement.
Ker, J S; Kuo, Y H; Wen, R C; Liu, B D
1997-01-01
The cerebellar model articulation controller (CMAC) neural network has the advantages of fast convergence speed and low computation complexity. However, it suffers from a low storage space utilization rate on weight memory. In this paper, we propose a direct weight address mapping approach, which can reduce the required weight memory size with a utilization rate near 100%. Based on such an address mapping approach, we developed a pipeline architecture to efficiently perform the addressing operations. The proposed direct weight address mapping approach also speeds up the computation for the generation of weight addresses. Besides, a CMAC hardware prototype used for color calibration has been implemented to confirm the proposed approach and architecture.
Integrated optical circuits for numerical computation
NASA Technical Reports Server (NTRS)
Verber, C. M.; Kenan, R. P.
1983-01-01
The development of integrated optical circuits (IOC) for numerical-computation applications is reviewed, with a focus on the use of systolic architectures. The basic architecture criteria for optical processors are shown to be the same as those proposed by Kung (1982) for VLSI design, and the advantages of IOCs over bulk techniques are indicated. The operation and fabrication of electrooptic grating structures are outlined, and the application of IOCs of this type to an existing 32-bit, 32-Mbit/sec digital correlator, a proposed matrix multiplier, and a proposed pipeline processor for polynomial evaluation is discussed. The problems arising from the inherent nonlinearity of electrooptic gratings are considered. Diagrams and drawings of the application concepts are provided.
Production experience with the ATLAS Event Service
NASA Astrophysics Data System (ADS)
Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)
NASA Astrophysics Data System (ADS)
Vincenti, Henri
2016-03-01
The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
A surface code quantum computer in silicon
Hill, Charles D.; Peretz, Eldad; Hile, Samuel J.; House, Matthew G.; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y.; Hollenberg, Lloyd C. L.
2015-01-01
The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel—posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited. PMID:26601310
A surface code quantum computer in silicon.
Hill, Charles D; Peretz, Eldad; Hile, Samuel J; House, Matthew G; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y; Hollenberg, Lloyd C L
2015-10-01
The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel-posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited.
Sengupta, Abhronil; Shim, Yong; Roy, Kaushik
2016-12-01
Non-Boolean computing based on emerging post-CMOS technologies can potentially pave the way for low-power neural computing platforms. However, existing work on such emerging neuromorphic architectures have either focused on solely mimicking the neuron, or the synapse functionality. While memristive devices have been proposed to emulate biological synapses, spintronic devices have proved to be efficient at performing the thresholding operation of the neuron at ultra-low currents. In this work, we propose an All-Spin Artificial Neural Network where a single spintronic device acts as the basic building block of the system. The device offers a direct mapping to synapse and neuron functionalities in the brain while inter-layer network communication is accomplished via CMOS transistors. To the best of our knowledge, this is the first demonstration of a neural architecture where a single nanoelectronic device is able to mimic both neurons and synapses. The ultra-low voltage operation of low resistance magneto-metallic neurons enables the low-voltage operation of the array of spintronic synapses, thereby leading to ultra-low power neural architectures. Device-level simulations, calibrated to experimental results, was used to drive the circuit and system level simulations of the neural network for a standard pattern recognition problem. Simulation studies indicate energy savings by ∼ 100× in comparison to a corresponding digital/analog CMOS neuron implementation.
Recent Trends in Spintronics-Based Nanomagnetic Logic
NASA Astrophysics Data System (ADS)
Das, Jayita; Alam, Syed M.; Bhanja, Sanjukta
2014-09-01
With the growing concerns of standby power in sub-100-nm CMOS technologies, alternative computing techniques and memory technologies are explored. Spin transfer torque magnetoresistive RAM (STT-MRAM) is one such nonvolatile memory relying on magnetic tunnel junctions (MTJs) to store information. It uses spin transfer torque to write information and magnetoresistance to read information. In 2012, Everspin Technologies, Inc. commercialized the first 64Mbit Spin Torque MRAM. On the computing end, nanomagnetic logic (NML) is a promising technique with zero leakage and high data retention. In 2000, Cowburn and Welland first demonstrated its potential in logic and information propagation through magnetostatic interaction in a chain of single domain circular nanomagnetic dots of Supermalloy (Ni80Fe14Mo5X1, X is other metals). In 2006, Imre et al. demonstrated wires and majority gates followed by coplanar cross wire systems demonstration in 2010 by Pulecio et al. Since 2004 researchers have also investigated the potential of MTJs in logic. More recently with dipolar coupling between MTJs demonstrated in 2012, logic-in-memory architecture with STT-MRAM have been investigated. The architecture borrows the computing concept from NML and read and write style from MRAM. The architecture can switch its operation between logic and memory modes with clock as classifier. Further through logic partitioning between MTJ and CMOS plane, a significant performance boost has been observed in basic computing blocks within the architecture. In this work, we have explored the developments in NML, in MTJs and more recent developments in hybrid MTJ/CMOS logic-in-memory architecture and its unique logic partitioning capability.
Hybrid architecture for building secure sensor networks
NASA Astrophysics Data System (ADS)
Owens, Ken R., Jr.; Watkins, Steve E.
2012-04-01
Sensor networks have various communication and security architectural concerns. Three approaches are defined to address these concerns for sensor networks. The first area is the utilization of new computing architectures that leverage embedded virtualization software on the sensor. Deploying a small, embedded virtualization operating system on the sensor nodes that is designed to communicate to low-cost cloud computing infrastructure in the network is the foundation to delivering low-cost, secure sensor networks. The second area focuses on securing the sensor. Sensor security components include developing an identification scheme, and leveraging authentication algorithms and protocols that address security assurance within the physical, communication network, and application layers. This function will primarily be accomplished through encrypting the communication channel and integrating sensor network firewall and intrusion detection/prevention components to the sensor network architecture. Hence, sensor networks will be able to maintain high levels of security. The third area addresses the real-time and high priority nature of the data that sensor networks collect. This function requires that a quality-of-service (QoS) definition and algorithm be developed for delivering the right data at the right time. A hybrid architecture is proposed that combines software and hardware features to handle network traffic with diverse QoS requirements.
An Evaluation of an Ada Implementation of the Rete Algorithm for Embedded Flight Processors
1990-12-01
computers was desired. The VAX VMS operating system has many built-in methods for determining program performance (including VAX PCA), but these methods... overviev , of the target environment-- the MIL-STD-1750A VHSIC Avionic Modular Processor ( VA.IP, running under the Ada Avionics Real-Time Software (AARTS... computers . Mil-STD-1750A, the Air Force’s standard flight computer architecture, however, places severe constraints on applications software processing
Restricted access processor - An application of computer security technology
NASA Technical Reports Server (NTRS)
Mcmahon, E. M.
1985-01-01
This paper describes a security guard device that is currently being developed by Computer Sciences Corporation (CSC). The methods used to provide assurance that the system meets its security requirements include the system architecture, a system security evaluation, and the application of formal and informal verification techniques. The combination of state-of-the-art technology and the incorporation of new verification procedures results in a demonstration of the feasibility of computer security technology for operational applications.
High-performance computing in image registration
NASA Astrophysics Data System (ADS)
Zanin, Michele; Remondino, Fabio; Dalla Mura, Mauro
2012-10-01
Thanks to the recent technological advances, a large variety of image data is at our disposal with variable geometric, radiometric and temporal resolution. In many applications the processing of such images needs high performance computing techniques in order to deliver timely responses e.g. for rapid decisions or real-time actions. Thus, parallel or distributed computing methods, Digital Signal Processor (DSP) architectures, Graphical Processing Unit (GPU) programming and Field-Programmable Gate Array (FPGA) devices have become essential tools for the challenging issue of processing large amount of geo-data. The article focuses on the processing and registration of large datasets of terrestrial and aerial images for 3D reconstruction, diagnostic purposes and monitoring of the environment. For the image alignment procedure, sets of corresponding feature points need to be automatically extracted in order to successively compute the geometric transformation that aligns the data. The feature extraction and matching are ones of the most computationally demanding operations in the processing chain thus, a great degree of automation and speed is mandatory. The details of the implemented operations (named LARES) exploiting parallel architectures and GPU are thus presented. The innovative aspects of the implementation are (i) the effectiveness on a large variety of unorganized and complex datasets, (ii) capability to work with high-resolution images and (iii) the speed of the computations. Examples and comparisons with standard CPU processing are also reported and commented.
Atomic switch networks—nanoarchitectonic design of a complex system for natural computing
NASA Astrophysics Data System (ADS)
Demis, E. C.; Aguilera, R.; Sillin, H. O.; Scharnhorst, K.; Sandouk, E. J.; Aono, M.; Stieg, A. Z.; Gimzewski, J. K.
2015-05-01
Self-organized complex systems are ubiquitous in nature, and the structural complexity of these natural systems can be used as a model to design new classes of functional nanotechnology based on highly interconnected networks of interacting units. Conventional fabrication methods for electronic computing devices are subject to known scaling limits, confining the diversity of possible architectures. This work explores methods of fabricating a self-organized complex device known as an atomic switch network and discusses its potential utility in computing. Through a merger of top-down and bottom-up techniques guided by mathematical and nanoarchitectonic design principles, we have produced functional devices comprising nanoscale elements whose intrinsic nonlinear dynamics and memorization capabilities produce robust patterns of distributed activity and a capacity for nonlinear transformation of input signals when configured in the appropriate network architecture. Their operational characteristics represent a unique potential for hardware implementation of natural computation, specifically in the area of reservoir computing—a burgeoning field that investigates the computational aptitude of complex biologically inspired systems.
NASA Astrophysics Data System (ADS)
Iacobucci, Joseph V.
The research objective for this manuscript is to develop a Rapid Architecture Alternative Modeling (RAAM) methodology to enable traceable Pre-Milestone A decision making during the conceptual phase of design of a system of systems. Rather than following current trends that place an emphasis on adding more analysis which tends to increase the complexity of the decision making problem, RAAM improves on current methods by reducing both runtime and model creation complexity. RAAM draws upon principles from computer science, system architecting, and domain specific languages to enable the automatic generation and evaluation of architecture alternatives. For example, both mission dependent and mission independent metrics are considered. Mission dependent metrics are determined by the performance of systems accomplishing a task, such as Probability of Success. In contrast, mission independent metrics, such as acquisition cost, are solely determined and influenced by the other systems in the portfolio. RAAM also leverages advances in parallel computing to significantly reduce runtime by defining executable models that are readily amendable to parallelization. This allows the use of cloud computing infrastructures such as Amazon's Elastic Compute Cloud and the PASTEC cluster operated by the Georgia Institute of Technology Research Institute (GTRI). Also, the amount of data that can be generated when fully exploring the design space can quickly exceed the typical capacity of computational resources at the analyst's disposal. To counter this, specific algorithms and techniques are employed. Streaming algorithms and recursive architecture alternative evaluation algorithms are used that reduce computer memory requirements. Lastly, a domain specific language is created to provide a reduction in the computational time of executing the system of systems models. A domain specific language is a small, usually declarative language that offers expressive power focused on a particular problem domain by establishing an effective means to communicate the semantics from the RAAM framework. These techniques make it possible to include diverse multi-metric models within the RAAM framework in addition to system and operational level trades. A canonical example was used to explore the uses of the methodology. The canonical example contains all of the features of a full system of systems architecture analysis study but uses fewer tasks and systems. Using RAAM with the canonical example it was possible to consider both system and operational level trades in the same analysis. Once the methodology had been tested with the canonical example, a Suppression of Enemy Air Defenses (SEAD) capability model was developed. Due to the sensitive nature of analyses on that subject, notional data was developed. The notional data has similar trends and properties to realistic Suppression of Enemy Air Defenses data. RAAM was shown to be traceable and provided a mechanism for a unified treatment of a variety of metrics. The SEAD capability model demonstrated lower computer runtimes and reduced model creation complexity as compared to methods currently in use. To determine the usefulness of the implementation of the methodology on current computing hardware, RAAM was tested with system of system architecture studies of different sizes. This was necessary since system of systems may be called upon to accomplish thousands of tasks. It has been clearly demonstrated that RAAM is able to enumerate and evaluate the types of large, complex design spaces usually encountered in capability based design, oftentimes providing the ability to efficiently search the entire decision space. The core algorithms for generation and evaluation of alternatives scale linearly with expected problem sizes. The SEAD capability model outputs prompted the discovery a new issue, the data storage and manipulation requirements for an analysis. Two strategies were developed to counter large data sizes, the use of portfolio views and top 'n' analysis. This proved the usefulness of the RAAM framework and methodology during Pre-Milestone A capability based analysis. (Abstract shortened by UMI.).
Benchmarking Memory Performance with the Data Cube Operator
NASA Technical Reports Server (NTRS)
Frumkin, Michael A.; Shabanov, Leonid V.
2004-01-01
Data movement across a computer memory hierarchy and across computational grids is known to be a limiting factor for applications processing large data sets. We use the Data Cube Operator on an Arithmetic Data Set, called ADC, to benchmark capabilities of computers and of computational grids to handle large distributed data sets. We present a prototype implementation of a parallel algorithm for computation of the operatol: The algorithm follows a known approach for computing views from the smallest parent. The ADC stresses all levels of grid memory and storage by producing some of 2d views of an Arithmetic Data Set of d-tuples described by a small number of integers. We control data intensity of the ADC by selecting the tuple parameters, the sizes of the views, and the number of realized views. Benchmarking results of memory performance of a number of computer architectures and of a small computational grid are presented.
Malleable architecture generator for FPGA computing
NASA Astrophysics Data System (ADS)
Gokhale, Maya; Kaba, James; Marks, Aaron; Kim, Jang
1996-10-01
The malleable architecture generator (MARGE) is a tool set that translates high-level parallel C to configuration bit streams for field-programmable logic based computing systems. MARGE creates an application-specific instruction set and generates the custom hardware components required to perform exactly those computations specified by the C program. In contrast to traditional fixed-instruction processors, MARGE's dynamic instruction set creation provides for efficient use of hardware resources. MARGE processes intermediate code in which each operation is annotated by the bit lengths of the operands. Each basic block (sequence of straight line code) is mapped into a single custom instruction which contains all the operations and logic inherent in the block. A synthesis phase maps the operations comprising the instructions into register transfer level structural components and control logic which have been optimized to exploit functional parallelism and function unit reuse. As a final stage, commercial technology-specific tools are used to generate configuration bit streams for the desired target hardware. Technology- specific pre-placed, pre-routed macro blocks are utilized to implement as much of the hardware as possible. MARGE currently supports the Xilinx-based Splash-2 reconfigurable accelerator and National Semiconductor's CLAy-based parallel accelerator, MAPA. The MARGE approach has been demonstrated on systolic applications such as DNA sequence comparison.
Exploring quantum computing application to satellite data assimilation
NASA Astrophysics Data System (ADS)
Cheung, S.; Zhang, S. Q.
2015-12-01
This is an exploring work on potential application of quantum computing to a scientific data optimization problem. On classical computational platforms, the physical domain of a satellite data assimilation problem is represented by a discrete variable transform, and classical minimization algorithms are employed to find optimal solution of the analysis cost function. The computation becomes intensive and time-consuming when the problem involves large number of variables and data. The new quantum computer opens a very different approach both in conceptual programming and in hardware architecture for solving optimization problem. In order to explore if we can utilize the quantum computing machine architecture, we formulate a satellite data assimilation experimental case in the form of quadratic programming optimization problem. We find a transformation of the problem to map it into Quadratic Unconstrained Binary Optimization (QUBO) framework. Binary Wavelet Transform (BWT) will be applied to the data assimilation variables for its invertible decomposition and all calculations in BWT are performed by Boolean operations. The transformed problem will be experimented as to solve for a solution of QUBO instances defined on Chimera graphs of the quantum computer.
Automation of Shuttle Tile Inspection - Engineering methodology for Space Station
NASA Technical Reports Server (NTRS)
Wiskerchen, M. J.; Mollakarimi, C.
1987-01-01
The Space Systems Integration and Operations Research Applications (SIORA) Program was initiated in late 1986 as a cooperative applications research effort between Stanford University, NASA Kennedy Space Center, and Lockheed Space Operations Company. One of the major initial SIORA tasks was the application of automation and robotics technology to all aspects of the Shuttle tile processing and inspection system. This effort has adopted a systems engineering approach consisting of an integrated set of rapid prototyping testbeds in which a government/university/industry team of users, technologists, and engineers test and evaluate new concepts and technologies within the operational world of Shuttle. These integrated testbeds include speech recognition and synthesis, laser imaging inspection systems, distributed Ada programming environments, distributed relational database architectures, distributed computer network architectures, multimedia workbenches, and human factors considerations.
NASA Astrophysics Data System (ADS)
Guzman, J. C.; Bennett, T.
2008-08-01
The Convergent Radio Astronomy Demonstrator (CONRAD) is a collaboration between the computing teams of two SKA pathfinder instruments, MeerKAT (South Africa) and ASKAP (Australia). Our goal is to produce the required common software to operate, process and store the data from the two instruments. Both instruments are synthesis arrays composed of a large number of antennas (40 - 100) operating at centimeter wavelengths with wide-field capabilities. Key challenges are the processing of high volume of data in real-time as well as the remote mode of operations. Here we present the software architecture for CONRAD. Our design approach is to maximize the use of open solutions and third-party software widely deployed in commercial applications, such as SNMP and LDAP, and to utilize modern web-based technologies for the user interfaces, such as AJAX.
NASA Technical Reports Server (NTRS)
Benyo, Theresa L.
2002-01-01
Integration of a supersonic inlet simulation with a computer aided design (CAD) system is demonstrated. The integration is performed using the Project Integration Architecture (PIA). PIA provides a common environment for wrapping many types of applications. Accessing geometry data from CAD files is accomplished by incorporating appropriate function calls from the Computational Analysis Programming Interface (CAPRI). CAPRI is a CAD vendor neutral programming interface that aids in acquiring geometry data directly from CAD files. The benefits of wrapping a supersonic inlet simulation into PIA using CAPRI are; direct access of geometry data, accurate capture of geometry data, automatic conversion of data units, CAD vendor neutral operation, and on-line interactive history capture. This paper describes the PIA and the CAPRI wrapper and details the supersonic inlet simulation demonstration.
NASA Technical Reports Server (NTRS)
1989-01-01
The results of the refined conceptual design phase (task 5) of the Simulation Computer System (SCS) study are reported. The SCS is the computational portion of the Payload Training Complex (PTC) providing simulation based training on payload operations of the Space Station Freedom (SSF). In task 4 of the SCS study, the range of architectures suitable for the SCS was explored. Identified system architectures, along with their relative advantages and disadvantages for SCS, were presented in the Conceptual Design Report. Six integrated designs-combining the most promising features from the architectural formulations-were additionally identified in the report. The six integrated designs were evaluated further to distinguish the more viable designs to be refined as conceptual designs. The three designs that were selected represent distinct approaches to achieving a capable and cost effective SCS configuration for the PTC. Here, the results of task 4 (input to this task) are briefly reviewed. Then, prior to describing individual conceptual designs, the PTC facility configuration and the SSF systems architecture that must be supported by the SCS are reviewed. Next, basic features of SCS implementation that have been incorporated into all selected SCS designs are considered. The details of the individual SCS designs are then presented before making a final comparison of the three designs.
CLOCS (Computer with Low Context-Switching Time) Architecture Reference Documents
1988-05-06
Peculiarities The only state inside the central processing unit(CPU) is a program status word. All data operations are memory to memory. One result of this... to the challenge "if I whore to design RISC, this is how I would do it." The architecture was designed by Mark Davis and Bill Gallmeister. 1.2...are memory to memory. Any special devices added should be memory mapped. The program counter is even memory mapped. 1.3.1 Working storage There is no
High-throughput sample adaptive offset hardware architecture for high-efficiency video coding
NASA Astrophysics Data System (ADS)
Zhou, Wei; Yan, Chang; Zhang, Jingzhi; Zhou, Xin
2018-03-01
A high-throughput hardware architecture for a sample adaptive offset (SAO) filter in the high-efficiency video coding video coding standard is presented. First, an implementation-friendly and simplified bitrate estimation method of rate-distortion cost calculation is proposed to reduce the computational complexity in the mode decision of SAO. Then, a high-throughput VLSI architecture for SAO is presented based on the proposed bitrate estimation method. Furthermore, multiparallel VLSI architecture for in-loop filters, which integrates both deblocking filter and SAO filter, is proposed. Six parallel strategies are applied in the proposed in-loop filters architecture to improve the system throughput and filtering speed. Experimental results show that the proposed in-loop filters architecture can achieve up to 48% higher throughput in comparison with prior work. The proposed architecture can reach a high-operating clock frequency of 297 MHz with TSMC 65-nm library and meet the real-time requirement of the in-loop filters for 8 K × 4 K video format at 132 fps.
A Distributed Laboratory for Event-Driven Coastal Prediction and Hazard Planning
NASA Astrophysics Data System (ADS)
Bogden, P.; Allen, G.; MacLaren, J.; Creager, G. J.; Flournoy, L.; Sheng, Y. P.; Graber, H.; Graves, S.; Conover, H.; Luettich, R.; Perrie, W.; Ramakrishnan, L.; Reed, D. A.; Wang, H. V.
2006-12-01
The 2005 Atlantic hurricane season was the most active in recorded history. Collectively, 2005 hurricanes caused more than 2,280 deaths and record damages of over 100 billion dollars. Of the storms that made landfall, Dennis, Emily, Katrina, Rita, and Wilma caused most of the destruction. Accurate predictions of storm-driven surge, wave height, and inundation can save lives and help keep recovery costs down, provided the information gets to emergency response managers in time. The information must be available well in advance of landfall so that responders can weigh the costs of unnecessary evacuation against the costs of inadequate preparation. The SURA Coastal Ocean Observing and Prediction (SCOOP) Program is a multi-institution collaboration implementing a modular, distributed service-oriented architecture for real time prediction and visualization of the impacts of extreme atmospheric events. The modular infrastructure enables real-time prediction of multi- scale, multi-model, dynamic, data-driven applications. SURA institutions are working together to create a virtual and distributed laboratory integrating coastal models, simulation data, and observations with computational resources and high speed networks. The loosely coupled architecture allows teams of computer and coastal scientists at multiple institutions to innovate complex system components that are interconnected with relatively stable interfaces. The operational system standardizes at the interface level to enable substantial innovation by complementary communities of coastal and computer scientists. This architectural philosophy solves a long-standing problem associated with the transition from research to operations. The SCOOP Program thereby implements a prototype laboratory consistent with the vision of a national, multi-agency initiative called the Integrated Ocean Observing System (IOOS). Several service- oriented components of the SCOOP enterprise architecture have already been designed and implemented, including data archive and transport services, metadata registry and retrieval (catalog), resource management, and portal interfaces. SCOOP partners are integrating these at the service level and implementing reconfigurable workflows for several kinds of user scenarios, and are working with resource providers to prototype new policies and technologies for on-demand computing.
Multicore Education through Simulation
ERIC Educational Resources Information Center
Ozturk, O.
2011-01-01
A project-oriented course for advanced undergraduate and graduate students is described for simulating multiple processor cores. Simics, a free simulator for academia, was utilized to enable students to explore computer architecture, operating systems, and hardware/software cosimulation. Motivation for including this course in the curriculum is…
ERIC Educational Resources Information Center
Executive Educator, 1994
1994-01-01
This issue of "The Electronic School" features a special forum on computer networking. Articles specifically focus on network operating systems, cabling requirements, and network architecture. Tom Wall argues that virtual reality is not yet ready for classroom use. B.J. Novitsky profiles two high schools experimenting with CD-ROM…
Service Oriented Architecture for Coast Guard Command and Control
2007-03-01
Operations BPEL4WS The Business Process Execution Language for Web Services BPMN Business Process Modeling Notation CASP Computer Aided Search Planning...Business Process Modeling Notation ( BPMN ) provides a standardized graphical notation for drawing business processes in a workflow. Software tools
An efficient three-dimensional Poisson solver for SIMD high-performance-computing architectures
NASA Technical Reports Server (NTRS)
Cohl, H.
1994-01-01
We present an algorithm that solves the three-dimensional Poisson equation on a cylindrical grid. The technique uses a finite-difference scheme with operator splitting. This splitting maps the banded structure of the operator matrix into a two-dimensional set of tridiagonal matrices, which are then solved in parallel. Our algorithm couples FFT techniques with the well-known ADI (Alternating Direction Implicit) method for solving Elliptic PDE's, and the implementation is extremely well suited for a massively parallel environment like the SIMD architecture of the MasPar MP-1. Due to the highly recursive nature of our problem, we believe that our method is highly efficient, as it avoids excessive interprocessor communication.
Ando, S; Sekine, S; Mita, M; Katsuo, S
1989-12-15
An architecture and the algorithms for matrix multiplication using optical flip-flops (OFFs) in optical processors are proposed based on residue arithmetic. The proposed system is capable of processing all elements of matrices in parallel utilizing the information retrieving ability of optical Fourier processors. The employment of OFFs enables bidirectional data flow leading to a simpler architecture and the burden of residue-to-decimal (or residue-to-binary) conversion to operation time can be largely reduced by processing all elements in parallel. The calculated characteristics of operation time suggest a promising use of the system in a real time 2-D linear transform.
Exascale computing and what it means for shock physics
NASA Astrophysics Data System (ADS)
Germann, Timothy
2015-06-01
The U.S. Department of Energy is preparing to launch an Exascale Computing Initiative, to address the myriad challenges required to deploy and effectively utilize an exascale-class supercomputer (i.e., one capable of performing 1018 operations per second) in the 2023 timeframe. Since physical (power dissipation) requirements limit clock rates to at most a few GHz, this will necessitate the coordination of on the order of a billion concurrent operations, requiring sophisticated system and application software, and underlying mathematical algorithms, that may differ radically from traditional approaches. Even at the smaller workstation or cluster level of computation, the massive concurrency and heterogeneity within each processor will impact computational scientists. Through the multi-institutional, multi-disciplinary Exascale Co-design Center for Materials in Extreme Environments (ExMatEx), we have initiated an early and deep collaboration between domain (computational materials) scientists, applied mathematicians, computer scientists, and hardware architects, in order to establish the relationships between algorithms, software stacks, and architectures needed to enable exascale-ready materials science application codes within the next decade. In my talk, I will discuss these challenges, and what it will mean for exascale-era electronic structure, molecular dynamics, and engineering-scale simulations of shock-compressed condensed matter. In particular, we anticipate that the emerging hierarchical, heterogeneous architectures can be exploited to achieve higher physical fidelity simulations using adaptive physics refinement. This work is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research.
Advanced information processing system: Local system services
NASA Technical Reports Server (NTRS)
Burkhardt, Laura; Alger, Linda; Whittredge, Roy; Stasiowski, Peter
1989-01-01
The Advanced Information Processing System (AIPS) is a multi-computer architecture composed of hardware and software building blocks that can be configured to meet a broad range of application requirements. The hardware building blocks are fault-tolerant, general-purpose computers, fault-and damage-tolerant networks (both computer and input/output), and interfaces between the networks and the computers. The software building blocks are the major software functions: local system services, input/output, system services, inter-computer system services, and the system manager. The foundation of the local system services is an operating system with the functions required for a traditional real-time multi-tasking computer, such as task scheduling, inter-task communication, memory management, interrupt handling, and time maintenance. Resting on this foundation are the redundancy management functions necessary in a redundant computer and the status reporting functions required for an operator interface. The functional requirements, functional design and detailed specifications for all the local system services are documented.
Datta, Asit K; Munshi, Soumika
2002-03-10
Based on the negabinary number representation, parallel one-step arithmetic operations (that is, addition and subtraction), logical operations, and matrix-vector multiplication on data have been optically implemented, by use of a two-dimensional spatial-encoding technique. For addition and subtraction, one of the operands in decimal form is converted into the unsigned negabinary form, whereas the other decimal number is represented in the signed negabinary form. The result of operation is obtained in the mixed negabinary form and is converted back into decimal. Matrix-vector multiplication for unsigned negabinary numbers is achieved through the convolution technique. Both of the operands for logical operation are converted to their signed negabinary forms. All operations are implemented by use of a unique optical architecture. The use of a single liquid-crystal-display panel to spatially encode the input data, operational kernels, and decoding masks have simplified the architecture as well as reduced the cost and complexity.
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi; Alkalai, Leon
1996-01-01
Recent changes within NASA's space exploration program favor the design, implementation, and operation of low cost, lightweight, small and micro spacecraft with multiple launches per year. In order to meet the future needs of these missions with regard to the use of spacecraft microelectronics, NASA's advanced flight computing (AFC) program is currently considering industrial cooperation and advanced packaging architectures. In relation to this, the AFC program is reviewed, considering the design and implementation of NASA's AFC multichip module.
The Information Science Experiment System - The computer for science experiments in space
NASA Technical Reports Server (NTRS)
Foudriat, Edwin C.; Husson, Charles
1989-01-01
The concept of the Information Science Experiment System (ISES), potential experiments, and system requirements are reviewed. The ISES is conceived as a computer resource in space whose aim is to assist computer, earth, and space science experiments, to develop and demonstrate new information processing concepts, and to provide an experiment base for developing new information technology for use in space systems. The discussion covers system hardware and architecture, operating system software, the user interface, and the ground communication link.
Nonvolatile “AND,” “OR,” and “NOT” Boolean logic gates based on phase-change memory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Y.; Zhong, Y. P.; Deng, Y. F.
2013-12-21
Electronic devices or circuits that can implement both logic and memory functions are regarded as the building blocks for future massive parallel computing beyond von Neumann architecture. Here we proposed phase-change memory (PCM)-based nonvolatile logic gates capable of AND, OR, and NOT Boolean logic operations verified in SPICE simulations and circuit experiments. The logic operations are parallel computing and results can be stored directly in the states of the logic gates, facilitating the combination of computing and memory in the same circuit. These results are encouraging for ultralow-power and high-speed nonvolatile logic circuit design based on novel memory devices.
2010-06-01
DATES COVEREDAPR 2009 – JAN 2010 (From - To) APR 2009 – JAN 2010 4. TITLE AND SUBTITLE EMERGING NEUROMORPHIC COMPUTING ARCHITECTURES AND ENABLING...14. ABSTRACT The highly cross-disciplinary emerging field of neuromorphic computing architectures for cognitive information processing applications...belief systems, software, computer engineering, etc. In our effort to develop cognitive systems atop a neuromorphic computing architecture, we explored
Hybrid techniques for the digital control of mechanical and optical systems
NASA Astrophysics Data System (ADS)
Acernese, Fausto; Barone, Fabrizio; De Rosa, Rosario; Eleuteri, Antonio; Milano, Leopoldo; Pardi, Silvio; Ricciardi, Iolanda; Russo, Guido
2004-07-01
One of the main requirements of a digital system for the control of interferometric detectors of gravitational waves is the computing power, that is a direct consequence of the increasing complexity of the digital algorithms necessary for the control signals generation. For this specific task many specialised non standard real-time architectures have been developed, often very expensive and difficult to upgrade. On the other hand, such computing power is generally fully available for off-line applications on standard Pc based systems. Therefore, a possible and obvious solution may be provided by the integration of both the the real-time and off-line architecture resulting in a hybrid control system architecture based on standards available components, trying to get both the advantages of the perfect data synchronization provided by the real-time systems and by the large computing power available on Pc based systems. Such integration may be provided by the implementation of the link between the two different architectures through the standard Ethernet network, whose data transfer speed is largely increasing in these years, using the TCP/IP and UDP protocols. In this paper we describe the architecture of an hybrid Ethernet based real-time control system protoype we implemented in Napoli, discussing its characteristics and performances. Finally we discuss a possible application to the real-time control of a suspended mass of the mode cleaner of the 3m prototype optical interferometer for gravitational wave detection (IDGW-3P) operational in Napoli.
Evolving binary classifiers through parallel computation of multiple fitness cases.
Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni
2005-06-01
This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.
Reconfigurable vision system for real-time applications
NASA Astrophysics Data System (ADS)
Torres-Huitzil, Cesar; Arias-Estrada, Miguel
2002-03-01
Recently, a growing community of researchers has used reconfigurable systems to solve computationally intensive problems. Reconfigurability provides optimized processors for systems on chip designs, and makes easy to import technology to a new system through reusable modules. The main objective of this work is the investigation of a reconfigurable computer system targeted for computer vision and real-time applications. The system is intended to circumvent the inherent computational load of most window-based computer vision algorithms. It aims to build a system for such tasks by providing an FPGA-based hardware architecture for task specific vision applications with enough processing power, using the minimum amount of hardware resources as possible, and a mechanism for building systems using this architecture. Regarding the software part of the system, a library of pre-designed and general-purpose modules that implement common window-based computer vision operations is being investigated. A common generic interface is established for these modules in order to define hardware/software components. These components can be interconnected to develop more complex applications, providing an efficient mechanism for transferring image and result data among modules. Some preliminary results are presented and discussed.
COMBAT: mobile-Cloud-based cOmpute/coMmunications infrastructure for BATtlefield applications
NASA Astrophysics Data System (ADS)
Soyata, Tolga; Muraleedharan, Rajani; Langdon, Jonathan; Funai, Colin; Ames, Scott; Kwon, Minseok; Heinzelman, Wendi
2012-05-01
The amount of data processed annually over the Internet has crossed the zetabyte boundary, yet this Big Data cannot be efficiently processed or stored using today's mobile devices. Parallel to this explosive growth in data, a substantial increase in mobile compute-capability and the advances in cloud computing have brought the state-of-the- art in mobile-cloud computing to an inflection point, where the right architecture may allow mobile devices to run applications utilizing Big Data and intensive computing. In this paper, we propose the MObile Cloud-based Hybrid Architecture (MOCHA), which formulates a solution to permit mobile-cloud computing applications such as object recognition in the battlefield by introducing a mid-stage compute- and storage-layer, called the cloudlet. MOCHA is built on the key observation that many mobile-cloud applications have the following characteristics: 1) they are compute-intensive, requiring the compute-power of a supercomputer, and 2) they use Big Data, requiring a communications link to cloud-based database sources in near-real-time. In this paper, we describe the operation of MOCHA in battlefield applications, by formulating the aforementioned mobile and cloudlet to be housed within a soldier's vest and inside a military vehicle, respectively, and enabling access to the cloud through high latency satellite links. We provide simulations using the traditional mobile-cloud approach as well as utilizing MOCHA with a mid-stage cloudlet to quantify the utility of this architecture. We show that the MOCHA platform for mobile-cloud computing promises a future for critical battlefield applications that access Big Data, which is currently not possible using existing technology.
Signal processing applications of massively parallel charge domain computing devices
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)
1999-01-01
The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.
Memristor-Based Synapse Design and Training Scheme for Neuromorphic Computing Architecture
2012-06-01
system level built upon the conventional Von Neumann computer architecture [2][3]. Developing the neuromorphic architecture at chip level by...SCHEME FOR NEUROMORPHIC COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-11-2-0046 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...creation of memristor-based neuromorphic computing architecture. Rather than the existing crossbar-based neuron network designs, we focus on memristor
Telescience - Optimizing aerospace science return through geographically distributed operations
NASA Technical Reports Server (NTRS)
Rasmussen, Daryl N.; Mian, Arshad M.
1990-01-01
The paper examines the objectives and requirements of teleoperations, defined as the means and process for scientists, NASA operations personnel, and astronauts to conduct payload operations as if these were colocated. This process is described in terms of Space Station era platforms. Some of the enabling technologies are discussed, including open architecture workstations, distributed computing, transaction management, expert systems, and high-speed networks. Recent testbedding experiments are surveyed to highlight some of the human factors requirements.
Dictionary of marine technology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, D.A.
1989-01-01
This book is intended to replace G. O. Watson's Dictionary of Marine Engineering and Nautical Terms (1964). It includes terms from marine and offshore engineering, naval architecture, shipbuilding, shipping, ship operation, and relevant terms from the electronics, control and computing fields. A few nautical terms are also included.
A modular architecture for transparent computation in recurrent neural networks.
Carmantini, Giovanni S; Beim Graben, Peter; Desroches, Mathieu; Rodrigues, Serafim
2017-01-01
Computation is classically studied in terms of automata, formal languages and algorithms; yet, the relation between neural dynamics and symbolic representations and operations is still unclear in traditional eliminative connectionism. Therefore, we suggest a unique perspective on this central issue, to which we would like to refer as transparent connectionism, by proposing accounts of how symbolic computation can be implemented in neural substrates. In this study we first introduce a new model of dynamics on a symbolic space, the versatile shift, showing that it supports the real-time simulation of a range of automata. We then show that the Gödelization of versatile shifts defines nonlinear dynamical automata, dynamical systems evolving on a vectorial space. Finally, we present a mapping between nonlinear dynamical automata and recurrent artificial neural networks. The mapping defines an architecture characterized by its granular modularity, where data, symbolic operations and their control are not only distinguishable in activation space, but also spatially localizable in the network itself, while maintaining a distributed encoding of symbolic representations. The resulting networks simulate automata in real-time and are programmed directly, in the absence of network training. To discuss the unique characteristics of the architecture and their consequences, we present two examples: (i) the design of a Central Pattern Generator from a finite-state locomotive controller, and (ii) the creation of a network simulating a system of interactive automata that supports the parsing of garden-path sentences as investigated in psycholinguistics experiments. Copyright © 2016 Elsevier Ltd. All rights reserved.
Li-ion synaptic transistor for low power analog computing
Fuller, Elliot J.; Gabaly, Farid El; Leonard, Francois; ...
2016-11-22
Nonvolatile redox transistors (NVRTs) based upon Li-ion battery materials are demonstrated as memory elements for neuromorphic computer architectures with multi-level analog states, “write” linearity, low-voltage switching, and low power dissipation. Simulations of back propagation using the device properties reach ideal classification accuracy. Finally, physics-based simulations predict energy costs per “write” operation of <10 aJ when scaled to 200 nm × 200 nm.
Image Processing Using a Parallel Architecture.
1987-12-01
ENG/87D-25 Abstract This study developed a set o± low level image processing tools on a parallel computer that allows concurrent processing of images...environment, the set of tools offers a significant reduction in the time required to perform some commonly used image processing operations. vI IMAGE...step toward developing these systems, a structured set of image processing tools was implemented using a parallel computer. More important than
Model of a programmable quantum processing unit based on a quantum transistor effect
NASA Astrophysics Data System (ADS)
Ablayev, Farid; Andrianov, Sergey; Fetisov, Danila; Moiseev, Sergey; Terentyev, Alexandr; Urmanchev, Andrey; Vasiliev, Alexander
2018-02-01
In this paper we propose a model of a programmable quantum processing device realizable with existing nano-photonic technologies. It can be viewed as a basis for new high performance hardware architectures. Protocols for physical implementation of device on the controlled photon transfer and atomic transitions are presented. These protocols are designed for executing basic single-qubit and multi-qubit gates forming a universal set. We analyze the possible operation of this quantum computer scheme. Then we formalize the physical architecture by a mathematical model of a Quantum Processing Unit (QPU), which we use as a basis for the Quantum Programming Framework. This framework makes it possible to perform universal quantum computations in a multitasking environment.
Airport Simulations Using Distributed Computational Resources
NASA Technical Reports Server (NTRS)
McDermott, William J.; Maluf, David A.; Gawdiak, Yuri; Tran, Peter; Clancy, Daniel (Technical Monitor)
2002-01-01
The Virtual National Airspace Simulation (VNAS) will improve the safety of Air Transportation. In 2001, using simulation and information management software running over a distributed network of super-computers, researchers at NASA Ames, Glenn, and Langley Research Centers developed a working prototype of a virtual airspace. This VNAS prototype modeled daily operations of the Atlanta airport by integrating measured operational data and simulation data on up to 2,000 flights a day. The concepts and architecture developed by NASA for this prototype are integral to the National Airspace Simulation to support the development of strategies improving aviation safety, identifying precursors to component failure.
Architecture for Control of the K9 Rover
NASA Technical Reports Server (NTRS)
Bresina, John L.; Bualat, maria; Fair, Michael; Wright, Anne; Washington, Richard
2006-01-01
Software featuring a multilevel architecture is used to control the hardware on the K9 Rover, which is a mobile robot used in research on robots for scientific exploration and autonomous operation in general. The software consists of five types of modules: Device Drivers - These modules, at the lowest level of the architecture, directly control motors, cameras, data buses, and other hardware devices. Resource Managers - Each of these modules controls several device drivers. Resource managers can be commanded by either a remote operator or the pilot or conditional-executive modules described below. Behaviors and Data Processors - These modules perform computations for such functions as planning paths, avoiding obstacles, visual tracking, and stereoscopy. These modules can be commanded only by the pilot. Pilot - The pilot receives a possibly complex command from the remote operator or the conditional executive, then decomposes the command into (1) more-specific commands to the resource managers and (2) requests for information from the behaviors and data processors. Conditional Executive - This highest-level module interprets a command plan sent by the remote operator, determines whether resources required for execution of the plan are available, monitors execution, and, if necessary, selects an alternate branch of the plan.
NASA Astrophysics Data System (ADS)
Blume, H.; Alexandru, R.; Applegate, R.; Giordano, T.; Kamiya, K.; Kresina, R.
1986-06-01
In a digital diagnostic imaging department, the majority of operations for handling and processing of images can be grouped into a small set of basic operations, such as image data buffering and storage, image processing and analysis, image display, image data transmission and image data compression. These operations occur in almost all nodes of the diagnostic imaging communications network of the department. An image processor architecture was developed in which each of these functions has been mapped into hardware and software modules. The modular approach has advantages in terms of economics, service, expandability and upgradeability. The architectural design is based on the principles of hierarchical functionality, distributed and parallel processing and aims at real time response. Parallel processing and real time response is facilitated in part by a dual bus system: a VME control bus and a high speed image data bus, consisting of 8 independent parallel 16-bit busses, capable of handling combined up to 144 MBytes/sec. The presented image processor is versatile enough to meet the video rate processing needs of digital subtraction angiography, the large pixel matrix processing requirements of static projection radiography, or the broad range of manipulation and display needs of a multi-modality diagnostic work station. Several hardware modules are described in detail. For illustrating the capabilities of the image processor, processed 2000 x 2000 pixel computed radiographs are shown and estimated computation times for executing the processing opera-tions are presented.
Matrix-vector multiplication using digital partitioning for more accurate optical computing
NASA Technical Reports Server (NTRS)
Gary, C. K.
1992-01-01
Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.
VLSI implementation of a new LMS-based algorithm for noise removal in ECG signal
NASA Astrophysics Data System (ADS)
Satheeskumaran, S.; Sabrigiriraj, M.
2016-06-01
Least mean square (LMS)-based adaptive filters are widely deployed for removing artefacts in electrocardiogram (ECG) due to less number of computations. But they posses high mean square error (MSE) under noisy environment. The transform domain variable step-size LMS algorithm reduces the MSE at the cost of computational complexity. In this paper, a variable step-size delayed LMS adaptive filter is used to remove the artefacts from the ECG signal for improved feature extraction. The dedicated digital Signal processors provide fast processing, but they are not flexible. By using field programmable gate arrays, the pipelined architectures can be used to enhance the system performance. The pipelined architecture can enhance the operation efficiency of the adaptive filter and save the power consumption. This technique provides high signal-to-noise ratio and low MSE with reduced computational complexity; hence, it is a useful method for monitoring patients with heart-related problem.
A global spacecraft control network for spacecraft autonomy research
NASA Technical Reports Server (NTRS)
Kitts, Christopher A.
1996-01-01
The development and implementation of the Automated Space System Experimental Testbed (ASSET) space operations and control network, is reported on. This network will serve as a command and control architecture for spacecraft operations and will offer a real testbed for the application and validation of advanced autonomous spacecraft operations strategies. The proposed network will initially consist of globally distributed amateur radio ground stations at locations throughout North America and Europe. These stations will be linked via Internet to various control centers. The Stanford (CA) control center will be capable of human and computer based decision making for the coordination of user experiments, resource scheduling and fault management. The project's system architecture is described together with its proposed use as a command and control system, its value as a testbed for spacecraft autonomy research, and its current implementation.
SPOT4 Operational Control Center (CMP)
NASA Technical Reports Server (NTRS)
Zaouche, G.
1993-01-01
CNES(F) is responsible for the development of a new generation of Operational Control Center (CMP) which will operate the new heliosynchronous remote sensing satellite (SPOT4). This Operational Control Center takes large benefit from the experience of the first generation of control center and from the recent advances in computer technology and standards. The CMP is designed for operating two satellites all the same time with a reduced pool of controllers. The architecture of this CMP is simple, robust, and flexible, since it is based on powerful distributed workstations interconnected through an Ethernet LAN. The application software uses modern and formal software engineering methods, in order to improve quality and reliability, and facilitate maintenance. This software is table driven so it can be easily adapted to other operational needs. Operation tasks are automated to the maximum extent, so that it could be possible to operate the CMP automatically with very limited human interference for supervision and decision making. This paper provides an overview of the SPOTS mission and associated ground segment. It also details the CMP, its functions, and its software and hardware architecture.
The structure of the clouds distributed operating system
NASA Technical Reports Server (NTRS)
Dasgupta, Partha; Leblanc, Richard J., Jr.
1989-01-01
A novel system architecture, based on the object model, is the central structuring concept used in the Clouds distributed operating system. This architecture makes Clouds attractive over a wide class of machines and environments. Clouds is a native operating system, designed and implemented at Georgia Tech. and runs on a set of generated purpose computers connected via a local area network. The system architecture of Clouds is composed of a system-wide global set of persistent (long-lived) virtual address spaces, called objects that contain persistent data and code. The object concept is implemented at the operating system level, thus presenting a single level storage view to the user. Lightweight treads carry computational activity through the code stored in the objects. The persistent objects and threads gives rise to a programming environment composed of shared permanent memory, dispensing with the need for hardware-derived concepts such as the file systems and message systems. Though the hardware may be distributed and may have disks and networks, the Clouds provides the applications with a logically centralized system, based on a shared, structured, single level store. The current design of Clouds uses a minimalist philosophy with respect to both the kernel and the operating system. That is, the kernel and the operating system support a bare minimum of functionality. Clouds also adheres to the concept of separation of policy and mechanism. Most low-level operating system services are implemented above the kernel and most high level services are implemented at the user level. From the measured performance of using the kernel mechanisms, we are able to demonstrate that efficient implementations are feasible for the object model on commercially available hardware. Clouds provides a rich environment for conducting research in distributed systems. Some of the topics addressed in this paper include distributed programming environments, consistency of persistent data and fault-tolerance.
Procedural Quantum Programming
NASA Astrophysics Data System (ADS)
Ömer, Bernhard
2002-09-01
While classical computing science has developed a variety of methods and programming languages around the concept of the universal computer, the typical description of quantum algorithms still uses a purely mathematical, non-constructive formalism which makes no difference between a hydrogen atom and a quantum computer. This paper investigates, how the concept of procedural programming languages, the most widely used classical formalism for describing and implementing algorithms, can be adopted to the field of quantum computing, and how non-classical features like the reversibility of unitary transformations, the non-observability of quantum states or the lack of copy and erase operations can be reflected semantically. It introduces the key concepts of procedural quantum programming (hybrid target architecture, operator hierarchy, quantum data types, memory management, etc.) and presents the experimental language QCL, which implements these principles.
Approximation algorithms for planning and control
NASA Technical Reports Server (NTRS)
Boddy, Mark; Dean, Thomas
1989-01-01
A control system operating in a complex environment will encounter a variety of different situations, with varying amounts of time available to respond to critical events. Ideally, such a control system will do the best possible with the time available. In other words, its responses should approximate those that would result from having unlimited time for computation, where the degree of the approximation depends on the amount of time it actually has. There exist approximation algorithms for a wide variety of problems. Unfortunately, the solution to any reasonably complex control problem will require solving several computationally intensive problems. Algorithms for successive approximation are a subclass of the class of anytime algorithms, algorithms that return answers for any amount of computation time, where the answers improve as more time is allotted. An architecture is described for allocating computation time to a set of anytime algorithms, based on expectations regarding the value of the answers they return. The architecture described is quite general, producing optimal schedules for a set of algorithms under widely varying conditions.
NASA Technical Reports Server (NTRS)
Farhat, Charbel; Lesoinne, Michel
1993-01-01
Most of the recently proposed computational methods for solving partial differential equations on multiprocessor architectures stem from the 'divide and conquer' paradigm and involve some form of domain decomposition. For those methods which also require grids of points or patches of elements, it is often necessary to explicitly partition the underlying mesh, especially when working with local memory parallel processors. In this paper, a family of cost-effective algorithms for the automatic partitioning of arbitrary two- and three-dimensional finite element and finite difference meshes is presented and discussed in view of a domain decomposed solution procedure and parallel processing. The influence of the algorithmic aspects of a solution method (implicit/explicit computations), and the architectural specifics of a multiprocessor (SIMD/MIMD, startup/transmission time), on the design of a mesh partitioning algorithm are discussed. The impact of the partitioning strategy on load balancing, operation count, operator conditioning, rate of convergence and processor mapping is also addressed. Finally, the proposed mesh decomposition algorithms are demonstrated with realistic examples of finite element, finite volume, and finite difference meshes associated with the parallel solution of solid and fluid mechanics problems on the iPSC/2 and iPSC/860 multiprocessors.
Test and Evaluation of Architecture-Aware Compiler Environment
2011-11-01
biology, medicine, social sciences , and security applications. Challenges include extremely large graphs (the Facebook friend network has over...Operations with Temporal Binning ....................................................................... 32 4.12 Memory behavior and Energy per...five challenge problems empirically, exploring their scaling properties, computation and datatype needs, memory behavior , and temporal behavior
User interface issues in supporting human-computer integrated scheduling
NASA Technical Reports Server (NTRS)
Cooper, Lynne P.; Biefeld, Eric W.
1991-01-01
The topics are presented in view graph form and include the following: characteristics of Operations Mission Planner (OMP) schedule domain; OMP architecture; definition of a schedule; user interface dimensions; functional distribution; types of users; interpreting user interaction; dynamic overlays; reactive scheduling; and transitioning the interface.
Sequoia: A fault-tolerant tightly coupled multiprocessor for transaction processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernstein, P.A.
1988-02-01
The Sequoia computer is a tightly coupled multiprocessor, and thus attains the performance advantages of this style of architecture. It avoids most of the fault-tolerance disadvantages of tight coupling by using a new fault-tolerance design. The Sequoia architecture is similar to other multimicroprocessor architectures, such as those of Encore and Sequent, in that it gives dozens of microprocessors shared access to a large main memory. It resembles the Stratus architecture in its extensive use of hardware fault-detection techniques. It resembles Stratus and Auragen in its ability to quickly recover all processes after a single point failure, transparently to the user.more » However, Sequoia is unique in its combination of a large-scale tightly coupled architecture with a hardware approach to fault tolerance. This article gives an overview of how the hardware architecture and operating systems (OS) work together to provide a high degree of fault tolerance with good system performance.« less
The thermodynamic efficiency of computations made in cells across the range of life
NASA Astrophysics Data System (ADS)
Kempes, Christopher P.; Wolpert, David; Cohen, Zachary; Pérez-Mercader, Juan
2017-11-01
Biological organisms must perform computation as they grow, reproduce and evolve. Moreover, ever since Landauer's bound was proposed, it has been known that all computation has some thermodynamic cost-and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However, this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the useful efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single- and multicellular eukaryotes. However, the rates of total computation per unit mass are non-monotonic in bacteria with increasing cell size, and also change across different biological architectures, including the shift from unicellular to multicellular eukaryotes. This article is part of the themed issue 'Reconceptualizing the origins of life'.
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-08-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called SimBOX that will use some of the real-time infrastructure (RTI) functionality from the current military real-time simulation architecture. The uniqueness of the approach is to provide a "plug and play environment" for various system components that run at various data rates (Hz) and the ability to replicate or transfer C2 operations to various subsystems in a scalable manner. This is possible by providing a communication bus called "Distributed Shared Data Bus" and a distributed computing environment used to scale the control needs by providing a self-contained computing, data logging and control function module that can be rapidly reconfigured to perform different functions. This kind of software-enabled control is very much needed to meet the needs of future aerospace command and control functions.
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-09-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called
NASA Astrophysics Data System (ADS)
Liyanagedera, Chamika M.; Sengupta, Abhronil; Jaiswal, Akhilesh; Roy, Kaushik
2017-12-01
Stochastic spiking neural networks based on nanoelectronic spin devices can be a possible pathway to achieving "brainlike" compact and energy-efficient cognitive intelligence. The computational model attempt to exploit the intrinsic device stochasticity of nanoelectronic synaptic or neural components to perform learning or inference. However, there has been limited analysis on the scaling effect of stochastic spin devices and its impact on the operation of such stochastic networks at the system level. This work attempts to explore the design space and analyze the performance of nanomagnet-based stochastic neuromorphic computing architectures for magnets with different barrier heights. We illustrate how the underlying network architecture must be modified to account for the random telegraphic switching behavior displayed by magnets with low barrier heights as they are scaled into the superparamagnetic regime. We perform a device-to-system-level analysis on a deep neural-network architecture for a digit-recognition problem on the MNIST data set.
Developing a Distributed Computing Architecture at Arizona State University.
ERIC Educational Resources Information Center
Armann, Neil; And Others
1994-01-01
Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…
Multicore Architectures for Multiple Independent Levels of Security Applications
2012-09-01
to bolster the MILS effort. However, current MILS operating systems are not designed for multi-core platforms. They do not have the hardware support...current MILS operating systems are not designed for multi‐core platforms. They do not have the hardware support to ensure that the separation...the availability of information at different security classification levels while increasing the overall security of the computing system . Due to the
2008 Defense Industrial Base Critical Infrastructure Protection Conference (DIB-CBIP)
2008-04-09
a cloak -and- dagger thing. It’s about computer architecture and the soundness of electronic systems." Joel Brenner, ODNI Counterintelligence Office...to support advanced network exploitation and launch attacks on the informational and physical elements of our cyber infrastructure. In order to...entities and is vulnerable to attacks and manipulation. Operations in the cyber domain have the ability to impact operations in other war-fighting
Processing Cones: A Computational Structure for Image Analysis.
1981-12-01
image analysis applications, referred to as a processing cone, is described and sample algorithms are presented. A fundamental characteristic of the structure is its hierarchical organization into two-dimensional arrays of decreasing resolution. In this architecture, a protypical function is defined on a local window of data and applied uniformly to all windows in a parallel manner. Three basic modes of processing are supported in the cone: reduction operations (upward processing), horizontal operations (processing at a single level) and projection operations (downward
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
Force-reflective teleoperated system with shared and compliant control capabilities
NASA Technical Reports Server (NTRS)
Szakaly, Z.; Kim, W. S.; Bejczy, A. K.
1989-01-01
The force-reflecting teleoperator breadboard is described. It is the first system among available Research and Development systems with the following combined capabilities: (1) The master input device is not a replica of the slave arm. It is a general purpose device which can be applied to the control of different robot arms through proper mathematical transformations. (2) Force reflection generated in the master hand controller is referenced to forces and moments measured by a six DOF force-moment sensor at the base of the robot hand. (3) The system permits a smooth spectrum of operations between full manual, shared manual and automatic, and full automatic (called traded) control. (4) The system can be operated with variable compliance or stiffness in force-reflecting control. Some of the key points of the system are the data handling and computing architecture, the communication method, and the handling of mathematical transformations. The architecture is a fully synchronized pipeline. The communication method achieves optimal use of a parallel communication channel between the local and remote computing nodes. A time delay box is also implemented in this communication channel permitting experiments with up to 8 sec time delay. The mathematical transformations are computed faster than 1 msec so that control at each node can be operated at 1 kHz servo rate without interpolation. This results in an overall force-reflecting loop rate of 200 Hz.
NASA Astrophysics Data System (ADS)
Rucinski, Marek; Coates, Adam; Montano, Giuseppe; Allouis, Elie; Jameux, David
2015-09-01
The Lightweight Advanced Robotic Arm Demonstrator (LARAD) is a state-of-the-art, two-meter long robotic arm for planetary surface exploration currently being developed by a UK consortium led by Airbus Defence and Space Ltd under contract to the UK Space Agency (CREST-2 programme). LARAD has a modular design, which allows for experimentation with different electronics and control software. The control system architecture includes the on-board computer, control software and firmware, and the communication infrastructure (e.g. data links, switches) connecting on-board computer(s), sensors, actuators and the end-effector. The purpose of the control system is to operate the arm according to pre-defined performance requirements, monitoring its behaviour in real-time and performing safing/recovery actions in case of faults. This paper reports on the results of a recent study about the feasibility of the development and integration of a novel control system architecture for LARAD fully based on the SpaceWire protocol. The current control system architecture is based on the combination of two communication protocols, Ethernet and CAN. The new SpaceWire-based control system will allow for improved monitoring and telecommanding performance thanks to higher communication data rate, allowing for the adoption of advanced control schemes, potentially based on multiple vision sensors, and for the handling of sophisticated end-effectors that require fine control, such as science payloads or robotic hands.
National Test Bed Security and Communications Architecture Working Group Report
1992-04-01
computer systems via a physical medium. Most of those physical media are tappable or interceptable. This means that all the data that flows across the...provides the capability for NTBN nodes to support users operating in differing COIs to share the computing resources and communication media and for...representation. Again generally speaking, the NTBN must act as the high-speed, wide-bandwidth communications media that would provide the "near real-time
Frances: A Tool for Understanding Computer Architecture and Assembly Language
ERIC Educational Resources Information Center
Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh
2012-01-01
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
Tutorial: Computer architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gajski, D.D.; Milutinovic, V.M.; Siegel, H.J.
1986-01-01
This book presents the state-of-the-art in advanced computer architecture. It deals with the concepts underlying current architectures and covers approaches and techniques being used in the design of advanced computer systems.
Grossman, Robert L.; Heath, Allison; Murphy, Mark; Patterson, Maria; Wells, Walt
2017-01-01
Data commons collocate data, storage, and computing infrastructure with core services and commonly used tools and applications for managing, analyzing, and sharing data to create an interoperable resource for the research community. An architecture for data commons is described, as well as some lessons learned from operating several large-scale data commons. PMID:29033693
MATREX: A Unifying Modeling and Simulation Architecture for Live-Virtual-Constructive Applications
2007-05-23
Deployment Systems Acquisition Operations & Support B C Sustainment FRP Decision Review FOC LRIP/IOT& ECritical Design Review Pre-Systems...CMS2 – Comprehensive Munitions & Sensor Server • CSAT – C4ISR Static Analysis Tool • C4ISR – Command & Control, Communications, Computers
A New Architecture for Improved Human Behavior in Military Simulations
2008-04-01
forces were using motorcycle couriers to avoid U.S. intelligence capabilities in the early days of Operation Iraqi Freedom (OIF), there were certainly...Games ( MMORPG ) engage millions of game players in near-real-time computing environments. Games such as World of Warcraft® attract players to
An architecture for real-time vision processing
NASA Technical Reports Server (NTRS)
Chien, Chiun-Hong
1994-01-01
To study the feasibility of developing an architecture for real time vision processing, a task queue server and parallel algorithms for two vision operations were designed and implemented on an i860-based Mercury Computing System 860VS array processor. The proposed architecture treats each vision function as a task or set of tasks which may be recursively divided into subtasks and processed by multiple processors coordinated by a task queue server accessible by all processors. Each idle processor subsequently fetches a task and associated data from the task queue server for processing and posts the result to shared memory for later use. Load balancing can be carried out within the processing system without the requirement for a centralized controller. The author concludes that real time vision processing cannot be achieved without both sequential and parallel vision algorithms and a good parallel vision architecture.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.; Som, Sukhamony
1990-01-01
The performance modeling and enhancement for periodic execution of large-grain, decision-free algorithms in data flow architectures is examined. Applications include real-time implementation of control and signal processing algorithms where performance is required to be highly predictable. The mapping of algorithms onto the specified class of data flow architectures is realized by a marked graph model called ATAMM (Algorithm To Architecture Mapping Model). Performance measures and bounds are established. Algorithm transformation techniques are identified for performance enhancement and reduction of resource (computing element) requirements. A systematic design procedure is described for generating operating conditions for predictable performance both with and without resource constraints. An ATAMM simulator is used to test and validate the performance prediction by the design procedure. Experiments on a three resource testbed provide verification of the ATAMM model and the design procedure.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Som, Sukhamoy; Stoughton, John W.; Mielke, Roland R.
1990-01-01
Performance modeling and performance enhancement for periodic execution of large-grain, decision-free algorithms in data flow architectures are discussed. Applications include real-time implementation of control and signal processing algorithms where performance is required to be highly predictable. The mapping of algorithms onto the specified class of data flow architectures is realized by a marked graph model called algorithm to architecture mapping model (ATAMM). Performance measures and bounds are established. Algorithm transformation techniques are identified for performance enhancement and reduction of resource (computing element) requirements. A systematic design procedure is described for generating operating conditions for predictable performance both with and without resource constraints. An ATAMM simulator is used to test and validate the performance prediction by the design procedure. Experiments on a three resource testbed provide verification of the ATAMM model and the design procedure.
Outline of a novel architecture for cortical computation.
Majumdar, Kaushik
2008-03-01
In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.
Advances in Orion's On-Orbit Guidance and Targeting System Architecture
NASA Technical Reports Server (NTRS)
Scarritt, Sara K.; Fill, Thomas; Robinson, Shane
2015-01-01
NASA's manned spaceflight programs have a rich history of advancing onboard guidance and targeting technology. In order to support future missions, the guidance and targeting architecture for the Orion Multi-Purpose Crew Vehicle must be able to operate in complete autonomy, without any support from the ground. Orion's guidance and targeting system must be sufficiently flexible to easily adapt to a wide array of undecided future missions, yet also not cause an undue computational burden on the flight computer. This presents a unique design challenge from the perspective of both algorithm development and system architecture construction. The present work shows how Orion's guidance and targeting system addresses these challenges. On the algorithm side, the system advances the state-of-the-art by: (1) steering burns with a simple closed-loop guidance strategy based on Shuttle heritage, and (2) planning maneuvers with a cutting-edge two-level targeting routine. These algorithms are then placed into an architecture designed to leverage the advantages of each and ensure that they function in concert with one another. The resulting system is characterized by modularity and simplicity. As such, it is adaptable to the on-orbit phases of any future mission that Orion may attempt.
The CEOS Global Observation Strategy for Disaster Risk Management: An Enterprise Architect's View
NASA Astrophysics Data System (ADS)
Moe, K.; Evans, J. D.; Frye, S.
2013-12-01
The Committee on Earth Observation Satellites (CEOS) Working Group on Information Systems and Services (WGISS), on behalf of the Global Earth Observation System of Systems (GEOSS), is defining an enterprise architecture (known as GA.4.D) for the use of satellite observations in international disaster management. This architecture defines the scope and structure of the disaster management enterprise (based on disaster types and phases); its processes (expressed via use cases / system functions); and its core values (in particular, free and open data sharing via standard interfaces). The architecture also details how a disaster management enterprise describes, obtains, and handles earth observations and data products for decision-support; and how it draws on distributed computational services for streamlined operational capability. We have begun to apply this architecture to a new CEOS initiative, the Global Observation Strategy for Disaster Risk Management (DRM). CEOS is defining this Strategy based on the outcomes of three pilot projects focused on seismic hazards, volcanoes, and floods. These pilots offer a unique opportunity to characterize and assess the impacts (benefits / costs) of the GA.4.D architecture in practice. In particular, the DRM Floods Pilot is applying satellite-based optical and radar data to flood mitigation, warning, and response, including monitoring and modeling at regional to global scales. It is focused on serving user needs and building local institutional / technical capacity in the Caribbean, Southern Africa, and Southeast Asia. In the context of these CEOS DRM Pilots, we are characterizing where and how the GA.4D architecture helps participants to: - Understand the scope and nature of hazard events quickly and accurately - Assure timely delivery of observations into analysis, modeling, and decision-making - Streamline user access to products - Lower barriers to entry for users or suppliers - Streamline or focus field operations in disaster reduction - Reduce redundancies and gaps in inter-organizational systems - Assist in planning / managing / prioritizing information and computing resources - Adapt computational resources to new technologies or evolving user needs - Sustain capability for the long term Insights from this exercise are helping us to abstract best practices applicable to other contexts, disaster types, and disaster phases, whereby local communities can improve their use of satellite data for greater preparedness. This effort is also helping to assess the likely impacts and roles of emerging technologies (such as cloud computing, "Big Data" analysis, location-based services, crowdsourcing, semantic services, small satellites, drones, direct broadcast, or model webs) in future disaster management activities.
1977-10-01
APPROVED DATE FUNCTION APPROVED jDATE WRITER J . K-olanek 2/6/76 REVISIONS CHK DESCRIPTION REV CHK DESCRIPTION IREV REVISION jJ ~ ~ ~~~ _ II SHEET NO...DOCUMENT (CDBDD) 45 5.5 COMPUTER PROGRAM PACKAGE (CPP)- j 45 5.6 COMPUTER PROGRAM OPERATOR’S MANUAL (CPOM) 45 5.7 COMPUTER PROGRAM TEST PLAN (CPTPL) 45...I LIST OF FIGURES Number Page 1 JEWS Simplified Block Diagram 4 2 System Controller Architecture 5 SIZE CODE IDENT NO DRAWING NO. A 49956 SCALE REV J
Application of technology developed for flight simulation at NASA. Langley Research Center
NASA Technical Reports Server (NTRS)
Cleveland, Jeff I., II
1991-01-01
In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations including mathematical model computation and data input/output to the simulators must be deterministic and be completed in as short a time as possible. Personnel at NASA's Langley Research Center are currently developing the use of supercomputers for simulation mathematical model computation for real-time simulation. This, coupled with the use of an open systems software architecture, will advance the state-of-the-art in real-time flight simulation.
Software/hardware distributed processing network supporting the Ada environment
NASA Astrophysics Data System (ADS)
Wood, Richard J.; Pryk, Zen
1993-09-01
A high-performance, fault-tolerant, distributed network has been developed, tested, and demonstrated. The network is based on the MIPS Computer Systems, Inc. R3000 Risc for processing, VHSIC ASICs for high speed, reliable, inter-node communications and compatible commercial memory and I/O boards. The network is an evolution of the Advanced Onboard Signal Processor (AOSP) architecture. It supports Ada application software with an Ada- implemented operating system. A six-node implementation (capable of expansion up to 256 nodes) of the RISC multiprocessor architecture provides 120 MIPS of scalar throughput, 96 Mbytes of RAM and 24 Mbytes of non-volatile memory. The network provides for all ground processing applications, has merit for space-qualified RISC-based network, and interfaces to advanced Computer Aided Software Engineering (CASE) tools for application software development.
Wong, Stephen T C; Tjandra, Donny; Wang, Huili; Shen, Weimin
2003-09-01
Few information systems today offer a flexible means to define and manage the automated part of radiology processes, which provide clinical imaging services for the entire healthcare organization. Even fewer of them provide a coherent architecture that can easily cope with heterogeneity and inevitable local adaptation of applications and can integrate clinical and administrative information to aid better clinical, operational, and business decisions. We describe an innovative enterprise architecture of image information management systems to fill the needs. Such a system is based on the interplay of production workflow management, distributed object computing, Java and Web techniques, and in-depth domain knowledge in radiology operations. Our design adapts the approach of "4+1" architectural view. In this new architecture, PACS and RIS become one while the user interaction can be automated by customized workflow process. Clinical service applications are implemented as active components. They can be reasonably substituted by applications of local adaptations and can be multiplied for fault tolerance and load balancing. Furthermore, the workflow-enabled digital radiology system would provide powerful query and statistical functions for managing resources and improving productivity. This paper will potentially lead to a new direction of image information management. We illustrate the innovative design with examples taken from an implemented system.
The implementation of fail-operative functions in integrated digital avionics systems
NASA Technical Reports Server (NTRS)
Osoer, S. S.
1976-01-01
System architectures which incorporate fail operative flight guidance functions within a total integrated avionics complex are described. It is shown that the mixture of flight critical and nonflight critical functions within a common computer complex is an efficient solution to the integration of navigation, guidance, flight control, display, and flight management. Interfacing subsystems retain autonomous capability to avoid vulnerability to total avionics system shutdown as a result of only a few failures.
Proposal for a transmon-based quantum router.
Sala, Arnau; Blaauboer, M
2016-07-13
We propose an implementation of a quantum router for microwave photons in a superconducting qubit architecture consisting of a transmon qubit, SQUIDs and a nonlinear capacitor. We model and analyze the dynamics of operation of the quantum switch using quantum Langevin equations in a scattering approach and compute the photon reflection and transmission probabilities. For parameters corresponding to up-to-date experimental devices we predict successful operation of the router with probabilities above 94%.
A Programmable Five Qubit Quantum Computer Using Trapped Atomic Ions
NASA Astrophysics Data System (ADS)
Debnath, Shantanu
Quantum computers can solve certain problems more efficiently compared to conventional classical methods. In the endeavor to build a quantum computer, several competing platforms have emerged that can implement certain quantum algorithms using a few qubits. However, the demonstrations so far have been done usually by tailoring the hardware to meet the requirements of a particular algorithm implemented for a limited number of instances. Although such proof of principal implementations are important to verify the working of algorithms on a physical system, they further need to have the potential to serve as a general purpose quantum computer allowing the flexibility required for running multiple algorithms and be scaled up to host more qubits. Here we demonstrate a small programmable quantum computer based on five trapped atomic ions each of which serves as a qubit. By optically resolving each ion we can individually address them in order to perform a complete set of single-qubit and fully connected two-qubit quantum gates and alsoperform efficient individual qubit measurements. We implement a computation architecture that accepts an algorithm from a user interface in the form of a standard logic gate sequence and decomposes it into fundamental quantum operations that are native to the hardware using a set of compilation instructions that are defined within the software. These operations are then effected through a pattern of laser pulses that perform coherent rotations on targeted qubits in the chain. The architecture implemented in the experiment therefore gives us unprecedented flexibility in the programming of any quantum algorithm while staying blind to the underlying hardware. As a demonstration we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms on the five-qubit processor and achieve average success rates of 95 and 90 percent, respectively. We also implement a five-qubit coherent quantum Fourier transform and examine its performance in the period finding and phase estimation protocol. We find fidelities of 84 and 62 percent, respectively. While maintaining the same computation architecture the system can be scaled to more ions using resources that scale favorably (O(N. 2)) with the numberof qubits N.
A Grid Infrastructure for Supporting Space-based Science Operations
NASA Technical Reports Server (NTRS)
Bradford, Robert N.; Redman, Sandra H.; McNair, Ann R. (Technical Monitor)
2002-01-01
Emerging technologies for computational grid infrastructures have the potential for revolutionizing the way computers are used in all aspects of our lives. Computational grids are currently being implemented to provide a large-scale, dynamic, and secure research and engineering environments based on standards and next-generation reusable software, enabling greater science and engineering productivity through shared resources and distributed computing for less cost than traditional architectures. Combined with the emerging technologies of high-performance networks, grids provide researchers, scientists and engineers the first real opportunity for an effective distributed collaborative environment with access to resources such as computational and storage systems, instruments, and software tools and services for the most computationally challenging applications.
NASA Astrophysics Data System (ADS)
Saponara, M.; Tramutola, A.; Creten, P.; Hardy, J.; Philippe, C.
2013-08-01
Optimization-based control techniques such as Model Predictive Control (MPC) are considered extremely attractive for space rendezvous, proximity operations and capture applications that require high level of autonomy, optimal path planning and dynamic safety margins. Such control techniques require high-performance computational needs for solving large optimization problems. The development and implementation in a flight representative avionic architecture of a MPC based Guidance, Navigation and Control system has been investigated in the ESA R&T study “On-line Reconfiguration Control System and Avionics Architecture” (ORCSAT) of the Aurora programme. The paper presents the baseline HW and SW avionic architectures, and verification test results obtained with a customised RASTA spacecraft avionics development platform from Aeroflex Gaisler.
Logical NAND and NOR Operations Using Algorithmic Self-assembly of DNA Molecules
NASA Astrophysics Data System (ADS)
Wang, Yanfeng; Cui, Guangzhao; Zhang, Xuncai; Zheng, Yan
DNA self-assembly is the most advanced and versatile system that has been experimentally demonstrated for programmable construction of patterned systems on the molecular scale. It has been demonstrated that the simple binary arithmetic and logical operations can be computed by the process of self assembly of DNA tiles. Here we report a one-dimensional algorithmic self-assembly of DNA triple-crossover molecules that can be used to execute five steps of a logical NAND and NOR operations on a string of binary bits. To achieve this, abstract tiles were translated into DNA tiles based on triple-crossover motifs. Serving as input for the computation, long single stranded DNA molecules were used to nucleate growth of tiles into algorithmic crystals. Our method shows that engineered DNA self-assembly can be treated as a bottom-up design techniques, and can be capable of designing DNA computer organization and architecture.
NASA Astrophysics Data System (ADS)
Papers are presented on ISDN, mobile radio systems and techniques for digital connectivity, centralized and distributed algorithms in computer networks, communications networks, quality assurance and impact on cost, adaptive filters in communications, the spread spectrum, signal processing, video communication techniques, and digital satellite services. Topics discussed include performance evaluation issues for integrated protocols, packet network operations, the computer network theory and multiple-access, microwave single sideband systems, switching architectures, fiber optic systems, wireless local communications, modulation, coding, and synchronization, remote switching, software quality, transmission, and expert systems in network operations. Consideration is given to wide area networks, image and speech processing, office communications application protocols, multimedia systems, customer-controlled network operations, digital radio systems, channel modeling and signal processing in digital communications, earth station/on-board modems, computer communications system performance evaluation, source encoding, compression, and quantization, and adaptive communications systems.
SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU FPGA DSP Architectures
NASA Technical Reports Server (NTRS)
Schmidt, Andrew G.; Weisz, Gabriel; French, Matthew; Flatley, Thomas; Villalpando, Carlos Y.
2017-01-01
The SpaceCubeX project is motivated by the need for high performance, modular, and scalable on-board processing to help scientists answer critical 21st century questions about global climate change, air quality, ocean health, and ecosystem dynamics, while adding new capabilities such as low-latency data products for extreme event warnings. These goals translate into on-board processing throughput requirements that are on the order of 100-1,000 more than those of previous Earth Science missions for standard processing, compression, storage, and downlink operations. To study possible future architectures to achieve these performance requirements, the SpaceCubeX project provides an evolvable testbed and framework that enables a focused design space exploration of candidate hybrid CPU/FPGA/DSP processing architectures. The framework includes ArchGen, an architecture generator tool populated with candidate architecture components, performance models, and IP cores, that allows an end user to specify the type, number, and connectivity of a hybrid architecture. The framework requires minimal extensions to integrate new processors, such as the anticipated High Performance Spaceflight Computer (HPSC), reducing time to initiate benchmarking by months. To evaluate the framework, we leverage a wide suite of high performance embedded computing benchmarks and Earth science scenarios to ensure robust architecture characterization. We report on our projects Year 1 efforts and demonstrate the capabilities across four simulation testbed models, a baseline SpaceCube 2.0 system, a dual ARM A9 processor system, a hybrid quad ARM A53 and FPGA system, and a hybrid quad ARM A53 and DSP system.
Multi-processor including data flow accelerator module
Davidson, George S.; Pierce, Paul E.
1990-01-01
An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.
Understanding Evolutionary Potential in Virtual CPU Instruction Set Architectures
Bryson, David M.; Ofria, Charles
2013-01-01
We investigate fundamental decisions in the design of instruction set architectures for linear genetic programs that are used as both model systems in evolutionary biology and underlying solution representations in evolutionary computation. We subjected digital organisms with each tested architecture to seven different computational environments designed to present a range of evolutionary challenges. Our goal was to engineer a general purpose architecture that would be effective under a broad range of evolutionary conditions. We evaluated six different types of architectural features for the virtual CPUs: (1) genetic flexibility: we allowed digital organisms to more precisely modify the function of genetic instructions, (2) memory: we provided an increased number of registers in the virtual CPUs, (3) decoupled sensors and actuators: we separated input and output operations to enable greater control over data flow. We also tested a variety of methods to regulate expression: (4) explicit labels that allow programs to dynamically refer to specific genome positions, (5) position-relative search instructions, and (6) multiple new flow control instructions, including conditionals and jumps. Each of these features also adds complication to the instruction set and risks slowing evolution due to epistatic interactions. Two features (multiple argument specification and separated I/O) demonstrated substantial improvements in the majority of test environments, along with versions of each of the remaining architecture modifications that show significant improvements in multiple environments. However, some tested modifications were detrimental, though most exhibit no systematic effects on evolutionary potential, highlighting the robustness of digital evolution. Combined, these observations enhance our understanding of how instruction architecture impacts evolutionary potential, enabling the creation of architectures that support more rapid evolution of complex solutions to a broad range of challenges. PMID:24376669
NASA Astrophysics Data System (ADS)
Broten, Gregory S.; Monckton, Simon P.; Collier, Jack; Giesbrecht, Jared
2006-05-01
In 2002 Defence R&D Canada changed research direction from pure tele-operated land vehicles to general autonomy for land, air, and sea craft. The unique constraints of the military environment coupled with the complexity of autonomous systems drove DRDC to carefully plan a research and development infrastructure that would provide state of the art tools without restricting research scope. DRDC's long term objectives for its autonomy program address disparate unmanned ground vehicle (UGV), unattended ground sensor (UGS), air (UAV), and subsea and surface (UUV and USV) vehicles operating together with minimal human oversight. Individually, these systems will range in complexity from simple reconnaissance mini-UAVs streaming video to sophisticated autonomous combat UGVs exploiting embedded and remote sensing. Together, these systems can provide low risk, long endurance, battlefield services assuming they can communicate and cooperate with manned and unmanned systems. A key enabling technology for this new research is a software architecture capable of meeting both DRDC's current and future requirements. DRDC built upon recent advances in the computing science field while developing its software architecture know as the Architecture for Autonomy (AFA). Although a well established practice in computing science, frameworks have only recently entered common use by unmanned vehicles. For industry and government, the complexity, cost, and time to re-implement stable systems often exceeds the perceived benefits of adopting a modern software infrastructure. Thus, most persevere with legacy software, adapting and modifying software when and wherever possible or necessary -- adopting strategic software frameworks only when no justifiable legacy exists. Conversely, academic programs with short one or two year projects frequently exploit strategic software frameworks but with little enduring impact. The open-source movement radically changes this picture. Academic frameworks, open to public scrutiny and modification, now rival commercial frameworks in both quality and economic impact. Further, industry now realizes that open source frameworks can reduce cost and risk of systems engineering. This paper describes the Architecture for Autonomy implemented by DRDC and how this architecture meets DRDC's current needs. It also presents an argument for why this architecture should also satisfy DRDC's future requirements as well.
Supporting Undergraduate Computer Architecture Students Using a Visual MIPS64 CPU Simulator
ERIC Educational Resources Information Center
Patti, D.; Spadaccini, A.; Palesi, M.; Fazzino, F.; Catania, V.
2012-01-01
The topics of computer architecture are always taught using an Assembly dialect as an example. The most commonly used textbooks in this field use the MIPS64 Instruction Set Architecture (ISA) to help students in learning the fundamentals of computer architecture because of its orthogonality and its suitability for real-world applications. This…
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques
2013-03-01
MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated
NASA Technical Reports Server (NTRS)
Kayatin, Matthew J.; Perry, Jay L.
2017-01-01
Traditional gas-phase trace contaminant control adsorption process flow is constrained as required to maintain high contaminant single-pass adsorption efficiency. Specifically, the bed superficial velocity is controlled to limit the adsorption mass-transfer zone length relative to the physical adsorption bed; this is aided by traditional high-aspect ratio bed design. Through operation in this manner, most contaminants, including those with relatively high potential energy are readily adsorbed. A consequence of this operational approach, however, is a limited available operational flow margin. By considering a paradigm shift in adsorption architecture design and operations, in which flows of high superficial velocity are treated by low-aspect ratio sorbent beds, the range of well-adsorbed contaminants becomes limited, but the process flow is increased such that contaminant leaks or emerging contaminants of interest may be effectively controlled. To this end, the high velocity, low aspect ratio (HVLA) adsorption process architecture was demonstrated against a trace contaminant load representative of the International Space Station atmosphere. Two HVLA concept packaging designs (linear flow and radial flow) were tested. The performance of each design was evaluated and compared against computer simulation. Utilizing the HVLA process, long and sustained control of heavy organic contaminants was demonstrated.
A resilient and secure software platform and architecture for distributed spacecraft
NASA Astrophysics Data System (ADS)
Otte, William R.; Dubey, Abhishek; Karsai, Gabor
2014-06-01
A distributed spacecraft is a cluster of independent satellite modules flying in formation that communicate via ad-hoc wireless networks. This system in space is a cloud platform that facilitates sharing sensors and other computing and communication resources across multiple applications, potentially developed and maintained by different organizations. Effectively, such architecture can realize the functions of monolithic satellites at a reduced cost and with improved adaptivity and robustness. Openness of these architectures pose special challenges because the distributed software platform has to support applications from different security domains and organizations, and where information flows have to be carefully managed and compartmentalized. If the platform is used as a robust shared resource its management, configuration, and resilience becomes a challenge in itself. We have designed and prototyped a distributed software platform for such architectures. The core element of the platform is a new operating system whose services were designed to restrict access to the network and the file system, and to enforce resource management constraints for all non-privileged processes Mixed-criticality applications operating at different security labels are deployed and controlled by a privileged management process that is also pre-configuring all information flows. This paper describes the design and objective of this layer.
Peer-to-peer architectures for exascale computing : LDRD final report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vorobeychik, Yevgeniy; Mayo, Jackson R.; Minnich, Ronald G.
2010-09-01
The goal of this research was to investigate the potential for employing dynamic, decentralized software architectures to achieve reliability in future high-performance computing platforms. These architectures, inspired by peer-to-peer networks such as botnets that already scale to millions of unreliable nodes, hold promise for enabling scientific applications to run usefully on next-generation exascale platforms ({approx} 10{sup 18} operations per second). Traditional parallel programming techniques suffer rapid deterioration of performance scaling with growing platform size, as the work of coping with increasingly frequent failures dominates over useful computation. Our studies suggest that new architectures, in which failures are treated as ubiquitousmore » and their effects are considered as simply another controllable source of error in a scientific computation, can remove such obstacles to exascale computing for certain applications. We have developed a simulation framework, as well as a preliminary implementation in a large-scale emulation environment, for exploration of these 'fault-oblivious computing' approaches. High-performance computing (HPC) faces a fundamental problem of increasing total component failure rates due to increasing system sizes, which threaten to degrade system reliability to an unusable level by the time the exascale range is reached ({approx} 10{sup 18} operations per second, requiring of order millions of processors). As computer scientists seek a way to scale system software for next-generation exascale machines, it is worth considering peer-to-peer (P2P) architectures that are already capable of supporting 10{sup 6}-10{sup 7} unreliable nodes. Exascale platforms will require a different way of looking at systems and software because the machine will likely not be available in its entirety for a meaningful execution time. Realistic estimates of failure rates range from a few times per day to more than once per hour for these platforms. P2P architectures give us a starting point for crafting applications and system software for exascale. In the context of the Internet, P2P applications (e.g., file sharing, botnets) have already solved this problem for 10{sup 6}-10{sup 7} nodes. Usually based on a fractal distributed hash table structure, these systems have proven robust in practice to constant and unpredictable outages, failures, and even subversion. For example, a recent estimate of botnet turnover (i.e., the number of machines leaving and joining) is about 11% per week. Nonetheless, P2P networks remain effective despite these failures: The Conficker botnet has grown to {approx} 5 x 10{sup 6} peers. Unlike today's system software and applications, those for next-generation exascale machines cannot assume a static structure and, to be scalable over millions of nodes, must be decentralized. P2P architectures achieve both, and provide a promising model for 'fault-oblivious computing'. This project aimed to study the dynamics of P2P networks in the context of a design for exascale systems and applications. Having no single point of failure, the most successful P2P architectures are adaptive and self-organizing. While there has been some previous work applying P2P to message passing, little attention has been previously paid to the tightly coupled exascale domain. Typically, the per-node footprint of P2P systems is small, making them ideal for HPC use. The implementation on each peer node cooperates en masse to 'heal' disruptions rather than relying on a controlling 'master' node. Understanding this cooperative behavior from a complex systems viewpoint is essential to predicting useful environments for the inextricably unreliable exascale platforms of the future. We sought to obtain theoretical insight into the stability and large-scale behavior of candidate architectures, and to work toward leveraging Sandia's Emulytics platform to test promising candidates in a realistic (ultimately {ge} 10{sup 7} nodes) setting. Our primary example applications are drawn from linear algebra: a Jacobi relaxation solver for the heat equation, and the closely related technique of value iteration in optimization. We aimed to apply P2P concepts in designing implementations capable of surviving an unreliable machine of 10{sup 6} nodes.« less
OʼHara, Susan
2014-01-01
Nurses have increasingly been regarded as critical members of the planning team as architects recognize their knowledge and value. But the nurses' role as knowledge experts can be expanded to leading efforts to integrate the clinical, operational, and architectural expertise through simulation modeling. Simulation modeling allows for the optimal merge of multifactorial data to understand the current state of the intensive care unit and predict future states. Nurses can champion the simulation modeling process and reap the benefits of a cost-effective way to test new designs, processes, staffing models, and future programming trends prior to implementation. Simulation modeling is an evidence-based planning approach, a standard, for integrating the sciences with real client data, to offer solutions for improving patient care.
CADC and CANFAR: Extending the role of the data centre
NASA Astrophysics Data System (ADS)
Gaudet, Severin
2015-12-01
Over the past six years, the CADC has moved beyond the astronomy archive data centre to a multi-service system for the community. This evolution is based on two major initiatives. The first is the adoption of International Virtual Observatory Alliance (IVOA) standards in both the system and data architecture of the CADC, including a common characterization data model. The second is the Canadian Advanced Network for Astronomical Research (CANFAR), a digital infrastructure combining the Canadian national research network (CANARIE), cloud processing and storage resources (Compute Canada) and a data centre (Canadian Astronomy Data Centre) into a unified ecosystem for storage and processing for the astronomy community. This talk will describe the architecture and integration of IVOA and CANFAR services into CADC operations, the operational experiences, the lessons learned and future directions
Towards Energy-Performance Trade-off Analysis of Parallel Applications
ERIC Educational Resources Information Center
Korthikanti, Vijay Anand Reddy
2011-01-01
Energy consumption by computer systems has emerged as an important concern, both at the level of individual devices (limited battery capacity in mobile systems) and at the societal level (the production of Green House Gases). In parallel architectures, applications may be executed on a variable number of cores and these cores may operate at…
AADL and Model-based Engineering
2014-10-20
and MBE Feiler, Oct 20, 2014 © 2014 Carnegie Mellon University We Rely on Software for Safe Aircraft Operation Embedded software systems ...D eveloper Compute Platform Runtime Architecture Application Software Embedded SW System Engineer Data Stream Characteristics Latency...confusion Hardware Engineer Why do system level failures still occur despite fault tolerance techniques being deployed in systems ? Embedded software
NASA Astrophysics Data System (ADS)
Litinski, Daniel; Kesselring, Markus S.; Eisert, Jens; von Oppen, Felix
2017-07-01
We present a scalable architecture for fault-tolerant topological quantum computation using networks of voltage-controlled Majorana Cooper pair boxes and topological color codes for error correction. Color codes have a set of transversal gates which coincides with the set of topologically protected gates in Majorana-based systems, namely, the Clifford gates. In this way, we establish color codes as providing a natural setting in which advantages offered by topological hardware can be combined with those arising from topological error-correcting software for full-fledged fault-tolerant quantum computing. We provide a complete description of our architecture, including the underlying physical ingredients. We start by showing that in topological superconductor networks, hexagonal cells can be employed to serve as physical qubits for universal quantum computation, and we present protocols for realizing topologically protected Clifford gates. These hexagonal-cell qubits allow for a direct implementation of open-boundary color codes with ancilla-free syndrome read-out and logical T gates via magic-state distillation. For concreteness, we describe how the necessary operations can be implemented using networks of Majorana Cooper pair boxes, and we give a feasibility estimate for error correction in this architecture. Our approach is motivated by nanowire-based networks of topological superconductors, but it could also be realized in alternative settings such as quantum-Hall-superconductor hybrids.
Visualization of decision processes using a cognitive architecture
NASA Astrophysics Data System (ADS)
Livingston, Mark A.; Murugesan, Arthi; Brock, Derek; Frost, Wende K.; Perzanowski, Dennis
2013-01-01
Cognitive architectures are computational theories of reasoning the human mind engages in as it processes facts and experiences. A cognitive architecture uses declarative and procedural knowledge to represent mental constructs that are involved in decision making. Employing a model of behavioral and perceptual constraints derived from a set of one or more scenarios, the architecture reasons about the most likely consequence(s) of a sequence of events. Reasoning of any complexity and depth involving computational processes, however, is often opaque and challenging to comprehend. Arguably, for decision makers who may need to evaluate or question the results of autonomous reasoning, it would be useful to be able to inspect the steps involved in an interactive, graphical format. When a chain of evidence and constraint-based decision points can be visualized, it becomes easier to explore both how and why a scenario of interest will likely unfold in a particular way. In initial work on a scheme for visualizing cognitively-based decision processes, we focus on generating graphical representations of models run in the Polyscheme cognitive architecture. Our visualization algorithm operates on a modified version of Polyscheme's output, which is accomplished by augmenting models with a simple set of tags. We provide example visualizations and discuss properties of our technique that pose challenges for our representation goals. We conclude with a summary of feedback solicited from domain experts and practitioners in the field of cognitive modeling.
Brain architecture: a design for natural computation.
Kaiser, Marcus
2007-12-15
Fifty years ago, John von Neumann compared the architecture of the brain with that of the computers he invented and which are still in use today. In those days, the organization of computers was based on concepts of brain organization. Here, we give an update on current results on the global organization of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
A new software-based architecture for quantum computer
NASA Astrophysics Data System (ADS)
Wu, Nan; Song, FangMin; Li, Xiangdong
2010-04-01
In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
NASA Technical Reports Server (NTRS)
1990-01-01
NASA's Space Station Freedom Program (SSFP) planning efforts have identified a need for a payload training simulator system to serve as both a training facility and as a demonstrator to validate operational concepts. The envisioned MSFC Payload Training Complex (PTC) required to meet this need will train the Space Station payload scientists, station scientists, and ground controllers to operate the wide variety of experiments that will be onboard the Space Station Freedom. The Simulation Computer System (SCS) is the computer hardware, software, and workstations that will support the Payload Training Complex at MSFC. The purpose of this SCS Study is to investigate issues related to the SCS, alternative requirements, simulator approaches, and state-of-the-art technologies to develop candidate concepts and designs.
Scalable digital hardware for a trapped ion quantum computer
NASA Astrophysics Data System (ADS)
Mount, Emily; Gaultney, Daniel; Vrijsen, Geert; Adams, Michael; Baek, So-Young; Hudek, Kai; Isabella, Louis; Crain, Stephen; van Rynbach, Andre; Maunz, Peter; Kim, Jungsang
2016-12-01
Many of the challenges of scaling quantum computer hardware lie at the interface between the qubits and the classical control signals used to manipulate them. Modular ion trap quantum computer architectures address scalability by constructing individual quantum processors interconnected via a network of quantum communication channels. Successful operation of such quantum hardware requires a fully programmable classical control system capable of frequency stabilizing the continuous wave lasers necessary for loading, cooling, initialization, and detection of the ion qubits, stabilizing the optical frequency combs used to drive logic gate operations on the ion qubits, providing a large number of analog voltage sources to drive the trap electrodes, and a scheme for maintaining phase coherence among all the controllers that manipulate the qubits. In this work, we describe scalable solutions to these hardware development challenges.
NASA Technical Reports Server (NTRS)
1990-01-01
NASA's Space Station Freedom Program (SSFP) planning efforts have identified a need for a payload training simulator system to serve as both a training facility and as a demonstrator to validate operational concepts. The envisioned MSFC Payload Training Complex (PTC) required to meet this need will train the Space Station payload scientists, station scientists, and ground controllers to operate the wide variety of experiments that will be onboard the Space Station Freedom. The Simulation Computer System (SCS) is made up of the computer hardware, software, and workstations that will support the Payload Training Complex at MSFC. The purpose of this SCS Study is to investigate issues related to the SCS, alternative requirements, simulator approaches, and state-of-the-art technologies to develop candidate concepts and designs.
Strategic Adaptation of SCA for STRS
NASA Technical Reports Server (NTRS)
Quinn, Todd; Kacpura, Thomas
2007-01-01
The Space Telecommunication Radio System (STRS) architecture is being developed to provide a standard framework for future NASA space radios with greater degrees of interoperability and flexibility to meet new mission requirements. The space environment imposes unique operational requirements with restrictive size, weight, and power constraints that are significantly smaller than terrestrial-based military communication systems. With the harsh radiation environment of space, the computing and processing resources are typically one or two generations behind current terrestrial technologies. Despite these differences, there are elements of the SCA that can be adapted to facilitate the design and implementation of the STRS architecture.
On the Computing Potential of Intracellular Vesicles
Mayne, Richard; Adamatzky, Andrew
2015-01-01
Collision-based computing (CBC) is a form of unconventional computing in which travelling localisations represent data and conditional routing of signals determines the output state; collisions between localisations represent logical operations. We investigated patterns of Ca2+-containing vesicle distribution within a live organism, slime mould Physarum polycephalum, with confocal microscopy and observed them colliding regularly. Vesicles travel down cytoskeletal ‘circuitry’ and their collisions may result in reflection, fusion or annihilation. We demonstrate through experimental observations that naturally-occurring vesicle dynamics may be characterised as a computationally-universal set of Boolean logical operations and present a ‘vesicle modification’ of the archetypal CBC ‘billiard ball model’ of computation. We proceed to discuss the viability of intracellular vesicles as an unconventional computing substrate in which we delineate practical considerations for reliable vesicle ‘programming’ in both in vivo and in vitro vesicle computing architectures and present optimised designs for both single logical gates and combinatorial logic circuits based on cytoskeletal network conformations. The results presented here demonstrate the first characterisation of intracelluar phenomena as collision-based computing and hence the viability of biological substrates for computing. PMID:26431435
Software Systems for High-performance Quantum Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Humble, Travis S; Britt, Keith A
Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventionalmore » computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.« less
Automatic maintenance payload on board of a Mexican LEO microsatellite
NASA Astrophysics Data System (ADS)
Vicente-Vivas, Esaú; García-Nocetti, Fabián; Mendieta-Jiménez, Francisco
2006-02-01
Few research institutions from Mexico work together to finalize the integration of a technological demonstration microsatellite called Satex, aiming the launching of the first ever fully designed and manufactured domestic space vehicle. The project is based on technical knowledge gained in previous space experiences, particularly in developing GASCAN automatic experiments for NASA's space shuttle, and in some support obtained from the local team which assembled the México-OSCAR-30 microsatellites. Satex includes three autonomous payloads and a power subsystem, each one with a local microcomputer to provide intelligent and dedicated control. It also contains a flight computer (FC) with a pair of full redundancies. This enables the remote maintenance of processing boards from the ground station. A fourth communications payload depends on the flight computer for control purposes. A fifth payload was decided to be developed for the satellite. It adds value to the available on-board computers and extends the opportunity for a developing country to learn and to generate domestic space technology. Its aim is to provide automatic maintenance capabilities for the most critical on-board computer in order to achieve continuous satellite operations. This paper presents the virtual computer architecture specially developed to provide maintenance capabilities to the flight computer. The architecture is periodically implemented by software with a small amount of physical processors (FC processors) and virtual redundancies (payload processors) to emulate a hybrid redundancy computer. Communications among processors are accomplished over a fault-tolerant LAN. This allows a versatile operating behavior in terms of data communication as well as in terms of distributed fault tolerance. Obtained results, payload validation and reliability results are also presented.
NASA Technical Reports Server (NTRS)
Gorospe, George E., Jr.; Daigle, Matthew J.; Sankararaman, Shankar; Kulkarni, Chetan S.; Ng, Eley
2017-01-01
Prognostic methods enable operators and maintainers to predict the future performance for critical systems. However, these methods can be computationally expensive and may need to be performed each time new information about the system becomes available. In light of these computational requirements, we have investigated the application of graphics processing units (GPUs) as a computational platform for real-time prognostics. Recent advances in GPU technology have reduced cost and increased the computational capability of these highly parallel processing units, making them more attractive for the deployment of prognostic software. We present a survey of model-based prognostic algorithms with considerations for leveraging the parallel architecture of the GPU and a case study of GPU-accelerated battery prognostics with computational performance results.
A novel parallel architecture for local histogram equalization
NASA Astrophysics Data System (ADS)
Ohannessian, Mesrob I.; Choueiter, Ghinwa F.; Diab, Hassan
2005-07-01
Local histogram equalization is an image enhancement algorithm that has found wide application in the pre-processing stage of areas such as computer vision, pattern recognition and medical imaging. The computationally intensive nature of the procedure, however, is a main limitation when real time interactive applications are in question. This work explores the possibility of performing parallel local histogram equalization, using an array of special purpose elementary processors, through an HDL implementation that targets FPGA or ASIC platforms. A novel parallelization scheme is presented and the corresponding architecture is derived. The algorithm is reduced to pixel-level operations. Processing elements are assigned image blocks, to maintain a reasonable performance-cost ratio. To further simplify both processor and memory organizations, a bit-serial access scheme is used. A brief performance assessment is provided to illustrate and quantify the merit of the approach.
A smart sensor architecture based on emergent computation in an array of outer-totalistic cells
NASA Astrophysics Data System (ADS)
Dogaru, Radu; Dogaru, Ioana; Glesner, Manfred
2005-06-01
A novel smart-sensor architecture is proposed, capable to segment and recognize characters in a monochrome image. It is capable to provide a list of ASCII codes representing the recognized characters from the monochrome visual field. It can operate as a blind's aid or for industrial applications. A bio-inspired cellular model with simple linear neurons was found the best to perform the nontrivial task of cropping isolated compact objects such as handwritten digits or characters. By attaching a simple outer-totalistic cell to each pixel sensor, emergent computation in the resulting cellular automata lattice provides a straightforward and compact solution to the otherwise computationally intensive problem of character segmentation. A simple and robust recognition algorithm is built in a compact sequential controller accessing the array of cells so that the integrated device can provide directly a list of codes of the recognized characters. Preliminary simulation tests indicate good performance and robustness to various distortions of the visual field.
System on a chip with MPEG-4 capability
NASA Astrophysics Data System (ADS)
Yassa, Fathy; Schonfeld, Dan
2002-12-01
Current products supporting video communication applications rely on existing computer architectures. RISC processors have been used successfully in numerous applications over several decades. DSP processors have become ubiquitous in signal processing and communication applications. Real-time applications such as speech processing in cellular telephony rely extensively on the computational power of these processors. Video processors designed to implement the computationally intensive codec operations have also been used to address the high demands of video communication applications (e.g., cable set-top boxes and DVDs). This paper presents an overview of a system-on-chip (SOC) architecture used for real-time video in wireless communication applications. The SOC specifications answer to the system requirements imposed by the application environment. A CAM-based video processor is used to accelerate data intensive video compression tasks such as motion estimations and filtering. Other components are dedicated to system level data processing and audio processing. A rich set of I/Os allows the SOC to communicate with other system components such as baseband and memory subsystems.
KeyWare: an open wireless distributed computing environment
NASA Astrophysics Data System (ADS)
Shpantzer, Isaac; Schoenfeld, Larry; Grindahl, Merv; Kelman, Vladimir
1995-12-01
Deployment of distributed applications in the wireless domain lack equivalent tools, methodologies, architectures, and network management that exist in LAN based applications. A wireless distributed computing environment (KeyWareTM) based on intelligent agents within a multiple client multiple server scheme was developed to resolve this problem. KeyWare renders concurrent application services to wireline and wireless client nodes encapsulated in multiple paradigms such as message delivery, database access, e-mail, and file transfer. These services and paradigms are optimized to cope with temporal and spatial radio coverage, high latency, limited throughput and transmission costs. A unified network management paradigm for both wireless and wireline facilitates seamless extensions of LAN- based management tools to include wireless nodes. A set of object oriented tools and methodologies enables direct asynchronous invocation of agent-based services supplemented by tool-sets matched to supported KeyWare paradigms. The open architecture embodiment of KeyWare enables a wide selection of client node computing platforms, operating systems, transport protocols, radio modems and infrastructures while maintaining application portability.
Architectures for single-chip image computing
NASA Astrophysics Data System (ADS)
Gove, Robert J.
1992-04-01
This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
Transitioning ISR architecture into the cloud
NASA Astrophysics Data System (ADS)
Lash, Thomas D.
2012-06-01
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.
Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein
2015-12-01
Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.
Architecture Framework for Trapped-Ion Quantum Computer based on Performance Simulation Tool
NASA Astrophysics Data System (ADS)
Ahsan, Muhammad
The challenge of building scalable quantum computer lies in striking appropriate balance between designing a reliable system architecture from large number of faulty computational resources and improving the physical quality of system components. The detailed investigation of performance variation with physics of the components and the system architecture requires adequate performance simulation tool. In this thesis we demonstrate a software tool capable of (1) mapping and scheduling the quantum circuit on a realistic quantum hardware architecture with physical resource constraints, (2) evaluating the performance metrics such as the execution time and the success probability of the algorithm execution, and (3) analyzing the constituents of these metrics and visualizing resource utilization to identify system components which crucially define the overall performance. Using this versatile tool, we explore vast design space for modular quantum computer architecture based on trapped ions. We find that while success probability is uniformly determined by the fidelity of physical quantum operation, the execution time is a function of system resources invested at various layers of design hierarchy. At physical level, the number of lasers performing quantum gates, impact the latency of the fault-tolerant circuit blocks execution. When these blocks are used to construct meaningful arithmetic circuit such as quantum adders, the number of ancilla qubits for complicated non-clifford gates and entanglement resources to establish long-distance communication channels, become major performance limiting factors. Next, in order to factorize large integers, these adders are assembled into modular exponentiation circuit comprising bulk of Shor's algorithm. At this stage, the overall scaling of resource-constraint performance with the size of problem, describes the effectiveness of chosen design. By matching the resource investment with the pace of advancement in hardware technology, we find optimal designs for different types of quantum adders. Conclusively, we show that 2,048-bit Shor's algorithm can be reliably executed within the resource budget of 1.5 million qubits.
Advanced computer architecture specification for automated weld systems
NASA Technical Reports Server (NTRS)
Katsinis, Constantine
1994-01-01
This report describes the requirements for an advanced automated weld system and the associated computer architecture, and defines the overall system specification from a broad perspective. According to the requirements of welding procedures as they relate to an integrated multiaxis motion control and sensor architecture, the computer system requirements are developed based on a proven multiple-processor architecture with an expandable, distributed-memory, single global bus architecture, containing individual processors which are assigned to specific tasks that support sensor or control processes. The specified architecture is sufficiently flexible to integrate previously developed equipment, be upgradable and allow on-site modifications.
NASA Technical Reports Server (NTRS)
Jain, Abhinandan
2011-01-01
Ndarts software provides algorithms for computing quantities associated with the dynamics of articulated, rigid-link, multibody systems. It is designed as a general-purpose dynamics library that can be used for the modeling of robotic platforms, space vehicles, molecular dynamics, and other such applications. The architecture and algorithms in Ndarts are based on the Spatial Operator Algebra (SOA) theory for computational multibody and robot dynamics developed at JPL. It uses minimal, internal coordinate models. The algorithms are low-order, recursive scatter/ gather algorithms. In comparison with the earlier Darts++ software, this version has a more general and cleaner design needed to support a larger class of computational dynamics needs. It includes a frames infrastructure, allows algorithms to operate on subgraphs of the system, and implements lazy and deferred computation for better efficiency. Dynamics modeling modules such as Ndarts are core building blocks of control and simulation software for space, robotic, mechanism, bio-molecular, and material systems modeling.
Recursive computer architecture for VLSI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treleaven, P.C.; Hopkins, R.P.
1982-01-01
A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
Knowledge-based processing for aircraft flight control
NASA Technical Reports Server (NTRS)
Painter, John H.
1991-01-01
The purpose is to develop algorithms and architectures for embedding artificial intelligence in aircraft guidance and control systems. With the approach adopted, AI-computing is used to create an outer guidance loop for driving the usual aircraft autopilot. That is, a symbolic processor monitors the operation and performance of the aircraft. Then, based on rules and other stored knowledge, commands are automatically formulated for driving the autopilot so as to accomplish desired flight operations. The focus is on developing a software system which can respond to linguistic instructions, input in a standard format, so as to formulate a sequence of simple commands to the autopilot. The instructions might be a fairly complex flight clearance, input either manually or by data-link. Emphasis is on a software system which responds much like a pilot would, employing not only precise computations, but, also, knowledge which is less precise, but more like common-sense. The approach is based on prior work to develop a generic 'shell' architecture for an AI-processor, which may be tailored to many applications by describing the application in appropriate processor data bases (libraries). Such descriptions include numerical models of the aircraft and flight control system, as well as symbolic (linguistic) descriptions of flight operations, rules, and tactics.
Motion-sensor fusion-based gesture recognition and its VLSI architecture design for mobile devices
NASA Astrophysics Data System (ADS)
Zhu, Wenping; Liu, Leibo; Yin, Shouyi; Hu, Siqi; Tang, Eugene Y.; Wei, Shaojun
2014-05-01
With the rapid proliferation of smartphones and tablets, various embedded sensors are incorporated into these platforms to enable multimodal human-computer interfaces. Gesture recognition, as an intuitive interaction approach, has been extensively explored in the mobile computing community. However, most gesture recognition implementations by now are all user-dependent and only rely on accelerometer. In order to achieve competitive accuracy, users are required to hold the devices in predefined manner during the operation. In this paper, a high-accuracy human gesture recognition system is proposed based on multiple motion sensor fusion. Furthermore, to reduce the energy overhead resulted from frequent sensor sampling and data processing, a high energy-efficient VLSI architecture implemented on a Xilinx Virtex-5 FPGA board is also proposed. Compared with the pure software implementation, approximately 45 times speed-up is achieved while operating at 20 MHz. The experiments show that the average accuracy for 10 gestures achieves 93.98% for user-independent case and 96.14% for user-dependent case when subjects hold the device randomly during completing the specified gestures. Although a few percent lower than the conventional best result, it still provides competitive accuracy acceptable for practical usage. Most importantly, the proposed system allows users to hold the device randomly during operating the predefined gestures, which substantially enhances the user experience.
FPGA design for constrained energy minimization
NASA Astrophysics Data System (ADS)
Wang, Jianwei; Chang, Chein-I.; Cao, Mang
2004-02-01
The Constrained Energy Minimization (CEM) has been widely used for hyperspectral detection and classification. The feasibility of implementing the CEM as a real-time processing algorithm in systolic arrays has been also demonstrated. The main challenge of realizing the CEM in hardware architecture in the computation of the inverse of the data correlation matrix performed in the CEM, which requires a complete set of data samples. In order to cope with this problem, the data correlation matrix must be calculated in a causal manner which only needs data samples up to the sample at the time it is processed. This paper presents a Field Programmable Gate Arrays (FPGA) design of such a causal CEM. The main feature of the proposed FPGA design is to use the Coordinate Rotation DIgital Computer (CORDIC) algorithm that can convert a Givens rotation of a vector to a set of shift-add operations. As a result, the CORDIC algorithm can be easily implemented in hardware architecture, therefore in FPGA. Since the computation of the inverse of the data correlction involves a series of Givens rotations, the utility of the CORDIC algorithm allows the causal CEM to perform real-time processing in FPGA. In this paper, an FPGA implementation of the causal CEM will be studied and its detailed architecture will be also described.
Fast associative memory + slow neural circuitry = the computational model of the brain.
NASA Astrophysics Data System (ADS)
Berkovich, Simon; Berkovich, Efraim; Lapir, Gennady
1997-08-01
We propose a computational model of the brain based on a fast associative memory and relatively slow neural processors. In this model, processing time is expensive but memory access is not, and therefore most algorithmic tasks would be accomplished by using large look-up tables as opposed to calculating. The essential feature of an associative memory in this context (characteristic for a holographic type memory) is that it works without an explicit mechanism for resolution of multiple responses. As a result, the slow neuronal processing elements, overwhelmed by the flow of information, operate as a set of templates for ranking of the retrieved information. This structure addresses the primary controversy in the brain architecture: distributed organization of memory vs. localization of processing centers. This computational model offers an intriguing explanation of many of the paradoxical features in the brain architecture, such as integration of sensors (through DMA mechanism), subliminal perception, universality of software, interrupts, fault-tolerance, certain bizarre possibilities for rapid arithmetics etc. In conventional computer science the presented type of a computational model did not attract attention as it goes against the technological grain by using a working memory faster than processing elements.
Stability and performance tradeoffs in bi-lateral telemanipulation
NASA Technical Reports Server (NTRS)
Hannaford, Blake
1989-01-01
Kinesthetic force feedback provides measurable increase in remote manipulation system performance. Intensive computation time requirements or operation under conditions of time delay can cause serious stability problems in control-system design. Here, a simplified linear analysis of this stability problem is presented for the forward-flow generalized architecture, applying the hybrid two-port representation to express the loop gain of the traditional master-slave architecture, which can be subjected to similar analysis. The hybrid two-port representation is also used to express the effects on the fidelity of manipulation or feel of one design approach used to stabilize the forward-flow architecture. The results suggest that, when local force feedback at the slave side is used to reduce manipulator stability problems, a price is paid in terms of telemanipulation fidelity.
HRST architecture modeling and assessments
NASA Astrophysics Data System (ADS)
Comstock, Douglas A.
1997-01-01
This paper presents work supporting the assessment of advanced concept options for the Highly Reusable Space Transportation (HRST) study. It describes the development of computer models as the basis for creating an integrated capability to evaluate the economic feasibility and sustainability of a variety of system architectures. It summarizes modeling capabilities for use on the HRST study to perform sensitivity analysis of alternative architectures (consisting of different combinations of highly reusable vehicles, launch assist systems, and alternative operations and support concepts) in terms of cost, schedule, performance, and demand. In addition, the identification and preliminary assessment of alternative market segments for HRST applications, such as space manufacturing, space tourism, etc., is described. Finally, the development of an initial prototype model that can begin to be used for modeling alternative HRST concepts at the system level is presented.
An Efficient VLSI Architecture of the Enhanced Three Step Search Algorithm
NASA Astrophysics Data System (ADS)
Biswas, Baishik; Mukherjee, Rohan; Saha, Priyabrata; Chakrabarti, Indrajit
2016-09-01
The intense computational complexity of any video codec is largely due to the motion estimation unit. The Enhanced Three Step Search is a popular technique that can be adopted for fast motion estimation. This paper proposes a novel VLSI architecture for the implementation of the Enhanced Three Step Search Technique. A new addressing mechanism has been introduced which enhances the speed of operation and reduces the area requirements. The proposed architecture when implemented in Verilog HDL on Virtex-5 Technology and synthesized using Xilinx ISE Design Suite 14.1 achieves a critical path delay of 4.8 ns while the area comes out to be 2.9K gate equivalent. It can be incorporated in commercial devices like smart-phones, camcorders, video conferencing systems etc.
Vemuri, Anant S; Wu, Jungle Chi-Hsiang; Liu, Kai-Che; Wu, Hurng-Sheng
2012-12-01
Surgical procedures have undergone considerable advancement during the last few decades. More recently, the availability of some imaging methods intraoperatively has added a new dimension to minimally invasive techniques. Augmented reality in surgery has been a topic of intense interest and research. Augmented reality involves usage of computer vision algorithms on video from endoscopic cameras or cameras mounted in the operating room to provide the surgeon additional information that he or she otherwise would have to recognize intuitively. One of the techniques combines a virtual preoperative model of the patient with the endoscope camera using natural or artificial landmarks to provide an augmented reality view in the operating room. The authors' approach is to provide this with the least number of changes to the operating room. Software architecture is presented to provide interactive adjustment in the registration of a three-dimensional (3D) model and endoscope video. Augmented reality including adrenalectomy, ureteropelvic junction obstruction, and retrocaval ureter and pancreas was used to perform 12 surgeries. The general feedback from the surgeons has been very positive not only in terms of deciding the positions for inserting points but also in knowing the least change in anatomy. The approach involves providing a deformable 3D model architecture and its application to the operating room. A 3D model with a deformable structure is needed to show the shape change of soft tissue during the surgery. The software architecture to provide interactive adjustment in registration of the 3D model and endoscope video with adjustability of every 3D model is presented.
Interaction with Machine Improvisation
NASA Astrophysics Data System (ADS)
Assayag, Gerard; Bloch, George; Cont, Arshia; Dubnov, Shlomo
We describe two multi-agent architectures for an improvisation oriented musician-machine interaction systems that learn in real time from human performers. The improvisation kernel is based on sequence modeling and statistical learning. We present two frameworks of interaction with this kernel. In the first, the stylistic interaction is guided by a human operator in front of an interactive computer environment. In the second framework, the stylistic interaction is delegated to machine intelligence and therefore, knowledge propagation and decision are taken care of by the computer alone. The first framework involves a hybrid architecture using two popular composition/performance environments, Max and OpenMusic, that are put to work and communicate together, each one handling the process at a different time/memory scale. The second framework shares the same representational schemes with the first but uses an Active Learning architecture based on collaborative, competitive and memory-based learning to handle stylistic interactions. Both systems are capable of processing real-time audio/video as well as MIDI. After discussing the general cognitive background of improvisation practices, the statistical modelling tools and the concurrent agent architecture are presented. Then, an Active Learning scheme is described and considered in terms of using different improvisation regimes for improvisation planning. Finally, we provide more details about the different system implementations and describe several performances with the system.
Efficient self-organizing multilayer neural network for nonlinear system modeling.
Han, Hong-Gui; Wang, Li-Dan; Qiao, Jun-Fei
2013-07-01
It has been shown extensively that the dynamic behaviors of a neural system are strongly influenced by the network architecture and learning process. To establish an artificial neural network (ANN) with self-organizing architecture and suitable learning algorithm for nonlinear system modeling, an automatic axon-neural network (AANN) is investigated in the following respects. First, the network architecture is constructed automatically to change both the number of hidden neurons and topologies of the neural network during the training process. The approach introduced in adaptive connecting-and-pruning algorithm (ACP) is a type of mixed mode operation, which is equivalent to pruning or adding the connecting of the neurons, as well as inserting some required neurons directly. Secondly, the weights are adjusted, using a feedforward computation (FC) to obtain the information for the gradient during learning computation. Unlike most of the previous studies, AANN is able to self-organize the architecture and weights, and to improve the network performances. Also, the proposed AANN has been tested on a number of benchmark problems, ranging from nonlinear function approximating to nonlinear systems modeling. The experimental results show that AANN can have better performances than that of some existing neural networks. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan
2006-01-01
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
ActiveTutor: Towards More Adaptive Features in an E-Learning Framework
ERIC Educational Resources Information Center
Fournier, Jean-Pierre; Sansonnet, Jean-Paul
2008-01-01
Purpose: This paper aims to sketch the emerging notion of auto-adaptive software when applied to e-learning software. Design/methodology/approach: The study and the implementation of the auto-adaptive architecture are based on the operational framework "ActiveTutor" that is used for teaching the topic of computer science programming in first-grade…
Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS
NASA Astrophysics Data System (ADS)
Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.
2018-03-01
Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.
A synchronized computational architecture for generalized bilateral control of robot arms
NASA Technical Reports Server (NTRS)
Bejczy, Antal K.; Szakaly, Zoltan
1987-01-01
This paper describes a computational architecture for an interconnected high speed distributed computing system for generalized bilateral control of robot arms. The key method of the architecture is the use of fully synchronized, interrupt driven software. Since an objective of the development is to utilize the processing resources efficiently, the synchronization is done in the hardware level to reduce system software overhead. The architecture also achieves a balaced load on the communication channel. The paper also describes some architectural relations to trading or sharing manual and automatic control.
The Kepler Science Data Processing Pipeline Source Code Road Map
NASA Technical Reports Server (NTRS)
Wohler, Bill; Jenkins, Jon M.; Twicken, Joseph D.; Bryson, Stephen T.; Clarke, Bruce Donald; Middour, Christopher K.; Quintana, Elisa Victoria; Sanderfer, Jesse Thomas; Uddin, Akm Kamal; Sabale, Anima;
2016-01-01
We give an overview of the operational concepts and architecture of the Kepler Science Processing Pipeline. Designed, developed, operated, and maintained by the Kepler Science Operations Center (SOC) at NASA Ames Research Center, the Science Processing Pipeline is a central element of the Kepler Ground Data System. The SOC consists of an office at Ames Research Center, software development and operations departments, and a data center which hosts the computers required to perform data analysis. The SOC's charter is to analyze stellar photometric data from the Kepler spacecraft and report results to the Kepler Science Office for further analysis. We describe how this is accomplished via the Kepler Science Processing Pipeline, including, the software algorithms. We present the high-performance, parallel computing software modules of the pipeline that perform transit photometry, pixel-level calibration, systematic error correction, attitude determination, stellar target management, and instrument characterization.
Distributed intelligence for supervisory control
NASA Technical Reports Server (NTRS)
Wolfe, W. J.; Raney, S. D.
1987-01-01
Supervisory control systems must deal with various types of intelligence distributed throughout the layers of control. Typical layers are real-time servo control, off-line planning and reasoning subsystems and finally, the human operator. Design methodologies must account for the fact that the majority of the intelligence will reside with the human operator. Hierarchical decompositions and feedback loops as conceptual building blocks that provide a common ground for man-machine interaction are discussed. Examples of types of parallelism and parallel implementation on several classes of computer architecture are also discussed.
Open Radio Communications Architecture Core Framework V1.1.0 Volume 1 Software Users Manual
2005-02-01
on a PC utilizing the KDE desktop that comes with Red Hat Linux . The default desktop for most Red Hat Linux installations is the GNOME desktop. The...SCA) v2.2. The software was designed for a desktop computer running the Linux operating system (OS). It was developed in C++, uses ACE/TAO for CORBA...middleware, Xerces for the XML parser, and Red Hat Linux for the Operating System. The software is referred to as, Open Radio Communication
Architecture Specification for PAVE PILLAR Avionics
1987-01-01
PAVE PILLAR system is 99% fault detection. The percent fault detection is determined by the following computation. The number of verified failures de ...reconfiguration or reparameterization requi’red to support manual operations rests w’ith the Mission Supervi’sor. 3.3.8 corm~utr _ De in 3.3.8.1 Hither...1Order Ti.rie Su ’, .S.yStem The Operational Flight Program (OFP) will be de - veloped in accordance with the requirements of the Ada (ANSI/ MIL-STD
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Stocker, John C.; Golomb, Andrew M.
2011-01-01
Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Boolean and brain-inspired computing using spin-transfer torque devices
NASA Astrophysics Data System (ADS)
Fan, Deliang
Several completely new approaches (such as spintronic, carbon nanotube, graphene, TFETs, etc.) to information processing and data storage technologies are emerging to address the time frame beyond current Complementary Metal-Oxide-Semiconductor (CMOS) roadmap. The high speed magnetization switching of a nano-magnet due to current induced spin-transfer torque (STT) have been demonstrated in recent experiments. Such STT devices can be explored in compact, low power memory and logic design. In order to truly leverage STT devices based computing, researchers require a re-think of circuit, architecture, and computing model, since the STT devices are unlikely to be drop-in replacements for CMOS. The potential of STT devices based computing will be best realized by considering new computing models that are inherently suited to the characteristics of STT devices, and new applications that are enabled by their unique capabilities, thereby attaining performance that CMOS cannot achieve. The goal of this research is to conduct synergistic exploration in architecture, circuit and device levels for Boolean and brain-inspired computing using nanoscale STT devices. Specifically, we first show that the non-volatile STT devices can be used in designing configurable Boolean logic blocks. We propose a spin-memristor threshold logic (SMTL) gate design, where memristive cross-bar array is used to perform current mode summation of binary inputs and the low power current mode spintronic threshold device carries out the energy efficient threshold operation. Next, for brain-inspired computing, we have exploited different spin-transfer torque device structures that can implement the hard-limiting and soft-limiting artificial neuron transfer functions respectively. We apply such STT based neuron (or 'spin-neuron') in various neural network architectures, such as hierarchical temporal memory and feed-forward neural network, for performing "human-like" cognitive computing, which show more than two orders of lower energy consumption compared to state of the art CMOS implementation. Finally, we show the dynamics of injection locked Spin Hall Effect Spin-Torque Oscillator (SHE-STO) cluster can be exploited as a robust multi-dimensional distance metric for associative computing, image/ video analysis, etc. Our simulation results show that the proposed system architecture with injection locked SHE-STOs and the associated CMOS interface circuits can be suitable for robust and energy efficient associative computing and pattern matching.
Architectural Specialization for Inter-Iteration Loop Dependence Patterns
2015-10-01
Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
An Architecture to Enable Autonomous Control of Spacecraft
NASA Technical Reports Server (NTRS)
May, Ryan D.; Dever, Timothy P.; Soeder, James F.; George, Patrick J.; Morris, Paul H.; Colombano, Silvano P.; Frank, Jeremy D.; Schwabacher, Mark A.; Wang, Liu; LawLer, Dennis
2014-01-01
Autonomy is required for manned spacecraft missions distant enough that light-time communication delays make ground-based mission control infeasible. Presently, ground controllers develop a complete schedule of power modes for all spacecraft components based on a large number of factors. The proposed architecture is an early attempt to formalize and automate this process using on-vehicle computation resources. In order to demonstrate this architecture, an autonomous electrical power system controller and vehicle Mission Manager are constructed. These two components are designed to work together in order to plan upcoming load use as well as respond to unanticipated deviations from the plan. The communication protocol was developed using "paper" simulations prior to formally encoding the messages and developing software to implement the required functionality. These software routines exchange data via TCP/IP sockets with the Mission Manager operating at NASA Ames Research Center and the autonomous power controller running at NASA Glenn Research Center. The interconnected systems are tested and shown to be effective at planning the operation of a simulated quasi-steady state spacecraft power system and responding to unexpected disturbances.
Ag2S atomic switch-based `tug of war' for decision making
NASA Astrophysics Data System (ADS)
Lutz, C.; Hasegawa, T.; Chikyow, T.
2016-07-01
For a computing process such as making a decision, a software controlled chip of several transistors is necessary. Inspired by how a single cell amoeba decides its movements, the theoretical `tug of war' computing model was proposed but not yet implemented in an analogue device suitable for integrated circuits. Based on this model, we now developed a new electronic element for decision making processes, which will have no need for prior programming. The devices are based on the growth and shrinkage of Ag filaments in α-Ag2+δS gap-type atomic switches. Here we present the adapted device design and the new materials. We demonstrate the basic `tug of war' operation by IV-measurements and Scanning Electron Microscopy (SEM) observation. These devices could be the base for a CMOS-free new computer architecture.For a computing process such as making a decision, a software controlled chip of several transistors is necessary. Inspired by how a single cell amoeba decides its movements, the theoretical `tug of war' computing model was proposed but not yet implemented in an analogue device suitable for integrated circuits. Based on this model, we now developed a new electronic element for decision making processes, which will have no need for prior programming. The devices are based on the growth and shrinkage of Ag filaments in α-Ag2+δS gap-type atomic switches. Here we present the adapted device design and the new materials. We demonstrate the basic `tug of war' operation by IV-measurements and Scanning Electron Microscopy (SEM) observation. These devices could be the base for a CMOS-free new computer architecture. Electronic supplementary information (ESI) available. See DOI: 10.1039/c6nr00690f
NASA Astrophysics Data System (ADS)
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-03-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Cognitive Architectures and Human-Computer Interaction. Introduction to Special Issue.
ERIC Educational Resources Information Center
Gray, Wayne D.; Young, Richard M.; Kirschenbaum, Susan S.
1997-01-01
In this introduction to a special issue on cognitive architectures and human-computer interaction (HCI), editors and contributors provide a brief overview of cognitive architectures. The following four architectures represented by articles in this issue are: Soar; LICAI (linked model of comprehension-based action planning and instruction taking);…
Biomimetic design processes in architecture: morphogenetic and evolutionary computational design.
Menges, Achim
2012-03-01
Design computation has profound impact on architectural design methods. This paper explains how computational design enables the development of biomimetic design processes specific to architecture, and how they need to be significantly different from established biomimetic processes in engineering disciplines. The paper first explains the fundamental difference between computer-aided and computational design in architecture, as the understanding of this distinction is of critical importance for the research presented. Thereafter, the conceptual relation and possible transfer of principles from natural morphogenesis to design computation are introduced and the related developments of generative, feature-based, constraint-based, process-based and feedback-based computational design methods are presented. This morphogenetic design research is then related to exploratory evolutionary computation, followed by the presentation of two case studies focusing on the exemplary development of spatial envelope morphologies and urban block morphologies.
A PACS archive architecture supported on cloud services.
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2012-05-01
Diagnostic imaging procedures have continuously increased over the last decade and this trend may continue in coming years, creating a great impact on storage and retrieval capabilities of current PACS. Moreover, many smaller centers do not have financial resources or requirements that justify the acquisition of a traditional infrastructure. Alternative solutions, such as cloud computing, may help address this emerging need. A tremendous amount of ubiquitous computational power, such as that provided by Google and Amazon, are used every day as a normal commodity. Taking advantage of this new paradigm, an architecture for a Cloud-based PACS archive that provides data privacy, integrity, and availability is proposed. The solution is independent from the cloud provider and the core modules were successfully instantiated in examples of two cloud computing providers. Operational metrics for several medical imaging modalities were tabulated and compared for Google Storage, Amazon S3, and LAN PACS. A PACS-as-a-Service archive that provides storage of medical studies using the Cloud was developed. The results show that the solution is robust and that it is possible to store, query, and retrieve all desired studies in a similar way as in a local PACS approach. Cloud computing is an emerging solution that promises high scalability of infrastructures, software, and applications, according to a "pay-as-you-go" business model. The presented architecture uses the cloud to setup medical data repositories and can have a significant impact on healthcare institutions by reducing IT infrastructures.
Adding control to arbitrary unknown quantum operations
Zhou, Xiao-Qi; Ralph, Timothy C.; Kalasuwan, Pruet; Zhang, Mian; Peruzzo, Alberto; Lanyon, Benjamin P.; O'Brien, Jeremy L.
2011-01-01
Although quantum computers promise significant advantages, the complexity of quantum algorithms remains a major technological obstacle. We have developed and demonstrated an architecture-independent technique that simplifies adding control qubits to arbitrary quantum operations—a requirement in many quantum algorithms, simulations and metrology. The technique, which is independent of how the operation is done, does not require knowledge of what the operation is, and largely separates the problems of how to implement a quantum operation in the laboratory and how to add a control. Here, we demonstrate an entanglement-based version in a photonic system, realizing a range of different two-qubit gates with high fidelity. PMID:21811242
NASA Technical Reports Server (NTRS)
Barnes, George H. (Inventor); Lundstrom, Stephen F. (Inventor); Shafer, Philip E. (Inventor)
1983-01-01
A high speed parallel array data processing architecture fashioned under a computational envelope approach includes a data base memory for secondary storage of programs and data, and a plurality of memory modules interconnected to a plurality of processing modules by a connection network of the Omega gender. Programs and data are fed from the data base memory to the plurality of memory modules and from hence the programs are fed through the connection network to the array of processors (one copy of each program for each processor). Execution of the programs occur with the processors operating normally quite independently of each other in a multiprocessing fashion. For data dependent operations and other suitable operations, all processors are instructed to finish one given task or program branch before all are instructed to proceed in parallel processing fashion on the next instruction. Even when functioning in the parallel processing mode however, the processors are not locked-step but execute their own copy of the program individually unless or until another overall processor array synchronization instruction is issued.
Uranus: a rapid prototyping tool for FPGA embedded computer vision
NASA Astrophysics Data System (ADS)
Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.
2007-01-01
The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Sub-nanosecond signal propagation in anisotropy-engineered nanomagnetic logic chains
Gu, Zheng; Nowakowski, Mark E.; Carlton, David B.; ...
2015-03-16
Energy efficient nanomagnetic logic (NML) computing architectures propagate binary information by relying on dipolar field coupling to reorient closely spaced nanoscale magnets. In the past, signal propagation in nanomagnet chains were characterized by static magnetic imaging experiments; however, the mechanisms that determine the final state and their reproducibility over millions of cycles in high-speed operation have yet to be experimentally investigated. Here we present a study of NML operation in a high-speed regime. We perform direct imaging of digital signal propagation in permalloy nanomagnet chains with varying degrees of shape-engineered biaxial anisotropy using full-field magnetic X-ray transmission microscopy and time-resolvedmore » photoemission electron microscopy after applying nanosecond magnetic field pulses. Moreover, an intrinsic switching time of 100 ps per magnet is observed. In conclusion these experiments, and accompanying macrospin and micromagnetic simulations, reveal the underlying physics of NML architectures repetitively operated on nanosecond timescales and identify relevant engineering parameters to optimize performance and reliability.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Radtke, M.A.
This paper will chronicle the activity at Wisconsin Public Service Corporation (WPSC) that resulted in the complete migration of a traditional, late 1970`s vintage, Energy Management System (EMS). The new environment includes networked microcomputers, minicomputers, and the corporate mainframe, and provides on-line access to employees outside the energy control center and some WPSC customers. In the late 1980`s, WPSC was forecasting an EMS computer upgrade or replacement to address both capacity and technology needs. Reasoning that access to diverse computing resources would best position the company to accommodate the uncertain needs of the energy industry in the 90`s, WPSC chosemore » to investigate an in-place migration to a network of computers, able to support heterogeneous hardware and operating systems. The system was developed in a modular fashion, with individual modules being deployed as soon as they were completed. The functional and technical specification was continuously enhanced as operating experience was gained from each operational module. With the migration off the original EMS computers complete, the networked system called DEMAXX (Distributed Energy Management Architecture with eXtensive eXpandability) has exceeded expectations in the areas of: cost, performance, flexibility, and reliability.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Radtke, M.A.
This paper will chronicle the activity at Wisconsin Public Service Corporation (WPSC) that resulted in the complete migration of a traditional, late 1970`s vintage, Energy management System (EMS). The new environment includes networked microcomputers, minicomputers, and the corporate mainframe, and provides on-line access to employees outside the energy control center and some WPSC customers. In the late 1980`s, WPSC was forecasting an EMS computer upgrade or replacement to address both capacity and technology needs. Reasoning that access to diverse computing resources would best position the company to accommodate the uncertain needs of the energy industry in the 90`s, WPSC chosemore » to investigate an in-place migration to a network of computers, able to support heterogeneous hardware and operating systems. The system was developed in a modular fashion, with individual modules being deployed as soon as they were completed. The functional and technical specification was continuously enhanced as operating experience was gained from each operational module. With the migration of the original EMS computers complete, the networked system called DEMAXX (Distributed Energy Management Architecture with eXtensive eXpandability) has exceeded expectations in the areas of: cost, performance, flexibility, and reliability.« less
The flight telerobotic servicer: From functional architecture to computer architecture
NASA Technical Reports Server (NTRS)
Lumia, Ronald; Fiala, John
1989-01-01
After a brief tutorial on the NASA/National Bureau of Standards Standard Reference Model for Telerobot Control System Architecture (NASREM) functional architecture, the approach to its implementation is shown. First, interfaces must be defined which are capable of supporting the known algorithms. This is illustrated by considering the interfaces required for the SERVO level of the NASREM functional architecture. After interface definition, the specific computer architecture for the implementation must be determined. This choice is obviously technology dependent. An example illustrating one possible mapping of the NASREM functional architecture to a particular set of computers which implements it is shown. The result of choosing the NASREM functional architecture is that it provides a technology independent paradigm which can be mapped into a technology dependent implementation capable of evolving with technology in the laboratory and in space.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dennig, Yasmin
Sandia National Laboratories has a long history of significant contributions to the high performance community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraFLOP barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing (HPC) capabilities to gain a tremendous competitive edge in the marketplace. As part of our continuing commitment to providing modern computing infrastructuremore » and systems in support of Sandia missions, we made a major investment in expanding Building 725 to serve as the new home of HPC systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) Prototype platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment and application code performance.« less
FPGA implementation of self organizing map with digital phase locked loops.
Hikawa, Hiroomi
2005-01-01
The self-organizing map (SOM) has found applicability in a wide range of application areas. Recently new SOM hardware with phase modulated pulse signal and digital phase-locked loops (DPLLs) has been proposed (Hikawa, 2005). The system uses the DPLL as a computing element since the operation of the DPLL is very similar to that of SOM's computation. The system also uses square waveform phase to hold the value of the each input vector element. This paper discuss the hardware implementation of the DPLL SOM architecture. For effective hardware implementation, some components are redesigned to reduce the circuit size. The proposed SOM architecture is described in VHDL and implemented on field programmable gate array (FPGA). Its feasibility is verified by experiments. Results show that the proposed SOM implemented on the FPGA has a good quantization capability, and its circuit size very small.
Virtualization - A Key Cost Saver in NASA Multi-Mission Ground System Architecture
NASA Technical Reports Server (NTRS)
Swenson, Paul; Kreisler, Stephen; Sager, Jennifer A.; Smith, Dan
2014-01-01
With science team budgets being slashed, and a lack of adequate facilities for science payload teams to operate their instruments, there is a strong need for innovative new ground systems that are able to provide necessary levels of capability processing power, system availability and redundancy while maintaining a small footprint in terms of physical space, power utilization and cooling.The ground system architecture being presented is based off of heritage from several other projects currently in development or operations at Goddard, but was designed and built specifically to meet the needs of the Science and Planetary Operations Control Center (SPOCC) as a low-cost payload command, control, planning and analysis operations center. However, this SPOCC architecture was designed to be generic enough to be re-used partially or in whole by other labs and missions (since its inception that has already happened in several cases!)The SPOCC architecture leverages a highly available VMware-based virtualization cluster with shared SAS Direct-Attached Storage (DAS) to provide an extremely high-performing, low-power-utilization and small-footprint compute environment that provides Virtual Machine resources shared among the various tenant missions in the SPOCC. The storage is also expandable, allowing future missions to chain up to 7 additional 2U chassis of storage at an extremely competitive cost if they require additional archive or virtual machine storage space.The software architecture provides a fully-redundant GMSEC-based message bus architecture based on the ActiveMQ middleware to track all health and safety status within the SPOCC ground system. All virtual machines utilize the GMSEC system agents to report system host health over the GMSEC bus, and spacecraft payload health is monitored using the Hammers Integrated Test and Operations System (ITOS) Galaxy Telemetry and Command (TC) system, which performs near-real-time limit checking and data processing on the downlinked data stream and injects messages into the GMSEC bus that are monitored to automatically page the on-call operator or Systems Administrator (SA) when an off-nominal condition is detected. This architecture, like the LTSP thin clients, are shared across all tenant missions.Other required IT security controls are implemented at the ground system level, including physical access controls, logical system-level authentication authorization management, auditing and reporting, network management and a NIST 800-53 FISMA-Moderate IT Security plan Risk Assessment Contingency Plan, helping multiple missions share the cost of compliance with agency-mandated directives.The SPOCC architecture provides science payload control centers and backup mission operations centers with a cost-effective, standardized approach to virtualizing and monitoring resources that were traditionally multiple racks full of physical machines. The increased agility in deploying new virtual systems and thin client workstations can provide significant savings in personnel costs for maintaining the ground system. The cost savings in procurement, power, rack footprint and cooling as well as the shared multi-mission design greatly reduces upfront cost for missions moving into the facility. Overall, the authors hope that this architecture will become a model for how future NASA operations centers are constructed!
Feeding People's Curiosity: Leveraging the Cloud for Automatic Dissemination of Mars Images
NASA Technical Reports Server (NTRS)
Knight, David; Powell, Mark
2013-01-01
Smartphones and tablets have made wireless computing ubiquitous, and users expect instant, on-demand access to information. The Mars Science Laboratory (MSL) operations software suite, MSL InterfaCE (MSLICE), employs a different back-end image processing architecture compared to that of the Mars Exploration Rovers (MER) in order to better satisfy modern consumer-driven usage patterns and to offer greater server-side flexibility. Cloud services are a centerpiece of the server-side architecture that allows new image data to be delivered automatically to both scientists using MSLICE and the general public through the MSL website (http://mars.jpl.nasa.gov/msl/).
Parallel heterogeneous architectures for efficient OMP compressive sensing reconstruction
NASA Astrophysics Data System (ADS)
Kulkarni, Amey; Stanislaus, Jerome L.; Mohsenin, Tinoosh
2014-05-01
Compressive Sensing (CS) is a novel scheme, in which a signal that is sparse in a known transform domain can be reconstructed using fewer samples. The signal reconstruction techniques are computationally intensive and have sluggish performance, which make them impractical for real-time processing applications . The paper presents novel architectures for Orthogonal Matching Pursuit algorithm, one of the popular CS reconstruction algorithms. We show the implementation results of proposed architectures on FPGA, ASIC and on a custom many-core platform. For FPGA and ASIC implementation, a novel thresholding method is used to reduce the processing time for the optimization problem by at least 25%. Whereas, for the custom many-core platform, efficient parallelization techniques are applied, to reconstruct signals with variant signal lengths of N and sparsity of m. The algorithm is divided into three kernels. Each kernel is parallelized to reduce execution time, whereas efficient reuse of the matrix operators allows us to reduce area. Matrix operations are efficiently paralellized by taking advantage of blocked algorithms. For demonstration purpose, all architectures reconstruct a 256-length signal with maximum sparsity of 8 using 64 measurements. Implementation on Xilinx Virtex-5 FPGA, requires 27.14 μs to reconstruct the signal using basic OMP. Whereas, with thresholding method it requires 18 μs. ASIC implementation reconstructs the signal in 13 μs. However, our custom many-core, operating at 1.18 GHz, takes 18.28 μs to complete. Our results show that compared to the previous published work of the same algorithm and matrix size, proposed architectures for FPGA and ASIC implementations perform 1.3x and 1.8x respectively faster. Also, the proposed many-core implementation performs 3000x faster than the CPU and 2000x faster than the GPU.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Donaldson, Thomas; Doty, Karl; Engle, Steven W.; Larson, Robert E.; O'Reilly, John G.
1988-01-01
The results of a Phase I Small Business Innovative Research (SBIR) project performed for the NASA Langley Computational Structural Mechanics Group are described. The project resulted in the identification of a family of chordal-ring interconnection architectures with excellent potential to serve as the basis for new multimicroprocessor (MMP) computers. The paper presents examples of how computational algorithms from structural mechanics can be efficiently implemented on the chordal-ring architecture.
NASA Technical Reports Server (NTRS)
Simon, Donald L.
2010-01-01
Aircraft engine performance trend monitoring and gas path fault diagnostics are closely related technologies that assist operators in managing the health of their gas turbine engine assets. Trend monitoring is the process of monitoring the gradual performance change that an aircraft engine will naturally incur over time due to turbomachinery deterioration, while gas path diagnostics is the process of detecting and isolating the occurrence of any faults impacting engine flow-path performance. Today, performance trend monitoring and gas path fault diagnostic functions are performed by a combination of on-board and off-board strategies. On-board engine control computers contain logic that monitors for anomalous engine operation in real-time. Off-board ground stations are used to conduct fleet-wide engine trend monitoring and fault diagnostics based on data collected from each engine each flight. Continuing advances in avionics are enabling the migration of portions of the ground-based functionality on-board, giving rise to more sophisticated on-board engine health management capabilities. This paper reviews the conventional engine performance trend monitoring and gas path fault diagnostic architecture commonly applied today, and presents a proposed enhanced on-board architecture for future applications. The enhanced architecture gains real-time access to an expanded quantity of engine parameters, and provides advanced on-board model-based estimation capabilities. The benefits of the enhanced architecture include the real-time continuous monitoring of engine health, the early diagnosis of fault conditions, and the estimation of unmeasured engine performance parameters. A future vision to advance the enhanced architecture is also presented and discussed
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.; Lytle, John K. (Technical Monitor)
2002-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
Modeling the evolution of protein domain architectures using maximum parsimony.
Fong, Jessica H; Geer, Lewis Y; Panchenko, Anna R; Bryant, Stephen H
2007-02-09
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.
Modeling the Evolution of Protein Domain Architectures Using Maximum Parsimony
Fong, Jessica H.; Geer, Lewis Y.; Panchenko, Anna R.; Bryant, Stephen H.
2007-01-01
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture “neighbors” identified in this way may lead to new insights about the evolution of protein function. PMID:17166515
Computation Directorate 2008 Annual Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crawford, D L
2009-03-25
Whether a computer is simulating the aging and performance of a nuclear weapon, the folding of a protein, or the probability of rainfall over a particular mountain range, the necessary calculations can be enormous. Our computers help researchers answer these and other complex problems, and each new generation of system hardware and software widens the realm of possibilities. Building on Livermore's historical excellence and leadership in high-performance computing, Computation added more than 331 trillion floating-point operations per second (teraFLOPS) of power to LLNL's computer room floors in 2008. In addition, Livermore's next big supercomputer, Sequoia, advanced ever closer to itsmore » 2011-2012 delivery date, as architecture plans and the procurement contract were finalized. Hyperion, an advanced technology cluster test bed that teams Livermore with 10 industry leaders, made a big splash when it was announced during Michael Dell's keynote speech at the 2008 Supercomputing Conference. The Wall Street Journal touted Hyperion as a 'bright spot amid turmoil' in the computer industry. Computation continues to measure and improve the costs of operating LLNL's high-performance computing systems by moving hardware support in-house, by measuring causes of outages to apply resources asymmetrically, and by automating most of the account and access authorization and management processes. These improvements enable more dollars to go toward fielding the best supercomputers for science, while operating them at less cost and greater responsiveness to the customers.« less
NASA Astrophysics Data System (ADS)
Xu, Shigang; Liu, Yang
2018-03-01
The conventional pseudo-acoustic wave equations (PWEs) in arbitrary orthorhombic anisotropic (OA) media usually have coupled P- and SV-wave modes. These coupled equations may introduce strong SV-wave artifacts and numerical instabilities in P-wave simulation results and reverse-time migration (RTM) profiles. However, pure acoustic wave equations (PAWEs) completely decouple the P-wave component from the full elastic wavefield and naturally solve all the aforementioned problems. In this article, we present a novel PAWE in arbitrary OA media and compare it with the conventional coupled PWEs. Through decomposing the solution of the corresponding eigenvalue equation for the original PWE into an ellipsoidal differential operator (EDO) and an ellipsoidal scalar operator (ESO), the new PAWE in time-space domain is constructed by applying the combination of these two solvable operators and can effectively describe P-wave features in arbitrary OA media. Furthermore, we adopt the optimal finite-difference method (FDM) to solve the newly derived PAWE. In addition, the three-dimensional (3D) hybrid absorbing boundary condition (HABC) with some reasonable modifications is developed for reducing artificial edge reflections in anisotropic media. To improve computational efficiency in 3D case, we adopt graphic processing unit (GPU) with Compute Unified Device Architecture (CUDA) instead of traditional central processing unit (CPU) architecture. Several numerical experiments for arbitrary OA models confirm that the proposed schemes can produce pure, stable and accurate P-wave modeling results and RTM images with higher computational efficiency. Moreover, the 3D numerical simulations can provide us with a comprehensive and real description of wave propagation.
NASA Astrophysics Data System (ADS)
Irving, D. H.; Rasheed, M.; Hillman, C.; O'Doherty, N.
2012-12-01
Oilfield management is moving to a more operational footing with near-realtime seismic and sensor monitoring governing drilling, fluid injection and hydrocarbon extraction workflows within safety, productivity and profitability constraints. To date, the geoscientific analytical architectures employed are configured for large volumes of data, computational power or analytical latency and compromises in system design must be made to achieve all three aspects. These challenges are encapsulated by the phrase 'Big Data' which has been employed for over a decade in the IT industry to describe the challenges presented by data sets that are too large, volatile and diverse for existing computational architectures and paradigms. We present a data-centric architecture developed to support a geoscientific and geotechnical workflow whereby: ●scientific insight is continuously applied to fresh data ●insights and derived information are incorporated into engineering and operational decisions ●data governance and provenance are routine within a broader data management framework Strategic decision support systems in large infrastructure projects such as oilfields are typically relational data environments; data modelling is pervasive across analytical functions. However, subsurface data and models are typically non-relational (i.e. file-based) in the form of large volumes of seismic imaging data or rapid streams of sensor feeds and are analysed and interpreted using niche applications. The key architectural challenge is to move data and insight from a non-relational to a relational, or structured, data environment for faster and more integrated analytics. We describe how a blend of MapReduce and relational database technologies can be applied in geoscientific decision support, and the strengths and weaknesses of each in such an analytical ecosystem. In addition we discuss hybrid technologies that use aspects of both and translational technologies for moving data and analytics across these platforms. Moving to a data-centric architecture requires data management methodologies to be overhauled by default and we show how end-to-end data provenancing and dependency management is implicit in such an environment and how it benefits system administration as well as the user community. Whilst the architectural experiences are drawn from the oil industry, we believe that they are more broadly applicable in academic and government settings where large volumes of data are added to incrementally and require revisiting with low analytical latency and we suggest application to earthquake monitoring and remote sensing networks.
Algorithms and Architectures for Elastic-Wave Inversion Final Report CRADA No. TC02144.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, S.; Lindtjorn, O.
2017-08-15
This was a collaborative effort between Lawrence Livermore National Security, LLC as manager and operator of Lawrence Livermore National Laboratory (LLNL) and Schlumberger Technology Corporation (STC), to perform a computational feasibility study that investigates hardware platforms and software algorithms applicable to STC for Reverse Time Migration (RTM) / Reverse Time Inversion (RTI) of 3-D seismic data.
2014-06-30
steganalysis) in large-scale datasets such as might be obtained by monitoring a corporate network or social network. Identifying guilty actors...guilty’ user (of steganalysis) in large-scale datasets such as might be obtained by monitoring a corporate network or social network. Identifying guilty...floating point operations (1 TFLOPs) for a 1 megapixel image. We designed a new implementation using Compute Unified Device Architecture (CUDA) on NVIDIA
Integrated Environment for Development and Assurance
2015-01-26
Jan 26, 2015 © 2015 Carnegie Mellon University We Rely on Software for Safe Aircraft Operation Embedded software systems introduce a new class of...eveloper Compute Platform Runtime Architecture Application Software Embedded SW System Engineer Data Stream Characteristics Latency jitter affects...Why do system level failures still occur despite fault tolerance techniques being deployed in systems ? Embedded software system as major source of
Context Aware Routing Management Architecture for Airborne Networks
2012-03-22
awareness, increased survivability, 2 higher operation tempo , greater lethality, improve speed of command and certain degree of self-synchronization [35...first two sets of experiments. This error model simulates deviations from predetermined routes as well as variations on signal strength for radio...routes computed using Maximum Concurrent Multi-Commodity flow algorithm are not susceptible to rapid topology variations induced by noise. 57 5
Exploring the Feasibility of a DNA Computer: Design of an ALU Using Sticker-Based DNA Model.
Sarkar, Mayukh; Ghosal, Prasun; Mohanty, Saraju P
2017-09-01
Since its inception, DNA computing has advanced to offer an extremely powerful, energy-efficient emerging technology for solving hard computational problems with its inherent massive parallelism and extremely high data density. This would be much more powerful and general purpose when combined with other existing well-known algorithmic solutions that exist for conventional computing architectures using a suitable ALU. Thus, a specifically designed DNA Arithmetic and Logic Unit (ALU) that can address operations suitable for both domains can mitigate the gap between these two. An ALU must be able to perform all possible logic operations, including NOT, OR, AND, XOR, NOR, NAND, and XNOR; compare, shift etc., integer and floating point arithmetic operations (addition, subtraction, multiplication, and division). In this paper, design of an ALU has been proposed using sticker-based DNA model with experimental feasibility analysis. Novelties of this paper may be in manifold. First, the integer arithmetic operations performed here are 2s complement arithmetic, and the floating point operations follow the IEEE 754 floating point format, resembling closely to a conventional ALU. Also, the output of each operation can be reused for any next operation. So any algorithm or program logic that users can think of can be implemented directly on the DNA computer without any modification. Second, once the basic operations of sticker model can be automated, the implementations proposed in this paper become highly suitable to design a fully automated ALU. Third, proposed approaches are easy to implement. Finally, these approaches can work on sufficiently large binary numbers.
NASA Astrophysics Data System (ADS)
Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia
2017-07-01
The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
Quantum logic between remote quantum registers
NASA Astrophysics Data System (ADS)
Yao, N. Y.; Gong, Z.-X.; Laumann, C. R.; Bennett, S. D.; Duan, L.-M.; Lukin, M. D.; Jiang, L.; Gorshkov, A. V.
2013-02-01
We consider two approaches to dark-spin-mediated quantum computing in hybrid solid-state spin architectures. First, we review the notion of eigenmode-mediated unpolarized spin-chain state transfer and extend the analysis to various experimentally relevant imperfections: quenched disorder, dynamical decoherence, and uncompensated long-range coupling. In finite-length chains, the interplay between disorder-induced localization and decoherence yields a natural optimal channel fidelity, which we calculate. Long-range dipolar couplings induce a finite intrinsic lifetime for the mediating eigenmode; extensive numerical simulations of dipolar chains of lengths up to L=12 show remarkably high fidelity despite these decay processes. We further briefly consider the extension of the protocol to bosonic systems of coupled oscillators. Second, we introduce a quantum mirror based architecture for universal quantum computing that exploits all of the dark spins in the system as potential qubits. While this dramatically increases the number of qubits available, the composite operations required to manipulate dark-spin qubits significantly raise the error threshold for robust operation. Finally, we demonstrate that eigenmode-mediated state transfer can enable robust long-range logic between spatially separated nitrogen-vacancy registers in diamond; disorder-averaged numerics confirm that high-fidelity gates are achievable even in the presence of moderate disorder.
The technological obsolescence of the Brazilian eletronic ballot box.
Camargo, Carlos Rogério; Faust, Richard; Merino, Eugênio; Stefani, Clarissa
2012-01-01
The electronic ballot box has played a significant role in the consolidation of Brazilian political process. It has enabled paper ballots extinction as a support for the elector's vote as well as for voting counting processes. It is also widely known that election automation has decisively collaborated to the legitimization of Brazilian democracy, getting rid of doubts about the winning candidates. In 1995, when the project was conceived, it represented a compromise solution, balancing technical efficiency and costs trade-offs. However, this architecture currently limits the ergonomic enhancements to the device operation, transportation, maintenance and storage. Nowadays are available in the market devices of reduced dimensions, based on novel computational architecture, namely tablet computers, which emphasizes usability, autonomy, portability, security and low power consumption. Therefore, the proposal under discussion is the replacement of the current electronic ballot boxes for tablet-based devices to improve the ergonomics aspects of the Brazilian voting process. These devices offer a plethora of integrated features (e.g., capacitive touchscreen, speakers, microphone) that enable highly usable and simple user interfaces, in addition to enhancing the voting process security mechanisms. Finally, their operational systems features allow for the development of highly secure applications, suitable to the requirements of a voting process.
Distributed computing environments for future space control systems
NASA Technical Reports Server (NTRS)
Viallefont, Pierre
1993-01-01
The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
Electro-Optic Computing Architectures. Volume I
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
A laboratory breadboard system for dual-arm teleoperation
NASA Technical Reports Server (NTRS)
Bejczy, A. K.; Szakaly, Z.; Kim, W. S.
1990-01-01
The computing architecture of a novel dual-arm teleoperation system is described. The novelty of this system is that: (1) the master arm is not a replica of the slave arm; it is unspecific to any manipulator and can be used for the control of various robot arms with software modifications; and (2) the force feedback to the general purpose master arm is derived from force-torque sensor data originating from the slave hand. The computing architecture of this breadboard system is a fully synchronized pipeline with unique methods for data handling, communication and mathematical transformations. The computing system is modular, thus inherently extendable. The local control loops at both sites operate at 100 Hz rate, and the end-to-end bilateral (force-reflecting) control loop operates at 200 Hz rate, each loop without interpolation. This provides high-fidelity control. This end-to-end system elevates teleoperation to a new level of capabilities via the use of sensors, microprocessors, novel electronics, and real-time graphics displays. A description is given of a graphic simulation system connected to the dual-arm teleoperation breadboard system. High-fidelity graphic simulation of a telerobot (called Phantom Robot) is used for preview and predictive displays for planning and for real-time control under several seconds communication time delay conditions. High fidelity graphic simulation is obtained by using appropriate calibration techniques.
NASA Astrophysics Data System (ADS)
Burnett, W.
2016-12-01
The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Design and Implementation of the PALM-3000 Real-Time Control System
NASA Technical Reports Server (NTRS)
Truong, Tuan N.; Bouchez, Antonin H.; Burruss, Rick S.; Dekany, Richard G.; Guiwits, Stephen R.; Roberts, Jennifer E.; Shelton, Jean C.; Troy, Mitchell
2012-01-01
This paper reflects, from a computational perspective, on the experience gathered in designing and implementing realtime control of the PALM-3000 adaptive optics system currently in operation at the Palomar Observatory. We review the algorithms that serve as functional requirements driving the architecture developed, and describe key design issues and solutions that contributed to the system's low compute-latency. Additionally, we describe an implementation of dense matrix-vector-multiplication for wavefront reconstruction that exceeds 95% of the maximum sustained achievable bandwidth on NVIDIA Geforce 8800GTX GPU.
NASA Astrophysics Data System (ADS)
Singh, Surya P. N.; Thayer, Scott M.
2002-02-01
This paper presents a novel algorithmic architecture for the coordination and control of large scale distributed robot teams derived from the constructs found within the human immune system. Using this as a guide, the Immunology-derived Distributed Autonomous Robotics Architecture (IDARA) distributes tasks so that broad, all-purpose actions are refined and followed by specific and mediated responses based on each unit's utility and capability to timely address the system's perceived need(s). This method improves on initial developments in this area by including often overlooked interactions of the innate immune system resulting in a stronger first-order, general response mechanism. This allows for rapid reactions in dynamic environments, especially those lacking significant a priori information. As characterized via computer simulation of a of a self-healing mobile minefield having up to 7,500 mines and 2,750 robots, IDARA provides an efficient, communications light, and scalable architecture that yields significant operation and performance improvements for large-scale multi-robot coordination and control.
A fast, programmable hardware architecture for the processing of spaceborne SAR data
NASA Technical Reports Server (NTRS)
Bennett, J. R.; Cumming, I. G.; Lim, J.; Wedding, R. M.
1984-01-01
The development of high-throughput SAR processors (HTSPs) for the spaceborne SARs being planned by NASA, ESA, DFVLR, NASDA, and the Canadian Radarsat Project is discussed. The basic parameters and data-processing requirements of the SARs are listed in tables, and the principal problems are identified as real-operations rates in excess of 2 x 10 to the 9th/sec, I/O rates in excess of 8 x 10 to the 6th samples/sec, and control computation loads (as for range cell migration correction) as high as 1.4 x 10 to the 6th instructions/sec. A number of possible HTSP architectures are reviewed; host/array-processor (H/AP) and distributed-control/data-path (DCDP) architectures are examined in detail and illustrated with block diagrams; and a cost/speed comparison of these two architectures is presented. The H/AP approach is found to be adequate and economical for speeds below 1/200 of real time, while DCDP is more cost-effective above 1/50 of real time.
Image-Processing Software For A Hypercube Computer
NASA Technical Reports Server (NTRS)
Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.
1992-01-01
Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Experimental Comparison of Two Quantum Computing Architectures
2017-03-28
IN A U G U RA L A RT IC LE CO M PU TE R SC IE N CE S Experimental comparison of two quantum computing architectures Norbert M. Linkea,b,1, Dmitri...the vast computing power a universal quantumcomputer could offer, several candidate systems are being explored. They have allowed experimental ...existing systems and the role of architecture in quantum computer design . These will be crucial for the realization of more advanced future incarna
Airport-Noise Levels and Annoyance Model (ALAMO) user's guide
NASA Technical Reports Server (NTRS)
Deloach, R.; Donaldson, J. L.; Johnson, M. J.
1986-01-01
A guide for the use of the Airport-Noise Level and Annoyance MOdel (ALAMO) at the Langley Research Center computer complex is provided. This document is divided into 5 primary sections, the introduction, the purpose of the model, and an in-depth description of the following subsystems: baseline, noise reduction simulation and track analysis. For each subsystem, the user is provided with a description of architecture, an explanation of subsystem use, sample results, and a case runner's check list. It is assumed that the user is familiar with the operations at the Langley Research Center (LaRC) computer complex, the Network Operating System (NOS 1.4) and CYBER Control Language. Incorporated within the ALAMO model is a census database system called SITE II.
Quantum information processing with a travelling wave of light
NASA Astrophysics Data System (ADS)
Serikawa, Takahiro; Shiozawa, Yu; Ogawa, Hisashi; Takanashi, Naoto; Takeda, Shuntaro; Yoshikawa, Jun-ichi; Furusawa, Akira
2018-02-01
We exploit quantum information processing on a traveling wave of light, expecting emancipation from thermal noise, easy coupling to fiber communication, and potentially high operation speed. Although optical memories are technically challenging, we have an alternative approach to apply multi-step operations on traveling light, that is, continuous-variable one-way computation. So far our achievement includes generation of a one-million-mode entangled chain in time-domain, mode engineering of nonlinear resource states, and real-time nonlinear feedforward. Although they are implemented with free space optics, we are also investigating photonic integration and performed quantum teleportation with a passive liner waveguide chip as a demonstration of entangling, measurement, and feedforward. We also suggest a loop-based architecture as another model of continuous-variable computing.
On some Aitken-like acceleration of the Schwarz method
NASA Astrophysics Data System (ADS)
Garbey, M.; Tromeur-Dervout, D.
2002-12-01
In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
Architecture of next-generation information management systems for digital radiology enterprises
NASA Astrophysics Data System (ADS)
Wong, Stephen T. C.; Wang, Huili; Shen, Weimin; Schmidt, Joachim; Chen, George; Dolan, Tom
2000-05-01
Few information systems today offer a clear and flexible means to define and manage the automated part of radiology processes. None of them provide a coherent and scalable architecture that can easily cope with heterogeneity and inevitable local adaptation of applications. Most importantly, they often lack a model that can integrate clinical and administrative information to aid better decisions in managing resources, optimizing operations, and improving productivity. Digital radiology enterprises require cost-effective solutions to deliver information to the right person in the right place and at the right time. We propose a new architecture of image information management systems for digital radiology enterprises. Such a system is based on the emerging technologies in workflow management, distributed object computing, and Java and Web techniques, as well as Philips' domain knowledge in radiology operations. Our design adapts the approach of '4+1' architectural view. In this new architecture, PACS and RIS will become one while the user interaction can be automated by customized workflow process. Clinical service applications are implemented as active components. They can be reasonably substituted by applications of local adaptations and can be multiplied for fault tolerance and load balancing. Furthermore, it will provide powerful query and statistical functions for managing resources and improving productivity in real time. This work will lead to a new direction of image information management in the next millennium. We will illustrate the innovative design with implemented examples of a working prototype.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
Porting marine ecosystem model spin-up using transport matrices to GPUs
NASA Astrophysics Data System (ADS)
Siewertsen, E.; Piwonski, J.; Slawig, T.
2013-01-01
We have ported an implementation of the spin-up for marine ecosystem models based on transport matrices to graphics processing units (GPUs). The original implementation was designed for distributed-memory architectures and uses the Portable, Extensible Toolkit for Scientific Computation (PETSc) library that is based on the Message Passing Interface (MPI) standard. The spin-up computes a steady seasonal cycle of ecosystem tracers with climatological ocean circulation data as forcing. Since the transport is linear with respect to the tracers, the resulting operator is represented by matrices. Each iteration of the spin-up involves two matrix-vector multiplications and the evaluation of the used biogeochemical model. The original code was written in C and Fortran. On the GPU, we use the Compute Unified Device Architecture (CUDA) standard, a customized version of PETSc and a commercial CUDA Fortran compiler. We describe the extensions to PETSc and the modifications of the original C and Fortran codes that had to be done. Here we make use of freely available libraries for the GPU. We analyze the computational effort of the main parts of the spin-up for two exemplar ecosystem models and compare the overall computational time to those necessary on different CPUs. The results show that a consumer GPU can compete with a significant number of cluster CPUs without further code optimization.
PISCES: An environment for parallel scientific computation
NASA Technical Reports Server (NTRS)
Pratt, T. W.
1985-01-01
The parallel implementation of scientific computing environment (PISCES) is a project to provide high-level programming environments for parallel MIMD computers. Pisces 1, the first of these environments, is a FORTRAN 77 based environment which runs under the UNIX operating system. The Pisces 1 user programs in Pisces FORTRAN, an extension of FORTRAN 77 for parallel processing. The major emphasis in the Pisces 1 design is in providing a carefully specified virtual machine that defines the run-time environment within which Pisces FORTRAN programs are executed. Each implementation then provides the same virtual machine, regardless of differences in the underlying architecture. The design is intended to be portable to a variety of architectures. Currently Pisces 1 is implemented on a network of Apollo workstations and on a DEC VAX uniprocessor via simulation of the task level parallelism. An implementation for the Flexible Computing Corp. FLEX/32 is under construction. An introduction to the Pisces 1 virtual computer and the FORTRAN 77 extensions is presented. An example of an algorithm for the iterative solution of a system of equations is given. The most notable features of the design are the provision for several granularities of parallelism in programs and the provision of a window mechanism for distributed access to large arrays of data.
A learnable parallel processing architecture towards unity of memory and computing
NASA Astrophysics Data System (ADS)
Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.
2015-08-01
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.
Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J
2015-08-14
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
NASA Astrophysics Data System (ADS)
Yue, S. S.; Wen, Y. N.; Lv, G. N.; Hu, D.
2013-10-01
In recent years, the increasing development of cloud computing technologies laid critical foundation for efficiently solving complicated geographic issues. However, it is still difficult to realize the cooperative operation of massive heterogeneous geographical models. Traditional cloud architecture is apt to provide centralized solution to end users, while all the required resources are often offered by large enterprises or special agencies. Thus, it's a closed framework from the perspective of resource utilization. Solving comprehensive geographic issues requires integrating multifarious heterogeneous geographical models and data. In this case, an open computing platform is in need, with which the model owners can package and deploy their models into cloud conveniently, while model users can search, access and utilize those models with cloud facility. Based on this concept, the open cloud service strategies for the sharing of heterogeneous geographic analysis models is studied in this article. The key technology: unified cloud interface strategy, sharing platform based on cloud service, and computing platform based on cloud service are discussed in detail, and related experiments are conducted for further verification.
A Self-Synthesis Approach to Perceptual Learning for Multisensory Fusion in Robotics
Axenie, Cristian; Richter, Christoph; Conradt, Jörg
2016-01-01
Biological and technical systems operate in a rich multimodal environment. Due to the diversity of incoming sensory streams a system perceives and the variety of motor capabilities a system exhibits there is no single representation and no singular unambiguous interpretation of such a complex scene. In this work we propose a novel sensory processing architecture, inspired by the distributed macro-architecture of the mammalian cortex. The underlying computation is performed by a network of computational maps, each representing a different sensory quantity. All the different sensory streams enter the system through multiple parallel channels. The system autonomously associates and combines them into a coherent representation, given incoming observations. These processes are adaptive and involve learning. The proposed framework introduces mechanisms for self-creation and learning of the functional relations between the computational maps, encoding sensorimotor streams, directly from the data. Its intrinsic scalability, parallelisation, and automatic adaptation to unforeseen sensory perturbations make our approach a promising candidate for robust multisensory fusion in robotic systems. We demonstrate this by applying our model to a 3D motion estimation on a quadrotor. PMID:27775621
Multicore Architecture-aware Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srinivasa, Avinash
Modern high performance systems are becoming increasingly complex and powerful due to advancements in processor and memory architecture. In order to keep up with this increasing complexity, applications have to be augmented with certain capabilities to fully exploit such systems. These may be at the application level, such as static or dynamic adaptations or at the system level, like having strategies in place to override some of the default operating system polices, the main objective being to improve computational performance of the application. The current work proposes two such capabilites with respect to multi-threaded scientific applications, in particular a largemore » scale physics application computing ab-initio nuclear structure. The first involves using a middleware tool to invoke dynamic adaptations in the application, so as to be able to adjust to the changing computational resource availability at run-time. The second involves a strategy for effective placement of data in main memory, to optimize memory access latencies and bandwidth. These capabilties when included were found to have a significant impact on the application performance, resulting in average speedups of as much as two to four times.« less
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
Computer-aided acquisition and logistics support (CALS): Concept of Operations for Depot Maintenance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bourgeois, N.C.; Greer, D.K.
1993-04-01
This CALS Concept of Operations for Depot Maintenance provides the foundation strategy and the near term tactical plan for CALS implementation in the depot maintenance environment. The user requirements enumerated and the overarching architecture outlined serve as the primary framework for implementation planning. The seamless integration of depot maintenance business processes and supporting information systems with the emerging global CALS environment will be critical to the efficient realization of depot user's information requirements, and as, such will be a fundamental theme in depot implementations.
2003-01-01
dependencies, and conceptual independencies. Taken together, the three views provide a framework to ensure interoperability, regardless of system... products for COP users . It enables a shared situational awareness that significantly improves the ability of commanders at all levels to quickly make... Review , March-April 1998. 5 Eric K. Shinseki, General , U.S. Army. “ The Army Transformation: A Historic Opportunity,” 2001- 02 Army Green Book
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Software beamforming: comparison between a phased array and synthetic transmit aperture.
Li, Yen-Feng; Li, Pai-Chi
2011-04-01
The data-transfer and computation requirements are compared between software-based beamforming using a phased array (PA) and a synthetic transmit aperture (STA). The advantages of a software-based architecture are reduced system complexity and lower hardware cost. Although this architecture can be implemented using commercial CPUs or GPUs, the high computation and data-transfer requirements limit its real-time beamforming performance. In particular, transferring the raw rf data from the front-end subsystem to the software back-end remains challenging with current state-of-the-art electronics technologies, which offset the cost advantage of the software back end. This study investigated the tradeoff between the data-transfer and computation requirements. Two beamforming methods based on a PA and STA, respectively, were used: the former requires a higher data transfer rate and the latter requires more memory operations. The beamformers were implemente;d in an NVIDIA GeForce GTX 260 GPU and an Intel core i7 920 CPU. The frame rate of PA beamforming was 42 fps with a 128-element array transducer, with 2048 samples per firing and 189 beams per image (with a 95 MB/frame data-transfer requirement). The frame rate of STA beamforming was 40 fps with 16 firings per image (with an 8 MB/frame data-transfer requirement). Both approaches achieved real-time beamforming performance but each had its own bottleneck. On the one hand, the required data-transfer speed was considerably reduced in STA beamforming, whereas this required more memory operations, which limited the overall computation time. The advantages of the GPU approach over the CPU approach were clearly demonstrated.
A Scalable, Out-of-Band Diagnostics Architecture for International Space Station Systems Support
NASA Technical Reports Server (NTRS)
Fletcher, Daryl P.; Alena, Rick; Clancy, Daniel (Technical Monitor)
2002-01-01
The computational infrastructure of the International Space Station (ISS) is a dynamic system that supports multiple vehicle subsystems such as Caution and Warning, Electrical Power Systems and Command and Data Handling (C&DH), as well as scientific payloads of varying size and complexity. The dynamic nature of the ISS configuration coupled with the increased demand for payload support places a significant burden on the inherently resource constrained computational infrastructure of the ISS. Onboard system diagnostics applications are hosted on computers that are elements of the avionics network while ground-based diagnostic applications receive only a subset of available telemetry, down-linked via S-band communications. In this paper we propose a scalable, out-of-band diagnostics architecture for ISS systems support that uses a read-only connection for C&DH data acquisition, which provides a lower cost of deployment and maintenance (versus a higher criticality readwrite connection). The diagnostics processing burden is off-loaded from the avionics network to elements of the on-board LAN that have a lower overall cost of operation and increased computational capacity. A superset of diagnostic data, richer in content than the configured telemetry, is made available to Advanced Diagnostic System (ADS) clients running on wireless handheld devices, affording the crew greater mobility for troubleshooting and providing improved insight into vehicle state. The superset of diagnostic data is made available to the ground in near real-time via an out-of band downlink, providing a high level of fidelity between vehicle state and test, training and operational facilities on the ground.
Adaptive neural coding: from biological to behavioral decision-making
Louie, Kenway; Glimcher, Paul W.; Webb, Ryan
2015-01-01
Empirical decision-making in diverse species deviates from the predictions of normative choice theory, but why such suboptimal behavior occurs is unknown. Here, we propose that deviations from optimality arise from biological decision mechanisms that have evolved to maximize choice performance within intrinsic biophysical constraints. Sensory processing utilizes specific computations such as divisive normalization to maximize information coding in constrained neural circuits, and recent evidence suggests that analogous computations operate in decision-related brain areas. These adaptive computations implement a relative value code that may explain the characteristic context-dependent nature of behavioral violations of classical normative theory. Examining decision-making at the computational level thus provides a crucial link between the architecture of biological decision circuits and the form of empirical choice behavior. PMID:26722666
Accelerating scientific computations with mixed precision algorithms
NASA Astrophysics Data System (ADS)
Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire
2009-12-01
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
NASA Technical Reports Server (NTRS)
Ido, Haisam; Burns, Rich
2015-01-01
The NASA Goddard Space Science Mission Operations project (SSMO) is performing a technical cost-benefit analysis for centralizing and consolidating operations of a diverse set of missions into a unified and integrated technical infrastructure. The presentation will focus on the notion of normalizing spacecraft operations processes, workflows, and tools. It will also show the processes of creating a standardized open architecture, creating common security models and implementations, interfaces, services, automations, notifications, alerts, logging, publish, subscribe and middleware capabilities. The presentation will also discuss how to leverage traditional capabilities, along with virtualization, cloud computing services, control groups and containers, and possibly Big Data concepts.
NASA Technical Reports Server (NTRS)
Campbell, R. H.; Essick, Ray B.; Johnston, Gary; Kenny, Kevin; Russo, Vince
1987-01-01
Project EOS is studying the problems of building adaptable real-time embedded operating systems for the scientific missions of NASA. Choices (A Class Hierarchical Open Interface for Custom Embedded Systems) is an operating system designed and built by Project EOS to address the following specific issues: the software architecture for adaptable embedded parallel operating systems, the achievement of high-performance and real-time operation, the simplification of interprocess communications, the isolation of operating system mechanisms from one another, and the separation of mechanisms from policy decisions. Choices is written in C++ and runs on a ten processor Encore Multimax. The system is intended for use in constructing specialized computer applications and research on advanced operating system features including fault tolerance and parallelism.
NASA Astrophysics Data System (ADS)
Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng
2018-04-01
In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.
NASA Technical Reports Server (NTRS)
Hsia, T. C.; Lu, G. Z.; Han, W. H.
1987-01-01
In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
DOT National Transportation Integrated Search
1999-04-22
The CVISN Operational and Architectural Compatibility Handbook (COACH) provides a comprehensive checklist of what is required to conform with the Commercial Vehicle Information Systems and Networks (CVISN) operational concepts and architecture. It is...
Jin, Miaomiao; Cheng, Long; Li, Yi; Hu, Siyu; Lu, Ke; Chen, Jia; Duan, Nian; Wang, Zhuorui; Zhou, Yaxiong; Chang, Ting-Chang; Miao, Xiangshui
2018-06-27
Owing to the capability of integrating the information storage and computing in the same physical location, in-memory computing with memristors has become a research hotspot as a promising route for non von Neumann architecture. However, it is still a challenge to develop high performance devices as well as optimized logic methodologies to realize energy-efficient computing. Herein, filamentary Cu/GeTe/TiN memristor is reported to show satisfactory properties with nanosecond switching speed (< 60 ns), low voltage operation (< 2 V), high endurance (>104 cycles) and good retention (>104 s @85℃). It is revealed that the charge carrier conduction mechanisms in high resistance and low resistance states are Schottky emission and hopping transport between the adjacent Cu clusters, respectively, based on the analysis of current-voltage behaviors and resistance-temperature characteristics. An intuitive picture is given to describe the dynamic processes of resistive switching. Moreover, based on the basic material implication (IMP) logic circuit, we proposed a reconfigurable logic method and experimentally implemented IMP, NOT, OR, and COPY logic functions. Design of a one-bit full adder with reduction in computational sequences and its validation in simulation further demonstrate the potential practical application. The results provide important progress towards understanding of resistive switching mechanism and realization of energy-efficient in-memory computing architecture. © 2018 IOP Publishing Ltd.
Secure Computer System: Unified Exposition and Multics Interpretation
1976-03-01
prearranged code to semaphore critical information to an undercleared subject/process. Neither of these topics is directly addressed by the mathematical...FURTHER CONSIDERATIONS. RULES OF OPERATION FOR A SECURE MULTICS Kernel primitives for a secure Multics will be derived from a higher level user...the Multics architecture as little as possible; this will account to a large extent for radical differences in form between actual kernel primitives
Development of IS2100: An Information Systems Laboratory.
1985-03-01
systems for digital logic; hardware architecture; machine, assembly, and high order language programming; and application packages such as database... applications and limitations. They should be able to define, demonstrate and/or discuss how computers are used, how they do their work, how to use them, and...limitations. Hands on operation of the hardware and software provides experience that aids in future selection of hardware systems and applications
FOS: A Factored Operating Systems for High Assurance and Scalability on Multicores
2012-08-01
computing. It builds on previous work in distributed and microkernel OSes by factoring services out of the kernel, and then further distributing each...2 3.0 Methods, Assumptions, and Procedures (System Design) .................................................. 4 3.1 Microkernel ...cooperating servers. We term such a service a fleet. Figure 2 shows the high-level architecture of fos. A small microkernel runs on every core
Use of Dynamic Models and Operational Architecture to Solve Complex Navy Challenges
NASA Technical Reports Server (NTRS)
Grande, Darby; Black, J. Todd; Freeman, Jared; Sorber, TIm; Serfaty, Daniel
2010-01-01
The United States Navy established 8 Maritime Operations Centers (MOC) to enhance the command and control of forces at the operational level of warfare. Each MOC is a headquarters manned by qualified joint operational-level staffs, and enabled by globally interoperable C41 systems. To assess and refine MOC staffing, equipment, and schedules, a dynamic software model was developed. The model leverages pre-existing operational process architecture, joint military task lists that define activities and their precedence relations, as well as Navy documents that specify manning and roles per activity. The software model serves as a "computational wind-tunnel" in which to test a MOC on a mission, and to refine its structure, staffing, processes, and schedules. More generally, the model supports resource allocation decisions concerning Doctrine, Organization, Training, Material, Leadership, Personnel and Facilities (DOTMLPF) at MOCs around the world. A rapid prototype effort efficiently produced this software in less than five months, using an integrated process team consisting of MOC military and civilian staff, modeling experts, and software developers. The work reported here was conducted for Commander, United States Fleet Forces Command in Norfolk, Virginia, code N5-0LW (Operational Level of War) that facilitates the identification, consolidation, and prioritization of MOC capabilities requirements, and implementation and delivery of MOC solutions.
Study of fault-tolerant software technology
NASA Technical Reports Server (NTRS)
Slivinski, T.; Broglio, C.; Wild, C.; Goldberg, J.; Levitt, K.; Hitt, E.; Webb, J.
1984-01-01
Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance.
Design and implementation of a robot control system with traded and shared control capability
NASA Technical Reports Server (NTRS)
Hayati, S.; Venkataraman, S. T.
1989-01-01
Preliminary results are reported from efforts to design and develop a robotic system that will accept and execute commands from either a six-axis teleoperator device or an autonomous planner, or combine the two. Such a system should have both traded as well as shared control capability. A sharing strategy is presented whereby the overall system, while retaining positive features of teleoperated and autonomous operation, loses its individual negative features. A two-tiered shared control architecture is considered here, consisting of a task level and a servo level. Also presented is a computer architecture for the implementation of this system, including a description of the hardware and software.
NASA Technical Reports Server (NTRS)
Tavenner, Leslie A. (Editor)
1991-01-01
These proceedings overview major space information system projects and lessons learned from current missions. Other topics include the science information system requirements for the 1990s, an information systems design approach for major programs, the technology needs and projections, the standards for space data information systems, the artificial intelligence technology and applications, international interoperability, and spacecraft data systems and architectures advanced communications. Other topics include the software engineering technology and applications, the multimission multidiscipline information system architectures, the distributed planning and scheduling systems and operations, and the computer and information systems architectures. Paper presented include prospects for scientific data analysis systems for solar-terrestrial physics in the 1990s, the Columbus data management system, data storage technologies for the future, the German aerospace research establishment, and launching artificial intelligence in NASA ground systems.
A Survey of Architectural Techniques For Improving Cache Power Efficiency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers have proposed several techniques for improving energy efficiency of cache architectures. This paper surveys recent architectural techniques for improving cache power efficiency and also presents a classification of these techniques based on their characteristics. For providing an application perspective, this paper also reviews several real-worldmore » processor chips that employ cache energy saving techniques. The aim of this survey is to enable engineers and researchers to get insights into the techniques for improving cache power efficiency and motivate them to invent novel solutions for enabling low-power operation of caches.« less
HRST architecture modeling and assessments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Comstock, D.A.
1997-01-01
This paper presents work supporting the assessment of advanced concept options for the Highly Reusable Space Transportation (HRST) study. It describes the development of computer models as the basis for creating an integrated capability to evaluate the economic feasibility and sustainability of a variety of system architectures. It summarizes modeling capabilities for use on the HRST study to perform sensitivity analysis of alternative architectures (consisting of different combinations of highly reusable vehicles, launch assist systems, and alternative operations and support concepts) in terms of cost, schedule, performance, and demand. In addition, the identification and preliminary assessment of alternative market segmentsmore » for HRST applications, such as space manufacturing, space tourism, etc., is described. Finally, the development of an initial prototype model that can begin to be used for modeling alternative HRST concepts at the system level is presented. {copyright} {ital 1997 American Institute of Physics.}« less
On-Line Tracking Controller for Brushless DC Motor Drives Using Artificial Neural Networks
NASA Technical Reports Server (NTRS)
Rubaai, Ahmed
1996-01-01
A real-time control architecture is developed for time-varying nonlinear brushless dc motors operating in a high performance drives environment. The developed control architecture possesses the capabilities of simultaneous on-line identification and control. The dynamics of the motor are modeled on-line and controlled using an artificial neural network, as the system runs. The control architecture combines the experience and dependability of adaptive tracking systems with potential and promise of the neural computing technology. The sensitivity of real-time controller to parametric changes that occur during training is investigated. Such changes are usually manifested by rapid changes in the load of the brushless motor drives. This sudden change in the external load is simulated for the sigmoidal and sinusoidal reference tracks. The ability of the neuro-controller to maintain reasonable tracking accuracy in the presence of external noise is also verified for a number of desired reference trajectories.
NASA Technical Reports Server (NTRS)
Albus, James S.
1996-01-01
The Real-time Control System (RCS) developed at NIST and elsewhere over the past two decades defines a reference model architecture for design and analysis of complex intelligent control systems. The RCS architecture consists of a hierarchically layered set of functional processing modules connected by a network of communication pathways. The primary distinguishing feature of the layers is the bandwidth of the control loops. The characteristic bandwidth of each level is determined by the spatial and temporal integration window of filters, the temporal frequency of signals and events, the spatial frequency of patterns, and the planning horizon and granularity of the planners that operate at each level. At each level, tasks are decomposed into sequential subtasks, to be performed by cooperating sets of subordinate agents. At each level, signals from sensors are filtered and correlated with spatial and temporal features that are relevant to the control function being implemented at that level.
Single instruction computer architecture and its application in image processing
NASA Astrophysics Data System (ADS)
Laplante, Phillip A.
1992-03-01
A single processing computer system using only half-adder circuits is described. In addition, it is shown that only a single hard-wired instruction is needed in the control unit to obtain a complete instruction set for this general purpose computer. Such a system has several advantages. First it is intrinsically a RISC machine--in fact the 'ultimate RISC' machine. Second, because only a single type of logic element is employed the entire computer system can be easily realized on a single, highly integrated chip. Finally, due to the homogeneous nature of the computer's logic elements, the computer has possible implementations as an optical or chemical machine. This in turn suggests possible paradigms for neural computing and artificial intelligence. After showing how we can implement a full-adder, min, max and other operations using the half-adder, we use an array of such full-adders to implement the dilation operation for two black and white images. Next we implement the erosion operation of two black and white images using a relative complement function and the properties of erosion and dilation. This approach was inspired by papers by van der Poel in which a single instruction is used to furnish a complete set of general purpose instructions and by Bohm- Jacopini where it is shown that any problem can be solved using a Turing machine with one entry and one exit.
Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems
NASA Technical Reports Server (NTRS)
Ecer, A.; Chien, Y. P.; Boenisch, T.; Akay, H. U.
2000-01-01
The developed methodology is aimed at improving the efficiency of executing block-structured algorithms on parallel, distributed, heterogeneous computers. The basic approach of these algorithms is to divide the flow domain into many sub- domains called blocks, and solve the governing equations over these blocks. Dynamic load balancing problem is defined as the efficient distribution of the blocks among the available processors over a period of several hours of computations. In environments with computers of different architecture, operating systems, CPU speed, memory size, load, and network speed, balancing the loads and managing the communication between processors becomes crucial. Load balancing software tools for mutually dependent parallel processes have been created to efficiently utilize an advanced computation environment and algorithms. These tools are dynamic in nature because of the chances in the computer environment during execution time. More recently, these tools were extended to a second operating system: NT. In this paper, the problems associated with this application will be discussed. Also, the developed algorithms were combined with the load sharing capability of LSF to efficiently utilize workstation clusters for parallel computing. Finally, results will be presented on running a NASA based code ADPAC to demonstrate the developed tools for dynamic load balancing.
Non-Linear Pattern Formation in Bone Growth and Architecture
Salmon, Phil
2014-01-01
The three-dimensional morphology of bone arises through adaptation to its required engineering performance. Genetically and adaptively bone travels along a complex spatiotemporal trajectory to acquire optimal architecture. On a cellular, micro-anatomical scale, what mechanisms coordinate the activity of osteoblasts and osteoclasts to produce complex and efficient bone architectures? One mechanism is examined here – chaotic non-linear pattern formation (NPF) – which underlies in a unifying way natural structures as disparate as trabecular bone, swarms of birds flying, island formation, fluid turbulence, and others. At the heart of NPF is the fact that simple rules operating between interacting elements, and Turing-like interaction between global and local signals, lead to complex and structured patterns. The study of “group intelligence” exhibited by swarming birds or shoaling fish has led to an embodiment of NPF called “particle swarm optimization” (PSO). This theoretical model could be applicable to the behavior of osteoblasts, osteoclasts, and osteocytes, seeing them operating “socially” in response simultaneously to both global and local signals (endocrine, cytokine, mechanical), resulting in their clustered activity at formation and resorption sites. This represents problem-solving by social intelligence, and could potentially add further realism to in silico computer simulation of bone modeling. What insights has NPF provided to bone biology? One example concerns the genetic disorder juvenile Pagets disease or idiopathic hyperphosphatasia, where the anomalous parallel trabecular architecture characteristic of this pathology is consistent with an NPF paradigm by analogy with known experimental NPF systems. Here, coupling or “feedback” between osteoblasts and osteoclasts is the critical element. This NPF paradigm implies a profound link between bone regulation and its architecture: in bone the architecture is the regulation. The former is the emergent consequence of the latter. PMID:25653638
Non-linear pattern formation in bone growth and architecture.
Salmon, Phil
2014-01-01
The three-dimensional morphology of bone arises through adaptation to its required engineering performance. Genetically and adaptively bone travels along a complex spatiotemporal trajectory to acquire optimal architecture. On a cellular, micro-anatomical scale, what mechanisms coordinate the activity of osteoblasts and osteoclasts to produce complex and efficient bone architectures? One mechanism is examined here - chaotic non-linear pattern formation (NPF) - which underlies in a unifying way natural structures as disparate as trabecular bone, swarms of birds flying, island formation, fluid turbulence, and others. At the heart of NPF is the fact that simple rules operating between interacting elements, and Turing-like interaction between global and local signals, lead to complex and structured patterns. The study of "group intelligence" exhibited by swarming birds or shoaling fish has led to an embodiment of NPF called "particle swarm optimization" (PSO). This theoretical model could be applicable to the behavior of osteoblasts, osteoclasts, and osteocytes, seeing them operating "socially" in response simultaneously to both global and local signals (endocrine, cytokine, mechanical), resulting in their clustered activity at formation and resorption sites. This represents problem-solving by social intelligence, and could potentially add further realism to in silico computer simulation of bone modeling. What insights has NPF provided to bone biology? One example concerns the genetic disorder juvenile Pagets disease or idiopathic hyperphosphatasia, where the anomalous parallel trabecular architecture characteristic of this pathology is consistent with an NPF paradigm by analogy with known experimental NPF systems. Here, coupling or "feedback" between osteoblasts and osteoclasts is the critical element. This NPF paradigm implies a profound link between bone regulation and its architecture: in bone the architecture is the regulation. The former is the emergent consequence of the latter.
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Collaborative Working Architecture for IoT-Based Applications.
Mora, Higinio; Signes-Pont, María Teresa; Gil, David; Johnsson, Magnus
2018-05-23
The new sensing applications need enhanced computing capabilities to handle the requirements of complex and huge data processing. The Internet of Things (IoT) concept brings processing and communication features to devices. In addition, the Cloud Computing paradigm provides resources and infrastructures for performing the computations and outsourcing the work from the IoT devices. This scenario opens new opportunities for designing advanced IoT-based applications, however, there is still much research to be done to properly gear all the systems for working together. This work proposes a collaborative model and an architecture to take advantage of the available computing resources. The resulting architecture involves a novel network design with different levels which combines sensing and processing capabilities based on the Mobile Cloud Computing (MCC) paradigm. An experiment is included to demonstrate that this approach can be used in diverse real applications. The results show the flexibility of the architecture to perform complex computational tasks of advanced applications.
Optimizing Engineering Tools Using Modern Ground Architectures
2017-12-01
Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
A Framework for Adaptable Operating and Runtime Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sterling, Thomas
The emergence of new classes of HPC systems where performance improvement is enabled by Moore’s Law for technology is manifest through multi-core-based architectures including specialized GPU structures. Operating systems were originally designed for control of uniprocessor systems. By the 1980s multiprogramming, virtual memory, and network interconnection were integral services incorporated as part of most modern computers. HPC operating systems were primarily derivatives of the Unix model with Linux dominating the Top-500 list. The use of Linux for commodity clusters was first pioneered by the NASA Beowulf Project. However, the rapid increase in number of cores to achieve performance gain throughmore » technology advances has exposed the limitations of POSIX general-purpose operating systems in scaling and efficiency. This project was undertaken through the leadership of Sandia National Laboratories and in partnership of the University of New Mexico to investigate the alternative of composable lightweight kernels on scalable HPC architectures to achieve superior performance for a wide range of applications. The use of composable operating systems is intended to provide a minimalist set of services specifically required by a given application to preclude overheads and operational uncertainties (“OS noise”) that have been demonstrated to degrade efficiency and operational consistency. This project was undertaken as an exploration to investigate possible strategies and methods for composable lightweight kernel operating systems towards support for extreme scale systems.« less
NASA Technical Reports Server (NTRS)
Rabelo, Luis C.
2002-01-01
This is a report of my activities as a NASA Fellow during the summer of 2002 at the NASA Kennedy Space Center (KSC). The core of these activities is the assigned project: the Virtual Test Bed (VTB) from the Spaceport Engineering and Technology Directorate. The VTB Project has its foundations in the NASA Ames Research Center (ARC) Intelligent Launch & Range Operations program. The objective of the VTB project is to develop a new and unique collaborative computing environment where simulation models can be hosted and integrated in a seamless fashion. This collaborative computing environment will be used to build a Virtual Range as well as a Virtual Spaceport. This project will work as a technology pipeline to research, develop, test and validate R&D efforts against real time operations without interfering with the actual operations or consuming the operational personnel s time. This report will also focus on the systems issues required to conceptualize and provide form to a systems architecture capable of handling the different demands.
Development of the Tensoral Computer Language
NASA Technical Reports Server (NTRS)
Ferziger, Joel; Dresselhaus, Eliot
1996-01-01
The research scientist or engineer wishing to perform large scale simulations or to extract useful information from existing databases is required to have expertise in the details of the particular database, the numerical methods and the computer architecture to be used. This poses a significant practical barrier to the use of simulation data. The goal of this research was to develop a high-level computer language called Tensoral, designed to remove this barrier. The Tensoral language provides a framework in which efficient generic data manipulations can be easily coded and implemented. First of all, Tensoral is general. The fundamental objects in Tensoral represent tensor fields and the operators that act on them. The numerical implementation of these tensors and operators is completely and flexibly programmable. New mathematical constructs and operators can be easily added to the Tensoral system. Tensoral is compatible with existing languages. Tensoral tensor operations co-exist in a natural way with a host language, which may be any sufficiently powerful computer language such as Fortran, C, or Vectoral. Tensoral is very-high-level. Tensor operations in Tensoral typically act on entire databases (i.e., arrays) at one time and may, therefore, correspond to many lines of code in a conventional language. Tensoral is efficient. Tensoral is a compiled language. Database manipulations are simplified optimized and scheduled by the compiler eventually resulting in efficient machine code to implement them.
The GOES-R Product Generation Architecture
NASA Astrophysics Data System (ADS)
Dittberner, G. J.; Kalluri, S.; Hansen, D.; Weiner, A.; Tarpley, A.; Marley, S.
2011-12-01
The GOES-R system will substantially improve users' ability to succeed in their work by providing data with significantly enhanced instruments, higher resolution, much shorter relook times, and an increased number and diversity of products. The Product Generation architecture is designed to provide the computer and memory resources necessary to achieve the necessary latency and availability for these products. Over time, new and updated algorithms are expected to be added and old ones removed as science advances and new products are developed. The GOES-R GS architecture is being planned to maintain functionality so that when such changes are implemented, operational product generation will continue without interruption. The primary parts of the PG infrastructure are the Service Based Architecture (SBA) and the Data Fabric (DF). SBA is the middleware that encapsulates and manages science algorithms that generate products. It is divided into three parts, the Executive, which manages and configures the algorithm as a service, the Dispatcher, which provides data to the algorithm, and the Strategy, which determines when the algorithm can execute with the available data. SBA is a distributed architecture, with services connected to each other over a compute grid and is highly scalable. This plug-and-play architecture allows algorithms to be added, removed, or updated without affecting any other services or software currently running and producing data. Algorithms require product data from other algorithms, so a scalable and reliable messaging is necessary. The SBA uses the DF to provide this data communication layer between algorithms. The DF provides an abstract interface over a distributed and persistent multi-layered storage system (e.g., memory based caching above disk-based storage) and an event management system that allows event-driven algorithm services to know when instrument data are available and where they reside. Together, the SBA and the DF provide a flexible, high performance architecture that can meet the needs of product processing now and as they grow in the future.
System, methods and apparatus for program optimization for multi-threaded processor architectures
Bastoul, Cedric; Lethin, Richard A; Leung, Allen K; Meister, Benoit J; Szilagyi, Peter; Vasilache, Nicolas T; Wohlford, David E
2015-01-06
Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
FPGA-based coprocessor for matrix algorithms implementation
NASA Astrophysics Data System (ADS)
Amira, Abbes; Bensaali, Faycal
2003-03-01
Matrix algorithms are important in many types of applications including image and signal processing. These areas require enormous computing power. A close examination of the algorithms used in these, and related, applications reveals that many of the fundamental actions involve matrix operations such as matrix multiplication which is of O (N3) on a sequential computer and O (N3/p) on a parallel system with p processors complexity. This paper presents an investigation into the design and implementation of different matrix algorithms such as matrix operations, matrix transforms and matrix decompositions using an FPGA based environment. Solutions for the problem of processing large matrices have been proposed. The proposed system architectures are scalable, modular and require less area and time complexity with reduced latency when compared with existing structures.
Research into display sharing techniques for distributed computing environments
NASA Technical Reports Server (NTRS)
Hugg, Steven B.; Fitzgerald, Paul F., Jr.; Rosson, Nina Y.; Johns, Stephen R.
1990-01-01
The X-based Display Sharing solution for distributed computing environments is described. The Display Sharing prototype includes the base functionality for telecast and display copy requirements. Since the prototype implementation is modular and the system design provided flexibility for the Mission Control Center Upgrade (MCCU) operational consideration, the prototype implementation can be the baseline for a production Display Sharing implementation. To facilitate the process the following discussions are presented: Theory of operation; System of architecture; Using the prototype; Software description; Research tools; Prototype evaluation; and Outstanding issues. The prototype is based on the concept of a dedicated central host performing the majority of the Display Sharing processing, allowing minimal impact on each individual workstation. Each workstation participating in Display Sharing hosts programs to facilitate the user's access to Display Sharing as host machine.
Quantum Simulation of Tunneling in Small Systems
Sornborger, Andrew T.
2012-01-01
A number of quantum algorithms have been performed on small quantum computers; these include Shor's prime factorization algorithm, error correction, Grover's search algorithm and a number of analog and digital quantum simulations. Because of the number of gates and qubits necessary, however, digital quantum particle simulations remain untested. A contributing factor to the system size required is the number of ancillary qubits needed to implement matrix exponentials of the potential operator. Here, we show that a set of tunneling problems may be investigated with no ancillary qubits and a cost of one single-qubit operator per time step for the potential evolution, eliminating at least half of the quantum gates required for the algorithm and more than that in the general case. Such simulations are within reach of current quantum computer architectures. PMID:22916333
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
State-of-the-art in Heterogeneous Computing
Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...
2010-01-01
Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures.
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2017-03-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ ( h ), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h . Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O ( n 2 ) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor.
Semivariogram Analysis of Bone Images Implemented on FPGA Architectures
Shirvaikar, Mukul; Lagadapati, Yamuna; Dong, Xuanliang
2016-01-01
Osteoporotic fractures are a major concern for the healthcare of elderly and female populations. Early diagnosis of patients with a high risk of osteoporotic fractures can be enhanced by introducing second-order statistical analysis of bone image data using techniques such as variogram analysis. Such analysis is computationally intensive thereby creating an impediment for introduction into imaging machines found in common clinical settings. This paper investigates the fast implementation of the semivariogram algorithm, which has been proven to be effective in modeling bone strength, and should be of interest to readers in the areas of computer-aided diagnosis and quantitative image analysis. The semivariogram is a statistical measure of the spatial distribution of data, and is based on Markov Random Fields (MRFs). Semivariogram analysis is a computationally intensive algorithm that has typically seen applications in the geosciences and remote sensing areas. Recently, applications in the area of medical imaging have been investigated, resulting in the need for efficient real time implementation of the algorithm. A semi-variance, γ(h), is defined as the half of the expected squared differences of pixel values between any two data locations with a lag distance of h. Due to the need to examine each pair of pixels in the image or sub-image being processed, the base algorithm complexity for an image window with n pixels is O (n2) Field Programmable Gate Arrays (FPGAs) are an attractive solution for such demanding applications due to their parallel processing capability. FPGAs also tend to operate at relatively modest clock rates measured in a few hundreds of megahertz. This paper presents a technique for the fast computation of the semivariogram using two custom FPGA architectures. A modular architecture approach is chosen to allow for replication of processing units. This allows for high throughput due to concurrent processing of pixel pairs. The current implementation is focused on isotropic semivariogram computations only. The algorithm is benchmarked using VHDL on a Xilinx XUPV5-LX110T development Kit, which utilizes the Virtex5 FPGA. Medical image data from DXA scans are utilized for the experiments. Implementation results show that a significant advantage in computational speed is attained by the architectures with respect to implementation on a personal computer with an Intel i7 multi-core processor. PMID:28428829
Advanced computer architecture for large-scale real-time applications.
DOT National Transportation Integrated Search
1973-04-01
Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
Exploring the architectural trade space of NASAs Space Communication and Navigation Program
NASA Astrophysics Data System (ADS)
Sanchez, M.; Selva, D.; Cameron, B.; Crawley, E.; Seas, A.; Seery, B.
NASAs Space Communication and Navigation (SCaN) Program is responsible for providing communication and navigation services to space missions and other users in and beyond low Earth orbit. The current SCaN architecture consists of three independent networks: the Space Network (SN), which contains the TDRS relay satellites in GEO; the Near Earth Network (NEN), which consists of several NASA owned and commercially operated ground stations; and the Deep Space Network (DSN), with three ground stations in Goldstone, Madrid, and Canberra. The first task of this study is the stakeholder analysis. The goal of the stakeholder analysis is to identify the main stakeholders of the SCaN system and their needs. Twenty-one main groups of stakeholders have been identified and put on a stakeholder map. Their needs are currently being elicited by means of interviews and an extensive literature review. The data will then be analyzed by applying Cameron and Crawley's stakeholder analysis theory, with a view to highlighting dominant needs and conflicting needs. The second task of this study is the architectural tradespace exploration of the next generation TDRSS. The space of possible architectures for SCaN is represented by a set of architectural decisions, each of which has a discrete set of options. A computational tool is used to automatically synthesize a very large number of possible architectures by enumerating different combinations of decisions and options. The same tool contains models to evaluate the architectures in terms of performance and cost. The performance model uses the stakeholder needs and requirements identified in the previous steps as inputs, and it is based in the VASSAR methodology presented in a companion paper. This paper summarizes the current status of the MIT SCaN architecture study. It starts by motivating the need to perform tradespace exploration studies in the context of relay data systems through a description of the history NASA's space communicati- n networks. It then presents the generalities of possible architectures for future space communication and navigation networks. Finally, it describes the tools and methods being developed, clearly indicating the architectural decisions that have been taken into account as well as the systematic approach followed to model them. The purpose of this study is to explore the SCaN architectural tradespace by means of a computational tool. This paper describes the tool, while the tradespace exploration is underway.
A Fog Computing and Cloudlet Based Augmented Reality System for the Industry 4.0 Shipyard.
Fernández-Caramés, Tiago M; Fraga-Lamas, Paula; Suárez-Albela, Manuel; Vilar-Montesinos, Miguel
2018-06-02
Augmented Reality (AR) is one of the key technologies pointed out by Industry 4.0 as a tool for enhancing the next generation of automated and computerized factories. AR can also help shipbuilding operators, since they usually need to interact with information (e.g., product datasheets, instructions, maintenance procedures, quality control forms) that could be handled easily and more efficiently through AR devices. This is the reason why Navantia, one of the 10 largest shipbuilders in the world, is studying the application of AR (among other technologies) in different shipyard environments in a project called "Shipyard 4.0". This article presents Navantia's industrial AR (IAR) architecture, which is based on cloudlets and on the fog computing paradigm. Both technologies are ideal for supporting physically-distributed, low-latency and QoS-aware applications that decrease the network traffic and the computational load of traditional cloud computing systems. The proposed IAR communications architecture is evaluated in real-world scenarios with payload sizes according to demanding Microsoft HoloLens applications and when using a cloud, a cloudlet and a fog computing system. The results show that, in terms of response delay, the fog computing system is the fastest when transferring small payloads (less than 128 KB), while for larger file sizes, the cloudlet solution is faster than the others. Moreover, under high loads (with many concurrent IAR clients), the cloudlet in some cases is more than four times faster than the fog computing system in terms of response delay.
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.
ERIC Educational Resources Information Center
Beltrametti, Monica; English, Will
1994-01-01
Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…
Optimization of atmospheric transport models on HPC platforms
NASA Astrophysics Data System (ADS)
de la Cruz, Raúl; Folch, Arnau; Farré, Pau; Cabezas, Javier; Navarro, Nacho; Cela, José María
2016-12-01
The performance and scalability of atmospheric transport models on high performance computing environments is often far from optimal for multiple reasons including, for example, sequential input and output, synchronous communications, work unbalance, memory access latency or lack of task overlapping. We investigate how different software optimizations and porting to non general-purpose hardware architectures improve code scalability and execution times considering, as an example, the FALL3D volcanic ash transport model. To this purpose, we implement the FALL3D model equations in the WARIS framework, a software designed from scratch to solve in a parallel and efficient way different geoscience problems on a wide variety of architectures. In addition, we consider further improvements in WARIS such as hybrid MPI-OMP parallelization, spatial blocking, auto-tuning and thread affinity. Considering all these aspects together, the FALL3D execution times for a realistic test case running on general-purpose cluster architectures (Intel Sandy Bridge) decrease by a factor between 7 and 40 depending on the grid resolution. Finally, we port the application to Intel Xeon Phi (MIC) and NVIDIA GPUs (CUDA) accelerator-based architectures and compare performance, cost and power consumption on all the architectures. Implications on time-constrained operational model configurations are discussed.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; ...
2017-06-01
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel
As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using themore » compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.« less
Data storage technology: Hardware and software, Appendix B
NASA Technical Reports Server (NTRS)
Sable, J. D.
1972-01-01
This project involves the development of more economical ways of integrating and interfacing new storage devices and data processing programs into a computer system. It involves developing interface standards and a software/hardware architecture which will make it possible to develop machine independent devices and programs. These will interface with the machine dependent operating systems of particular computers. The development project will not be to develop the software which would ordinarily be the responsibility of the manufacturer to supply, but to develop the standards with which that software is expected to confirm in providing an interface with the user or storage system.
The evolution of the ISOLDE control system
NASA Astrophysics Data System (ADS)
Jonsson, O. C.; Catherall, R.; Deloose, I.; Drumm, P.; Evensen, A. H. M.; Gase, K.; Focker, G. J.; Fowler, A.; Kugler, E.; Lettry, J.; Olesen, G.; Ravn, H. L.; Isolde Collaboration
The ISOLDE on-line mass separator facility is operating on a Personal Computer based control system since spring 1992. Front End Computers accessing the hardware are controlled from consoles running Microsoft Windows ™ through a Novell NetWare4 ™ local area network. The control system is transparently integrated in the CERN wide office network and makes heavy use of the CERN standard office application programs to control and to document the running of the ISOLDE isotope separators. This paper recalls the architecture of the control system, shows its recent developments and gives some examples of its graphical user interface.
The evolution of the ISOLDE control system
NASA Astrophysics Data System (ADS)
Jonsson, O. C.; Catherall, R.; Deloose, I.; Evensen, A. H. M.; Gase, K.; Focker, G. J.; Fowler, A.; Kugler, E.; Lettry, J.; Olesen, G.; Ravn, H. L.; Drumm, P.
1996-04-01
The ISOLDE on-line mass separator facility is operating on a Personal Computer based control system since spring 1992. Front End Computers accessing the hardware are controlled from consoles running Microsoft Windows® through a Novell NetWare4® local area network. The control system is transparently integrated in the CERN wide office network and makes heavy use of the CERN standard office application programs to control and to document the running of the ISOLDE isotope separators. This paper recalls the architecture of the control system, shows its recent developments and gives some examples of its graphical user interface.
ERIC Educational Resources Information Center
Farid, Ayman A.; Zaghloul, Weaam M.; Dewidar, Khaled M.
2014-01-01
The great shift in sustainability and computer aided design in the field of architecture caused a remarkable change in the architecture philosophy, new aspects of beauty and aesthetic values are being introduced, and traditional definitions for beauty cannot fully cover this aspects, which causes a gap between; new architecture works criticism and…
Programmable hardware for reconfigurable computing systems
NASA Astrophysics Data System (ADS)
Smith, Stephen
1996-10-01
In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
Execution environment for intelligent real-time control systems
NASA Technical Reports Server (NTRS)
Sztipanovits, Janos
1987-01-01
Modern telerobot control technology requires the integration of symbolic and non-symbolic programming techniques, different models of parallel computations, and various programming paradigms. The Multigraph Architecture, which has been developed for the implementation of intelligent real-time control systems is described. The layered architecture includes specific computational models, integrated execution environment and various high-level tools. A special feature of the architecture is the tight coupling between the symbolic and non-symbolic computations. It supports not only a data interface, but also the integration of the control structures in a parallel computing environment.
ROI-Based On-Board Compression for Hyperspectral Remote Sensing Images on GPU.
Giordano, Rossella; Guccione, Pietro
2017-05-19
In recent years, hyperspectral sensors for Earth remote sensing have become very popular. Such systems are able to provide the user with images having both spectral and spatial information. The current hyperspectral spaceborne sensors are able to capture large areas with increased spatial and spectral resolution. For this reason, the volume of acquired data needs to be reduced on board in order to avoid a low orbital duty cycle due to limited storage space. Recently, literature has focused the attention on efficient ways for on-board data compression. This topic is a challenging task due to the difficult environment (outer space) and due to the limited time, power and computing resources. Often, the hardware properties of Graphic Processing Units (GPU) have been adopted to reduce the processing time using parallel computing. The current work proposes a framework for on-board operation on a GPU, using NVIDIA's CUDA (Compute Unified Device Architecture) architecture. The algorithm aims at performing on-board compression using the target's related strategy. In detail, the main operations are: the automatic recognition of land cover types or detection of events in near real time in regions of interest (this is a user related choice) with an unsupervised classifier; the compression of specific regions with space-variant different bit rates including Principal Component Analysis (PCA), wavelet and arithmetic coding; and data volume management to the Ground Station. Experiments are provided using a real dataset taken from an AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) airborne sensor in a harbor area.
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy
Hwang, Wen-Jyi; Cheng, Shih-Chang; Cheng, Chau-Jern
2011-01-01
This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system. PMID:22163688
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.; ...
2017-08-17
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
Re-engineering Nascom's network management architecture
NASA Technical Reports Server (NTRS)
Drake, Brian C.; Messent, David
1994-01-01
The development of Nascom systems for ground communications began in 1958 with Project Vanguard. The low-speed systems (rates less than 9.6 Kbs) were developed following existing standards; but, there were no comparable standards for high-speed systems. As a result, these systems were developed using custom protocols and custom hardware. Technology has made enormous strides since the ground support systems were implemented. Standards for computer equipment, software, and high-speed communications exist and the performance of current workstations exceeds that of the mainframes used in the development of the ground systems. Nascom is in the process of upgrading its ground support systems and providing additional services. The Message Switching System (MSS), Communications Address Processor (CAP), and Multiplexer/Demultiplexer (MDM) Automated Control System (MACS) are all examples of Nascom systems developed using standards such as, X-windows, Motif, and Simple Network Management Protocol (SNMP). Also, the Earth Observing System (EOS) Communications (Ecom) project is stressing standards as an integral part of its network. The move towards standards has produced a reduction in development, maintenance, and interoperability costs, while providing operational quality improvement. The Facility and Resource Manager (FARM) project has been established to integrate the Nascom networks and systems into a common network management architecture. The maximization of standards and implementation of computer automation in the architecture will lead to continued cost reductions and increased operational efficiency. The first step has been to derive overall Nascom requirements and identify the functionality common to all the current management systems. The identification of these common functions will enable the reuse of processes in the management architecture and promote increased use of automation throughout the Nascom network. The MSS, CAP, MACS, and Ecom projects have indicated the potential value of commercial-off-the-shelf (COTS) and standards through reduced cost and high quality. The FARM will allow the application of the lessons learned from these projects to all future Nascom systems.
GOES-R GS Product Generation Infrastructure Operations
NASA Astrophysics Data System (ADS)
Blanton, M.; Gundy, J.
2012-12-01
GOES-R GS Product Generation Infrastructure Operations: The GOES-R Ground System (GS) will produce a much larger set of products with higher data density than previous GOES systems. This requires considerably greater compute and memory resources to achieve the necessary latency and availability for these products. Over time, new algorithms could be added and existing ones removed or updated, but the GOES-R GS cannot go down during this time. To meet these GOES-R GS processing needs, the Harris Corporation will implement a Product Generation (PG) infrastructure that is scalable, extensible, extendable, modular and reliable. The primary parts of the PG infrastructure are the Service Based Architecture (SBA), which includes the Distributed Data Fabric (DDF). The SBA is the middleware that encapsulates and manages science algorithms that generate products. The SBA is divided into three parts, the Executive, which manages and configures the algorithm as a service, the Dispatcher, which provides data to the algorithm, and the Strategy, which determines when the algorithm can execute with the available data. The SBA is a distributed architecture, with services connected to each other over a compute grid and is highly scalable. This plug-and-play architecture allows algorithms to be added, removed, or updated without affecting any other services or software currently running and producing data. Algorithms require product data from other algorithms, so a scalable and reliable messaging is necessary. The SBA uses the DDF to provide this data communication layer between algorithms. The DDF provides an abstract interface over a distributed and persistent multi-layered storage system (memory based caching above disk-based storage) and an event system that allows algorithm services to know when data is available and to get the data that they need to begin processing when they need it. Together, the SBA and the DDF provide a flexible, high performance architecture that can meet the needs of product processing now and as they grow in the future.
ERIC Educational Resources Information Center
Betts, Janelle Lyon
2001-01-01
Describes a high school art assignment in which students utilize Appleworks or Claris Works to design their own house, after learning about architectural styles and how to use the computer program. States that the project develops student computer skills and increases student knowledge about architecture. (CMK)
Physical-depth architectural requirements for generating universal photonic cluster states
NASA Astrophysics Data System (ADS)
Morley-Short, Sam; Bartolucci, Sara; Gimeno-Segovia, Mercedes; Shadbolt, Pete; Cable, Hugo; Rudolph, Terry
2018-01-01
Most leading proposals for linear-optical quantum computing (LOQC) use cluster states, which act as a universal resource for measurement-based (one-way) quantum computation. In ballistic approaches to LOQC, cluster states are generated passively from small entangled resource states using so-called fusion operations. Results from percolation theory have previously been used to argue that universal cluster states can be generated in the ballistic approach using schemes which exceed the critical threshold for percolation, but these results consider cluster states with unbounded size. Here we consider how successful percolation can be maintained using a physical architecture with fixed physical depth, assuming that the cluster state is continuously generated and measured, and therefore that only a finite portion of it is visible at any one point in time. We show that universal LOQC can be implemented using a constant-size device with modest physical depth, and that percolation can be exploited using simple pathfinding strategies without the need for high-complexity algorithms.
NASA Astrophysics Data System (ADS)
Oztekin, Halit; Temurtas, Feyzullah; Gulbag, Ali
The Arithmetic and Logic Unit (ALU) design is one of the important topics in Computer Architecture and Organization course in Computer and Electrical Engineering departments. There are ALU designs that have non-modular nature to be used as an educational tool. As the programmable logic technology has developed rapidly, it is feasible that ALU design based on Field Programmable Gate Array (FPGA) is implemented in this course. In this paper, we have adopted the modular approach to ALU design based on FPGA. All the modules in the ALU design are realized using schematic structure on Altera's Cyclone II Development board. Under this model, the ALU content is divided into four distinct modules. These are arithmetic unit except for multiplication and division operations, logic unit, multiplication unit and division unit. User can easily design any size of ALU unit since this approach has the modular nature. Then, this approach was applied to microcomputer architecture design named BZK.SAU.FPGA10.0 instead of the current ALU unit.
Compact FPGA hardware architecture for public key encryption in embedded devices
Morales-Sandoval, Miguel; Cumplido, René; Feregrino-Uribe, Claudia; Algredo-Badillo, Ignacio
2018-01-01
Security is a crucial requirement in the envisioned applications of the Internet of Things (IoT), where most of the underlying computing platforms are embedded systems with reduced computing capabilities and energy constraints. In this paper we present the design and evaluation of a scalable low-area FPGA hardware architecture that serves as a building block to accelerate the costly operations of exponentiation and multiplication in GF(p), commonly required in security protocols relying on public key encryption, such as in key agreement, authentication and digital signature. The proposed design can process operands of different size using the same datapath, which exhibits a significant reduction in area without loss of efficiency if compared to representative state of the art designs. For example, our design uses 96% less standard logic than a similar design optimized for performance, and 46% less resources than other design optimized for area. Even using fewer area resources, our design still performs better than its embedded software counterparts (190x and 697x). PMID:29360824
High Performance Radiation Transport Simulations on TITAN
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M
2012-01-01
In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less
Compact FPGA hardware architecture for public key encryption in embedded devices.
Rodríguez-Flores, Luis; Morales-Sandoval, Miguel; Cumplido, René; Feregrino-Uribe, Claudia; Algredo-Badillo, Ignacio
2018-01-01
Security is a crucial requirement in the envisioned applications of the Internet of Things (IoT), where most of the underlying computing platforms are embedded systems with reduced computing capabilities and energy constraints. In this paper we present the design and evaluation of a scalable low-area FPGA hardware architecture that serves as a building block to accelerate the costly operations of exponentiation and multiplication in [Formula: see text], commonly required in security protocols relying on public key encryption, such as in key agreement, authentication and digital signature. The proposed design can process operands of different size using the same datapath, which exhibits a significant reduction in area without loss of efficiency if compared to representative state of the art designs. For example, our design uses 96% less standard logic than a similar design optimized for performance, and 46% less resources than other design optimized for area. Even using fewer area resources, our design still performs better than its embedded software counterparts (190x and 697x).
Architectures for Cognitive Systems
2010-02-01
highly modular many- node chip was designed which addressed power efficiency to the maximum extent possible. Each node contains an Asynchronous Field...optimization to perform complex cognitive computing operations. This project focused on the design of the core and integration across a four node chip . A...follow on project will focus on creating a 3 dimensional stack of chips that is enabled by the low power usage. The chip incorporates structures to
Leibo, Joel Z.; Liao, Qianli; Freiwald, Winrich A.; Anselmi, Fabio; Poggio, Tomaso
2017-01-01
SUMMARY The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and robust against identity-preserving transformations like depth-rotations [1, 2]. Current computational models of object recognition, including recent deep learning networks, generate these properties through a hierarchy of alternating selectivity-increasing filtering and tolerance-increasing pooling operations, similar to simple-complex cells operations [3, 4, 5, 6]. Here we prove that a class of hierarchical architectures and a broad set of biologically plausible learning rules generate approximate invariance to identity-preserving transformations at the top level of the processing hierarchy. However, all past models tested failed to reproduce the most salient property of an intermediate representation of a three-level face-processing hierarchy in the brain: mirror-symmetric tuning to head orientation [7]. Here we demonstrate that one specific biologically-plausible Hebb-type learning rule generates mirror-symmetric tuning to bilaterally symmetric stimuli like faces at intermediate levels of the architecture and show why it does so. Thus the tuning properties of individual cells inside the visual stream appear to result from group properties of the stimuli they encode and to reflect the learning rules that sculpted the information-processing system within which they reside. PMID:27916522
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.; ...
2011-03-02
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
Evaluation of Visual Computer Simulator for Computer Architecture Education
ERIC Educational Resources Information Center
Imai, Yoshiro; Imai, Masatoshi; Moritoh, Yoshio
2013-01-01
This paper presents trial evaluation of a visual computer simulator in 2009-2011, which has been developed to play some roles of both instruction facility and learning tool simultaneously. And it illustrates an example of Computer Architecture education for University students and usage of e-Learning tool for Assembly Programming in order to…
Accelerating artificial intelligence with reconfigurable computing
NASA Astrophysics Data System (ADS)
Cieszewski, Radoslaw
Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.