Hypercluster Parallel Processor
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela
1992-01-01
Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
Developing a Distributed Computing Architecture at Arizona State University.
ERIC Educational Resources Information Center
Armann, Neil; And Others
1994-01-01
Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…
Frances: A Tool for Understanding Computer Architecture and Assembly Language
ERIC Educational Resources Information Center
Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh
2012-01-01
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.
Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein
2015-12-01
Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.; Lytle, John K. (Technical Monitor)
2002-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.
ERIC Educational Resources Information Center
Beltrametti, Monica; English, Will
1994-01-01
Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…
Execution environment for intelligent real-time control systems
NASA Technical Reports Server (NTRS)
Sztipanovits, Janos
1987-01-01
Modern telerobot control technology requires the integration of symbolic and non-symbolic programming techniques, different models of parallel computations, and various programming paradigms. The Multigraph Architecture, which has been developed for the implementation of intelligent real-time control systems is described. The layered architecture includes specific computational models, integrated execution environment and various high-level tools. A special feature of the architecture is the tight coupling between the symbolic and non-symbolic computations. It supports not only a data interface, but also the integration of the control structures in a parallel computing environment.
Computing architecture for autonomous microgrids
Goldsmith, Steven Y.
2015-09-29
A computing architecture that facilitates autonomously controlling operations of a microgrid is described herein. A microgrid network includes numerous computing devices that execute intelligent agents, each of which is assigned to a particular entity (load, source, storage device, or switch) in the microgrid. The intelligent agents can execute in accordance with predefined protocols to collectively perform computations that facilitate uninterrupted control of the .
MIT CSAIL and Lincoln Laboratory Task Force Report
2016-08-01
projects have been very diverse, spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications...spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications, computing architectures and...to machine learning systems and algorithms, such as recommender systems, and “Big Data ” analytics . Advanced computing architectures broadly refer to
NASA Astrophysics Data System (ADS)
Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng
2018-04-01
In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.
ERIC Educational Resources Information Center
Nikolic, B.; Radivojevic, Z.; Djordjevic, J.; Milutinovic, V.
2009-01-01
Courses in Computer Architecture and Organization are regularly included in Computer Engineering curricula. These courses are usually organized in such a way that students obtain not only a purely theoretical experience, but also a practical understanding of the topics lectured. This practical work is usually done in a laboratory using simulators…
Collaborative Working Architecture for IoT-Based Applications.
Mora, Higinio; Signes-Pont, María Teresa; Gil, David; Johnsson, Magnus
2018-05-23
The new sensing applications need enhanced computing capabilities to handle the requirements of complex and huge data processing. The Internet of Things (IoT) concept brings processing and communication features to devices. In addition, the Cloud Computing paradigm provides resources and infrastructures for performing the computations and outsourcing the work from the IoT devices. This scenario opens new opportunities for designing advanced IoT-based applications, however, there is still much research to be done to properly gear all the systems for working together. This work proposes a collaborative model and an architecture to take advantage of the available computing resources. The resulting architecture involves a novel network design with different levels which combines sensing and processing capabilities based on the Mobile Cloud Computing (MCC) paradigm. An experiment is included to demonstrate that this approach can be used in diverse real applications. The results show the flexibility of the architecture to perform complex computational tasks of advanced applications.
Modelling parallel programs and multiprocessor architectures with AXE
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Fineman, Charles E.
1991-01-01
AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
NASA Technical Reports Server (NTRS)
Torres-Pomales, Wilfredo
2014-01-01
This report presents an example of the application of multi-criteria decision analysis to the selection of an architecture for a safety-critical distributed computer system. The design problem includes constraints on minimum system availability and integrity, and the decision is based on the optimal balance of power, weight and cost. The analysis process includes the generation of alternative architectures, evaluation of individual decision criteria, and the selection of an alternative based on overall value. In this example presented here, iterative application of the quantitative evaluation process made it possible to deliberately generate an alternative architecture that is superior to all others regardless of the relative importance of cost.
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
Super and parallel computers and their impact on civil engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamat, M.P.
1986-01-01
This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness
NASA Astrophysics Data System (ADS)
Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.
2014-09-01
Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis
NASA Technical Reports Server (NTRS)
Brent, G. A.
1978-01-01
A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
Electromagnetic Physics Models for Parallel Computing Architectures
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Information Architecture: Notes toward a New Curriculum.
ERIC Educational Resources Information Center
Latham, Don
2002-01-01
Considers the evolution of information architectures as a field of professional education. Topics include the need for an interdisciplinary approach; balancing practical skills with theoretical concepts; and key content areas, including information organization, graphic design, computer science, user and usability studies, and communication.…
ATCA for Machines-- Advanced Telecommunications Computing Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, R.S.; /SLAC
2008-04-22
The Advanced Telecommunications Computing Architecture is a new industry open standard for electronics instrument modules and shelves being evaluated for the International Linear Collider (ILC). It is the first industrial standard designed for High Availability (HA). ILC availability simulations have shown clearly that the capabilities of ATCA are needed in order to achieve acceptable integrated luminosity. The ATCA architecture looks attractive for beam instruments and detector applications as well. This paper provides an overview of ongoing R&D including application of HA principles to power electronics systems.
Computer Technology: State of the Art.
ERIC Educational Resources Information Center
Withington, Frederic G.
1981-01-01
Describes the nature of modern general-purpose computer systems, including hardware, semiconductor electronics, microprocessors, computer architecture, input output technology, and system control programs. Seven suggested readings are cited. (FM)
NASA Astrophysics Data System (ADS)
Zaveri, Mazad Shaheriar
The semiconductor/computer industry has been following Moore's law for several decades and has reaped the benefits in speed and density of the resultant scaling. Transistor density has reached almost one billion per chip, and transistor delays are in picoseconds. However, scaling has slowed down, and the semiconductor industry is now facing several challenges. Hybrid CMOS/nano technologies, such as CMOL, are considered as an interim solution to some of the challenges. Another potential architectural solution includes specialized architectures for applications/models in the intelligent computing domain, one aspect of which includes abstract computational models inspired from the neuro/cognitive sciences. Consequently in this dissertation, we focus on the hardware implementations of Bayesian Memory (BM), which is a (Bayesian) Biologically Inspired Computational Model (BICM). This model is a simplified version of George and Hawkins' model of the visual cortex, which includes an inference framework based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the BM. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom hardware architectures using both traditional CMOS and hybrid nanotechnology - CMOL, and investigating the baseline performance/price of these architectures. The results suggest that CMOL is a promising candidate for implementing a BM. Such implementations can utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32 to 40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a PC implementation. We later use this methodology to investigate the hardware implementations of cortex-scale spiking neural system, which is an approximate neural equivalent of BICM based cortex-scale system. The results of this investigation also suggest that CMOL is a promising candidate to implement such large-scale neuromorphic systems. In general, the assessment of such hypothetical baseline hardware architectures provides the prospects for building large-scale (mammalian cortex-scale) implementations of neuromorphic/Bayesian/intelligent systems using state-of-the-art and beyond state-of-the-art silicon structures.
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
Access control and privacy in large distributed systems
NASA Technical Reports Server (NTRS)
Leiner, B. M.; Bishop, M.
1986-01-01
Large scale distributed systems consists of workstations, mainframe computers, supercomputers and other types of servers, all connected by a computer network. These systems are being used in a variety of applications including the support of collaborative scientific research. In such an environment, issues of access control and privacy arise. Access control is required for several reasons, including the protection of sensitive resources and cost control. Privacy is also required for similar reasons, including the protection of a researcher's proprietary results. A possible architecture for integrating available computer and communications security technologies into a system that meet these requirements is described. This architecture is meant as a starting point for discussion, rather that the final answer.
NASA Astrophysics Data System (ADS)
Ragan-Kelley, M.; Perez, F.; Granger, B.; Kluyver, T.; Ivanov, P.; Frederic, J.; Bussonnier, M.
2014-12-01
IPython has provided terminal-based tools for interactive computing in Python since 2001. The notebook document format and multi-process architecture introduced in 2011 have expanded the applicable scope of IPython into teaching, presenting, and sharing computational work, in addition to interactive exploration. The new architecture also allows users to work in any language, with implementations in Python, R, Julia, Haskell, and several other languages. The language agnostic parts of IPython have been renamed to Jupyter, to better capture the notion that a cross-language design can encapsulate commonalities present in computational research regardless of the programming language being used. This architecture offers components like the web-based Notebook interface, that supports rich documents that combine code and computational results with text narratives, mathematics, images, video and any media that a modern browser can display. This interface can be used not only in research, but also for publication and education, as notebooks can be converted to a variety of output formats, including HTML and PDF. Recent developments in the Jupyter project include a multi-user environment for hosting notebooks for a class or research group, a live collaboration notebook via Google Docs, and better support for languages other than Python.
Design of a fault tolerant airborne digital computer. Volume 1: Architecture
NASA Technical Reports Server (NTRS)
Wensley, J. H.; Levitt, K. N.; Green, M. W.; Goldberg, J.; Neumann, P. G.
1973-01-01
This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive.
Electromagnetic physics models for parallel computing architectures
Amadio, G.; Ananya, A.; Apostolakis, J.; ...
2016-11-21
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part ofmore » the GeantV project. Finally, the results of preliminary performance evaluation and physics validation are presented as well.« less
ERIC Educational Resources Information Center
Hill, Linda L.; Crosier, Scott J.; Smith, Terrence R.; Goodchild, Michael; Iannella, Renato; Erickson, John S.; Reich, Vicky; Rosenthal, David S. H.
2001-01-01
Includes five articles. Topics include requirements for a content standard to describe computational models; architectures for digital rights management systems; access control for digital information objects; LOCKSS (Lots of Copies Keep Stuff Safe) that allows libraries to run Web caches for specific journals; and a Web site from the U.S.…
Application of Tessellation in Architectural Geometry Design
NASA Astrophysics Data System (ADS)
Chang, Wei
2018-06-01
Tessellation plays a significant role in architectural geometry design, which is widely used both through history of architecture and in modern architectural design with the help of computer technology. Tessellation has been found since the birth of civilization. In terms of dimensions, there are two- dimensional tessellations and three-dimensional tessellations; in terms of symmetry, there are periodic tessellations and aperiodic tessellations. Besides, some special types of tessellations such as Voronoi Tessellation and Delaunay Triangles are also included. Both Geometry and Crystallography, the latter of which is the basic theory of three-dimensional tessellations, need to be studied. In history, tessellation was applied into skins or decorations in architecture. The development of Computer technology enables tessellation to be more powerful, as seen in surface control, surface display and structure design, etc. Therefore, research on the application of tessellation in architectural geometry design is of great necessity in architecture studies.
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures
NASA Technical Reports Server (NTRS)
Biegel, Bryan A. (Technical Monitor); Jost, G.; Jin, H.; Labarta J.; Gimenez, J.; Caubet, J.
2003-01-01
Parallel programming paradigms include process level parallelism, thread level parallelization, and multilevel parallelism. This viewgraph presentation describes a detailed performance analysis of these paradigms for Shared Memory Architecture (SMA). This analysis uses the Paraver Performance Analysis System. The presentation includes diagrams of a flow of useful computations.
Scalable quantum computer architecture with coupled donor-quantum dot qubits
Schenkel, Thomas; Lo, Cheuk Chi; Weis, Christoph; Lyon, Stephen; Tyryshkin, Alexei; Bokor, Jeffrey
2014-08-26
A quantum bit computing architecture includes a plurality of single spin memory donor atoms embedded in a semiconductor layer, a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, wherein a first voltage applied across at least one pair of the aligned quantum dot and donor atom controls a donor-quantum dot coupling. A method of performing quantum computing in a scalable architecture quantum computing apparatus includes arranging a pattern of single spin memory donor atoms in a semiconductor layer, forming a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, applying a first voltage across at least one aligned pair of a quantum dot and donor atom to control a donor-quantum dot coupling, and applying a second voltage between one or more quantum dots to control a Heisenberg exchange J coupling between quantum dots and to cause transport of a single spin polarized electron between quantum dots.
NASA Technical Reports Server (NTRS)
Smith, Paul H.
1988-01-01
The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
Doing It Right: 366 answers to computing questions you didn't know you had
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herring, Stuart Davis
Slides include information on history: version control, version control: branches, version control: Git, releases, requirements, readability, readability control flow, global variables, architecture, architecture redundancy, processes, input/output, unix, etcetera.
Exploring Gigabyte Datasets in Real Time: Architectures, Interfaces and Time-Critical Design
NASA Technical Reports Server (NTRS)
Bryson, Steve; Gerald-Yamasaki, Michael (Technical Monitor)
1998-01-01
Architectures and Interfaces: The implications of real-time interaction on software architecture design: decoupling of interaction/graphics and computation into asynchronous processes. The performance requirements of graphics and computation for interaction. Time management in such an architecture. Examples of how visualization algorithms must be modified for high performance. Brief survey of interaction techniques and design, including direct manipulation and manipulation via widgets. talk discusses how human factors considerations drove the design and implementation of the virtual wind tunnel. Time-Critical Design: A survey of time-critical techniques for both computation and rendering. Emphasis on the assignment of a time budget to both the overall visualization environment and to each individual visualization technique in the environment. The estimation of the benefit and cost of an individual technique. Examples of the modification of visualization algorithms to allow time-critical control.
Hardware architecture design of image restoration based on time-frequency domain computation
NASA Astrophysics Data System (ADS)
Wen, Bo; Zhang, Jing; Jiao, Zipeng
2013-10-01
The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.
The MasPar MP-1 As a Computer Arithmetic Laboratory
Anuta, Michael A.; Lozier, Daniel W.; Turner, Peter R.
1996-01-01
This paper is a blueprint for the use of a massively parallel SIMD computer architecture for the simulation of various forms of computer arithmetic. The particular system used is a DEC/MasPar MP-1 with 4096 processors in a square array. This architecture has many advantages for such simulations due largely to the simplicity of the individual processors. Arithmetic operations can be spread across the processor array to simulate a hardware chip. Alternatively they may be performed on individual processors to allow simulation of a massively parallel implementation of the arithmetic. Compromises between these extremes permit speed-area tradeoffs to be examined. The paper includes a description of the architecture and its features. It then summarizes some of the arithmetic systems which have been, or are to be, implemented. The implementation of the level-index and symmetric level-index, LI and SLI, systems is described in some detail. An extensive bibliography is included. PMID:27805123
ERIC Educational Resources Information Center
Soares, S. N.; Wagner, F. R.
2011-01-01
Teaching and Design Workbench (T&D-Bench) is a framework aimed at education and research in the areas of computer architecture and embedded systems. It includes a set of features not found in other educational environments. This set of features is the result of an original combination of design requirements for T&D-Bench: that the…
NASA Technical Reports Server (NTRS)
1985-01-01
The second task in the Space Station Data System (SSDS) Analysis/Architecture Study is the development of an information base that will support the conduct of trade studies and provide sufficient data to make key design/programmatic decisions. This volume identifies the preferred options in the technology category and characterizes these options with respect to performance attributes, constraints, cost, and risk. The technology category includes advanced materials, processes, and techniques that can be used to enhance the implementation of SSDS design structures. The specific areas discussed are mass storage, including space and round on-line storage and off-line storage; man/machine interface; data processing hardware, including flight computers and advanced/fault tolerant computer architectures; and software, including data compression algorithms, on-board high level languages, and software tools. Also discussed are artificial intelligence applications and hard-wire communications.
Engineering and Computing Portal to Solve Environmental Problems
NASA Astrophysics Data System (ADS)
Gudov, A. M.; Zavozkin, S. Y.; Sotnikov, I. Y.
2018-01-01
This paper describes architecture and services of the Engineering and Computing Portal, which is considered to be a complex solution that provides access to high-performance computing resources, enables to carry out computational experiments, teach parallel technologies and solve computing tasks, including technogenic safety ones.
Analog Computation by DNA Strand Displacement Circuits.
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2016-08-19
DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands, respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. On the basis of these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lala, J.H.; Nagle, G.A.; Harper, R.E.
1993-05-01
The Maglev control computer system should be designed to verifiably possess high reliability and safety as well as high availability to make Maglev a dependable and attractive transportation alternative to the public. A Maglev control computer system has been designed using a design-for-validation methodology developed earlier under NASA and SDIO sponsorship for real-time aerospace applications. The present study starts by defining the maglev mission scenario and ends with the definition of a maglev control computer architecture. Key intermediate steps included definitions of functional and dependability requirements, synthesis of two candidate architectures, development of qualitative and quantitative evaluation criteria, and analyticalmore » modeling of the dependability characteristics of the two architectures. Finally, the applicability of the design-for-validation methodology was also illustrated by applying it to the German Transrapid TR07 maglev control system.« less
Multiprocessor architecture: Synthesis and evaluation
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1990-01-01
Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
NASA Astrophysics Data System (ADS)
Kelley, Troy D.; McGhee, S.
2013-05-01
This paper describes the ongoing development of a robotic control architecture that inspired by computational cognitive architectures from the discipline of cognitive psychology. The Symbolic and Sub-Symbolic Robotics Intelligence Control System (SS-RICS) combines symbolic and sub-symbolic representations of knowledge into a unified control architecture. The new architecture leverages previous work in cognitive architectures, specifically the development of the Adaptive Character of Thought-Rational (ACT-R) and Soar. This paper details current work on learning from episodes or events. The use of episodic memory as a learning mechanism has, until recently, been largely ignored by computational cognitive architectures. This paper details work on metric level episodic memory streams and methods for translating episodes into abstract schemas. The presentation will include research on learning through novelty and self generated feedback mechanisms for autonomous systems.
Muller, George; Perkins, Casey J.; Lancaster, Mary J.; MacDonald, Douglas G.; Clements, Samuel L.; Hutton, William J.; Patrick, Scott W.; Key, Bradley Robert
2015-07-28
Computer-implemented security evaluation methods, security evaluation systems, and articles of manufacture are described. According to one aspect, a computer-implemented security evaluation method includes accessing information regarding a physical architecture and a cyber architecture of a facility, building a model of the facility comprising a plurality of physical areas of the physical architecture, a plurality of cyber areas of the cyber architecture, and a plurality of pathways between the physical areas and the cyber areas, identifying a target within the facility, executing the model a plurality of times to simulate a plurality of attacks against the target by an adversary traversing at least one of the areas in the physical domain and at least one of the areas in the cyber domain, and using results of the executing, providing information regarding a security risk of the facility with respect to the target.
GPU-computing in econophysics and statistical physics
NASA Astrophysics Data System (ADS)
Preis, T.
2011-03-01
A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.
Analysis of Disaster Preparedness Planning Measures in DoD Computer Facilities
1993-09-01
city, stae, aod ZP code) 10 Source of Funding Numbers SProgram Element No lProject No ITask No lWork Unit Accesion I 11 Title include security...Computer Disaster Recovery .... 13 a. PC and LAN Lessons Learned . . ..... 13 2. Distributed Architectures . . . .. . 14 3. Backups...amount of expense, but no client problems." (Leeke, 1993, p. 8) 2. Distributed Architectures The majority of operations that were disrupted by the
Service-Oriented Architecture for NVO and TeraGrid Computing
NASA Technical Reports Server (NTRS)
Jacob, Joseph; Miller, Craig; Williams, Roy; Steenberg, Conrad; Graham, Matthew
2008-01-01
The National Virtual Observatory (NVO) Extensible Secure Scalable Service Infrastructure (NESSSI) is a Web service architecture and software framework that enables Web-based astronomical data publishing and processing on grid computers such as the National Science Foundation's TeraGrid. Characteristics of this architecture include the following: (1) Services are created, managed, and upgraded by their developers, who are trusted users of computing platforms on which the services are deployed. (2) Service jobs can be initiated by means of Java or Python client programs run on a command line or with Web portals. (3) Access is granted within a graduated security scheme in which the size of a job that can be initiated depends on the level of authentication of the user.
NASA Technical Reports Server (NTRS)
Rickard, D. A.; Bodenheimer, R. E.
1976-01-01
Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
FPGA-based real-time phase measuring profilometry algorithm design and implementation
NASA Astrophysics Data System (ADS)
Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng
2016-11-01
Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
Computers in Academic Architecture Libraries.
ERIC Educational Resources Information Center
Willis, Alfred; And Others
1992-01-01
Computers are widely used in architectural research and teaching in U.S. schools of architecture. A survey of libraries serving these schools sought information on the emphasis placed on computers by the architectural curriculum, accessibility of computers to library staff, and accessibility of computers to library patrons. Survey results and…
NASA Astrophysics Data System (ADS)
Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.
2016-05-01
In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Apparatuses and Methods for Producing Runtime Architectures of Computer Program Modules
NASA Technical Reports Server (NTRS)
Abi-Antoun, Marwan Elia (Inventor); Aldrich, Jonathan Erik (Inventor)
2013-01-01
Apparatuses and methods for producing run-time architectures of computer program modules. One embodiment includes creating an abstract graph from the computer program module and from containment information corresponding to the computer program module, wherein the abstract graph has nodes including types and objects, and wherein the abstract graph relates an object to a type, and wherein for a specific object the abstract graph relates the specific object to a type containing the specific object; and creating a runtime graph from the abstract graph, wherein the runtime graph is a representation of the true runtime object graph, wherein the runtime graph represents containment information such that, for a specific object, the runtime graph relates the specific object to another object that contains the specific object.
Hybrid parallel computing architecture for multiview phase shifting
NASA Astrophysics Data System (ADS)
Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun
2014-11-01
The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
Wolinski, Christophe Czeslaw [Los Alamos, NM; Gokhale, Maya B [Los Alamos, NM; McCabe, Kevin Peter [Los Alamos, NM
2011-01-18
Fabric-based computing systems and methods are disclosed. A fabric-based computing system can include a polymorphous computing fabric that can be customized on a per application basis and a host processor in communication with said polymorphous computing fabric. The polymorphous computing fabric includes a cellular architecture that can be highly parameterized to enable a customized synthesis of fabric instances for a variety of enhanced application performances thereof. A global memory concept can also be included that provides the host processor random access to all variables and instructions associated with the polymorphous computing fabric.
Architecture for hospital information integration
NASA Astrophysics Data System (ADS)
Chimiak, William J.; Janariz, Daniel L.; Martinez, Ralph
1999-07-01
The ongoing integration of hospital information systems (HIS) continues. Data storage systems, data networks and computers improve, data bases grow and health-care applications increase. Some computer operating systems continue to evolve and some fade. Health care delivery now depends on this computer-assisted environment. The result is the critical harmonization of the various hospital information systems becomes increasingly difficult. The purpose of this paper is to present an architecture for HIS integration that is computer-language-neutral and computer- hardware-neutral for the informatics applications. The proposed architecture builds upon the work done at the University of Arizona on middleware, the work of the National Electrical Manufacturers Association, and the American College of Radiology. It is a fresh approach to allowing applications engineers to access medical data easily and thus concentrates on the application techniques in which they are expert without struggling with medical information syntaxes. The HIS can be modeled using a hierarchy of information sub-systems thus facilitating its understanding. The architecture includes the resulting information model along with a strict but intuitive application programming interface, managed by CORBA. The CORBA requirement facilitates interoperability. It should also reduce software and hardware development times.
The role of architecture and ontology for interoperability.
Blobel, Bernd; González, Carolina; Oemig, Frank; Lopéz, Diego; Nykänen, Pirkko; Ruotsalainen, Pekka
2010-01-01
Turning from organization-centric to process-controlled or even to personalized approaches, advanced healthcare settings have to meet special interoperability challenges. eHealth and pHealth solutions must assure interoperability between actors cooperating to achieve common business objectives. Hereby, the interoperability chain also includes individually tailored technical systems, but also sensors and actuators. For enabling corresponding pervasive computing and even autonomic computing, individualized systems have to be based on an architecture framework covering many domains, scientifically managed by specialized disciplines using their specific ontologies in a formalized way. Therefore, interoperability has to advance from a communication protocol to an architecture-centric approach mastering ontology coordination challenges.
2010-06-01
DATES COVEREDAPR 2009 – JAN 2010 (From - To) APR 2009 – JAN 2010 4. TITLE AND SUBTITLE EMERGING NEUROMORPHIC COMPUTING ARCHITECTURES AND ENABLING...14. ABSTRACT The highly cross-disciplinary emerging field of neuromorphic computing architectures for cognitive information processing applications...belief systems, software, computer engineering, etc. In our effort to develop cognitive systems atop a neuromorphic computing architecture, we explored
Optical systolic solutions of linear algebraic equations
NASA Technical Reports Server (NTRS)
Neuman, C. P.; Casasent, D.
1984-01-01
The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
Memristor-Based Synapse Design and Training Scheme for Neuromorphic Computing Architecture
2012-06-01
system level built upon the conventional Von Neumann computer architecture [2][3]. Developing the neuromorphic architecture at chip level by...SCHEME FOR NEUROMORPHIC COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-11-2-0046 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...creation of memristor-based neuromorphic computing architecture. Rather than the existing crossbar-based neuron network designs, we focus on memristor
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-12
... design feature associated with the architecture and connectivity capabilities of the airplanes' computer... the comment for an association, business, labor union, etc.). DOT's complete Privacy Act Statement can...; facsimile 425-227-1149. SUPPLEMENTARY INFORMATION: The proposed network architecture includes the following...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-10
... design feature associated with the architecture and connectivity capabilities of the airplanes' computer... vulnerabilities to the airplanes' systems. The proposed network architecture includes the following connectivity.... Operator business and administrative support systems, and 3. Passenger entertainment systems, and access by...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Yao; Balaprakash, Prasanna; Meng, Jiayuan
We present Raexplore, a performance modeling framework for architecture exploration. Raexplore enables rapid, automated, and systematic search of architecture design space by combining hardware counter-based performance characterization and analytical performance modeling. We demonstrate Raexplore for two recent manycore processors IBM Blue- Gene/Q compute chip and Intel Xeon Phi, targeting a set of scientific applications. Our framework is able to capture complex interactions between architectural components including instruction pipeline, cache, and memory, and to achieve a 3–22% error for same-architecture and cross-architecture performance predictions. Furthermore, we apply our framework to assess the two processors, and discover and evaluate a list ofmore » architectural scaling options for future processor designs.« less
ERIC Educational Resources Information Center
Huston, Rick, Ed.; Armel, Donald, Ed.
Topics addressed by 40 papers from a conference on microcomputers include: developing a campus wide computer ethics policy; integrating new technologies into professional education; campus computer networks; computer assisted instruction; client/server architecture; competencies for entry-level computing positions; auditing and professional…
NASA Technical Reports Server (NTRS)
1972-01-01
The design is reported of an advanced modular computer system designated the Automatically Reconfigurable Modular Multiprocessor System, which anticipates requirements for higher computing capacity and reliability for future spaceborne computers. Subjects discussed include: an overview of the architecture, mission analysis, synchronous and nonsynchronous scheduling control, reliability, and data transmission.
Prospective Architectures for Onboard vs Cloud-Based Decision Making for Unmanned Aerial Systems
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Teubert, Christopher
2017-01-01
This paper investigates propsective architectures for decision-making in unmanned aerial systems. When these unmanned vehicles operate in urban environments, there are several sources of uncertainty that affect their behavior, and decision-making algorithms need to be robust to account for these different sources of uncertainty. It is important to account for several risk-factors that affect the flight of these unmanned systems, and facilitate decision-making by taking into consideration these various risk-factors. In addition, there are several technical challenges related to autonomous flight of unmanned aerial systems; these challenges include sensing, obstacle detection, path planning and navigation, trajectory generation and selection, etc. Many of these activities require significant computational power and in many situations, all of these activities need to be performed in real-time. In order to efficiently integrate these activities, it is important to develop a systematic architecture that can facilitate real-time decision-making. Four prospective architectures are discussed in this paper; on one end of the spectrum, the first architecture considers all activities/computations being performed onboard the vehicle whereas on the other end of the spectrum, the fourth and final architecture considers all activities/computations being performed in the cloud, using a new service known as Prognostics as a Service that is being developed at NASA Ames Research Center. The four different architectures are compared, their advantages and disadvantages are explained and conclusions are presented.
Space station needs, attributes and architectural options study
NASA Technical Reports Server (NTRS)
1983-01-01
All the candidate Technology Development missions investigated during the space station needs, attributes, and architectural options study are described. All the mission data forms plus additional information such as, cost, drawings, functional flows, etc., generated in support of these mission is included with a computer generated mission data form.
Integrating Software Modules For Robot Control
NASA Technical Reports Server (NTRS)
Volpe, Richard A.; Khosla, Pradeep; Stewart, David B.
1993-01-01
Reconfigurable, sensor-based control system uses state variables in systematic integration of reusable control modules. Designed for open-architecture hardware including many general-purpose microprocessors, each having own local memory plus access to global shared memory. Implemented in software as extension of Chimera II real-time operating system. Provides transparent computing mechanism for intertask communication between control modules and generic process-module architecture for multiprocessor realtime computation. Used to control robot arm. Proves useful in variety of other control and robotic applications.
An Object-Oriented Network-Centric Software Architecture for Physical Computing
NASA Astrophysics Data System (ADS)
Palmer, Richard
1997-08-01
Recent developments in object-oriented computer languages and infrastructure such as the Internet, Web browsers, and the like provide an opportunity to define a more productive computational environment for scientific programming that is based more closely on the underlying mathematics describing physics than traditional programming languages such as FORTRAN or C++. In this talk I describe an object-oriented software architecture for representing physical problems that includes classes for such common mathematical objects as geometry, boundary conditions, partial differential and integral equations, discretization and numerical solution methods, etc. In practice, a scientific program written using this architecture looks remarkably like the mathematics used to understand the problem, is typically an order of magnitude smaller than traditional FORTRAN or C++ codes, and hence easier to understand, debug, describe, etc. All objects in this architecture are ``network-enabled,'' which means that components of a software solution to a physical problem can be transparently loaded from anywhere on the Internet or other global network. The architecture is expressed as an ``API,'' or application programmers interface specification, with reference embeddings in Java, Python, and C++. A C++ class library for an early version of this API has been implemented for machines ranging from PC's to the IBM SP2, meaning that phidentical codes run on all architectures.
Software architecture and engineering for patient records: current and future.
Weng, Chunhua; Levine, Betty A; Mun, Seong K
2009-05-01
During the "The National Forum on the Future of the Defense Health Information System," a track focusing on "Systems Architecture and Software Engineering" included eight presenters. These presenters identified three key areas of interest in this field, which include the need for open enterprise architecture and a federated database design, net centrality based on service-oriented architecture, and the need for focus on software usability and reusability. The eight panelists provided recommendations related to the suitability of service-oriented architecture and the enabling technologies of grid computing and Web 2.0 for building health services research centers and federated data warehouses to facilitate large-scale collaborative health care and research. Finally, they discussed the need to leverage industry best practices for software engineering to facilitate rapid software development, testing, and deployment.
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
Vectorization, threading, and cache-blocking considerations for hydrocodes on emerging architectures
Fung, J.; Aulwes, R. T.; Bement, M. T.; ...
2015-07-14
This work reports on considerations for improving computational performance in preparation for current and expected changes to computer architecture. The algorithms studied will include increasingly complex prototypes for radiation hydrodynamics codes, such as gradient routines and diffusion matrix assembly (e.g., in [1-6]). The meshes considered for the algorithms are structured or unstructured meshes. The considerations applied for performance improvements are meant to be general in terms of architecture (not specifically graphical processing unit (GPUs) or multi-core machines, for example) and include techniques for vectorization, threading, tiling, and cache blocking. Out of a survey of optimization techniques on applications such asmore » diffusion and hydrodynamics, we make general recommendations with a view toward making these techniques conceptually accessible to the applications code developer. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
Programming a hillslope water movement model on the MPP
NASA Technical Reports Server (NTRS)
Devaney, J. E.; Irving, A. R.; Camillo, P. J.; Gurney, R. J.
1987-01-01
A physically based numerical model was developed of heat and moisture flow within a hillslope on a parallel architecture computer, as a precursor to a model of a complete catchment. Moisture flow within a catchment includes evaporation, overland flow, flow in unsaturated soil, and flow in saturated soil. Because of the empirical evidence that moisture flow in unsaturated soil is mainly in the vertical direction, flow in the unsaturated zone can be modeled as a series of one dimensional columns. This initial version of the hillslope model includes evaporation and a single column of one dimensional unsaturated zone flow. This case has already been solved on an IBM 3081 computer and is now being applied to the massively parallel processor architecture so as to make the extension to the one dimensional case easier and to check the problems and benefits of using a parallel architecture machine.
System for Computer Automated Typesetting (SCAT) of Computer Authored Texts.
ERIC Educational Resources Information Center
Keeler, F. Laurence
This description of the System for Automated Typesetting (SCAT), an automated system for typesetting text and inserting special graphic symbols in programmed instructional materials created by the computer aided authoring system AUTHOR, provides an outline of the design architecture of the system and an overview including the component…
Tutorial: Computer architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gajski, D.D.; Milutinovic, V.M.; Siegel, H.J.
1986-01-01
This book presents the state-of-the-art in advanced computer architecture. It deals with the concepts underlying current architectures and covers approaches and techniques being used in the design of advanced computer systems.
Outline of a novel architecture for cortical computation.
Majumdar, Kaushik
2008-03-01
In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.
NASA Astrophysics Data System (ADS)
Tekin, Tolga; Töpper, Michael; Reichl, Herbert
2009-05-01
Technological frontiers between semiconductor technology, packaging, and system design are disappearing. Scaling down geometries [1] alone does not provide improvement of performance, less power, smaller size, and lower cost. It will require "More than Moore" [2] through the tighter integration of system level components at the package level. System-in-Package (SiP) will deliver the efficient use of three dimensions (3D) through innovation in packaging and interconnect technology. A key bottleneck to the implementation of high-performance microelectronic systems, including SiP, is the lack of lowlatency, high-bandwidth, and high density off-chip interconnects. Some of the challenges in achieving high-bandwidth chip-to-chip communication using electrical interconnects include the high losses in the substrate dielectric, reflections and impedance discontinuities, and susceptibility to crosstalk [3]. Obviously, the incentive for the use of photonics to overcome the challenges and leverage low-latency and highbandwidth communication will enable the vision of optical computing within next generation architectures. Supercomputers of today offer sustained performance of more than petaflops, which can be increased by utilizing optical interconnects. Next generation computing architectures are needed with ultra low power consumption; ultra high performance with novel interconnection technologies. In this paper we will discuss a CMOS compatible underlying technology to enable next generation optical computing architectures. By introducing a new optical layer within the 3D SiP, the development of converged microsystems, deployment for next generation optical computing architecture will be leveraged.
NASA Technical Reports Server (NTRS)
1983-01-01
Various parameters of the orbital space station are discussed. The space station environment, data management system, communication and tracking, environmental control, and life support system are considered. Specific topics reviewed include crew work stations, restraint systems, stowage, computer hardware, and expert systems.
Improving Conceptual Design for Launch Vehicles
NASA Technical Reports Server (NTRS)
Olds, John R.
1998-01-01
This report summarizes activities performed during the second year of a three year cooperative agreement between NASA - Langley Research Center and Georgia Tech. Year 1 of the project resulted in the creation of a new Cost and Business Assessment Model (CABAM) for estimating the economic performance of advanced reusable launch vehicles including non-recurring costs, recurring costs, and revenue. The current year (second year) activities were focused on the evaluation of automated, collaborative design frameworks (computation architectures or computational frameworks) for automating the design process in advanced space vehicle design. Consistent with NASA's new thrust area in developing and understanding Intelligent Synthesis Environments (ISE), the goals of this year's research efforts were to develop and apply computer integration techniques and near-term computational frameworks for conducting advanced space vehicle design. NASA - Langley (VAB) has taken a lead role in developing a web-based computing architectures within which the designer can interact with disciplinary analysis tools through a flexible web interface. The advantages of this approach are, 1) flexible access to the designer interface through a simple web browser (e.g. Netscape Navigator), 2) ability to include existing 'legacy' codes, and 3) ability to include distributed analysis tools running on remote computers. To date, VAB's internal emphasis has been on developing this test system for the planetary entry mission under the joint Integrated Design System (IDS) program with NASA - Ames and JPL. Georgia Tech's complementary goals this year were to: 1) Examine an alternate 'custom' computational architecture for the three-discipline IDS planetary entry problem to assess the advantages and disadvantages relative to the web-based approach.and 2) Develop and examine a web-based interface and framework for a typical launch vehicle design problem.
Pedretti, Kevin
2008-11-18
A compute processor allocator architecture for allocating compute processors to run applications in a multiple processor computing apparatus is distributed among a subset of processors within the computing apparatus. Each processor of the subset includes a compute processor allocator. The compute processor allocators can share a common database of information pertinent to compute processor allocation. A communication path permits retrieval of information from the database independently of the compute processor allocators.
PEM-PCA: a parallel expectation-maximization PCA face recognition architecture.
Rujirakul, Kanokmon; So-In, Chakchai; Arnonkijpanich, Banchar
2014-01-01
Principal component analysis or PCA has been traditionally used as one of the feature extraction techniques in face recognition systems yielding high accuracy when requiring a small number of features. However, the covariance matrix and eigenvalue decomposition stages cause high computational complexity, especially for a large database. Thus, this research presents an alternative approach utilizing an Expectation-Maximization algorithm to reduce the determinant matrix manipulation resulting in the reduction of the stages' complexity. To improve the computational time, a novel parallel architecture was employed to utilize the benefits of parallelization of matrix computation during feature extraction and classification stages including parallel preprocessing, and their combinations, so-called a Parallel Expectation-Maximization PCA architecture. Comparing to a traditional PCA and its derivatives, the results indicate lower complexity with an insignificant difference in recognition precision leading to high speed face recognition systems, that is, the speed-up over nine and three times over PCA and Parallel PCA.
Job Superscheduler Architecture and Performance in Computational Grid Environments
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Oliker, Leonid; Biswas, Rupak
2003-01-01
Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex scientific problems. However, a number of major technical hurdles, including distributed resource management and effective job scheduling, stand in the way of realizing these gains. In this paper, we propose a novel grid superscheduler architecture and three distributed job migration algorithms. We also model the critical interaction between the superscheduler and autonomous local schedulers. Extensive performance comparisons with ideal, central, and local schemes using real workloads from leading computational centers are conducted in a simulation environment. Additionally, synthetic workloads are used to perform a detailed sensitivity analysis of our superscheduler. Several key metrics demonstrate that substantial performance gains can be achieved via smart superscheduling in distributed computational grids.
Integrating the Apache Big Data Stack with HPC for Big Data
NASA Astrophysics Data System (ADS)
Fox, G. C.; Qiu, J.; Jha, S.
2014-12-01
There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.
Developing Information Power Grid Based Algorithms and Software
NASA Technical Reports Server (NTRS)
Dongarra, Jack
1998-01-01
This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Hsien-Hsin S
The overall objective of this research project is to develop novel architectural techniques as well as system software to achieve a highly secure and intrusion-tolerant computing system. Such system will be autonomous, self-adapting, introspective, with self-healing capability under the circumstances of improper operations, abnormal workloads, and malicious attacks. The scope of this research includes: (1) System-wide, unified introspection techniques for autonomic systems, (2) Secure information-flow microarchitecture, (3) Memory-centric security architecture, (4) Authentication control and its implication to security, (5) Digital right management, (5) Microarchitectural denial-of-service attacks on shared resources. During the period of the project, we developed several architectural techniquesmore » and system software for achieving a robust, secure, and reliable computing system toward our goal.« less
Thrifty: An Exascale Architecture for Energy Proportional Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torrellas, Josep
2014-12-23
The objective of this project is to design different aspects of a novel exascale architecture called Thrifty. Our goal is to focus on the challenges of power/energy efficiency, performance, and resiliency in exascale systems. The project includes work on computer architecture (Josep Torrellas from University of Illinois), compilation (Daniel Quinlan from Lawrence Livermore National Laboratory), runtime and applications (Laura Carrington from University of California San Diego), and circuits (Wilfred Pinfold from Intel Corporation). In this report, we focus on the progress at the University of Illinois during the last year of the grant (September 1, 2013 to August 31, 2014).more » We also point to the progress in the other collaborating institutions when needed.« less
Code of Federal Regulations, 2010 CFR
2010-10-01
... through prolonged study. Examples of these professions include accountancy, actuarial computation, architecture, dentistry, engineering, law, medicine, nursing, pharmacy, the sciences (such as biology...
Performance Analysis of Distributed Object-Oriented Applications
NASA Technical Reports Server (NTRS)
Schoeffler, James D.
1998-01-01
The purpose of this research was to evaluate the efficiency of a distributed simulation architecture which creates individual modules which are made self-scheduling through the use of a message-based communication system used for requesting input data from another module which is the source of that data. To make the architecture as general as possible, the message-based communication architecture was implemented using standard remote object architectures (Common Object Request Broker Architecture (CORBA) and/or Distributed Component Object Model (DCOM)). A series of experiments were run in which different systems are distributed in a variety of ways across multiple computers and the performance evaluated. The experiments were duplicated in each case so that the overhead due to message communication and data transmission can be separated from the time required to actually perform the computational update of a module each iteration. The software used to distribute the modules across multiple computers was developed in the first year of the current grant and was modified considerably to add a message-based communication scheme supported by the DCOM distributed object architecture. The resulting performance was analyzed using a model created during the first year of this grant which predicts the overhead due to CORBA and DCOM remote procedure calls and includes the effects of data passed to and from the remote objects. A report covering the distributed simulation software and the results of the performance experiments has been submitted separately. The above report also discusses possible future work to apply the methodology to dynamically distribute the simulation modules so as to minimize overall computation time.
Supporting Undergraduate Computer Architecture Students Using a Visual MIPS64 CPU Simulator
ERIC Educational Resources Information Center
Patti, D.; Spadaccini, A.; Palesi, M.; Fazzino, F.; Catania, V.
2012-01-01
The topics of computer architecture are always taught using an Assembly dialect as an example. The most commonly used textbooks in this field use the MIPS64 Instruction Set Architecture (ISA) to help students in learning the fundamentals of computer architecture because of its orthogonality and its suitability for real-world applications. This…
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques
2013-03-01
MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated
VENI, video, VICI: The merging of computer and video technologies
NASA Technical Reports Server (NTRS)
Horowitz, Jay G.
1993-01-01
The topics covered include the following: High Definition Television (HDTV) milestones; visual information bandwidth; television frequency allocation and bandwidth; horizontal scanning; workstation RGB color domain; NTSC color domain; American HDTV time-table; HDTV image size; digital HDTV hierarchy; task force on digital image architecture; open architecture model; future displays; and the ULTIMATE imaging system.
Submicron Systems Architecture Project
1981-11-01
This project is concerned with the architecture , design , and testing of VLSI Systems. The principal activities in this report period include: The Tree Machine; COPE, The Homogeneous Machine; Computational Arrays; Switch-Level Model for MOS Logic Design; Testing; Local Network and Designer Workstations; Self-timed Systems; Characterization of Deadlock Free Resource Contention; Concurrency Algebra; Language Design and Logic for Program Verification.
Brain architecture: a design for natural computation.
Kaiser, Marcus
2007-12-15
Fifty years ago, John von Neumann compared the architecture of the brain with that of the computers he invented and which are still in use today. In those days, the organization of computers was based on concepts of brain organization. Here, we give an update on current results on the global organization of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
A Standard Platform for Testing and Comparison of MDAO Architectures
NASA Technical Reports Server (NTRS)
Gray, Justin S.; Moore, Kenneth T.; Hearn, Tristan A.; Naylor, Bret A.
2012-01-01
The Multidisciplinary Design Analysis and Optimization (MDAO) community has developed a multitude of algorithms and techniques, called architectures, for performing optimizations on complex engineering systems which involve coupling between multiple discipline analyses. These architectures seek to efficiently handle optimizations with computationally expensive analyses including multiple disciplines. We propose a new testing procedure that can provide a quantitative and qualitative means of comparison among architectures. The proposed test procedure is implemented within the open source framework, OpenMDAO, and comparative results are presented for five well-known architectures: MDF, IDF, CO, BLISS, and BLISS-2000. We also demonstrate how using open source soft- ware development methods can allow the MDAO community to submit new problems and architectures to keep the test suite relevant.
A new software-based architecture for quantum computer
NASA Astrophysics Data System (ADS)
Wu, Nan; Song, FangMin; Li, Xiangdong
2010-04-01
In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
NASA Astrophysics Data System (ADS)
Titov, A. G.; Okladnikov, I. G.; Gordov, E. P.
2017-11-01
The use of large geospatial datasets in climate change studies requires the development of a set of Spatial Data Infrastructure (SDI) elements, including geoprocessing and cartographical visualization web services. This paper presents the architecture of a geospatial OGC web service system as an integral part of a virtual research environment (VRE) general architecture for statistical processing and visualization of meteorological and climatic data. The architecture is a set of interconnected standalone SDI nodes with corresponding data storage systems. Each node runs a specialized software, such as a geoportal, cartographical web services (WMS/WFS), a metadata catalog, and a MySQL database of technical metadata describing geospatial datasets available for the node. It also contains geospatial data processing services (WPS) based on a modular computing backend realizing statistical processing functionality and, thus, providing analysis of large datasets with the results of visualization and export into files of standard formats (XML, binary, etc.). Some cartographical web services have been developed in a system’s prototype to provide capabilities to work with raster and vector geospatial data based on OGC web services. The distributed architecture presented allows easy addition of new nodes, computing and data storage systems, and provides a solid computational infrastructure for regional climate change studies based on modern Web and GIS technologies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Keyes, D.; McInnes, L. C.; Woodward, C.
This report is an outcome of the workshop Multiphysics Simulations: Challenges and Opportunities, sponsored by the Institute of Computing in Science (ICiS). Additional information about the workshop, including relevant reading and presentations on multiphysics issues in applications, algorithms, and software, is available via https://sites.google.com/site/icismultiphysics2011/. We consider multiphysics applications from algorithmic and architectural perspectives, where 'algorithmic' includes both mathematical analysis and computational complexity and 'architectural' includes both software and hardware environments. Many diverse multiphysics applications can be reduced, en route to their computational simulation, to a common algebraic coupling paradigm. Mathematical analysis of multiphysics coupling in this form is not alwaysmore » practical for realistic applications, but model problems representative of applications discussed herein can provide insight. A variety of software frameworks for multiphysics applications have been constructed and refined within disciplinary communities and executed on leading-edge computer systems. We examine several of these, expose some commonalities among them, and attempt to extrapolate best practices to future systems. From our study, we summarize challenges and forecast opportunities. We also initiate a modest suite of test problems encompassing features present in many applications.« less
NASA Technical Reports Server (NTRS)
Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas
2008-01-01
A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.
Why is a computational framework for motivational and metacognitive control needed?
NASA Astrophysics Data System (ADS)
Sun, Ron
2018-01-01
This paper discusses, in the context of computational modelling and simulation of cognition, the relevance of deeper structures in the control of behaviour. Such deeper structures include motivational control of behaviour, which provides underlying causes for actions, and also metacognitive control, which provides higher-order processes for monitoring and regulation. It is argued that such deeper structures are important and thus cannot be ignored in computational cognitive architectures. A general framework based on the Clarion cognitive architecture is outlined that emphasises the interaction amongst action selection, motivation, and metacognition. The upshot is that it is necessary to incorporate all essential processes; short of that, the understanding of cognition can only be incomplete.
Design and optimization of a portable LQCD Monte Carlo code using OpenACC
NASA Astrophysics Data System (ADS)
Bonati, Claudio; Coscetti, Simone; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Calore, Enrico; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele
The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core Graphics Processor Units (GPUs), exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work, we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenAcc, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.
NASA Astrophysics Data System (ADS)
Ford, Eric B.; Dindar, Saleh; Peters, Jorg
2015-08-01
The realism of astrophysical simulations and statistical analyses of astronomical data are set by the available computational resources. Thus, astronomers and astrophysicists are constantly pushing the limits of computational capabilities. For decades, astronomers benefited from massive improvements in computational power that were driven primarily by increasing clock speeds and required relatively little attention to details of the computational hardware. For nearly a decade, increases in computational capabilities have come primarily from increasing the degree of parallelism, rather than increasing clock speeds. Further increases in computational capabilities will likely be led by many-core architectures such as Graphical Processing Units (GPUs) and Intel Xeon Phi. Successfully harnessing these new architectures, requires significantly more understanding of the hardware architecture, cache hierarchy, compiler capabilities and network network characteristics.I will provide an astronomer's overview of the opportunities and challenges provided by modern many-core architectures and elastic cloud computing. The primary goal is to help an astronomical audience understand what types of problems are likely to yield more than order of magnitude speed-ups and which problems are unlikely to parallelize sufficiently efficiently to be worth the development time and/or costs.I will draw on my experience leading a team in developing the Swarm-NG library for parallel integration of large ensembles of small n-body systems on GPUs, as well as several smaller software projects. I will share lessons learned from collaborating with computer scientists, including both technical and soft skills. Finally, I will discuss the challenges of training the next generation of astronomers to be proficient in this new era of high-performance computing, drawing on experience teaching a graduate class on High-Performance Scientific Computing for Astrophysics and organizing a 2014 advanced summer school on Bayesian Computing for Astronomical Data Analysis with support of the Penn State Center for Astrostatistics and Institute for CyberScience.
Architectures for single-chip image computing
NASA Astrophysics Data System (ADS)
Gove, Robert J.
1992-04-01
This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
Transitioning ISR architecture into the cloud
NASA Astrophysics Data System (ADS)
Lash, Thomas D.
2012-06-01
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
NASA Astrophysics Data System (ADS)
Lhamon, Michael Earl
A pattern recognition system which uses complex correlation filter banks requires proportionally more computational effort than single-real valued filters. This introduces increased computation burden but also introduces a higher level of parallelism, that common computing platforms fail to identify. As a result, we consider algorithm mapping to both optical and digital processors. For digital implementation, we develop computationally efficient pattern recognition algorithms, referred to as, vector inner product operators that require less computational effort than traditional fast Fourier methods. These algorithms do not need correlation and they map readily onto parallel digital architectures, which imply new architectures for optical processors. These filters exploit circulant-symmetric matrix structures of the training set data representing a variety of distortions. By using the same mathematical basis as with the vector inner product operations, we are able to extend the capabilities of more traditional correlation filtering to what we refer to as "Super Images". These "Super Images" are used to morphologically transform a complicated input scene into a predetermined dot pattern. The orientation of the dot pattern is related to the rotational distortion of the object of interest. The optical implementation of "Super Images" yields feature reduction necessary for using other techniques, such as artificial neural networks. We propose a parallel digital signal processor architecture based on specific pattern recognition algorithms but general enough to be applicable to other similar problems. Such an architecture is classified as a data flow architecture. Instead of mapping an algorithm to an architecture, we propose mapping the DSP architecture to a class of pattern recognition algorithms. Today's optical processing systems have difficulties implementing full complex filter structures. Typically, optical systems (like the 4f correlators) are limited to phase-only implementation with lower detection performance than full complex electronic systems. Our study includes pseudo-random pixel encoding techniques for approximating full complex filtering. Optical filter bank implementation is possible and they have the advantage of time averaging the entire filter bank at real time rates. Time-averaged optical filtering is computational comparable to billions of digital operations-per-second. For this reason, we believe future trends in high speed pattern recognition will involve hybrid architectures of both optical and DSP elements.
Satellite on-board processing for earth resources data
NASA Technical Reports Server (NTRS)
Bodenheimer, R. E.; Gonzalez, R. C.; Gupta, J. N.; Hwang, K.; Rochelle, R. W.; Wilson, J. B.; Wintz, P. A.
1975-01-01
Results of a survey of earth resources user applications and their data requirements, earth resources multispectral scanner sensor technology, and preprocessing algorithms for correcting the sensor outputs and for data bulk reduction are presented along with a candidate data format. Computational requirements required to implement the data analysis algorithms are included along with a review of computer architectures and organizations. Computer architectures capable of handling the algorithm computational requirements are suggested and the environmental effects of an on-board processor discussed. By relating performance parameters to the system requirements of each of the user requirements the feasibility of on-board processing is determined for each user. A tradeoff analysis is performed to determine the sensitivity of results to each of the system parameters. Significant results and conclusions are discussed, and recommendations are presented.
NASA Technical Reports Server (NTRS)
1998-01-01
Under a NASA SBIR (Small Business Innovative Research) contract, (NAS5-30905), EAI Simulation Associates, Inc., developed a new digital simulation computer, Starlight(tm). With an architecture based on the analog model of computation, Starlight(tm) outperforms all other computers on a wide range of continuous system simulation. This system is used in a variety of applications, including aerospace, automotive, electric power and chemical reactors.
Advanced computer architecture specification for automated weld systems
NASA Technical Reports Server (NTRS)
Katsinis, Constantine
1994-01-01
This report describes the requirements for an advanced automated weld system and the associated computer architecture, and defines the overall system specification from a broad perspective. According to the requirements of welding procedures as they relate to an integrated multiaxis motion control and sensor architecture, the computer system requirements are developed based on a proven multiple-processor architecture with an expandable, distributed-memory, single global bus architecture, containing individual processors which are assigned to specific tasks that support sensor or control processes. The specified architecture is sufficiently flexible to integrate previously developed equipment, be upgradable and allow on-site modifications.
Space Ultrareliable Modular Computer (SUMC) instruction simulator
NASA Technical Reports Server (NTRS)
Curran, R. T.
1972-01-01
The design principles, description, functional operation, and recommended expansion and enhancements are presented for the Space Ultrareliable Modular Computer interpretive simulator. Included as appendices are the user's manual, program module descriptions, target instruction descriptions, simulator source program listing, and a sample program printout. In discussing the design and operation of the simulator, the key problems involving host computer independence and target computer architectural scope are brought into focus.
Recursive computer architecture for VLSI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treleaven, P.C.; Hopkins, R.P.
1982-01-01
A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
Integrated command, control, communications and computation system functional architecture
NASA Technical Reports Server (NTRS)
Cooley, C. G.; Gilbert, L. E.
1981-01-01
The functional architecture for an integrated command, control, communications, and computation system applicable to the command and control portion of the NASA End-to-End Data. System is described including the downlink data processing and analysis functions required to support the uplink processes. The functional architecture is composed of four elements: (1) the functional hierarchy which provides the decomposition and allocation of the command and control functions to the system elements; (2) the key system features which summarize the major system capabilities; (3) the operational activity threads which illustrate the interrelationahip between the system elements; and (4) the interfaces which illustrate those elements that originate or generate data and those elements that use the data. The interfaces also provide a description of the data and the data utilization and access techniques.
Production Level CFD Code Acceleration for Hybrid Many-Core Architectures
NASA Technical Reports Server (NTRS)
Duffy, Austen C.; Hammond, Dana P.; Nielsen, Eric J.
2012-01-01
In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time.
Adiabatic Quantum Computing via the Rydberg Blockade
NASA Astrophysics Data System (ADS)
Keating, Tyler; Goyal, Krittika; Deutsch, Ivan
2012-06-01
We study an architecture for implementing adiabatic quantum computation with trapped neutral atoms. Ground state atoms are dressed by laser fields in a manner conditional on the Rydberg blockade mechanism, thereby providing the requisite entangling interactions. As a benchmark we study the performance of a Quadratic Unconstrained Binary Optimization (QUBO) problem whose solution is found in the ground state spin configuration of an Ising-like model. We model a realistic architecture, including the effects of magnetic level structure, with qubits encoded into the clock states of ^133Cs, effective B-fields implemented through microwaves and light shifts, and atom-atom coupling achieved by excitation to a high-lying Rydberg level. Including the fundamental effects of photon scattering we find a high fidelity for the two-qubit implementation.
Application of multigrid methods to the solution of liquid crystal equations on a SIMD computer
NASA Technical Reports Server (NTRS)
Farrell, Paul A.; Ruttan, Arden; Zeller, Reinhardt R.
1993-01-01
We will describe a finite difference code for computing the equilibrium configurations of the order-parameter tensor field for nematic liquid crystals in rectangular regions by minimization of the Landau-de Gennes Free Energy functional. The implementation of the free energy functional described here includes magnetic fields, quadratic gradient terms, and scalar bulk terms through the fourth order. Boundary conditions include the effects of strong surface anchoring. The target architectures for our implementation are SIMD machines, with interconnection networks which can be configured as 2 or 3 dimensional grids, such as the Wavetracer DTC. We also discuss the relative efficiency of a number of iterative methods for the solution of the linear systems arising from this discretization on such architectures.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan
2006-01-01
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Mark 4A antenna control system data handling architecture study
NASA Technical Reports Server (NTRS)
Briggs, H. C.; Eldred, D. B.
1991-01-01
A high-level review was conducted to provide an analysis of the existing architecture used to handle data and implement control algorithms for NASA's Deep Space Network (DSN) antennas and to make system-level recommendations for improving this architecture so that the DSN antennas can support the ever-tightening requirements of the next decade and beyond. It was found that the existing system is seriously overloaded, with processor utilization approaching 100 percent. A number of factors contribute to this overloading, including dated hardware, inefficient software, and a message-passing strategy that depends on serial connections between machines. At the same time, the system has shortcomings and idiosyncrasies that require extensive human intervention. A custom operating system kernel and an obscure programming language exacerbate the problems and should be modernized. A new architecture is presented that addresses these and other issues. Key features of the new architecture include a simplified message passing hierarchy that utilizes a high-speed local area network, redesign of particular processing function algorithms, consolidation of functions, and implementation of the architecture in modern hardware and software using mainstream computer languages and operating systems. The system would also allow incremental hardware improvements as better and faster hardware for such systems becomes available, and costs could potentially be low enough that redundancy would be provided economically. Such a system could support DSN requirements for the foreseeable future, though thorough consideration must be given to hard computational requirements, porting existing software functionality to the new system, and issues of fault tolerance and recovery.
Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS
NASA Astrophysics Data System (ADS)
Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.
2018-03-01
Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.
A synchronized computational architecture for generalized bilateral control of robot arms
NASA Technical Reports Server (NTRS)
Bejczy, Antal K.; Szakaly, Zoltan
1987-01-01
This paper describes a computational architecture for an interconnected high speed distributed computing system for generalized bilateral control of robot arms. The key method of the architecture is the use of fully synchronized, interrupt driven software. Since an objective of the development is to utilize the processing resources efficiently, the synchronization is done in the hardware level to reduce system software overhead. The architecture also achieves a balaced load on the communication channel. The paper also describes some architectural relations to trading or sharing manual and automatic control.
Remote voice training: A case study on space shuttle applications, appendix C
NASA Technical Reports Server (NTRS)
Mollakarimi, Cindy; Hamid, Tamin
1990-01-01
The Tile Automation System includes applications of automation and robotics technology to all aspects of the Shuttle tile processing and inspection system. An integrated set of rapid prototyping testbeds was developed which include speech recognition and synthesis, laser imaging systems, distributed Ada programming environments, distributed relational data base architectures, distributed computer network architectures, multi-media workbenches, and human factors considerations. Remote voice training in the Tile Automation System is discussed. The user is prompted over a headset by synthesized speech for the training sequences. The voice recognition units and the voice output units are remote from the user and are connected by Ethernet to the main computer system. A supervisory channel is used to monitor the training sequences. Discussions include the training approaches as well as the human factors problems and solutions for this system utilizing remote training techniques.
Deep learning with coherent nanophotonic circuits
NASA Astrophysics Data System (ADS)
Shen, Yichen; Harris, Nicholas C.; Skirlo, Scott; Prabhu, Mihika; Baehr-Jones, Tom; Hochberg, Michael; Sun, Xin; Zhao, Shijie; Larochelle, Hugo; Englund, Dirk; Soljačić, Marin
2017-07-01
Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach-Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Stocker, John C.; Golomb, Andrew M.
2011-01-01
Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Colt: an experiment in wormhole run-time reconfiguration
NASA Astrophysics Data System (ADS)
Bittner, Ray; Athanas, Peter M.; Musgrove, Mark
1996-10-01
Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
Avionics System Architecture Tool
NASA Technical Reports Server (NTRS)
Chau, Savio; Hall, Ronald; Traylor, marcus; Whitfield, Adrian
2005-01-01
Avionics System Architecture Tool (ASAT) is a computer program intended for use during the avionics-system-architecture- design phase of the process of designing a spacecraft for a specific mission. ASAT enables simulation of the dynamics of the command-and-data-handling functions of the spacecraft avionics in the scenarios in which the spacecraft is expected to operate. ASAT is built upon I-Logix Statemate MAGNUM, providing a complement of dynamic system modeling tools, including a graphical user interface (GUI), modeling checking capabilities, and a simulation engine. ASAT augments this with a library of predefined avionics components and additional software to support building and analyzing avionics hardware architectures using these components.
Architectural Specialization for Inter-Iteration Loop Dependence Patterns
2015-10-01
Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
78 FR 39617 - Data Practices, Computer III Further Remand: BOC Provision of Enhanced Services
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-02
... Docket No. 10-132; FCC 13-69] Data Practices, Computer III Further Remand: BOC Provision of Enhanced... eliminates comparably efficient interconnection (CEI) and open network architecture (ONA) narrowband... disseminates data, including by altering or eliminating collections that are no longer useful or necessary to...
Study on Global GIS architecture and its key technologies
NASA Astrophysics Data System (ADS)
Cheng, Chengqi; Guan, Li; Lv, Xuefeng
2009-09-01
Global GIS (G2IS) is a system, which supports the huge data process and the global direct manipulation on global grid based on spheroid or ellipsoid surface. Based on global subdivision grid (GSG), Global GIS architecture is presented in this paper, taking advantage of computer cluster theory, the space-time integration technology and the virtual reality technology. Global GIS system architecture is composed of five layers, including data storage layer, data representation layer, network and cluster layer, data management layer and data application layer. Thereinto, it is designed that functions of four-level protocol framework and three-layer data management pattern of Global GIS based on organization, management and publication of spatial information in this architecture. Three kinds of core supportive technologies, which are computer cluster theory, the space-time integration technology and the virtual reality technology, and its application pattern in the Global GIS are introduced in detail. The primary ideas of Global GIS in this paper will be an important development tendency of GIS.
Study on Global GIS architecture and its key technologies
NASA Astrophysics Data System (ADS)
Cheng, Chengqi; Guan, Li; Lv, Xuefeng
2010-11-01
Global GIS (G2IS) is a system, which supports the huge data process and the global direct manipulation on global grid based on spheroid or ellipsoid surface. Based on global subdivision grid (GSG), Global GIS architecture is presented in this paper, taking advantage of computer cluster theory, the space-time integration technology and the virtual reality technology. Global GIS system architecture is composed of five layers, including data storage layer, data representation layer, network and cluster layer, data management layer and data application layer. Thereinto, it is designed that functions of four-level protocol framework and three-layer data management pattern of Global GIS based on organization, management and publication of spatial information in this architecture. Three kinds of core supportive technologies, which are computer cluster theory, the space-time integration technology and the virtual reality technology, and its application pattern in the Global GIS are introduced in detail. The primary ideas of Global GIS in this paper will be an important development tendency of GIS.
Transportable GPU (General Processor Units) chip set technology for standard computer architectures
NASA Astrophysics Data System (ADS)
Fosdick, R. E.; Denison, H. C.
1982-11-01
The USAFR-developed GPU Chip Set has been utilized by Tracor to implement both USAF and Navy Standard 16-Bit Airborne Computer Architectures. Both configurations are currently being delivered into DOD full-scale development programs. Leadless Hermetic Chip Carrier packaging has facilitated implementation of both architectures on single 41/2 x 5 substrates. The CMOS and CMOS/SOS implementations of the GPU Chip Set have allowed both CPU implementations to use less than 3 watts of power each. Recent efforts by Tracor for USAF have included the definition of a next-generation GPU Chip Set that will retain the application-proven architecture of the current chip set while offering the added cost advantages of transportability across ISO-CMOS and CMOS/SOS processes and across numerous semiconductor manufacturers using a newly-defined set of common design rules. The Enhanced GPU Chip Set will increase speed by an approximate factor of 3 while significantly reducing chip counts and costs of standard CPU implementations.
Performance study of a data flow architecture
NASA Technical Reports Server (NTRS)
Adams, George
1985-01-01
Teams of scientists studied data flow concepts, static data flow machine architecture, and the VAL language. Each team mapped its application onto the machine and coded it in VAL. The principal findings of the study were: (1) Five of the seven applications used the full power of the target machine. The galactic simulation and multigrid fluid flow teams found that a significantly smaller version of the machine (16 processing elements) would suffice. (2) A number of machine design parameters including processing element (PE) function unit numbers, array memory size and bandwidth, and routing network capability were found to be crucial for optimal machine performance. (3) The study participants readily acquired VAL programming skills. (4) Participants learned that application-based performance evaluation is a sound method of evaluating new computer architectures, even those that are not fully specified. During the course of the study, participants developed models for using computers to solve numerical problems and for evaluating new architectures. These models form the bases for future evaluation studies.
Research on the architecture and key technologies of SIG
NASA Astrophysics Data System (ADS)
Fu, Zhongliang; Meng, Qingxiang; Huang, Yan; Liu, Shufan
2007-06-01
Along with the development of computer network, Grid has become one of the hottest issues of researches on sharing and cooperation of Internet resources throughout the world. This paper illustrates a new architecture of SIG-a five-hierarchy architecture (including Data Collecting Layer, Grid Layer, Service Layer, Application Layer and Client Layer) of SIG from the traditional three hierarchies (only including resource layer, service layer and client layer). In the paper, the author proposes a new mixed network mode of Spatial Information Grid which integrates CAG (Certificate Authority of Grid) and P2P (Peer to Peer) in the Grid Layer, besides, the author discusses some key technologies of SIG and analysis the functions of these key technologies.
NASA Astrophysics Data System (ADS)
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-03-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Cognitive Architectures and Human-Computer Interaction. Introduction to Special Issue.
ERIC Educational Resources Information Center
Gray, Wayne D.; Young, Richard M.; Kirschenbaum, Susan S.
1997-01-01
In this introduction to a special issue on cognitive architectures and human-computer interaction (HCI), editors and contributors provide a brief overview of cognitive architectures. The following four architectures represented by articles in this issue are: Soar; LICAI (linked model of comprehension-based action planning and instruction taking);…
Biomimetic design processes in architecture: morphogenetic and evolutionary computational design.
Menges, Achim
2012-03-01
Design computation has profound impact on architectural design methods. This paper explains how computational design enables the development of biomimetic design processes specific to architecture, and how they need to be significantly different from established biomimetic processes in engineering disciplines. The paper first explains the fundamental difference between computer-aided and computational design in architecture, as the understanding of this distinction is of critical importance for the research presented. Thereafter, the conceptual relation and possible transfer of principles from natural morphogenesis to design computation are introduced and the related developments of generative, feature-based, constraint-based, process-based and feedback-based computational design methods are presented. This morphogenetic design research is then related to exploratory evolutionary computation, followed by the presentation of two case studies focusing on the exemplary development of spatial envelope morphologies and urban block morphologies.
Reliability models for dataflow computer systems
NASA Technical Reports Server (NTRS)
Kavi, K. M.; Buckles, B. P.
1985-01-01
The demands for concurrent operation within a computer system and the representation of parallelism in programming languages have yielded a new form of program representation known as data flow (DENN 74, DENN 75, TREL 82a). A new model based on data flow principles for parallel computations and parallel computer systems is presented. Necessary conditions for liveness and deadlock freeness in data flow graphs are derived. The data flow graph is used as a model to represent asynchronous concurrent computer architectures including data flow computers.
The flight telerobotic servicer: From functional architecture to computer architecture
NASA Technical Reports Server (NTRS)
Lumia, Ronald; Fiala, John
1989-01-01
After a brief tutorial on the NASA/National Bureau of Standards Standard Reference Model for Telerobot Control System Architecture (NASREM) functional architecture, the approach to its implementation is shown. First, interfaces must be defined which are capable of supporting the known algorithms. This is illustrated by considering the interfaces required for the SERVO level of the NASREM functional architecture. After interface definition, the specific computer architecture for the implementation must be determined. This choice is obviously technology dependent. An example illustrating one possible mapping of the NASREM functional architecture to a particular set of computers which implements it is shown. The result of choosing the NASREM functional architecture is that it provides a technology independent paradigm which can be mapped into a technology dependent implementation capable of evolving with technology in the laboratory and in space.
Suborbital Telepresence and Over-the-Horizon Networking
NASA Technical Reports Server (NTRS)
Freudinger, Lawrence C.
2007-01-01
A viewgraph presentation describing the suborbital telepresence project utilizing in-flight network computing is shown. The topics include: 1) Motivation; 2) Suborbital Telepresence and Global Test Range; 3) Tropical Composition, Cloud, and Climate Coupling Experiment (TC4); 4) Data Sets for TC4 Real-time Monitoring; 5) TC-4 Notional Architecture; 6) An Application Integration View; 7) Telepresence: Architectural Framework; and 8) Disruption Tolerant Networks.
Computer Sciences and Data Systems, volume 2
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: data storage; information network architecture; VHSIC technology; fiber optics; laser applications; distributed processing; spaceborne optical disk controller; massively parallel processors; and advanced digital SAR processors.
Study objectives: Will commercial avionics do the job? Improvements needed?
NASA Technical Reports Server (NTRS)
Nasr, Hatem
1992-01-01
Improvements in commercial avionics are covered in a viewgraph format. Topics include the following: computer architecture, user requirements, Boeing 777 aircraft, cost effectiveness, and implemention.
NASA Astrophysics Data System (ADS)
Guilfoyle, Peter S.; Stone, Richard V.
1991-12-01
OptiComp is currently completing a 32-bit, fully programmable digital optical computer (DOC II) that is designed to operate in a UNIX environment running RISC microcode. OptiComp's DOC II architecture is focused toward parallel microcode implementation where data is input in a dual rail format. By exploiting the physical principals inherent to optics (speed and low power consumption), an architectural balance of optical interconnects and software code efficiency can be achieved including high fan-in and fan-out. OptiComp's DOC II program is jointly sponsored by the Office of Naval Research (ONR), the Strategic Defense Initiative Office (SDIO), NASA space station group and Rome Laboratory (USAF). This paper not only describes the motivational basis behind DOC II but also provides an optical overview and architectural summary of the device that allows the emulation of any digital instruction set.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Donaldson, Thomas; Doty, Karl; Engle, Steven W.; Larson, Robert E.; O'Reilly, John G.
1988-01-01
The results of a Phase I Small Business Innovative Research (SBIR) project performed for the NASA Langley Computational Structural Mechanics Group are described. The project resulted in the identification of a family of chordal-ring interconnection architectures with excellent potential to serve as the basis for new multimicroprocessor (MMP) computers. The paper presents examples of how computational algorithms from structural mechanics can be efficiently implemented on the chordal-ring architecture.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1992-01-01
The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures
NASA Astrophysics Data System (ADS)
Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.
2016-12-01
The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.
The potential of multi-port optical memories in digital computing
NASA Technical Reports Server (NTRS)
Alford, C. O.; Gaylord, T. K.
1975-01-01
A high-capacity memory with a relatively high data transfer rate and multi-port simultaneous access capability may serve as the basis for new computer architectures. The implementation of a multi-port optical memory is discussed. Several computer structures are presented that might profitably use such a memory. These structures include (1) a simultaneous record access system, (2) a simultaneously shared memory computer system, and (3) a parallel digital processing structure.
Distributed computing environments for future space control systems
NASA Technical Reports Server (NTRS)
Viallefont, Pierre
1993-01-01
The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
Electro-Optic Computing Architectures. Volume I
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
The Workstation Approach to Laboratory Computing
Crosby, P.A.; Malachowski, G.C.; Hall, B.R.; Stevens, V.; Gunn, B.J.; Hudson, S.; Schlosser, D.
1985-01-01
There is a need for a Laboratory Workstation which specifically addresses the problems associated with computing in the scientific laboratory. A workstation based on the IBM PC architecture and including a front end data acquisition system which communicates with a host computer via a high speed communications link; a new graphics display controller with hardware window management and window scrolling; and an integrated software package is described.
Image-Processing Software For A Hypercube Computer
NASA Technical Reports Server (NTRS)
Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.
1992-01-01
Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Experimental Comparison of Two Quantum Computing Architectures
2017-03-28
IN A U G U RA L A RT IC LE CO M PU TE R SC IE N CE S Experimental comparison of two quantum computing architectures Norbert M. Linkea,b,1, Dmitri...the vast computing power a universal quantumcomputer could offer, several candidate systems are being explored. They have allowed experimental ...existing systems and the role of architecture in quantum computer design . These will be crucial for the realization of more advanced future incarna
Battlefield Object Control via Internet Architecture
2002-01-01
superiority is the best way to reach the goal of competition superiority. Using information technology (IT) in data processing, including computer hardware... technologies : Global Positioning System (GPS), Geographic Information System (GIS), Battlefield Information Transmission System (BITS), and Intelligent...operational environment. Keywords: C4ISR Systems, Information Superiority, Battlefield Objects, Computer - Aided Prototyping System (CAPS), IP-based
An Architectural Design System Based on Computer Graphics.
ERIC Educational Resources Information Center
MacDonald, Stephen L.; Wehrli, Robert
The recent developments in computer hardware and software are presented to inform architects of this design tool. Technical advancements in equipment include--(1) cathode ray tube displays, (2) light pens, (3) print-out and photo copying attachments, (4) controls for comparison and selection of images, (5) chording keyboards, (6) plotters, and (7)…
Production experience with the ATLAS Event Service
NASA Astrophysics Data System (ADS)
Benjamin, D.; Calafiura, P.; Childers, T.; De, K.; Guan, W.; Maeno, T.; Nilsson, P.; Tsulaia, V.; Van Gemmeren, P.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The ATLAS Event Service (AES) has been designed and implemented for efficient running of ATLAS production workflows on a variety of computing platforms, ranging from conventional Grid sites to opportunistic, often short-lived resources, such as spot market commercial clouds, supercomputers and volunteer computing. The Event Service architecture allows real time delivery of fine grained workloads to running payload applications which process dispatched events or event ranges and immediately stream the outputs to highly scalable Object Stores. Thanks to its agile and flexible architecture the AES is currently being used by grid sites for assigning low priority workloads to otherwise idle computing resources; similarly harvesting HPC resources in an efficient back-fill mode; and massively scaling out to the 50-100k concurrent core level on the Amazon spot market to efficiently utilize those transient resources for peak production needs. Platform ports in development include ATLAS@Home (BOINC) and the Google Compute Engine, and a growing number of HPC platforms. After briefly reviewing the concept and the architecture of the Event Service, we will report the status and experience gained in AES commissioning and production operations on supercomputers, and our plans for extending ES application beyond Geant4 simulation to other workflows, such as reconstruction and data analysis.
Computational approaches to vision
NASA Technical Reports Server (NTRS)
Barrow, H. G.; Tenenbaum, J. M.
1986-01-01
Vision is examined in terms of a computational process, and the competence, structure, and control of computer vision systems are analyzed. Theoretical and experimental data on the formation of a computer vision system are discussed. Consideration is given to early vision, the recovery of intrinsic surface characteristics, higher levels of interpretation, and system integration and control. A computational visual processing model is proposed and its architecture and operation are described. Examples of state-of-the-art vision systems, which include some of the levels of representation and processing mechanisms, are presented.
Hybrid architecture for building secure sensor networks
NASA Astrophysics Data System (ADS)
Owens, Ken R., Jr.; Watkins, Steve E.
2012-04-01
Sensor networks have various communication and security architectural concerns. Three approaches are defined to address these concerns for sensor networks. The first area is the utilization of new computing architectures that leverage embedded virtualization software on the sensor. Deploying a small, embedded virtualization operating system on the sensor nodes that is designed to communicate to low-cost cloud computing infrastructure in the network is the foundation to delivering low-cost, secure sensor networks. The second area focuses on securing the sensor. Sensor security components include developing an identification scheme, and leveraging authentication algorithms and protocols that address security assurance within the physical, communication network, and application layers. This function will primarily be accomplished through encrypting the communication channel and integrating sensor network firewall and intrusion detection/prevention components to the sensor network architecture. Hence, sensor networks will be able to maintain high levels of security. The third area addresses the real-time and high priority nature of the data that sensor networks collect. This function requires that a quality-of-service (QoS) definition and algorithm be developed for delivering the right data at the right time. A hybrid architecture is proposed that combines software and hardware features to handle network traffic with diverse QoS requirements.
PLAYGROUND: Preparing Students for the Cyber Battleground
ERIC Educational Resources Information Center
Nielson, Seth James
2017-01-01
Attempting to educate practitioners of computer security can be difficult if for no other reason than the breadth of knowledge required today. The security profession includes widely diverse subfields including cryptography, network architectures, programming, programming languages, design, coding practices, software testing, pattern recognition,…
Thin Client Architecture: The Promise and the Problems.
ERIC Educational Resources Information Center
Machovec, George S.
1997-01-01
Describes thin clients, a networking technology that allows organizations to provide software applications over networked workstations connected to a central server. Topics include corporate settings; major advantages, including cost effectiveness and increased computer security; problems; and possible applications for large public and academic…
A learnable parallel processing architecture towards unity of memory and computing
NASA Astrophysics Data System (ADS)
Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.
2015-08-01
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.
Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J
2015-08-14
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Computer Sciences and Data Systems, volume 1
NASA Technical Reports Server (NTRS)
1987-01-01
Topics addressed include: software engineering; university grants; institutes; concurrent processing; sparse distributed memory; distributed operating systems; intelligent data management processes; expert system for image analysis; fault tolerant software; and architecture research.
DFT algorithms for bit-serial GaAs array processor architectures
NASA Technical Reports Server (NTRS)
Mcmillan, Gary B.
1988-01-01
Systems and Processes Engineering Corporation (SPEC) has developed an innovative array processor architecture for computing Fourier transforms and other commonly used signal processing algorithms. This architecture is designed to extract the highest possible array performance from state-of-the-art GaAs technology. SPEC's architectural design includes a high performance RISC processor implemented in GaAs, along with a Floating Point Coprocessor and a unique Array Communications Coprocessor, also implemented in GaAs technology. Together, these data processors represent the latest in technology, both from an architectural and implementation viewpoint. SPEC has examined numerous algorithms and parallel processing architectures to determine the optimum array processor architecture. SPEC has developed an array processor architecture with integral communications ability to provide maximum node connectivity. The Array Communications Coprocessor embeds communications operations directly in the core of the processor architecture. A Floating Point Coprocessor architecture has been defined that utilizes Bit-Serial arithmetic units, operating at very high frequency, to perform floating point operations. These Bit-Serial devices reduce the device integration level and complexity to a level compatible with state-of-the-art GaAs device technology.
Neuromorphic Kalman filter implementation in IBM’s TrueNorth
NASA Astrophysics Data System (ADS)
Carney, R.; Bouchard, K.; Calafiura, P.; Clark, D.; Donofrio, D.; Garcia-Sciveres, M.; Livezey, J.
2017-10-01
Following the advent of a post-Moore’s law field of computation, novel architectures continue to emerge. With composite, multi-million connection neuromorphic chips like IBM’s TrueNorth, neural engineering has now become a feasible technology in this novel computing paradigm. High Energy Physics experiments are continuously exploring new methods of computation and data handling, including neuromorphic, to support the growing challenges of the field and be prepared for future commodity computing trends. This work details the first instance of a Kalman filter implementation in IBM’s neuromorphic architecture, TrueNorth, for both parallel and serial spike trains. The implementation is tested on multiple simulated systems and its performance is evaluated with respect to an equivalent non-spiking Kalman filter. The limits of the implementation are explored whilst varying the size of weight and threshold registers, the number of spikes used to encode a state, size of neuron block for spatial encoding, and neuron potential reset schemes.
The science of visual analysis at extreme scale
NASA Astrophysics Data System (ADS)
Nowell, Lucy T.
2011-01-01
Driven by market forces and spanning the full spectrum of computational devices, computer architectures are changing in ways that present tremendous opportunities and challenges for data analysis and visual analytic technologies. Leadership-class high performance computing system will have as many as a million cores by 2020 and support 10 billion-way concurrency, while laptop computers are expected to have as many as 1,000 cores by 2015. At the same time, data of all types are increasing exponentially and automated analytic methods are essential for all disciplines. Many existing analytic technologies do not scale to make full use of current platforms and fewer still are likely to scale to the systems that will be operational by the end of this decade. Furthermore, on the new architectures and for data at extreme scales, validating the accuracy and effectiveness of analytic methods, including visual analysis, will be increasingly important.
Enhancement of computer system for applications software branch
NASA Technical Reports Server (NTRS)
Bykat, Alex
1987-01-01
Presented is a compilation of the history of a two-month project concerned with a survey, evaluation, and specification of a new computer system for the Applications Software Branch of the Software and Data Management Division of Information and Electronic Systems Laboratory of Marshall Space Flight Center, NASA. Information gathering consisted of discussions and surveys of branch activities, evaluation of computer manufacturer literature, and presentations by vendors. Information gathering was followed by evaluation of their systems. The criteria of the latter were: the (tentative) architecture selected for the new system, type of network architecture supported, software tools, and to some extent the price. The information received from the vendors, as well as additional research, lead to detailed design of a suitable system. This design included considerations of hardware and software environments as well as personnel issues such as training. Design of the system culminated in a recommendation for a new computing system for the Branch.
East-West paths to unconventional computing.
Adamatzky, Andrew; Akl, Selim; Burgin, Mark; Calude, Cristian S; Costa, José Félix; Dehshibi, Mohammad Mahdi; Gunji, Yukio-Pegio; Konkoli, Zoran; MacLennan, Bruce; Marchal, Bruno; Margenstern, Maurice; Martínez, Genaro J; Mayne, Richard; Morita, Kenichi; Schumann, Andrew; Sergeyev, Yaroslav D; Sirakoulis, Georgios Ch; Stepney, Susan; Svozil, Karl; Zenil, Hector
2017-12-01
Unconventional computing is about breaking boundaries in thinking, acting and computing. Typical topics of this non-typical field include, but are not limited to physics of computation, non-classical logics, new complexity measures, novel hardware, mechanical, chemical and quantum computing. Unconventional computing encourages a new style of thinking while practical applications are obtained from uncovering and exploiting principles and mechanisms of information processing in and functional properties of, physical, chemical and living systems; in particular, efficient algorithms are developed, (almost) optimal architectures are designed and working prototypes of future computing devices are manufactured. This article includes idiosyncratic accounts of 'unconventional computing' scientists reflecting on their personal experiences, what attracted them to the field, their inspirations and discoveries. Copyright © 2017 Elsevier Ltd. All rights reserved.
Design of a real-time wind turbine simulator using a custom parallel architecture
NASA Technical Reports Server (NTRS)
Hoffman, John A.; Gluck, R.; Sridhar, S.
1995-01-01
The design of a new parallel-processing digital simulator is described. The new simulator has been developed specifically for analysis of wind energy systems in real time. The new processor has been named: the Wind Energy System Time-domain simulator, version 3 (WEST-3). Like previous WEST versions, WEST-3 performs many computations in parallel. The modules in WEST-3 are pure digital processors, however. These digital processors can be programmed individually and operated in concert to achieve real-time simulation of wind turbine systems. Because of this programmability, WEST-3 is very much more flexible and general than its two predecessors. The design features of WEST-3 are described to show how the system produces high-speed solutions of nonlinear time-domain equations. WEST-3 has two very fast Computational Units (CU's) that use minicomputer technology plus special architectural features that make them many times faster than a microcomputer. These CU's are needed to perform the complex computations associated with the wind turbine rotor system in real time. The parallel architecture of the CU causes several tasks to be done in each cycle, including an IO operation and the combination of a multiply, add, and store. The WEST-3 simulator can be expanded at any time for additional computational power. This is possible because the CU's interfaced to each other and to other portions of the simulation using special serial buses. These buses can be 'patched' together in essentially any configuration (in a manner very similar to the programming methods used in analog computation) to balance the input/ output requirements. CU's can be added in any number to share a given computational load. This flexible bus feature is very different from many other parallel processors which usually have a throughput limit because of rigid bus architecture.
SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications
Kalinin, Alexandr A.; Palanimalai, Selvam; Dinov, Ivo D.
2018-01-01
The modern web is a successful platform for large scale interactive web applications, including visualizations. However, there are no established design principles for building complex visual analytics (VA) web applications that could efficiently integrate visualizations with data management, computational transformation, hypothesis testing, and knowledge discovery. This imposes a time-consuming design and development process on many researchers and developers. To address these challenges, we consider the design requirements for the development of a module-based VA system architecture, adopting existing practices of large scale web application development. We present the preliminary design and implementation of an open-source platform for Statistics Online Computational Resource Analytical Toolbox (SOCRAT). This platform defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module interaction, re-usage, and extension. To demonstrate how this platform can be used to integrate a number of data management, interactive visualization, and analysis tools, we implement an example application for simple VA tasks including raw data input and representation, interactive visualization and analysis. PMID:29630069
SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications.
Kalinin, Alexandr A; Palanimalai, Selvam; Dinov, Ivo D
2017-04-01
The modern web is a successful platform for large scale interactive web applications, including visualizations. However, there are no established design principles for building complex visual analytics (VA) web applications that could efficiently integrate visualizations with data management, computational transformation, hypothesis testing, and knowledge discovery. This imposes a time-consuming design and development process on many researchers and developers. To address these challenges, we consider the design requirements for the development of a module-based VA system architecture, adopting existing practices of large scale web application development. We present the preliminary design and implementation of an open-source platform for Statistics Online Computational Resource Analytical Toolbox (SOCRAT). This platform defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module interaction, re-usage, and extension. To demonstrate how this platform can be used to integrate a number of data management, interactive visualization, and analysis tools, we implement an example application for simple VA tasks including raw data input and representation, interactive visualization and analysis.
Digital optical computers at the optoelectronic computing systems center
NASA Technical Reports Server (NTRS)
Jordan, Harry F.
1991-01-01
The Digital Optical Computing Program within the National Science Foundation Engineering Research Center for Opto-electronic Computing Systems has as its specific goal research on optical computing architectures suitable for use at the highest possible speeds. The program can be targeted toward exploiting the time domain because other programs in the Center are pursuing research on parallel optical systems, exploiting optical interconnection and optical devices and materials. Using a general purpose computing architecture as the focus, we are developing design techniques, tools and architecture for operation at the speed of light limit. Experimental work is being done with the somewhat low speed components currently available but with architectures which will scale up in speed as faster devices are developed. The design algorithms and tools developed for a general purpose, stored program computer are being applied to other systems such as optimally controlled optical communication networks.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGhee, J.M.; Roberts, R.M.; Morel, J.E.
1997-06-01
A spherical harmonics research code (DANTE) has been developed which is compatible with parallel computer architectures. DANTE provides 3-D, multi-material, deterministic, transport capabilities using an arbitrary finite element mesh. The linearized Boltzmann transport equation is solved in a second order self-adjoint form utilizing a Galerkin finite element spatial differencing scheme. The core solver utilizes a preconditioned conjugate gradient algorithm. Other distinguishing features of the code include options for discrete-ordinates and simplified spherical harmonics angular differencing, an exact Marshak boundary treatment for arbitrarily oriented boundary faces, in-line matrix construction techniques to minimize memory consumption, and an effective diffusion based preconditioner formore » scattering dominated problems. Algorithm efficiency is demonstrated for a massively parallel SIMD architecture (CM-5), and compatibility with MPP multiprocessor platforms or workstation clusters is anticipated.« less
DESPIC: Detecting Early Signatures of Persuasion in Information Cascades
2015-08-27
over NoSQL Databases, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). 26-MAY-14, . : , P...over NoSQL Databases. Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014). Chicago, IL, USA...distributed NoSQL databases including HBase and Riak, we finalized the requirements of the optimal computational architecture to support our framework
A Fuzzy Evaluation Method for System of Systems Meta-architectures
2013-03-01
Procedia Computer Science Procedia Computer Science 00 (2013) 000–000 www.elsevier.com/locate/ procedia Conference on Systems Engineering...boundary includes integration of technical systems as well as cognitive and social processes, which alter system behavior [2]. Most system architects...unclassified c. THIS PAGE unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 Pape/ Procedia Computer Science 00 (2013) 000
Gauss Elimination: Workhorse of Linear Algebra.
1995-08-05
linear algebra computation for solving systems, computing determinants and determining the rank of matrix. All of these are discussed in varying contexts. These include different arithmetic or algebraic setting such as integer arithmetic or polynomial rings as well as conventional real (floating-point) arithmetic. These have effects on both accuracy and complexity analyses of the algorithm. These, too, are covered here. The impact of modern parallel computer architecture on GE is also
A FORCEnet Framework for Analysis of Existing Naval C4I Architectures
2003-06-01
best qualities of humans and computers. f. Information Weapons Information weapons integrate the use of military deception, psychological ...operations, to include electronic warfare, psychological operations, computer network attack, computer network defense, operations security, and military...F/A-18 ( ATARS /SHARP), S-3B (SSU), SH-60 LAMPS (HAWKLINK) and P-3C (AIP, Special Projects). CDL-N consists of two antennas (one meter diameter
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Dictionary of marine technology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, D.A.
1989-01-01
This book is intended to replace G. O. Watson's Dictionary of Marine Engineering and Nautical Terms (1964). It includes terms from marine and offshore engineering, naval architecture, shipbuilding, shipping, ship operation, and relevant terms from the electronics, control and computing fields. A few nautical terms are also included.
Meta assembler enhancements and generalized linkage editor
NASA Technical Reports Server (NTRS)
1979-01-01
A meta Assembler for NASA was developed. The initial development of the Meta Assembler for the SUMC was performed. The capabilities included assembly for both main and micro level programs. A period of checkout and utilization to verify the performance of the Meta Assembler was undertaken. Additional enhancements were made to the Meta Assembler which expanded the target computer family to include architectures represented by the PDP-11, MODCOMP 2, and Raytheon 706 computers.
Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode
NASA Technical Reports Server (NTRS)
Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William
1986-01-01
The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
NASA Technical Reports Server (NTRS)
Hsia, T. C.; Lu, G. Z.; Han, W. H.
1987-01-01
In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
Hypercube matrix computation task
NASA Technical Reports Server (NTRS)
Calalo, Ruel H.; Imbriale, William A.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Lyons, James R.; Manshadi, Farzin; Patterson, Jean E.
1988-01-01
A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18).
NASA Technical Reports Server (NTRS)
Rutishauser, David
2006-01-01
The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
ERIC Educational Resources Information Center
Buechley, Leah, Ed.; Peppler, Kylie, Ed.; Eisenberg, Michael, Ed.; Yasmin, Kafai, Ed.
2013-01-01
"Textile Messages" focuses on the emerging field of electronic textiles, or e-textiles--computers that can be soft, colorful, approachable, and beautiful. E-textiles are articles of clothing, home furnishings, or architectures that include embedded computational and electronic elements. This book introduces a collection of tools that…
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Optimizing Engineering Tools Using Modern Ground Architectures
2017-12-01
Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Multi-processor including data flow accelerator module
Davidson, George S.; Pierce, Paul E.
1990-01-01
An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
Engineering Technology Education: Bibliography 1989.
ERIC Educational Resources Information Center
Dyrud, Marilyn A., Comp.
1990-01-01
Over 200 references divided into 24 different areas are presented. Topics include administration, aeronautics, architecture, biomedical technology, CAD/CAM, civil engineering, computers, curriculum, electrical/electronics engineering, industrial engineering, industry and employment, instructional technology, laboratories, lasers, liberal studies,…
School Architecture: New Activities Dictate New Designs.
ERIC Educational Resources Information Center
Hill, Robert
1984-01-01
Changing educational requirements have led to many school building design developments in recent years, including technologically sophisticated music and computer rooms, large school kitchens, and Title IX-mandated equal facilities available for both sexes. (MLF)
State-of-the-art in Heterogeneous Computing
Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...
2010-01-01
Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1991-01-01
The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Advanced computer architecture for large-scale real-time applications.
DOT National Transportation Integrated Search
1973-04-01
Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
Bruemmer, David J [Idaho Falls, ID; Few, Douglas A [Idaho Falls, ID
2010-09-21
The present invention provides methods, computer readable media, and apparatuses for a generic robot architecture providing a framework that is easily portable to a variety of robot platforms and is configured to provide hardware abstractions, abstractions for generic robot attributes, environment abstractions, and robot behaviors. The generic robot architecture includes a hardware abstraction level and a robot abstraction level. The hardware abstraction level is configured for developing hardware abstractions that define, monitor, and control hardware modules available on a robot platform. The robot abstraction level is configured for defining robot attributes and provides a software framework for building robot behaviors from the robot attributes. Each of the robot attributes includes hardware information from at least one hardware abstraction. In addition, each robot attribute is configured to substantially isolate the robot behaviors from the at least one hardware abstraction.
NASA Technical Reports Server (NTRS)
Hyde, Patricia R.; Loftin, R. Bowen
1993-01-01
The volume 2 proceedings from the 1993 Conference on Intelligent Computer-Aided Training and Virtual Environment Technology are presented. Topics discussed include intelligent computer assisted training (ICAT) systems architectures, ICAT educational and medical applications, virtual environment (VE) training and assessment, human factors engineering and VE, ICAT theory and natural language processing, ICAT military applications, VE engineering applications, ICAT knowledge acquisition processes and applications, and ICAT aerospace applications.
Lambda Data Grid: Communications Architecture in Support of Grid Computing
2006-12-21
number of paradigm shifts in the 20th century, including the growth of large geographically dispersed teams and the use of simulations and computational...get results. The work in this thesis automates the orchestration of networks with other resources, better utilizing all resources in a time efficient...domains, over transatlantic links in around minute. The main goal of this thesis is to build a new grid-computing paradigm that fully harnesses the
ERIC Educational Resources Information Center
Farid, Ayman A.; Zaghloul, Weaam M.; Dewidar, Khaled M.
2014-01-01
The great shift in sustainability and computer aided design in the field of architecture caused a remarkable change in the architecture philosophy, new aspects of beauty and aesthetic values are being introduced, and traditional definitions for beauty cannot fully cover this aspects, which causes a gap between; new architecture works criticism and…
Programmable hardware for reconfigurable computing systems
NASA Astrophysics Data System (ADS)
Smith, Stephen
1996-10-01
In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
NASA Technical Reports Server (NTRS)
Tavenner, Leslie A. (Editor)
1991-01-01
These proceedings overview major space information system projects and lessons learned from current missions. Other topics include the science information system requirements for the 1990s, an information systems design approach for major programs, the technology needs and projections, the standards for space data information systems, the artificial intelligence technology and applications, international interoperability, and spacecraft data systems and architectures advanced communications. Other topics include the software engineering technology and applications, the multimission multidiscipline information system architectures, the distributed planning and scheduling systems and operations, and the computer and information systems architectures. Paper presented include prospects for scientific data analysis systems for solar-terrestrial physics in the 1990s, the Columbus data management system, data storage technologies for the future, the German aerospace research establishment, and launching artificial intelligence in NASA ground systems.
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy
Hwang, Wen-Jyi; Cheng, Shih-Chang; Cheng, Chau-Jern
2011-01-01
This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system. PMID:22163688
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
NASA Technical Reports Server (NTRS)
Dias, W. C.
1994-01-01
RISK D/C is a prototype program which attempts to do program risk modeling for the Space Exploration Initiative (SEI) architectures proposed in the Synthesis Group Report. Risk assessment is made with respect to risk events, their probabilities, and the severities of potential results. The program allows risk mitigation strategies to be proposed for an exploration program architecture and to be ranked with respect to their effectiveness. RISK D/C allows for the fact that risk assessment in early planning phases is subjective. Although specific to the SEI in its present form, RISK D/C can be used as a framework for developing a risk assessment program for other specific uses. RISK D/C is organized into files, or stacks, of information, including the architecture, the hazard, and the risk event stacks. Although predefined, all stacks can be upgraded by a user. The architecture stack contains information concerning the general program alternatives, which are subsequently broken down into waypoints, missions, and mission phases. The hazard stack includes any background condition which could result in a risk event. A risk event is anything unfavorable that could happen during the course of a specific point within an architecture, and the risk event stack provides the probabilities, consequences, severities, and any mitigation strategies which could be used to reduce the risk of the event, and how much the risk is reduced. RISK D/C was developed for Macintosh series computers. It requires HyperCard 2.0 or later, as well as 2Mb of RAM and System 6.0.8 or later. A Macintosh II series computer is recommended due to speed concerns. The standard distribution medium for this package is one 3.5 inch 800K Macintosh format diskette. RISK D/C was developed in 1991 and is a copyrighted work with all copyright vested in NASA. Macintosh and HyperCard are trademarks of Apple Computer, Inc.
NASA Astrophysics Data System (ADS)
Hadade, Ioan; di Mare, Luca
2016-08-01
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
ERIC Educational Resources Information Center
Betts, Janelle Lyon
2001-01-01
Describes a high school art assignment in which students utilize Appleworks or Claris Works to design their own house, after learning about architectural styles and how to use the computer program. States that the project develops student computer skills and increases student knowledge about architecture. (CMK)
The architecture of a distributed medical dictionary.
Fowler, J; Buffone, G; Moreau, D
1995-01-01
Exploiting high-speed computer networks to provide a national medical information infrastructure is a goal for medical informatics. The Distributed Medical Dictionary under development at Baylor College of Medicine is a model for an architecture that supports collaborative development of a distributed online medical terminology knowledge-base. A prototype is described that illustrates the concept. Issues that must be addressed by such a system include high availability, acceptable response time, support for local idiom, and control of vocabulary.
Advanced flight control system study
NASA Technical Reports Server (NTRS)
Hartmann, G. L.; Wall, J. E., Jr.; Rang, E. R.; Lee, H. P.; Schulte, R. W.; Ng, W. K.
1982-01-01
A fly by wire flight control system architecture designed for high reliability includes spare sensor and computer elements to permit safe dispatch with failed elements, thereby reducing unscheduled maintenance. A methodology capable of demonstrating that the architecture does achieve the predicted performance characteristics consists of a hierarchy of activities ranging from analytical calculations of system reliability and formal methods of software verification to iron bird testing followed by flight evaluation. Interfacing this architecture to the Lockheed S-3A aircraft for flight test is discussed. This testbed vehicle can be expanded to support flight experiments in advanced aerodynamics, electromechanical actuators, secondary power systems, flight management, new displays, and air traffic control concepts.
Evaluation of Visual Computer Simulator for Computer Architecture Education
ERIC Educational Resources Information Center
Imai, Yoshiro; Imai, Masatoshi; Moritoh, Yoshio
2013-01-01
This paper presents trial evaluation of a visual computer simulator in 2009-2011, which has been developed to play some roles of both instruction facility and learning tool simultaneously. And it illustrates an example of Computer Architecture education for University students and usage of e-Learning tool for Assembly Programming in order to…
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
Architectural Strategies for Enabling Data-Driven Science at Scale
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Law, E. S.; Doyle, R. J.; Little, M. M.
2017-12-01
The analysis of large data collections from NASA or other agencies is often executed through traditional computational and data analysis approaches, which require users to bring data to their desktops and perform local data analysis. Alternatively, data are hauled to large computational environments that provide centralized data analysis via traditional High Performance Computing (HPC). Scientific data archives, however, are not only growing massive, but are also becoming highly distributed. Neither traditional approach provides a good solution for optimizing analysis into the future. Assumptions across the NASA mission and science data lifecycle, which historically assume that all data can be collected, transmitted, processed, and archived, will not scale as more capable instruments stress legacy-based systems. New paradigms are needed to increase the productivity and effectiveness of scientific data analysis. This paradigm must recognize that architectural and analytical choices are interrelated, and must be carefully coordinated in any system that aims to allow efficient, interactive scientific exploration and discovery to exploit massive data collections, from point of collection (e.g., onboard) to analysis and decision support. The most effective approach to analyzing a distributed set of massive data may involve some exploration and iteration, putting a premium on the flexibility afforded by the architectural framework. The framework should enable scientist users to assemble workflows efficiently, manage the uncertainties related to data analysis and inference, and optimize deep-dive analytics to enhance scalability. In many cases, this "data ecosystem" needs to be able to integrate multiple observing assets, ground environments, archives, and analytics, evolving from stewardship of measurements of data to using computational methodologies to better derive insight from the data that may be fused with other sets of data. This presentation will discuss architectural strategies, including a 2015-2016 NASA AIST Study on Big Data, for evolving scientific research towards massively distributed data-driven discovery. It will include example use cases across earth science, planetary science, and other disciplines.
A heterogeneous hierarchical architecture for real-time computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skroch, D.A.; Fornaro, R.J.
The need for high-speed data acquisition and control algorithms has prompted continued research in the area of multiprocessor systems and related programming techniques. The result presented here is a unique hardware and software architecture for high-speed real-time computer systems. The implementation of a prototype of this architecture has required the integration of architecture, operating systems and programming languages into a cohesive unit. This report describes a Heterogeneous Hierarchial Architecture for Real-Time (H{sup 2} ART) and system software for program loading and interprocessor communication.
NASA Astrophysics Data System (ADS)
Rucinski, Marek; Coates, Adam; Montano, Giuseppe; Allouis, Elie; Jameux, David
2015-09-01
The Lightweight Advanced Robotic Arm Demonstrator (LARAD) is a state-of-the-art, two-meter long robotic arm for planetary surface exploration currently being developed by a UK consortium led by Airbus Defence and Space Ltd under contract to the UK Space Agency (CREST-2 programme). LARAD has a modular design, which allows for experimentation with different electronics and control software. The control system architecture includes the on-board computer, control software and firmware, and the communication infrastructure (e.g. data links, switches) connecting on-board computer(s), sensors, actuators and the end-effector. The purpose of the control system is to operate the arm according to pre-defined performance requirements, monitoring its behaviour in real-time and performing safing/recovery actions in case of faults. This paper reports on the results of a recent study about the feasibility of the development and integration of a novel control system architecture for LARAD fully based on the SpaceWire protocol. The current control system architecture is based on the combination of two communication protocols, Ethernet and CAN. The new SpaceWire-based control system will allow for improved monitoring and telecommanding performance thanks to higher communication data rate, allowing for the adoption of advanced control schemes, potentially based on multiple vision sensors, and for the handling of sophisticated end-effectors that require fine control, such as science payloads or robotic hands.
Special Feature: Teaching about High Tech.
ERIC Educational Resources Information Center
Kopf, Michael; And Others
1992-01-01
Includes four articles: "Virtual Reality" (Kopf), description of its uses in computer-assisted design, architecture, and technical training; "SME (Society of Manufacturing Engineers) Robotics Contest Opens Doors to Future" (Wagner); "Superconductivity" (Canady), description of classroom demonstrations and experiments;…
Advanced Computing Architectures for Cognitive Processing
2009-07-01
Evolution ................................................................................. 20 Figure 9: Logic diagram smart block-based neuron...48 Figure 21: Naive Grid Potential Kernel...processing would be helpful for Air Force systems acquisition. Specific cognitive processing approaches addressed herein include global information grid
Distributive, Non-destructive Real-time System and Method for Snowpack Monitoring
NASA Technical Reports Server (NTRS)
Frolik, Jeff (Inventor); Skalka, Christian (Inventor)
2013-01-01
A ground-based system that provides quasi real-time measurement and collection of snow-water equivalent (SWE) data in remote settings is provided. The disclosed invention is significantly less expensive and easier to deploy than current methods and less susceptible to terrain and snow bridging effects. Embodiments of the invention include remote data recovery solutions. Compared to current infrastructure using existing SWE technology, the disclosed invention allows more SWE sites to be installed for similar cost and effort, in a greater variety of terrain; thus, enabling data collection at improved spatial resolutions. The invention integrates a novel computational architecture with new sensor technologies. The invention's computational architecture is based on wireless sensor networks, comprised of programmable, low-cost, low-powered nodes capable of sophisticated sensor control and remote data communication. The invention also includes measuring attenuation of electromagnetic radiation, an approach that is immune to snow bridging and significantly reduces sensor footprints.
Three Program Architecture for Design Optimization
NASA Technical Reports Server (NTRS)
Miura, Hirokazu; Olson, Lawrence E. (Technical Monitor)
1998-01-01
In this presentation, I would like to review historical perspective on the program architecture used to build design optimization capabilities based on mathematical programming and other numerical search techniques. It is rather straightforward to classify the program architecture in three categories as shown above. However, the relative importance of each of the three approaches has not been static, instead dynamically changing as the capabilities of available computational resource increases. For example, we considered that the direct coupling architecture would never be used for practical problems, but availability of such computer systems as multi-processor. In this presentation, I would like to review the roles of three architecture from historical as well as current and future perspective. There may also be some possibility for emergence of hybrid architecture. I hope to provide some seeds for active discussion where we are heading to in the very dynamic environment for high speed computing and communication.
ERIC Educational Resources Information Center
Arumi, Francisco N.
Computer programs capable of describing the thermal behavior of buildings are used to help architectural students understand environmental systems. The Numerical Simulation Laboratory at the Architectural School of the University of Texas at Austin was developed to provide the necessary software capable of simulating the energy transactions…
Lunar Applications in Reconfigurable Computing
NASA Technical Reports Server (NTRS)
Somervill, Kevin
2008-01-01
NASA s Constellation Program is developing a lunar surface outpost in which reconfigurable computing will play a significant role. Reconfigurable systems provide a number of benefits over conventional software-based implementations including performance and power efficiency, while the use of standardized reconfigurable hardware provides opportunities to reduce logistical overhead. The current vision for the lunar surface architecture includes habitation, mobility, and communications systems, each of which greatly benefit from reconfigurable hardware in applications including video processing, natural feature recognition, data formatting, IP offload processing, and embedded control systems. In deploying reprogrammable hardware, considerations similar to those of software systems must be managed. There needs to be a mechanism for discovery enabling applications to locate and utilize the available resources. Also, application interfaces are needed to provide for both configuring the resources as well as transferring data between the application and the reconfigurable hardware. Each of these topics are explored in the context of deploying reconfigurable resources as an integral aspect of the lunar exploration architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Y.; Cameron, K.W.
1998-11-24
Workload characterization has been proven an essential tool to architecture design and performance evaluation in both scientific and commercial computing areas. Traditional workload characterization techniques include FLOPS rate, cache miss ratios, CPI (cycles per instruction or IPC, instructions per cycle) etc. With the complexity of sophisticated modern superscalar microprocessors, these traditional characterization techniques are not powerful enough to pinpoint the performance bottleneck of an application on a specific microprocessor. They are also incapable of immediately demonstrating the potential performance benefit of any architectural or functional improvement in a new processor design. To solve these problems, many people rely on simulators,more » which have substantial constraints especially on large-scale scientific computing applications. This paper presents a new technique of characterizing applications at the instruction level using hardware performance counters. It has the advantage of collecting instruction-level characteristics in a few runs virtually without overhead or slowdown. A variety of instruction counts can be utilized to calculate some average abstract workload parameters corresponding to microprocessor pipelines or functional units. Based on the microprocessor architectural constraints and these calculated abstract parameters, the architectural performance bottleneck for a specific application can be estimated. In particular, the analysis results can provide some insight to the problem that only a small percentage of processor peak performance can be achieved even for many very cache-friendly codes. Meanwhile, the bottleneck estimation can provide suggestions about viable architectural/functional improvement for certain workloads. Eventually, these abstract parameters can lead to the creation of an analytical microprocessor pipeline model and memory hierarchy model.« less
NASA Astrophysics Data System (ADS)
Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.
2017-12-01
As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
Practical Application of Model-based Programming and State-based Architecture to Space Missions
NASA Technical Reports Server (NTRS)
Horvath, Gregory; Ingham, Michel; Chung, Seung; Martin, Oliver; Williams, Brian
2006-01-01
A viewgraph presentation to develop models from systems engineers that accomplish mission objectives and manage the health of the system is shown. The topics include: 1) Overview; 2) Motivation; 3) Objective/Vision; 4) Approach; 5) Background: The Mission Data System; 6) Background: State-based Control Architecture System; 7) Background: State Analysis; 8) Overview of State Analysis; 9) Background: MDS Software Frameworks; 10) Background: Model-based Programming; 10) Background: Titan Model-based Executive; 11) Model-based Execution Architecture; 12) Compatibility Analysis of MDS and Titan Architectures; 13) Integrating Model-based Programming and Execution into the Architecture; 14) State Analysis and Modeling; 15) IMU Subsystem State Effects Diagram; 16) Titan Subsystem Model: IMU Health; 17) Integrating Model-based Programming and Execution into the Software IMU; 18) Testing Program; 19) Computationally Tractable State Estimation & Fault Diagnosis; 20) Diagnostic Algorithm Performance; 21) Integration and Test Issues; 22) Demonstrated Benefits; and 23) Next Steps
An Evaluation of an Ada Implementation of the Rete Algorithm for Embedded Flight Processors
1990-12-01
computers was desired. The VAX VMS operating system has many built-in methods for determining program performance (including VAX PCA), but these methods... overviev , of the target environment-- the MIL-STD-1750A VHSIC Avionic Modular Processor ( VA.IP, running under the Ada Avionics Real-Time Software (AARTS... computers . Mil-STD-1750A, the Air Force’s standard flight computer architecture, however, places severe constraints on applications software processing
Restricted access processor - An application of computer security technology
NASA Technical Reports Server (NTRS)
Mcmahon, E. M.
1985-01-01
This paper describes a security guard device that is currently being developed by Computer Sciences Corporation (CSC). The methods used to provide assurance that the system meets its security requirements include the system architecture, a system security evaluation, and the application of formal and informal verification techniques. The combination of state-of-the-art technology and the incorporation of new verification procedures results in a demonstration of the feasibility of computer security technology for operational applications.
High-performance computing with quantum processing units
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...
2017-03-01
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
High-performance computing with quantum processing units
DOE Office of Scientific and Technical Information (OSTI.GOV)
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Exploring Asynchronous Many-Task Runtime Systems toward Extreme Scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Knight, Samuel; Baker, Gavin Matthew; Gamell, Marc
2015-10-01
Major exascale computing reports indicate a number of software challenges to meet the dramatic change of system architectures in near future. While several-orders-of-magnitude increase in parallelism is the most commonly cited of those, hurdles also include performance heterogeneity of compute nodes across the system, increased imbalance between computational capacity and I/O capabilities, frequent system interrupts, and complex hardware architectures. Asynchronous task-parallel programming models show a great promise in addressing these issues, but are not yet fully understood nor developed su ciently for computational science and engineering application codes. We address these knowledge gaps through quantitative and qualitative exploration of leadingmore » candidate solutions in the context of engineering applications at Sandia. In this poster, we evaluate MiniAero code ported to three leading candidate programming models (Charm++, Legion and UINTAH) to examine the feasibility of these models that permits insertion of new programming model elements into an existing code base.« less
Automated Generation of Message-Passing Programs: An Evaluation Using CAPTools
NASA Technical Reports Server (NTRS)
Hribar, Michelle R.; Jin, Haoqiang; Yan, Jerry C.; Saini, Subhash (Technical Monitor)
1998-01-01
Scientists at NASA Ames Research Center have been developing computational aeroscience applications on highly parallel architectures over the past ten years. During that same time period, a steady transition of hardware and system software also occurred, forcing us to expend great efforts into migrating and re-coding our applications. As applications and machine architectures become increasingly complex, the cost and time required for this process will become prohibitive. In this paper, we present the first set of results in our evaluation of interactive parallelization tools. In particular, we evaluate CAPTool's ability to parallelize computational aeroscience applications. CAPTools was tested on serial versions of the NAS Parallel Benchmarks and ARC3D, a computational fluid dynamics application, on two platforms: the SGI Origin 2000 and the Cray T3E. This evaluation includes performance, amount of user interaction required, limitations and portability. Based on these results, a discussion on the feasibility of computer aided parallelization of aerospace applications is presented along with suggestions for future work.
NASA Technical Reports Server (NTRS)
Weeks, Cindy Lou
1986-01-01
Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas; Schuman, Catherine; Patton, Robert
The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuller, Ivan K.; Stevens, Rick; Pino, Robinson
2015-10-29
Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas E; Schuman, Catherine D; Young, Steven R
Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
ERIC Educational Resources Information Center
Amenyo, John-Thones
2012-01-01
Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
Program optimizations: The interplay between power, performance, and energy
Leon, Edgar A.; Karlin, Ian; Grant, Ryan E.; ...
2016-05-16
Practical considerations for future supercomputer designs will impose limits on both instantaneous power consumption and total energy consumption. Working within these constraints while providing the maximum possible performance, application developers will need to optimize their code for speed alongside power and energy concerns. This paper analyzes the effectiveness of several code optimizations including loop fusion, data structure transformations, and global allocations. A per component measurement and analysis of different architectures is performed, enabling the examination of code optimizations on different compute subsystems. Using an explicit hydrodynamics proxy application from the U.S. Department of Energy, LULESH, we show how code optimizationsmore » impact different computational phases of the simulation. This provides insight for simulation developers into the best optimizations to use during particular simulation compute phases when optimizing code for future supercomputing platforms. Here, we examine and contrast both x86 and Blue Gene architectures with respect to these optimizations.« less
The Road to a New Unified Command
2008-01-01
The Directorate of Command, Control, Communications, and Computers (C4) Systems is chartered with information architecture (including in Africa...Friday afternoon cinema presentations where a documentary or feature film covering an African historic event was played, followed by dialogue
Understanding Evolutionary Potential in Virtual CPU Instruction Set Architectures
Bryson, David M.; Ofria, Charles
2013-01-01
We investigate fundamental decisions in the design of instruction set architectures for linear genetic programs that are used as both model systems in evolutionary biology and underlying solution representations in evolutionary computation. We subjected digital organisms with each tested architecture to seven different computational environments designed to present a range of evolutionary challenges. Our goal was to engineer a general purpose architecture that would be effective under a broad range of evolutionary conditions. We evaluated six different types of architectural features for the virtual CPUs: (1) genetic flexibility: we allowed digital organisms to more precisely modify the function of genetic instructions, (2) memory: we provided an increased number of registers in the virtual CPUs, (3) decoupled sensors and actuators: we separated input and output operations to enable greater control over data flow. We also tested a variety of methods to regulate expression: (4) explicit labels that allow programs to dynamically refer to specific genome positions, (5) position-relative search instructions, and (6) multiple new flow control instructions, including conditionals and jumps. Each of these features also adds complication to the instruction set and risks slowing evolution due to epistatic interactions. Two features (multiple argument specification and separated I/O) demonstrated substantial improvements in the majority of test environments, along with versions of each of the remaining architecture modifications that show significant improvements in multiple environments. However, some tested modifications were detrimental, though most exhibit no systematic effects on evolutionary potential, highlighting the robustness of digital evolution. Combined, these observations enhance our understanding of how instruction architecture impacts evolutionary potential, enabling the creation of architectures that support more rapid evolution of complex solutions to a broad range of challenges. PMID:24376669
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
Switching from computer to microcomputer architecture education
NASA Astrophysics Data System (ADS)
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-03-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to microcomputer architecture. The authors present their strategies towards a successful crossing of boundaries between engineering disciplines. This communication aims at providing a different aspect on professional courses that are, nowadays, addressed at the expense of traditional courses.
Optimized planning methodologies of ASON implementation
NASA Astrophysics Data System (ADS)
Zhou, Michael M.; Tamil, Lakshman S.
2005-02-01
Advanced network planning concerns effective network-resource allocation for dynamic and open business environment. Planning methodologies of ASON implementation based on qualitative analysis and mathematical modeling are presented in this paper. The methodology includes method of rationalizing technology and architecture, building network and nodal models, and developing dynamic programming for multi-period deployment. The multi-layered nodal architecture proposed here can accommodate various nodal configurations for a multi-plane optical network and the network modeling presented here computes the required network elements for optimizing resource allocation.
EHR standards--A comparative study.
Blobel, Bernd; Pharow, Peter
2006-01-01
For ensuring quality and efficiency of patient's care, the care paradigm moves from organization-centered over process-controlled towards personal care. Such health system paradigm change leads to new paradigms for analyzing, designing, implementing and deploying supporting health information systems including EHR systems as core application in a distributed eHealth environment. The paper defines the architectural paradigm for future-proof EHR systems. It compares advanced EHR architectures referencing them at the Generic Component Model. The paper introduces the evolving paradigm of autonomous computing for self-organizing health information systems.
Three-Dimensional Nanobiocomputing Architectures With Neuronal Hypercells
2007-06-01
Neumann architectures, and CMOS fabrication. Novel solutions of massive parallel distributed computing and processing (pipelined due to systolic... and processing platforms utilizing molecular hardware within an enabling organization and architecture. The design technology is based on utilizing a...Microsystems and Nanotechnologies investigated a novel 3D3 (Hardware Software Nanotechnology) technology to design super-high performance computing
SpaceCubeX: A Framework for Evaluating Hybrid Multi-Core CPU FPGA DSP Architectures
NASA Technical Reports Server (NTRS)
Schmidt, Andrew G.; Weisz, Gabriel; French, Matthew; Flatley, Thomas; Villalpando, Carlos Y.
2017-01-01
The SpaceCubeX project is motivated by the need for high performance, modular, and scalable on-board processing to help scientists answer critical 21st century questions about global climate change, air quality, ocean health, and ecosystem dynamics, while adding new capabilities such as low-latency data products for extreme event warnings. These goals translate into on-board processing throughput requirements that are on the order of 100-1,000 more than those of previous Earth Science missions for standard processing, compression, storage, and downlink operations. To study possible future architectures to achieve these performance requirements, the SpaceCubeX project provides an evolvable testbed and framework that enables a focused design space exploration of candidate hybrid CPU/FPGA/DSP processing architectures. The framework includes ArchGen, an architecture generator tool populated with candidate architecture components, performance models, and IP cores, that allows an end user to specify the type, number, and connectivity of a hybrid architecture. The framework requires minimal extensions to integrate new processors, such as the anticipated High Performance Spaceflight Computer (HPSC), reducing time to initiate benchmarking by months. To evaluate the framework, we leverage a wide suite of high performance embedded computing benchmarks and Earth science scenarios to ensure robust architecture characterization. We report on our projects Year 1 efforts and demonstrate the capabilities across four simulation testbed models, a baseline SpaceCube 2.0 system, a dual ARM A9 processor system, a hybrid quad ARM A53 and FPGA system, and a hybrid quad ARM A53 and DSP system.
Switching from Computer to Microcomputer Architecture Education
ERIC Educational Resources Information Center
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-01-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to…
An Architecture for Cross-Cloud System Management
NASA Astrophysics Data System (ADS)
Dodda, Ravi Teja; Smith, Chris; van Moorsel, Aad
The emergence of the cloud computing paradigm promises flexibility and adaptability through on-demand provisioning of compute resources. As the utilization of cloud resources extends beyond a single provider, for business as well as technical reasons, the issue of effectively managing such resources comes to the fore. Different providers expose different interfaces to their compute resources utilizing varied architectures and implementation technologies. This heterogeneity poses a significant system management problem, and can limit the extent to which the benefits of cross-cloud resource utilization can be realized. We address this problem through the definition of an architecture to facilitate the management of compute resources from different cloud providers in an homogenous manner. This preserves the flexibility and adaptability promised by the cloud computing paradigm, whilst enabling the benefits of cross-cloud resource utilization to be realized. The practical efficacy of the architecture is demonstrated through an implementation utilizing compute resources managed through different interfaces on the Amazon Elastic Compute Cloud (EC2) service. Additionally, we provide empirical results highlighting the performance differential of these different interfaces, and discuss the impact of this performance differential on efficiency and profitability.
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.
Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang
2016-12-07
The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
Computing NLTE Opacities -- Node Level Parallel Calculation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holladay, Daniel
Presentation. The goal: to produce a robust library capable of computing reasonably accurate opacities inline with the assumption of LTE relaxed (non-LTE). Near term: demonstrate acceleration of non-LTE opacity computation. Far term (if funded): connect to application codes with in-line capability and compute opacities. Study science problems. Use efficient algorithms that expose many levels of parallelism and utilize good memory access patterns for use on advanced architectures. Portability to multiple types of hardware including multicore processors, manycore processors such as KNL, GPUs, etc. Easily coupled to radiation hydrodynamics and thermal radiative transfer codes.
NASA Technical Reports Server (NTRS)
Kemeny, Sabrina E.
1994-01-01
Electronic and optoelectronic hardware implementations of highly parallel computing architectures address several ill-defined and/or computation-intensive problems not easily solved by conventional computing techniques. The concurrent processing architectures developed are derived from a variety of advanced computing paradigms including neural network models, fuzzy logic, and cellular automata. Hardware implementation technologies range from state-of-the-art digital/analog custom-VLSI to advanced optoelectronic devices such as computer-generated holograms and e-beam fabricated Dammann gratings. JPL's concurrent processing devices group has developed a broad technology base in hardware implementable parallel algorithms, low-power and high-speed VLSI designs and building block VLSI chips, leading to application-specific high-performance embeddable processors. Application areas include high throughput map-data classification using feedforward neural networks, terrain based tactical movement planner using cellular automata, resource optimization (weapon-target assignment) using a multidimensional feedback network with lateral inhibition, and classification of rocks using an inner-product scheme on thematic mapper data. In addition to addressing specific functional needs of DOD and NASA, the JPL-developed concurrent processing device technology is also being customized for a variety of commercial applications (in collaboration with industrial partners), and is being transferred to U.S. industries. This viewgraph p resentation focuses on two application-specific processors which solve the computation intensive tasks of resource allocation (weapon-target assignment) and terrain based tactical movement planning using two extremely different topologies. Resource allocation is implemented as an asynchronous analog competitive assignment architecture inspired by the Hopfield network. Hardware realization leads to a two to four order of magnitude speed-up over conventional techniques and enables multiple assignments, (many to many), not achievable with standard statistical approaches. Tactical movement planning (finding the best path from A to B) is accomplished with a digital two-dimensional concurrent processor array. By exploiting the natural parallel decomposition of the problem in silicon, a four order of magnitude speed-up over optimized software approaches has been demonstrated.
Li, Xumeng; Wang, Xiaohui; Wei, Hailin; Zhu, Xinguang; Peng, Yulin; Li, Ming; Li, Tao; Huang, Huang
2017-01-01
This study developed a technique system for the measurement, reconstruction, and trait extraction of rice canopy architectures, which have challenged functional–structural plant modeling for decades and have become the foundation of the design of ideo-plant architectures. The system uses the location-separation-measurement method (LSMM) for the collection of data on the canopy architecture and the analytic geometry method for the reconstruction and visualization of the three-dimensional (3D) digital architecture of the rice plant. It also uses the virtual clipping method for extracting the key traits of the canopy architecture such as the leaf area, inclination, and azimuth distribution in spatial coordinates. To establish the technique system, we developed (i) simple tools to measure the spatial position of the stem axis and azimuth of the leaf midrib and to capture images of tillers and leaves; (ii) computer software programs for extracting data on stem diameter, leaf nodes, and leaf midrib curves from the tiller images and data on leaf length, width, and shape from the leaf images; (iii) a database of digital architectures that stores the measured data and facilitates the reconstruction of the 3D visual architecture and the extraction of architectural traits; and (iv) computation algorithms for virtual clipping to stratify the rice canopy, to extend the stratified surface from the horizontal plane to a general curved surface (including a cylindrical surface), and to implement in silico. Each component of the technique system was quantitatively validated and visually compared to images, and the sensitivity of the virtual clipping algorithms was analyzed. This technique is inexpensive and accurate and provides high throughput for the measurement, reconstruction, and trait extraction of rice canopy architectures. The technique provides a more practical method of data collection to serve functional–structural plant models of rice and for the optimization of rice canopy types. Moreover, the technique can be easily adapted for other cereal crops such as wheat, which has numerous stems and leaves sheltering each other. PMID:28558045
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Radenski, Atanas; Follen, Gregory J. (Technical Monitor)
2001-01-01
The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
THE COMPUTER AND THE ARCHITECTURAL PROFESSION.
ERIC Educational Resources Information Center
HAVILAND, DAVID S.
THE ROLE OF ADVANCING TECHNOLOGY IN THE FIELD OF ARCHITECTURE IS DISCUSSED IN THIS REPORT. PROBLEMS IN COMMUNICATION AND THE DESIGN PROCESS ARE IDENTIFIED. ADVANTAGES AND DISADVANTAGES OF COMPUTERS ARE MENTIONED IN RELATION TO MAN AND MACHINE INTERACTION. PRESENT AND FUTURE IMPLICATIONS OF COMPUTER USAGE ARE IDENTIFIED AND DISCUSSED WITH RESPECT…
The Contribution of Visualization to Learning Computer Architecture
ERIC Educational Resources Information Center
Yehezkel, Cecile; Ben-Ari, Mordechai; Dreyfus, Tommy
2007-01-01
This paper describes a visualization environment and associated learning activities designed to improve learning of computer architecture. The environment, EasyCPU, displays a model of the components of a computer and the dynamic processes involved in program execution. We present the results of a research program that analysed the contribution of…
Technology advances and market forces: Their impact on high performance architectures
NASA Technical Reports Server (NTRS)
Best, D. R.
1978-01-01
Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vineyard, Craig Michael; Verzi, Stephen Joseph
As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
Blueprint for a microwave trapped ion quantum computer.
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K
2017-02-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
Yang, Shu; Qiu, Yuyan; Shi, Bo
2016-09-01
This paper explores the methods of building the internet of things of a regional ECG monitoring, focused on the implementation of ECG monitoring center based on cloud computing platform. It analyzes implementation principles of automatic identifi cation in the types of arrhythmia. It also studies the system architecture and key techniques of cloud computing platform, including server load balancing technology, reliable storage of massive smalfi les and the implications of quick search function.
PERCLOS: A Valid Psychophysiological Measure of Alertness As Assessed by Psychomotor Vigilance
DOT National Transportation Integrated Search
2002-04-01
The Logical Architecture is based on a Computer Aided Systems Engineering (CASE) model of the requirements for the flow of data and control through the various functions included in Intelligent Transportation Systems (ITS). Process Specifications pro...
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Youngblood, John N.; Saha, Aindam
1987-01-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, C.C.; Youngblood, J.N.; Saha, A.
1987-12-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
SU (2) lattice gauge theory simulations on Fermi GPUs
NASA Astrophysics Data System (ADS)
Cardoso, Nuno; Bicudo, Pedro
2011-05-01
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.; ...
2011-03-02
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
Neural codes of seeing architectural styles
Choo, Heeyoung; Nasar, Jack L.; Nikrahei, Bardia; Walther, Dirk B.
2017-01-01
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people’s visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture. PMID:28071765
Neural codes of seeing architectural styles.
Choo, Heeyoung; Nasar, Jack L; Nikrahei, Bardia; Walther, Dirk B
2017-01-10
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people's visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture.
NASA Astrophysics Data System (ADS)
Litinski, Daniel; Kesselring, Markus S.; Eisert, Jens; von Oppen, Felix
2017-07-01
We present a scalable architecture for fault-tolerant topological quantum computation using networks of voltage-controlled Majorana Cooper pair boxes and topological color codes for error correction. Color codes have a set of transversal gates which coincides with the set of topologically protected gates in Majorana-based systems, namely, the Clifford gates. In this way, we establish color codes as providing a natural setting in which advantages offered by topological hardware can be combined with those arising from topological error-correcting software for full-fledged fault-tolerant quantum computing. We provide a complete description of our architecture, including the underlying physical ingredients. We start by showing that in topological superconductor networks, hexagonal cells can be employed to serve as physical qubits for universal quantum computation, and we present protocols for realizing topologically protected Clifford gates. These hexagonal-cell qubits allow for a direct implementation of open-boundary color codes with ancilla-free syndrome read-out and logical T gates via magic-state distillation. For concreteness, we describe how the necessary operations can be implemented using networks of Majorana Cooper pair boxes, and we give a feasibility estimate for error correction in this architecture. Our approach is motivated by nanowire-based networks of topological superconductors, but it could also be realized in alternative settings such as quantum-Hall-superconductor hybrids.
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Atzori, Manfredo; Cognolato, Matteo; Müller, Henning
2016-01-01
Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Biomorphic Multi-Agent Architecture for Persistent Computing
NASA Technical Reports Server (NTRS)
Lodding, Kenneth N.; Brewster, Paul
2009-01-01
A multi-agent software/hardware architecture, inspired by the multicellular nature of living organisms, has been proposed as the basis of design of a robust, reliable, persistent computing system. Just as a multicellular organism can adapt to changing environmental conditions and can survive despite the failure of individual cells, a multi-agent computing system, as envisioned, could adapt to changing hardware, software, and environmental conditions. In particular, the computing system could continue to function (perhaps at a reduced but still reasonable level of performance) if one or more component( s) of the system were to fail. One of the defining characteristics of a multicellular organism is unity of purpose. In biology, the purpose is survival of the organism. The purpose of the proposed multi-agent architecture is to provide a persistent computing environment in harsh conditions in which repair is difficult or impossible. A multi-agent, organism-like computing system would be a single entity built from agents or cells. Each agent or cell would be a discrete hardware processing unit that would include a data processor with local memory, an internal clock, and a suite of communication equipment capable of both local line-of-sight communications and global broadcast communications. Some cells, denoted specialist cells, could contain such additional hardware as sensors and emitters. Each cell would be independent in the sense that there would be no global clock, no global (shared) memory, no pre-assigned cell identifiers, no pre-defined network topology, and no centralized brain or control structure. Like each cell in a living organism, each agent or cell of the computing system would contain a full description of the system encoded as genes, but in this case, the genes would be components of a software genome.
AHaH computing-from metastable switches to attractors to machine learning.
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures-all key capabilities of biological nervous systems and modern machine learning algorithms with real world application.
PIMS: Memristor-Based Processing-in-Memory-and-Storage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cook, Jeanine
Continued progress in computing has augmented the quest for higher performance with a new quest for higher energy efficiency. This has led to the re-emergence of Processing-In-Memory (PIM) ar- chitectures that offer higher density and performance with some boost in energy efficiency. Past PIM work either integrated a standard CPU with a conventional DRAM to improve the CPU- memory link, or used a bit-level processor with Single Instruction Multiple Data (SIMD) control, but neither matched the energy consumption of the memory to the computation. We originally proposed to develop a new architecture derived from PIM that more effectively addressed energymore » efficiency for high performance scientific, data analytics, and neuromorphic applications. We also originally planned to implement a von Neumann architecture with arithmetic/logic units (ALUs) that matched the power consumption of an advanced storage array to maximize energy efficiency. Implementing this architecture in storage was our original idea, since by augmenting storage (in- stead of memory), the system could address both in-memory computation and applications that accessed larger data sets directly from storage, hence Processing-in-Memory-and-Storage (PIMS). However, as our research matured, we discovered several things that changed our original direc- tion, the most important being that a PIM that implements a standard von Neumann-type archi- tecture results in significant energy efficiency improvement, but only about a O(10) performance improvement. In addition to this, the emergence of new memory technologies moved us to propos- ing a non-von Neumann architecture, called Superstrider, implemented not in storage, but in a new DRAM technology called High Bandwidth Memory (HBM). HBM is a stacked DRAM tech- nology that includes a logic layer where an architecture such as Superstrider could potentially be implemented.« less
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.; Som, Sukhamony
1990-01-01
The performance modeling and enhancement for periodic execution of large-grain, decision-free algorithms in data flow architectures is examined. Applications include real-time implementation of control and signal processing algorithms where performance is required to be highly predictable. The mapping of algorithms onto the specified class of data flow architectures is realized by a marked graph model called ATAMM (Algorithm To Architecture Mapping Model). Performance measures and bounds are established. Algorithm transformation techniques are identified for performance enhancement and reduction of resource (computing element) requirements. A systematic design procedure is described for generating operating conditions for predictable performance both with and without resource constraints. An ATAMM simulator is used to test and validate the performance prediction by the design procedure. Experiments on a three resource testbed provide verification of the ATAMM model and the design procedure.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Som, Sukhamoy; Stoughton, John W.; Mielke, Roland R.
1990-01-01
Performance modeling and performance enhancement for periodic execution of large-grain, decision-free algorithms in data flow architectures are discussed. Applications include real-time implementation of control and signal processing algorithms where performance is required to be highly predictable. The mapping of algorithms onto the specified class of data flow architectures is realized by a marked graph model called algorithm to architecture mapping model (ATAMM). Performance measures and bounds are established. Algorithm transformation techniques are identified for performance enhancement and reduction of resource (computing element) requirements. A systematic design procedure is described for generating operating conditions for predictable performance both with and without resource constraints. An ATAMM simulator is used to test and validate the performance prediction by the design procedure. Experiments on a three resource testbed provide verification of the ATAMM model and the design procedure.
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
An S N Algorithm for Modern Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
2016-08-29
LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics
NASA Technical Reports Server (NTRS)
Padovan, Joe; Gute, Doug; Johnson, Keith
1989-01-01
The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uhr, L.
1987-01-01
This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Computer Architecture's Changing Role in Rebooting Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik P.
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Computer Architecture's Changing Role in Rebooting Computing
DeBenedictis, Erik P.
2017-04-26
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Using a software-defined computer in teaching the basics of computer architecture and operation
NASA Astrophysics Data System (ADS)
Kosowska, Julia; Mazur, Grzegorz
2017-08-01
The paper describes the concept and implementation of SDC_One software-defined computer designed for experimental and didactic purposes. Equipped with extensive hardware monitoring mechanisms, the device enables the students to monitor the computer's operation on bus transfer cycle or instruction cycle basis, providing the practical illustration of basic aspects of computer's operation. In the paper, we describe the hardware monitoring capabilities of SDC_One and some scenarios of using it in teaching the basics of computer architecture and microprocessor operation.
A Serial Bus Architecture for Parallel Processing Systems
1986-09-01
pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more communication capacity is needed, pushing...chip. The wider the communication path the more pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more...13 2. A Suitable Architecture Sought 14 II. OPTIMUM ARCHITECTURE OF LARGE INTEGRATED A. PARTIONING SILICON FOR MAXIMUM 1? 1. Transistor
ERIC Educational Resources Information Center
Online-Offline, 1998
1998-01-01
This theme issue on recreation includes annotated listings of Web sites, CD-ROMs, computer software, videos, books, magazines, and professional resources that deal with recreation for K-8 language arts, art/architecture, music/dance, science, math, social studies, and health/physical education. Sidebars discuss fun and games, recess recreation,…
A bibliography on parallel and vector numerical algorithms
NASA Technical Reports Server (NTRS)
Ortega, James M.; Voigt, Robert G.; Romine, Charles H.
1988-01-01
This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
A bibliography on parallel and vector numerical algorithms
NASA Technical Reports Server (NTRS)
Ortega, J. M.; Voigt, R. G.
1987-01-01
This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also.
A bibliography on parallel and vector numerical algorithms
NASA Technical Reports Server (NTRS)
Ortega, James M.; Voigt, Robert G.; Romine, Charles H.
1990-01-01
This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
Multicore Education through Simulation
ERIC Educational Resources Information Center
Ozturk, O.
2011-01-01
A project-oriented course for advanced undergraduate and graduate students is described for simulating multiple processor cores. Simics, a free simulator for academia, was utilized to enable students to explore computer architecture, operating systems, and hardware/software cosimulation. Motivation for including this course in the curriculum is…
Accelerating next generation sequencing data analysis with system level optimizations.
Kathiresan, Nagarajan; Temanni, Ramzi; Almabrazi, Hakeem; Syed, Najeeb; Jithesh, Puthen V; Al-Ali, Rashid
2017-08-22
Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance
NASA Technical Reports Server (NTRS)
Rushby, John
1999-01-01
Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
Modeling driver behavior in a cognitive architecture.
Salvucci, Dario D
2006-01-01
This paper explores the development of a rigorous computational model of driver behavior in a cognitive architecture--a computational framework with underlying psychological theories that incorporate basic properties and limitations of the human system. Computational modeling has emerged as a powerful tool for studying the complex task of driving, allowing researchers to simulate driver behavior and explore the parameters and constraints of this behavior. An integrated driver model developed in the ACT-R (Adaptive Control of Thought-Rational) cognitive architecture is described that focuses on the component processes of control, monitoring, and decision making in a multilane highway environment. This model accounts for the steering profiles, lateral position profiles, and gaze distributions of human drivers during lane keeping, curve negotiation, and lane changing. The model demonstrates how cognitive architectures facilitate understanding of driver behavior in the context of general human abilities and constraints and how the driving domain benefits cognitive architectures by pushing model development toward more complex, realistic tasks. The model can also serve as a core computational engine for practical applications that predict and recognize driver behavior and distraction.
High performance semantic factoring of giga-scale semantic graph databases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
al-Saffar, Sinan; Adolf, Bob; Haglin, David
2010-10-01
As semantic graph database technology grows to address components ranging from extant large triple stores to SPARQL endpoints over SQL-structured relational databases, it will become increasingly important to be able to bring high performance computational resources to bear on their analysis, interpretation, and visualization, especially with respect to their innate semantic structure. Our research group built a novel high performance hybrid system comprising computational capability for semantic graph database processing utilizing the large multithreaded architecture of the Cray XMT platform, conventional clusters, and large data stores. In this paper we describe that architecture, and present the results of our deployingmore » that for the analysis of the Billion Triple dataset with respect to its semantic factors, including basic properties, connected components, namespace interaction, and typed paths.« less
Advanced cloud fault tolerance system
NASA Astrophysics Data System (ADS)
Sumangali, K.; Benny, Niketa
2017-11-01
Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
Heavy Lift Vehicle (HLV) Avionics Flight Computing Architecture Study
NASA Technical Reports Server (NTRS)
Hodson, Robert F.; Chen, Yuan; Morgan, Dwayne R.; Butler, A. Marc; Sdhuh, Joseph M.; Petelle, Jennifer K.; Gwaltney, David A.; Coe, Lisa D.; Koelbl, Terry G.; Nguyen, Hai D.
2011-01-01
A NASA multi-Center study team was assembled from LaRC, MSFC, KSC, JSC and WFF to examine potential flight computing architectures for a Heavy Lift Vehicle (HLV) to better understand avionics drivers. The study examined Design Reference Missions (DRMs) and vehicle requirements that could impact the vehicles avionics. The study considered multiple self-checking and voting architectural variants and examined reliability, fault-tolerance, mass, power, and redundancy management impacts. Furthermore, a goal of the study was to develop the skills and tools needed to rapidly assess additional architectures should requirements or assumptions change.
The Fifth Generation. An annotated bibliography
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramer, M.; Bramer, D.
The Japanese Fifth Generation Computer System project constitutes a radical reappraisal of the functions which an advanced computer system should be able to perform, the programming languages needed to implement such functions, and the machine architectures suitable for supporting the chosen languages. The book guides the reader through the ever-growing literature on the project, and the international responses, including the United Kingdom Government's Alvey Program and the MCC Program in the United States. Evaluative abstracts are given, including books, journal articles, unpublished reports and material at both overview and technical levels.
The new landscape of parallel computer architecture
NASA Astrophysics Data System (ADS)
Shalf, John
2007-07-01
The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
A convergent model for distributed processing of Big Sensor Data in urban engineering networks
NASA Astrophysics Data System (ADS)
Parygin, D. S.; Finogeev, A. G.; Kamaev, V. A.; Finogeev, A. A.; Gnedkova, E. P.; Tyukov, A. P.
2017-01-01
The problems of development and research of a convergent model of the grid, cloud, fog and mobile computing for analytical Big Sensor Data processing are reviewed. The model is meant to create monitoring systems of spatially distributed objects of urban engineering networks and processes. The proposed approach is the convergence model of the distributed data processing organization. The fog computing model is used for the processing and aggregation of sensor data at the network nodes and/or industrial controllers. The program agents are loaded to perform computing tasks for the primary processing and data aggregation. The grid and the cloud computing models are used for integral indicators mining and accumulating. A computing cluster has a three-tier architecture, which includes the main server at the first level, a cluster of SCADA system servers at the second level, a lot of GPU video cards with the support for the Compute Unified Device Architecture at the third level. The mobile computing model is applied to visualize the results of intellectual analysis with the elements of augmented reality and geo-information technologies. The integrated indicators are transferred to the data center for accumulation in a multidimensional storage for the purpose of data mining and knowledge gaining.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing
2000-03-06
and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
Blueprint for a microwave trapped ion quantum computer
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.
2017-01-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course
ERIC Educational Resources Information Center
Arbelaitz, Olatz; José I. Martín; Muguerza, Javier
2015-01-01
This paper presents an analysis of introducing active methodologies in the Computer Architecture course taught in the second year of the Computer Engineering Bachelor's degree program at the University of the Basque Country (UPV/EHU), Spain. The paper reports the experience from three academic years, 2011-2012, 2012-2013, and 2013-2014, in which…
A Project-Based Learning Approach to Programmable Logic Design and Computer Architecture
ERIC Educational Resources Information Center
Kellett, C. M.
2012-01-01
This paper describes a course in programmable logic design and computer architecture as it is taught at the University of Newcastle, Australia. The course is designed around a major design project and has two supplemental assessment tasks that are also described. The context of the Computer Engineering degree program within which the course is…
ERIC Educational Resources Information Center
Stanley, Timothy D.; Wong, Lap Kei; Prigmore, Daniel; Benson, Justin; Fishler, Nathan; Fife, Leslie; Colton, Don
2007-01-01
Students learn better when they both hear and do. In computer architecture courses "doing" can be difficult in small schools without hardware laboratories hosted by computer engineering, electrical engineering, or similar departments. Software solutions exist. Our success with George Mills' Multimedia Logic (MML) is the focus of this paper. MML…
Programming with process groups: Group and multicast semantics
NASA Technical Reports Server (NTRS)
Birman, Kenneth P.; Cooper, Robert; Gleeson, Barry
1991-01-01
Process groups are a natural tool for distributed programming and are increasingly important in distributed computing environments. Discussed here is a new architecture that arose from an effort to simplify Isis process group semantics. The findings include a refined notion of how the clients of a group should be treated, what the properties of a multicast primitive should be when systems contain large numbers of overlapping groups, and a new construct called the causality domain. A system based on this architecture is now being implemented in collaboration with the Chorus and Mach projects.
Mars Aerocapture Systems Study
NASA Technical Reports Server (NTRS)
Wright, Henry S.; Oh, David Y.; Westhelle, Carlos H.; Fisher, Jody L.; Dyke, R. Eric; Edquist, Karl T.; Brown, James L.; Justh, Hilary L.; Munk, Michelle M.
2006-01-01
Mars Aerocapture Systems Study (MASS) is a detailed study of the application of aerocapture to a large Mars robotic orbiter to assess and identify key technology gaps. This study addressed use of an Opposition class return segment for use in the Mars Sample Return architecture. Study addressed mission architecture issues as well as system design. Key trade studies focused on design of aerocapture aeroshell, spacecraft design and packaging, guidance, navigation and control with simulation, computational fluid dynamics, and thermal protection system sizing. Detailed master equipment lists are included as well as a cursory cost assessment.
Future Approach to tier-0 extension
NASA Astrophysics Data System (ADS)
Jones, B.; McCance, G.; Cordeiro, C.; Giordano, D.; Traylen, S.; Moreno García, D.
2017-10-01
The current tier-0 processing at CERN is done on two managed sites, the CERN computer centre and the Wigner computer centre. With the proliferation of public cloud resources at increasingly competitive prices, we have been investigating how to transparently increase our compute capacity to include these providers. The approach taken has been to integrate these resources using our existing deployment and computer management tools and to provide them in a way that exposes them to users as part of the same site. The paper will describe the architecture, the toolset and the current production experiences of this model.
Modeling Subsurface Reactive Flows Using Leadership-Class Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mills, Richard T; Hammond, Glenn; Lichtner, Peter
2009-01-01
We describe our experiences running PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - on leadership-class supercomputers, including initial experiences running on the petaflop incarnation of Jaguar, the Cray XT5 at the National Center for Computational Sciences at Oak Ridge National Laboratory. PFLOTRAN utilizes fully implicit time-stepping and is built on top of the Portable, Extensible Toolkit for Scientific Computation (PETSc). We discuss some of the hurdles to 'at scale' performance with PFLOTRAN and the progress we have made in overcoming them on leadership-class computer architectures.
Study of a unified hardware and software fault-tolerant architecture
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan; Alger, Linda; Friend, Steven; Greeley, Gregory; Sacco, Stephen; Adams, Stuart
1989-01-01
A unified architectural concept, called the Fault Tolerant Processor Attached Processor (FTP-AP), that can tolerate hardware as well as software faults is proposed for applications requiring ultrareliable computation capability. An emulation of the FTP-AP architecture, consisting of a breadboard Motorola 68010-based quadruply redundant Fault Tolerant Processor, four VAX 750s as attached processors, and four versions of a transport aircraft yaw damper control law, is used as a testbed in the AIRLAB to examine a number of critical issues. Solutions of several basic problems associated with N-Version software are proposed and implemented on the testbed. This includes a confidence voter to resolve coincident errors in N-Version software. A reliability model of N-Version software that is based upon the recent understanding of software failure mechanisms is also developed. The basic FTP-AP architectural concept appears suitable for hosting N-Version application software while at the same time tolerating hardware failures. Architectural enhancements for greater efficiency, software reliability modeling, and N-Version issues that merit further research are identified.
The Role of Sketch in Architecture Design
NASA Astrophysics Data System (ADS)
Li, Yanjin; Ning, Wen
2017-06-01
With the continuous development of computer technology, we rely more and more on the computer and pay more and more attention to the final design results, so that we ignore the importance of the sketch. However, the sketch is the most basic and effective way of architecture design. Based on the study of the sketch of Tjibao Cultural Center of sketch, the paper explores the role of sketch in architecture design .
SU (2) lattice gauge theory simulations on Fermi GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p
2011-05-10
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
Hybrid architecture for encoded measurement-based quantum computation
Zwerger, M.; Briegel, H. J.; Dür, W.
2014-01-01
We present a hybrid scheme for quantum computation that combines the modular structure of elementary building blocks used in the circuit model with the advantages of a measurement-based approach to quantum computation. We show how to construct optimal resource states of minimal size to implement elementary building blocks for encoded quantum computation in a measurement-based way, including states for error correction and encoded gates. The performance of the scheme is determined by the quality of the resource states, where within the considered error model a threshold of the order of 10% local noise per particle for fault-tolerant quantum computation and quantum communication. PMID:24946906
Solution of partial differential equations on vector and parallel computers
NASA Technical Reports Server (NTRS)
Ortega, J. M.; Voigt, R. G.
1985-01-01
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed.
Exploration of operator method digital optical computers for application to NASA
NASA Technical Reports Server (NTRS)
1990-01-01
Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
Changing (Almost) Everything and Keeping (Almost) Everyone Happy.
ERIC Educational Resources Information Center
Stewart, Craig A.; Grover, Douglas; Vernon, R. David
1998-01-01
In 1994, the information technology organization at Indiana University, Bloomington, undertook a major computing technology conversion that affected 40,000 people. The project is described, and factors contributing to its success are discussed, including system architecture, marketing and customer communications, and migration of information…
NASA Technical Reports Server (NTRS)
Maluf, David A.; Koga, Dennis (Technical Monitor)
2002-01-01
This presentation discuss NASA's proposed NETMARK knowledge management tool which aims 'to control and interoperate with every block in a document, email, spreadsheet, power point, database, etc. across the lifecycle'. Topics covered include: system software requirements and hardware requirements, seamless information systems, computer architecture issues, and potential benefits to NETMARK users.
Design and Implementation of a Tool for Teaching Programming.
ERIC Educational Resources Information Center
Goktepe, Mesut; And Others
1989-01-01
Discussion of the use of computers in education focuses on a graphics-based system for teaching the Pascal programing language for problem solving. Topics discussed include user interface; notification based systems; communication processes; object oriented programing; workstations; graphics architecture; and flowcharts. (18 references) (LRW)
DOT National Transportation Integrated Search
2002-04-01
The Logical Architecture is based on a Computer Aided Systems Engineering (CASE) model of the requirements for the flow of data and control through the various functions included in Intelligent Transportation Systems (ITS). Data Dictionary is the com...
NASA Astrophysics Data System (ADS)
Menzel, R.; Paynter, D.; Jones, A. L.
2017-12-01
Due to their relatively low computational cost, radiative transfer models in global climate models (GCMs) run on traditional CPU architectures generally consist of shortwave and longwave parameterizations over a small number of wavelength bands. With the rise of newer GPU and MIC architectures, however, the performance of high resolution line-by-line radiative transfer models may soon approach those of the physical parameterizations currently employed in GCMs. Here we present an analysis of the current performance of a new line-by-line radiative transfer model currently under development at GFDL. Although originally designed to specifically exploit GPU architectures through the use of CUDA, the radiative transfer model has recently been extended to include OpenMP in an effort to also effectively target MIC architectures such as Intel's Xeon Phi. Using input data provided by the upcoming Radiative Forcing Model Intercomparison Project (RFMIP, as part of CMIP 6), we compare model results and performance data for various model configurations and spectral resolutions run on both GPU and Intel Knights Landing architectures to analogous runs of the standard Oxford Reference Forward Model on traditional CPUs.
IPython: components for interactive and parallel computing across disciplines. (Invited)
NASA Astrophysics Data System (ADS)
Perez, F.; Bussonnier, M.; Frederic, J. D.; Froehle, B. M.; Granger, B. E.; Ivanov, P.; Kluyver, T.; Patterson, E.; Ragan-Kelley, B.; Sailer, Z.
2013-12-01
Scientific computing is an inherently exploratory activity that requires constantly cycling between code, data and results, each time adjusting the computations as new insights and questions arise. To support such a workflow, good interactive environments are critical. The IPython project (http://ipython.org) provides a rich architecture for interactive computing with: 1. Terminal-based and graphical interactive consoles. 2. A web-based Notebook system with support for code, text, mathematical expressions, inline plots and other rich media. 3. Easy to use, high performance tools for parallel computing. Despite its roots in Python, the IPython architecture is designed in a language-agnostic way to facilitate interactive computing in any language. This allows users to mix Python with Julia, R, Octave, Ruby, Perl, Bash and more, as well as to develop native clients in other languages that reuse the IPython clients. In this talk, I will show how IPython supports all stages in the lifecycle of a scientific idea: 1. Individual exploration. 2. Collaborative development. 3. Production runs with parallel resources. 4. Publication. 5. Education. In particular, the IPython Notebook provides an environment for "literate computing" with a tight integration of narrative and computation (including parallel computing). These Notebooks are stored in a JSON-based document format that provides an "executable paper": notebooks can be version controlled, exported to HTML or PDF for publication, and used for teaching.
Quantum Devices Bonded Beneath a Superconducting Shield: Part 2
NASA Astrophysics Data System (ADS)
McRae, Corey Rae; Abdallah, Adel; Bejanin, Jeremy; Earnest, Carolyn; McConkey, Thomas; Pagel, Zachary; Mariantoni, Matteo
The next-generation quantum computer will rely on physical quantum bits (qubits) organized into arrays to form error-robust logical qubits. In the superconducting quantum circuit implementation, this architecture will require the use of larger and larger chip sizes. In order for on-chip superconducting quantum computers to be scalable, various issues found in large chips must be addressed, including the suppression of box modes (due to the sample holder) and the suppression of slot modes (due to fractured ground planes). By bonding a metallized shield layer over a superconducting circuit using thin-film indium as a bonding agent, we have demonstrated proof of concept of an extensible circuit architecture that holds the key to the suppression of spurious modes. Microwave characterization of shielded transmission lines and measurement of superconducting resonators were compared to identical unshielded devices. The elimination of box modes was investigated, as well as bond characteristics including bond homogeneity and the presence of a superconducting connection.
New Developments in Modeling MHD Systems on High Performance Computing Architectures
NASA Astrophysics Data System (ADS)
Germaschewski, K.; Raeder, J.; Larson, D. J.; Bhattacharjee, A.
2009-04-01
Modeling the wide range of time and length scales present even in fluid models of plasmas like MHD and X-MHD (Extended MHD including two fluid effects like Hall term, electron inertia, electron pressure gradient) is challenging even on state-of-the-art supercomputers. In the last years, HPC capacity has continued to grow exponentially, but at the expense of making the computer systems more and more difficult to program in order to get maximum performance. In this paper, we will present a new approach to managing the complexity caused by the need to write efficient codes: Separating the numerical description of the problem, in our case a discretized right hand side (r.h.s.), from the actual implementation of efficiently evaluating it. An automatic code generator is used to describe the r.h.s. in a quasi-symbolic form while leaving the translation into efficient and parallelized code to a computer program itself. We implemented this approach for OpenGGCM (Open General Geospace Circulation Model), a model of the Earth's magnetosphere, which was accelerated by a factor of three on regular x86 architecture and a factor of 25 on the Cell BE architecture (commonly known for its deployment in Sony's PlayStation 3).
EON: software for long time simulations of atomic scale systems
NASA Astrophysics Data System (ADS)
Chill, Samuel T.; Welborn, Matthew; Terrell, Rye; Zhang, Liang; Berthet, Jean-Claude; Pedersen, Andreas; Jónsson, Hannes; Henkelman, Graeme
2014-07-01
The EON software is designed for simulations of the state-to-state evolution of atomic scale systems over timescales greatly exceeding that of direct classical dynamics. States are defined as collections of atomic configurations from which a minimization of the potential energy gives the same inherent structure. The time evolution is assumed to be governed by rare events, where transitions between states are uncorrelated and infrequent compared with the timescale of atomic vibrations. Several methods for calculating the state-to-state evolution have been implemented in EON, including parallel replica dynamics, hyperdynamics and adaptive kinetic Monte Carlo. Global optimization methods, including simulated annealing, basin hopping and minima hopping are also implemented. The software has a client/server architecture where the computationally intensive evaluations of the interatomic interactions are calculated on the client-side and the state-to-state evolution is managed by the server. The client supports optimization for different computer architectures to maximize computational efficiency. The server is written in Python so that developers have access to the high-level functionality without delving into the computationally intensive components. Communication between the server and clients is abstracted so that calculations can be deployed on a single machine, clusters using a queuing system, large parallel computers using a message passing interface, or within a distributed computing environment. A generic interface to the evaluation of the interatomic interactions is defined so that empirical potentials, such as in LAMMPS, and density functional theory as implemented in VASP and GPAW can be used interchangeably. Examples are given to demonstrate the range of systems that can be modeled, including surface diffusion and island ripening of adsorbed atoms on metal surfaces, molecular diffusion on the surface of ice and global structural optimization of nanoparticles.
Layered Architectures for Quantum Computers and Quantum Repeaters
NASA Astrophysics Data System (ADS)
Jones, Nathan C.
This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
Neural simulations on multi-core architectures.
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing.
Neural Simulations on Multi-Core Architectures
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing. PMID:19636393
Advanced flight computer. Special study
NASA Technical Reports Server (NTRS)
Coo, Dennis
1995-01-01
This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996.
Advanced information processing system for advanced launch system: Avionics architecture synthesis
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.
1991-01-01
The Advanced Information Processing System (AIPS) is a fault-tolerant distributed computer system architecture that was developed to meet the real time computational needs of advanced aerospace vehicles. One such vehicle is the Advanced Launch System (ALS) being developed jointly by NASA and the Department of Defense to launch heavy payloads into low earth orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An avionics architecture that utilizes the AIPS hardware and software building blocks was synthesized for ALS. The AIPS for ALS architecture synthesis process starting with the ALS mission requirements and ending with an analysis of the candidate ALS avionics architecture is described.
Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Duong, Vu A.
2012-01-01
A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant.
Reference Architecture Model Enabling Standards Interoperability.
Blobel, Bernd
2017-01-01
Advanced health and social services paradigms are supported by a comprehensive set of domains managed by different scientific disciplines. Interoperability has to evolve beyond information and communication technology (ICT) concerns, including the real world business domains and their processes, but also the individual context of all actors involved. So, the system must properly reflect the environment in front and around the computer as essential and even defining part of the health system. This paper introduces an ICT-independent system-theoretical, ontology-driven reference architecture model allowing the representation and harmonization of all domains involved including the transformation into an appropriate ICT design and implementation. The entire process is completely formalized and can therefore be fully automated.
Computational structures for robotic computations
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chang, P. R.
1987-01-01
The computational problem of inverse kinematics and inverse dynamics of robot manipulators by taking advantage of parallelism and pipelining architectures is discussed. For the computation of inverse kinematic position solution, a maximum pipelined CORDIC architecture has been designed based on a functional decomposition of the closed-form joint equations. For the inverse dynamics computation, an efficient p-fold parallel algorithm to overcome the recurrence problem of the Newton-Euler equations of motion to achieve the time lower bound of O(log sub 2 n) has also been developed.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Intelligent Agent Architectures: Reactive Planning Testbed
NASA Technical Reports Server (NTRS)
Rosenschein, Stanley J.; Kahn, Philip
1993-01-01
An Integrated Agent Architecture (IAA) is a framework or paradigm for constructing intelligent agents. Intelligent agents are collections of sensors, computers, and effectors that interact with their environments in real time in goal-directed ways. Because of the complexity involved in designing intelligent agents, it has been found useful to approach the construction of agents with some organizing principle, theory, or paradigm that gives shape to the agent's components and structures their relationships. Given the wide variety of approaches being taken in the field, the question naturally arises: Is there a way to compare and evaluate these approaches? The purpose of the present work is to develop common benchmark tasks and evaluation metrics to which intelligent agents, including complex robotic agents, constructed using various architectural approaches can be subjected.
Scaling deep learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gawande, Nitin A.; Landwehr, Joshua B.; Daily, Jeffrey A.
Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors --- including NVIDIA, Intel, AMD, and IBM --- have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating large DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. This paper provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path or Cray Aries. Ourmore » evaluation consists of a cross section of convolutional neural net workloads: CifarNet, AlexNet, GoogLeNet, and ResNet50 topologies using the Cifar10 and ImageNet datasets. The workloads are vendor-optimized for each architecture. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and the KNL can be competitive in performance/watt. We find that NVLink facilitates scaling efficiency on GPUs. However, its importance is heavily dependent on neural network architecture. Furthermore, for weak-scaling --- sometimes encouraged by restricted GPU memory --- NVLink is less important.« less
Impact of plant shoot architecture on leaf cooling: a coupled heat and mass transfer model
Bridge, L. J.; Franklin, K. A.; Homer, M. E.
2013-01-01
Plants display a range of striking architectural adaptations when grown at elevated temperatures. In the model plant Arabidopsis thaliana, these include elongation of petioles, and increased petiole and leaf angles from the soil surface. The potential physiological significance of these architectural changes remains speculative. We address this issue computationally by formulating a mathematical model and performing numerical simulations, testing the hypothesis that elongated and elevated plant configurations may reflect a leaf-cooling strategy. This sets in place a new basic model of plant water use and interaction with the surrounding air, which couples heat and mass transfer within a plant to water vapour diffusion in the air, using a transpiration term that depends on saturation, temperature and vapour concentration. A two-dimensional, multi-petiole shoot geometry is considered, with added leaf-blade shape detail. Our simulations show that increased petiole length and angle generally result in enhanced transpiration rates and reduced leaf temperatures in well-watered conditions. Furthermore, our computations also reveal plant configurations for which elongation may result in decreased transpiration rate owing to decreased leaf liquid saturation. We offer further qualitative and quantitative insights into the role of architectural parameters as key determinants of leaf-cooling capacity. PMID:23720538
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.
2003-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT).
Solving the Cauchy-Riemann equations on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, M.R.
1991-02-01
In recent years the NASA Langley Research Center has funded several contractors to conduct conceptual designs defining architectures for fault tolerant computer systems. Such a system is referred to as a Multi-Path Redundant Avionics Suite (MPRAS), and would form the basis for avionics systems that would be used in future families of space vehicles in a variety of missions. The principal contractors were General Dynamics, Boeing, and Draper Laboratories. These contractors participated in a series of review meetings, and submitted final reports defining their candidate architectures. NASA then commissioned the Research Triangle Institute (RTI) to perform an assessment of thesemore » architectures to identify strengths and weaknesses of each. This report is a separate, independent review of the RTI assessment, done primarily to assure that the assessment was comprehensive and objective. The report also includes general recommendations relative to further MPRAS development.« less
Wood, Dennis Patrick; Wiederhold, Brenda K; Spira, James
2010-02-01
Virtual-reality (VR) therapy has been distinguished from other psychotherapy interventions through the use of computer-assisted interventions that rely on the concepts of "immersion," "presence," and "synchrony." In this work, these concepts are defined, and their uses, within the VR treatment architecture, are discussed. VR therapy's emphasis on the incorporation of biofeedback and meditation, as a component of the VR treatment architecture, is also reviewed. A growing body of research has documented VR therapy as a successful treatment for combat-related Posttraumatic Stress Disorder (PTSD). The VR treatment architecture, utilized to treat 30 warriors diagnosed with combat-related PTSD, is summarized. Lastly, case summaries of two warriors successfully treated with VR therapy are included to assist with the goal of better understanding a VR treatment architecture paradigm. Continued validation of the VR treatment model is encouraged.
MindModeling@Home . . . and Anywhere Else You Have Idle Processors
2009-12-01
was SETI @Home. It was established in 1999 for the purpose of demonstrating the utility of “distributed grid computing” by providing a mechanism for...the public imagination, and SETI @Home remains the longest running and one of the most popular volunteer computing projects in the world. This...pursuits. Most of them, including SETI @Home, run on a software architecture called the Berkeley Open Infrastructure for Network Computing (BOINC). Some of
Manifold parametrization of the left ventricle for a statistical modelling of its complete anatomy
NASA Astrophysics Data System (ADS)
Gil, D.; Garcia-Barnes, J.; Hernández-Sabate, A.; Marti, E.
2010-03-01
Distortion of Left Ventricle (LV) external anatomy is related to some dysfunctions, such as hypertrophy. The architecture of myocardial fibers determines LV electromechanical activation patterns as well as mechanics. Thus, their joined modelling would allow the design of specific interventions (such as peacemaker implantation and LV remodelling) and therapies (such as resynchronization). On one hand, accurate modelling of external anatomy requires either a dense sampling or a continuous infinite dimensional approach, which requires non-Euclidean statistics. On the other hand, computation of fiber models requires statistics on Riemannian spaces. Most approaches compute separate statistical models for external anatomy and fibers architecture. In this work we propose a general mathematical framework based on differential geometry concepts for computing a statistical model including, both, external and fiber anatomy. Our framework provides a continuous approach to external anatomy supporting standard statistics. We also provide a straightforward formula for the computation of the Riemannian fiber statistics. We have applied our methodology to the computation of complete anatomical atlas of canine hearts from diffusion tensor studies. The orientation of fibers over the average external geometry agrees with the segmental description of orientations reported in the literature.
AHaH Computing–From Metastable Switches to Attractors to Machine Learning
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures–all key capabilities of biological nervous systems and modern machine learning algorithms with real world application. PMID:24520315
Space and Ground Trades for Human Exploration and Wearable Computing
NASA Technical Reports Server (NTRS)
Lupisella, Mark; Donohue, John; Mandl, Dan; Ly, Vuong; Graves, Corey; Heimerdinger, Dan; Studor, George; Saiz, John; DeLaune, Paul; Clancey, William
2006-01-01
Human exploration of the Moon and Mars will present unique trade study challenges as ground system elements shift to planetary bodies and perhaps eventually to the bodies of human explorers in the form of wearable computing technologies. This presentation will highlight some of the key space and ground trade issues that will face the Exploration Initiative as NASA begins designing systems for the sustained human exploration of the Moon and Mars, with an emphasis on wearable computing. We will present some preliminary test results and scenarios that demonstrate how wearable computing might affect the trade space noted below. We will first present some background on wearable computing and its utility to NASA's Exploration Initiative. Next, we will discuss three broad architectural themes, some key ground and space trade issues within those themes and how they relate to wearable computing. Lastly, we will present some preliminary test results and suggest guidance for proceeding in the assessment and creation of a value-added role for wearable computing in the Exploration Initiative. The three broad ground-space architectural trade themes we will discuss are: 1. Functional Shift and Distribution: To what extent, if any, should traditional ground system functionality be shifted to, and distributed among, the Earth, Moon/Mars, and the human. explorer? 2. Situational Awareness and Autonomy: How much situational awareness (e.g. environmental conditions, biometrics, etc.) and autonomy is required and desired, and where should these capabilities reside? 3. Functional Redundancy: What functions (e.g. command, control, analysis) should exist simultaneously on Earth, the Moon/Mars, and the human explorer? These three themes can serve as the axes of a three-dimensional trade space, within which architectural solutions reside. We will show how wearable computers can fit into this trade space and what the possible implications could be for the rest of the ground and space architecture(s). We intend this to be an example of explorer-centric thinking in a fully integrated explorer paradigm, where integrated explorer refers to a human explorer having instant access to all relevant data, knowledge of the environment, science models, health and safety-related events, and other tools and information via wearable computing technologies. The trade study approach will include involvement from the relevant stakeholders (Constellation Systems, CCCI, EVA Project Office, Astronaut office, Mission Operations, Space Life Sciences, etc.) to develop operations concepts (and/or operations scenarios) from which a basic high-level set of requirements could be extracted. This set of requirements could serve as a foundation (along with stakeholder buy-in) that would help define the trade space and assist in identifying candidate technologies for further study and evolution to higher-level technology readiness levels.
A language comparison for scientific computing on MIMD architectures
NASA Technical Reports Server (NTRS)
Jones, Mark T.; Patrick, Merrell L.; Voigt, Robert G.
1989-01-01
Choleski's method for solving banded symmetric, positive definite systems is implemented on a multiprocessor computer using three FORTRAN based parallel programming languages, the Force, PISCES and Concurrent FORTRAN. The capabilities of the language for expressing parallelism and their user friendliness are discussed, including readability of the code, debugging assistance offered, and expressiveness of the languages. The performance of the different implementations is compared. It is argued that PISCES, using the Force for medium-grained parallelism, is the appropriate choice for programming Choleski's method on the multiprocessor computer, Flex/32.
Application of technology developed for flight simulation at NASA. Langley Research Center
NASA Technical Reports Server (NTRS)
Cleveland, Jeff I., II
1991-01-01
In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations including mathematical model computation and data input/output to the simulators must be deterministic and be completed in as short a time as possible. Personnel at NASA's Langley Research Center are currently developing the use of supercomputers for simulation mathematical model computation for real-time simulation. This, coupled with the use of an open systems software architecture, will advance the state-of-the-art in real-time flight simulation.
NASA Technical Reports Server (NTRS)
Jones, J. R.; Bodenheimer, R. E.
1976-01-01
A simple programmable Tse processor organization and arithmetic operations necessary for extraction of the desired topological information are described. Hardware additions to this organization are discussed along with trade-offs peculiar to the tse computing concept. An improved organization is presented along with the complementary software for the various arithmetic operations. The performance of the two organizations is compared in terms of speed, power, and cost. Software routines developed to extract the desired information from an image are included.
Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gawande, Nitin A.; Landwehr, Joshua B.; Daily, Jeffrey A.
Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors --- including NVIDIA, Intel, AMD and IBM --- have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. This paper provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path. Our evaluation consists of amore » cross section of convolutional neural net workloads: CifarNet, CaffeNet, AlexNet and GoogleNet topologies using the Cifar10 and ImageNet datasets. The workloads are vendor optimized for each architecture. GPUs provide the highest overall raw performance. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and KNL can be competitive when considering performance/watt. Furthermore, NVLink is critical to GPU scaling.« less
Comparison of Classifier Architectures for Online Neural Spike Sorting.
Saeed, Maryam; Khan, Amir Ali; Kamboh, Awais Mehmood
2017-04-01
High-density, intracranial recordings from micro-electrode arrays need to undergo Spike Sorting in order to associate the recorded neuronal spikes to particular neurons. This involves spike detection, feature extraction, and classification. To reduce the data transmission and power requirements, on-chip real-time processing is becoming very popular. However, high computational resources are required for classifiers in on-chip spike-sorters, making scalability a great challenge. In this review paper, we analyze several popular classifiers to propose five new hardware architectures using the off-chip training with on-chip classification approach. These include support vector classification, fuzzy C-means classification, self-organizing maps classification, moving-centroid K-means classification, and Cosine distance classification. The performance of these architectures is analyzed in terms of accuracy and resource requirement. We establish that the neural networks based Self-Organizing Maps classifier offers the most viable solution. A spike sorter based on the Self-Organizing Maps classifier, requires only 7.83% of computational resources of the best-reported spike sorter, hierarchical adaptive means, while offering a 3% better accuracy at 7 dB SNR.
A single VLSI chip for computing syndromes in the (225, 223) Reed-Solomon decoder
NASA Technical Reports Server (NTRS)
Hsu, I. S.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.
1986-01-01
A description of a single VLSI chip for computing syndromes in the (255, 223) Reed-Solomon decoder is presented. The architecture that leads to this single VLSI chip design makes use of the dual basis multiplication algorithm. The same architecture can be applied to design VLSI chips to compute various kinds of number theoretic transforms.
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Biswas, Rupak
1999-01-01
The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.
KeyWare: an open wireless distributed computing environment
NASA Astrophysics Data System (ADS)
Shpantzer, Isaac; Schoenfeld, Larry; Grindahl, Merv; Kelman, Vladimir
1995-12-01
Deployment of distributed applications in the wireless domain lack equivalent tools, methodologies, architectures, and network management that exist in LAN based applications. A wireless distributed computing environment (KeyWareTM) based on intelligent agents within a multiple client multiple server scheme was developed to resolve this problem. KeyWare renders concurrent application services to wireline and wireless client nodes encapsulated in multiple paradigms such as message delivery, database access, e-mail, and file transfer. These services and paradigms are optimized to cope with temporal and spatial radio coverage, high latency, limited throughput and transmission costs. A unified network management paradigm for both wireless and wireline facilitates seamless extensions of LAN- based management tools to include wireless nodes. A set of object oriented tools and methodologies enables direct asynchronous invocation of agent-based services supplemented by tool-sets matched to supported KeyWare paradigms. The open architecture embodiment of KeyWare enables a wide selection of client node computing platforms, operating systems, transport protocols, radio modems and infrastructures while maintaining application portability.
Digital quantum simulators in a scalable architecture of hybrid spin-photon qubits
Chiesa, Alessandro; Santini, Paolo; Gerace, Dario; Raftery, James; Houck, Andrew A.; Carretta, Stefano
2015-01-01
Resolving quantum many-body problems represents one of the greatest challenges in physics and physical chemistry, due to the prohibitively large computational resources that would be required by using classical computers. A solution has been foreseen by directly simulating the time evolution through sequences of quantum gates applied to arrays of qubits, i.e. by implementing a digital quantum simulator. Superconducting circuits and resonators are emerging as an extremely promising platform for quantum computation architectures, but a digital quantum simulator proposal that is straightforwardly scalable, universal, and realizable with state-of-the-art technology is presently lacking. Here we propose a viable scheme to implement a universal quantum simulator with hybrid spin-photon qubits in an array of superconducting resonators, which is intrinsically scalable and allows for local control. As representative examples we consider the transverse-field Ising model, a spin-1 Hamiltonian, and the two-dimensional Hubbard model and we numerically simulate the scheme by including the main sources of decoherence. PMID:26563516
Test and Evaluation of Architecture-Aware Compiler Environment
2011-11-01
biology, medicine, social sciences , and security applications. Challenges include extremely large graphs (the Facebook friend network has over...Operations with Temporal Binning ....................................................................... 32 4.12 Memory behavior and Energy per...five challenge problems empirically, exploring their scaling properties, computation and datatype needs, memory behavior , and temporal behavior
User interface issues in supporting human-computer integrated scheduling
NASA Technical Reports Server (NTRS)
Cooper, Lynne P.; Biefeld, Eric W.
1991-01-01
The topics are presented in view graph form and include the following: characteristics of Operations Mission Planner (OMP) schedule domain; OMP architecture; definition of a schedule; user interface dimensions; functional distribution; types of users; interpreting user interaction; dynamic overlays; reactive scheduling; and transitioning the interface.
Evidence of common and separate eye and hand accumulators underlying flexible eye-hand coordination
Jana, Sumitash; Gopal, Atul
2016-01-01
Eye and hand movements are initiated by anatomically separate regions in the brain, and yet these movements can be flexibly coupled and decoupled, depending on the need. The computational architecture that enables this flexible coupling of independent effectors is not understood. Here, we studied the computational architecture that enables flexible eye-hand coordination using a drift diffusion framework, which predicts that the variability of the reaction time (RT) distribution scales with its mean. We show that a common stochastic accumulator to threshold, followed by a noisy effector-dependent delay, explains eye-hand RT distributions and their correlation in a visual search task that required decision-making, while an interactive eye and hand accumulator model did not. In contrast, in an eye-hand dual task, an interactive model better predicted the observed correlations and RT distributions than a common accumulator model. Notably, these two models could only be distinguished on the basis of the variability and not the means of the predicted RT distributions. Additionally, signatures of separate initiation signals were also observed in a small fraction of trials in the visual search task, implying that these distinct computational architectures were not a manifestation of the task design per se. Taken together, our results suggest two unique computational architectures for eye-hand coordination, with task context biasing the brain toward instantiating one of the two architectures. NEW & NOTEWORTHY Previous studies on eye-hand coordination have considered mainly the means of eye and hand reaction time (RT) distributions. Here, we leverage the approximately linear relationship between the mean and standard deviation of RT distributions, as predicted by the drift-diffusion model, to propose the existence of two distinct computational architectures underlying coordinated eye-hand movements. These architectures, for the first time, provide a computational basis for the flexible coupling between eye and hand movements. PMID:27784809
NASA Astrophysics Data System (ADS)
Hassanzadeh, Iraj; Janabi-Sharifi, Farrokh
2005-12-01
In this paper, a new open architecture for visual servo control tasks is illustrated. A Puma-560 robotic manipulator is used to prove the concept. This design enables doing hybrid forcehisual servo control in an unstructured environment in different modes. Also, it can be controlled through Internet in teleoperation mode using a haptic device. Our proposed structure includes two major parts, hardware and software. In terms of hardware, it consists of a master (host) computer, a slave (target) computer, a Puma 560 manipulator, a CCD camera, a force sensor and a haptic device. There are five DAQ cards, interfacing Puma 560 and a slave computer. An open architecture package is developed using Matlab (R), Simulink (R) and XPC target toolbox. This package has the Hardware-In-the-Loop (HIL) property, i.e., enables one to readily implement different configurations of force, visual or hybrid control in real time. The implementation includes the following stages. First of all, retrofitting of puma was carried out. Then a modular joint controller for Puma 560 was realized using Simulink (R). Force sensor driver and force control implementation were written, using sjknction blocks of Simulink (R). Visual images were captured through Image Acquisition Toolbox of Matlab (R), and processed using Image Processing Toolbox. A haptic device interface was also written in Simulink (R). Thus, this setup could be readily reconfigured and accommodate any other robotic manipulator and/or other sensors without the trouble of the external issues relevant to the control, interface and software, while providing flexibility in components modification.
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.
Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan
2017-11-01
The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
Federated data storage system prototype for LHC experiments and data intensive science
NASA Astrophysics Data System (ADS)
Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.
2017-10-01
Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.
Computer graphics in architecture and engineering
NASA Technical Reports Server (NTRS)
Greenberg, D. P.
1975-01-01
The present status of the application of computer graphics to the building profession or architecture and its relationship to other scientific and technical areas were discussed. It was explained that, due to the fragmented nature of architecture and building activities (in contrast to the aerospace industry), a comprehensive, economic utilization of computer graphics in this area is not practical and its true potential cannot now be realized due to the present inability of architects and structural, mechanical, and site engineers to rely on a common data base. Future emphasis will therefore have to be placed on a vertical integration of the construction process and effective use of a three-dimensional data base, rather than on waiting for any technological breakthrough in interactive computing.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Larson, Robert E.
1989-01-01
The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
Fault tolerant architectures for integrated aircraft electronics systems
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1983-01-01
Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered.
HyperForest: A high performance multi-processor architecture for real-time intelligent systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.
1997-04-01
Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doorsmore » for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.« less
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Topical perspective on massive threading and parallelism.
Farber, Robert M
2011-09-01
Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.
Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
A high performance parallel computing architecture for robust image features
NASA Astrophysics Data System (ADS)
Zhou, Renyan; Liu, Leibo; Wei, Shaojun
2014-03-01
A design of parallel architecture for image feature detection and description is proposed in this article. The major component of this architecture is a 2D cellular network composed of simple reprogrammable processors, enabling the Hessian Blob Detector and Haar Response Calculation, which are the most computing-intensive stage of the Speeded Up Robust Features (SURF) algorithm. Combining this 2D cellular network and dedicated hardware for SURF descriptors, this architecture achieves real-time image feature detection with minimal software in the host processor. A prototype FPGA implementation of the proposed architecture achieves 1318.9 GOPS general pixel processing @ 100 MHz clock and achieves up to 118 fps in VGA (640 × 480) image feature detection. The proposed architecture is stand-alone and scalable so it is easy to be migrated into VLSI implementation.
The RISC (Reduced Instruction Set Computer) Architecture and Computer Performance Evaluation.
1986-03-01
time where the main emphasis of the evaluation process is put on the software . The model is intended to provide a tool for computer architects to use...program, or 3) Was to be implemented in random logic more effec- tively than the equivalent sequence of software instructions. Both data and address...definition is the IEEE standard 729-1983 stating Computer Architecture as: " The process of defining a collection of hardware and software components and
First 3 years of operation of RIACS (Research Institute for Advanced Computer Science) (1983-1985)
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
The focus of the Research Institute for Advanced Computer Science (RIACS) is to explore matches between advanced computing architectures and the processes of scientific research. An architecture evaluation of the MIT static dataflow machine, specification of a graphical language for expressing distributed computations, and specification of an expert system for aiding in grid generation for two-dimensional flow problems was initiated. Research projects for 1984 and 1985 are summarized.
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2018-01-19
A main goal in DNA computing is to build DNA circuits to compute designated functions using a minimal number of DNA strands. Here, we propose a novel architecture to build compact DNA strand displacement circuits to compute a broad scope of functions in an analog fashion. A circuit by this architecture is composed of three autocatalytic amplifiers, and the amplifiers interact to perform computation. We show DNA circuits to compute functions sqrt(x), ln(x) and exp(x) for x in tunable ranges with simulation results. A key innovation in our architecture, inspired by Napier's use of logarithm transforms to compute square roots on a slide rule, is to make use of autocatalytic amplifiers to do logarithmic and exponential transforms in concentration and time. In particular, we convert from the input that is encoded by the initial concentration of the input DNA strand, to time, and then back again to the output encoded by the concentration of the output DNA strand at equilibrium. This combined use of strand-concentration and time encoding of computational values may have impact on other forms of molecular computation.
NASA Astrophysics Data System (ADS)
Jeffery, Keith; Harrison, Matt; Bailo, Daniele
2016-04-01
The EPOS-PP Project 2010-2014 proposed an architecture and demonstrated feasibility with a prototype. Requirements based on use cases were collected and an inventory of assets (e.g. datasets, software, users, computing resources, equipment/detectors, laboratory services) (RIDE) was developed. The architecture evolved through three stages of refinement with much consultation both with the EPOS community representing EPOS users and participants in geoscience and with the overall ICT community especially those working on research such as the RDA (Research Data Alliance) community. The architecture consists of a central ICS (Integrated Core Services) consisting of a portal and catalog, the latter providing to end-users a 'map' of all EPOS resources (datasets, software, users, computing, equipment/detectors etc.). ICS is extended to ICS-d (distributed ICS) for certain services (such as visualisation software services or Cloud computing resources) and CES (Computational Earth Science) for specific simulation or analytical processing. ICS also communicates with TCS (Thematic Core Services) which represent European-wide portals to national and local assets, resources and services in the various specific domains (e.g. seismology, volcanology, geodesy) of EPOS. The EPOS-IP project 2015-2019 started October 2015. Two work-packages cover the ICT aspects; WP6 involves interaction with the TCS while WP7 concentrates on ICS including interoperation with ICS-d and CES offerings: in short the ICT architecture. Based on the experience and results of EPOS-PP the ICT team held a pre-meeting in July 2015 and set out a project plan. The first major activity involved requirements (re-)collection with use cases and also updating the inventory of assets held by the various TCS in EPOS. The RIDE database of assets is currently being converted to CERIF (Common European Research Information Format - an EU Recommendation to Member States) to provide the basis for the EPOS-IP ICS Catalog. In parallel the ICT team is tracking developments in ICT for relevance to EPOS-IP. In particular, the potential utilisation of e-Is (e-Infrastructures) such as GEANT(network), AARC (security), EGI (GRID computing), EUDAT (data curation), PRACE (High Performance Computing), HELIX-Nebula / Open Science Cloud (Cloud computing) are being assessed. Similarly relationships to other e-RIs (e-Research Infrastructures) such as ENVRI+, EXCELERATE and other ESFRI (European Strategic Forum for Research Infrastructures) projects are developed to share experience and technology and to promote interoperability. EPOS ICT team members are also involved in VRE4EIC, a project developing a reference architecture and component software services for a Virtual Research Environment to be superimposed on EPOS-ICS. The challenge which is being tackled now is therefore to keep consistency and interoperability among the different modules, initiatives and actors which participate to the process of running the EPOS platform. It implies both a continuous update about IT aspects of mentioned initiatives and a refinement of the e-architecture designed so far. One major aspect of EPOS-IP is the ICT support for legalistic, financial and governance aspects of the EPOS ERIC to be initiated during EPOS-IP. This implies a sophisticated AAAI (Authentication, authorization, accounting infrastructure) with consistency throughout the software, communications and data stack.
Performances of multiprocessor multidisk architectures for continuous media storage
NASA Astrophysics Data System (ADS)
Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.
1996-03-01
Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Parallel mutual information estimation for inferring gene regulatory networks on GPUs
2011-01-01
Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:21672264
Architectural Implications of Cloud Computing
2011-10-24
Public Cloud Infrastructure-as-a- Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of...Twitter #SEIVirtualForum © 2011 Carnegie Mellon University Software -as-a- Service ( SaaS ) Model of software deployment in which a third-party...and System Solutions (RTSS) Program. Her current interests and projects are in service -oriented architecture (SOA), cloud computing, and context
Generic Software for Emulating Multiprocessor Architectures.
1985-05-01
RD-A157 662 GENERIC SOFTWARE FOR EMULATING MULTIPROCESSOR 1/2 AlRCHITECTURES(J) MASSACHUSETTS INST OF TECH CAMBRIDGE U LRS LAB FOR COMPUTER SCIENCE R...AREA & WORK UNIT NUMBERS MIT Laboratory for Computer Science 545 Technology Square Cambridge, MA 02139 ____________ I I. CONTROLLING OFFICE NAME AND...aide If neceeasy end Identify by block number) Computer architecture, emulation, simulation, dataf low 20. ABSTRACT (Continue an reverse slde It
Sigint Application for Polymorphous Computing Architecture (PCA): Wideband DF
2006-08-01
Polymorphous Computing Architecture (PCA) program as stated by Robert Graybill is to Develop the computing foundation for agile systems by establishing...ubiquitous MUSIC algorithm rely upon an underlying narrowband signal model [8]. In this case, narrowband means that the signal bandwidth is less than...a wideband DF algorithm is needed to compensate for this model inadequacy. Among the various wideband DF techniques available, the coherent signal
Takeda, Shuntaro; Furusawa, Akira
2017-09-22
We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
NASA Astrophysics Data System (ADS)
Takeda, Shuntaro; Furusawa, Akira
2017-09-01
We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
Houston Area Survey of Employment Trends for College Graduates.
ERIC Educational Resources Information Center
Somers, Coralie; Small, David
The actual and projected level of demand in the employment of college graduates in the Houston, Texas, area was surveyed. Responses from 74 employers provided information on methods for recruiting college graduates and hiring levels for 13 occupational groups, including advertising, architecture, banking, computer software, construction,…
Universal Design: Implications for Computing Education
ERIC Educational Resources Information Center
Burgstahler, Sheryl
2011-01-01
Universal design (UD), a concept that grew from the field of architecture, has recently emerged as a paradigm for designing instructional methods, curriculum, and assessments that are welcoming and accessible to students with a wide range of characteristics, including those related to race, ethnicity, native language, gender, age, and disability.…
Investigating Architectural Issues in Neuromorphic Computing
2012-05-01
term grasp. Some of these include learning, vision , audition and olfaction , ability to navigate an environment, and goal seeking. These abilities have...17 Figure 14: Word/sentence level accuracy versus the ambiguity: (a) Word accuracy vs . letter ambiguity, (b) (b) Sentence...accuracy vs . letter ambiguity, and (c) (b) Sentence accuracy vs . word ambiguity
Perseus Project: Interactive Teaching and Research Tools for Ancient Greek Civilization.
ERIC Educational Resources Information Center
Crane, Gregory; Harward, V. Judson
1987-01-01
Describes the Perseus Project, an educational program utilizing computer technology to study ancient Greek civilization. Including approximately 10 percent of all ancient literature and visual information on architecture, sculpture, ceramics, topography, and archaeology, the project spans a range of disciplines. States that Perseus fuels student…
Polymorphous Computing Architectures
2007-12-12
provide a multiprocessor implementation. In this work, we introduce the Atomos transactional programming language, which is the first to include...implicit transactions, strong atomicity, and a scalable multiprocessor implementation [47]. Atomos is derived from Java, but replaces its synchronization...and conditional waiting constructs with transactional alternatives. The Atomos conditional waiting proposal is tailored to allow efficient
PCI-based WILDFIRE reconfigurable computing engines
NASA Astrophysics Data System (ADS)
Fross, Bradley K.; Donaldson, Robert L.; Palmer, Douglas J.
1996-10-01
WILDFORCE is the first PCI-based custom reconfigurable computer that is based on the Splash 2 technology transferred from the National Security Agency and the Institute for Defense Analyses, Supercomputing Research Center (SRC). The WILDFORCE architecture has many of the features of the WILDFIRE computer, such as field- programmable gate array (FPGA) based processing elements, linear array and crossbar interconnection, and high- performance memory and I/O subsystems. New features introduced in the PCI-based WILDFIRE systems include memory/processor options that can be added to any processing element. These options include static and dynamic memory, digital signal processors (DSPs), FPGAs, and microprocessors. In addition to memory/processor options, many different application specific connectors can be used to extend the I/O capabilities of the system, including systolic I/O, camera input and video display output. This paper also discusses how this new PCI-based reconfigurable computing engine is used for rapid-prototyping, real-time video processing and other DSP applications.
Klonoff, David C
2017-07-01
The Internet of Things (IoT) is generating an immense volume of data. With cloud computing, medical sensor and actuator data can be stored and analyzed remotely by distributed servers. The results can then be delivered via the Internet. The number of devices in IoT includes such wireless diabetes devices as blood glucose monitors, continuous glucose monitors, insulin pens, insulin pumps, and closed-loop systems. The cloud model for data storage and analysis is increasingly unable to process the data avalanche, and processing is being pushed out to the edge of the network closer to where the data-generating devices are. Fog computing and edge computing are two architectures for data handling that can offload data from the cloud, process it nearby the patient, and transmit information machine-to-machine or machine-to-human in milliseconds or seconds. Sensor data can be processed near the sensing and actuating devices with fog computing (with local nodes) and with edge computing (within the sensing devices). Compared to cloud computing, fog computing and edge computing offer five advantages: (1) greater data transmission speed, (2) less dependence on limited bandwidths, (3) greater privacy and security, (4) greater control over data generated in foreign countries where laws may limit use or permit unwanted governmental access, and (5) lower costs because more sensor-derived data are used locally and less data are transmitted remotely. Connected diabetes devices almost all use fog computing or edge computing because diabetes patients require a very rapid response to sensor input and cannot tolerate delays for cloud computing.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.
Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros
2014-06-25
The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions
2014-01-01
Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCaskey, Alexander J.
Hybrid programming models for beyond-CMOS technologies will prove critical for integrating new computing technologies alongside our existing infrastructure. Unfortunately the software infrastructure required to enable this is lacking or not available. XACC is a programming framework for extreme-scale, post-exascale accelerator architectures that integrates alongside existing conventional applications. It is a pluggable framework for programming languages developed for next-gen computing hardware architectures like quantum and neuromorphic computing. It lets computational scientists efficiently off-load classically intractable work to attached accelerators through user-friendly Kernel definitions. XACC makes post-exascale hybrid programming approachable for domain computational scientists.
Application of computational physics within Northrop
NASA Technical Reports Server (NTRS)
George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.
1987-01-01
An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
All-memristive neuromorphic computing with level-tuned neurons
NASA Astrophysics Data System (ADS)
Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos
2016-09-01
In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
All-memristive neuromorphic computing with level-tuned neurons.
Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos
2016-09-02
In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
Aerodynamic optimization studies on advanced architecture computers
NASA Technical Reports Server (NTRS)
Chawla, Kalpana
1995-01-01
The approach to carrying out multi-discipline aerospace design studies in the future, especially in massively parallel computing environments, comprises of choosing (1) suitable solvers to compute solutions to equations characterizing a discipline, and (2) efficient optimization methods. In addition, for aerodynamic optimization problems, (3) smart methodologies must be selected to modify the surface shape. In this research effort, a 'direct' optimization method is implemented on the Cray C-90 to improve aerodynamic design. It is coupled with an existing implicit Navier-Stokes solver, OVERFLOW, to compute flow solutions. The optimization method is chosen such that it can accomodate multi-discipline optimization in future computations. In the work , however, only single discipline aerodynamic optimization will be included.
Specialized computer architectures for computational aerodynamics
NASA Technical Reports Server (NTRS)
Stevenson, D. K.
1978-01-01
In recent years, computational fluid dynamics has made significant progress in modelling aerodynamic phenomena. Currently, one of the major barriers to future development lies in the compute-intensive nature of the numerical formulations and the relative high cost of performing these computations on commercially available general purpose computers, a cost high with respect to dollar expenditure and/or elapsed time. Today's computing technology will support a program designed to create specialized computing facilities to be dedicated to the important problems of computational aerodynamics. One of the still unresolved questions is the organization of the computing components in such a facility. The characteristics of fluid dynamic problems which will have significant impact on the choice of computer architecture for a specialized facility are reviewed.
Genten: Software for Generalized Tensor Decompositions v. 1.0.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phipps, Eric T.; Kolda, Tamara G.; Dunlavy, Daniel
Tensors, or multidimensional arrays, are a powerful mathematical means of describing multiway data. This software provides computational means for decomposing or approximating a given tensor in terms of smaller tensors of lower dimension, focusing on decomposition of large, sparse tensors. These techniques have applications in many scientific areas, including signal processing, linear algebra, computer vision, numerical analysis, data mining, graph analysis, neuroscience and more. The software is designed to take advantage of parallelism present emerging computer architectures such has multi-core CPUs, many-core accelerators such as the Intel Xeon Phi, and computation-oriented GPUs to enable efficient processing of large tensors.
Mobile Computing for Aerospace Applications
NASA Technical Reports Server (NTRS)
Alena, Richard; Swietek, Gregory E. (Technical Monitor)
1994-01-01
The use of commercial computer technology in specific aerospace mission applications can reduce the cost and project cycle time required for the development of special-purpose computer systems. Additionally, the pace of technological innovation in the commercial market has made new computer capabilities available for demonstrations and flight tests. Three areas of research and development being explored by the Portable Computer Technology Project at NASA Ames Research Center are the application of commercial client/server network computing solutions to crew support and payload operations, the analysis of requirements for portable computing devices, and testing of wireless data communication links as extensions to the wired network. This paper will present computer architectural solutions to portable workstation design including the use of standard interfaces, advanced flat-panel displays and network configurations incorporating both wired and wireless transmission media. It will describe the design tradeoffs used in selecting high-performance processors and memories, interfaces for communication and peripheral control, and high resolution displays. The packaging issues for safe and reliable operation aboard spacecraft and aircraft are presented. The current status of wireless data links for portable computers is discussed from a system design perspective. An end-to-end data flow model for payload science operations from the experiment flight rack to the principal investigator is analyzed using capabilities provided by the new generation of computer products. A future flight experiment on-board the Russian MIR space station will be described in detail including system configuration and function, the characteristics of the spacecraft operating environment, the flight qualification measures needed for safety review, and the specifications of the computing devices to be used in the experiment. The software architecture chosen shall be presented. An analysis of the performance characteristics of wireless data links in the spacecraft environment will be discussed. Network performance and operation will be modeled and preliminary test results presented. A crew support application will be demonstrated in conjunction with the network metrics experiment.
Efficient architecture for spike sorting in reconfigurable hardware.
Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying
2013-11-01
This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
NASA Technical Reports Server (NTRS)
Clark, David A.
1998-01-01
In light of the escalation of terrorism, the Department of Defense spearheaded the development of new antiterrorist software for all Government agencies by issuing a Broad Agency Announcement to solicit proposals. This Government-wide competition resulted in a team that includes NASA Lewis Research Center's Computer Services Division, who will develop the graphical user interface (GUI) and test it in their usability lab. The team launched a program entitled Joint Sphere of Security (JSOS), crafted a design architecture (see the following figure), and is testing the interface. This software system has a state-ofthe- art, object-oriented architecture, with a main kernel composed of the Dynamic Information Architecture System (DIAS) developed by Argonne National Laboratory. DIAS will be used as the software "breadboard" for assembling the components of explosions, such as blast and collapse simulations.
Optical Scanning Architectures For Electronic Printing Applications
NASA Astrophysics Data System (ADS)
Johnson, Richard V.
1987-06-01
The explosive growth of computer technology in recent years has precipitated an equally dramatic growth in the market for nonimpact electronic printers. One of the most popular methods for implementing a high quality nonimpact electronic printer is to integrate a laser scanner with a xerographic copier/duplicator. The subject of this article is a discussion of alternative optical scanner architectures, including both traditional designs which are well represented in the marketplace, and also more exotic designs configured with spatial light modulators, designs which to date have had scant penetration into the marketplace but which can offer superior image quality.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Campbell, Andrea Beth
2004-07-01
This is a case study of the NuMAC nuclear accountability system developed at a private fuel fabrication facility. This paper investigates nuclear material accountability and safeguards by researching expert knowledge applied in the system design and development. Presented is a system developed to detect and deter the theft of weapons grade nuclear material. Examined is the system architecture that includes: issues for the design and development of the system; stakeholder issues; how the system was built and evolved; software design, database design, and development tool considerations; security and computing ethics. (author)
The science of computing - Parallel computation
NASA Technical Reports Server (NTRS)
Denning, P. J.
1985-01-01
Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
Advanced Architectures for Astrophysical Supercomputing
NASA Astrophysics Data System (ADS)
Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.
2010-12-01
Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
Computer Security Primer: Systems Architecture, Special Ontology and Cloud Virtual Machines
ERIC Educational Resources Information Center
Waguespack, Leslie J.
2014-01-01
With the increasing proliferation of multitasking and Internet-connected devices, security has reemerged as a fundamental design concern in information systems. The shift of IS curricula toward a largely organizational perspective of security leaves little room for focus on its foundation in systems architecture, the computational underpinnings of…
Techniques for the rapid display and manipulation of 3-D biomedical data.
Goldwasser, S M; Reynolds, R A; Talton, D A; Walsh, E S
1988-01-01
The use of fully interactive 3-D workstations with true real-time performance will become increasingly common as technology matures and economical commercial systems become available. This paper provides a comprehensive introduction to high speed approaches to the display and manipulation of 3-D medical objects obtained from tomographic data acquisition systems such as CT, MR, and PET. A variety of techniques are outlined including the use of software on conventional minicomputers, hardware assist devices such as array processors and programmable frame buffers, and special purpose computer architecture for dedicated high performance systems. While both algorithms and architectures are addressed, the major theme centers around the utilization of hardware-based approaches including parallel processors for the implementation of true real-time systems.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi
1989-01-01
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Sensing and perception: Connectionist approaches to subcognitive computing
NASA Technical Reports Server (NTRS)
Skrrypek, J.
1987-01-01
New approaches to machine sensing and perception are presented. The motivation for crossdisciplinary studies of perception in terms of AI and neurosciences is suggested. The question of computing architecture granularity as related to global/local computation underlying perceptual function is considered and examples of two environments are given. Finally, the examples of using one of the environments, UCLA PUNNS, to study neural architectures for visual function are presented.
Computer architecture evaluation for structural dynamics computations: Project summary
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1989-01-01
The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
2009-03-01
SENSOR NETWORKS THESIS Presented to the Faculty Department of Electrical and Computer Engineering Graduate School of Engineering and...hierarchical, and Secure Lock within a wireless sensor network (WSN) under the Hubenko architecture. Using a Matlab computer simulation, the impact of the...rekeying protocol should be applied given particular network parameters, such as WSN size. 10 1.3 Experimental Approach A computer simulation in
Kazakis, Georgios; Kanellopoulos, Ioannis; Sotiropoulos, Stefanos; Lagaros, Nikos D
2017-10-01
Construction industry has a major impact on the environment that we spend most of our life. Therefore, it is important that the outcome of architectural intuition performs well and complies with the design requirements. Architects usually describe as "optimal design" their choice among a rather limited set of design alternatives, dictated by their experience and intuition. However, modern design of structures requires accounting for a great number of criteria derived from multiple disciplines, often of conflicting nature. Such criteria derived from structural engineering, eco-design, bioclimatic and acoustic performance. The resulting vast number of alternatives enhances the need for computer-aided architecture in order to increase the possibility of arriving at a more preferable solution. Therefore, the incorporation of smart, automatic tools in the design process, able to further guide designer's intuition becomes even more indispensable. The principal aim of this study is to present possibilities to integrate automatic computational techniques related to topology optimization in the phase of intuition of civil structures as part of computer aided architectural design. In this direction, different aspects of a new computer aided architectural era related to the interpretation of the optimized designs, difficulties resulted from the increased computational effort and 3D printing capabilities are covered here in.
micROS: a morphable, intelligent and collective robot operating system.
Yang, Xuejun; Dai, Huadong; Yi, Xiaodong; Wang, Yanzhen; Yang, Shaowu; Zhang, Bo; Wang, Zhiyuan; Zhou, Yun; Peng, Xuefeng
2016-01-01
Robots are developing in much the same way that personal computers did 40 years ago, and robot operating system is the critical basis. Current robot software is mainly designed for individual robots. We present in this paper the design of micROS, a morphable, intelligent and collective robot operating system for future collective and collaborative robots. We first present the architecture of micROS, including the distributed architecture for collective robot system as a whole and the layered architecture for every single node. We then present the design of autonomous behavior management based on the observe-orient-decide-act cognitive behavior model and the design of collective intelligence including collective perception, collective cognition, collective game and collective dynamics. We also give the design of morphable resource management, which first categorizes robot resources into physical, information, cognitive and social domains, and then achieve morphability based on self-adaptive software technology. We finally deploy micROS on NuBot football robots and achieve significant improvement in real-time performance.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
NASA Technical Reports Server (NTRS)
Hale, Mark A.; Craig, James I.; Mistree, Farrokh; Schrage, Daniel P.
1995-01-01
Computing architectures are being assembled that extend concurrent engineering practices by providing more efficient execution and collaboration on distributed, heterogeneous computing networks. Built on the successes of initial architectures, requirements for a next-generation design computing infrastructure can be developed. These requirements concentrate on those needed by a designer in decision-making processes from product conception to recycling and can be categorized in two areas: design process and design information management. A designer both designs and executes design processes throughout design time to achieve better product and process capabilities while expanding fewer resources. In order to accomplish this, information, or more appropriately design knowledge, needs to be adequately managed during product and process decomposition as well as recomposition. A foundation has been laid that captures these requirements in a design architecture called DREAMS (Developing Robust Engineering Analysis Models and Specifications). In addition, a computing infrastructure, called IMAGE (Intelligent Multidisciplinary Aircraft Generation Environment), is being developed that satisfies design requirements defined in DREAMS and incorporates enabling computational technologies.
Gigaflop architecture, a hardware perspective
NASA Technical Reports Server (NTRS)
Feierbach, G. F.
1978-01-01
Any super computer built in the early 1980s will use components that are available by fall 1978. The architecture of such a system cannot depart radically from current super computers if the software experience painfully acquired from these computers in the 70's is to apply. Given the above constraints, 10 billion floating point operations per second (BFLOPS) are attainable and a problem memory of 512 million (64 bit) words could be supported by the technology of the time. In contrast to this, industry is likely to respond with commercially available machines with a performance of less than 150 MFLOPS. This is due to self-imposed constraints on the manufacturers to provide upward compatible architectures (same instruction set) and systems which can be sold in significant volumes. Since this computing speed is inadequate to meet the demands of computational fluid dynamics, a special processor is required. Issues which are felt to be significant in the pursuit of maximum compute capability in this special processor are discussed.
High performance flight computer developed for deep space applications
NASA Technical Reports Server (NTRS)
Bunker, Robert L.
1993-01-01
The development of an advanced space flight computer for real time embedded deep space applications which embodies the lessons learned on Galileo and modern computer technology is described. The requirements are listed and the design implementation that meets those requirements is described. The development of SPACE-16 (Spaceborne Advanced Computing Engine) (where 16 designates the databus width) was initiated to support the MM2 (Marine Mark 2) project. The computer is based on a radiation hardened emulation of a modern 32 bit microprocessor and its family of support devices including a high performance floating point accelerator. Additional custom devices which include a coprocessor to improve input/output capabilities, a memory interface chip, and an additional support chip that provide management of all fault tolerant features, are described. Detailed supporting analyses and rationale which justifies specific design and architectural decisions are provided. The six chip types were designed and fabricated. Testing and evaluation of a brass/board was initiated.
Using Multimedia for Teaching Analysis in History of Modern Architecture.
ERIC Educational Resources Information Center
Perryman, Garry
This paper presents a case for the development and support of a computer-based interactive multimedia program for teaching analysis in community college architecture design programs. Analysis in architecture design is an extremely important strategy for the teaching of higher-order thinking skills, which senior schools of architecture look for in…
Progress in a novel architecture for high performance processing
NASA Astrophysics Data System (ADS)
Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin
2018-04-01
The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).
Multiple Embedded Processors for Fault-Tolerant Computing
NASA Technical Reports Server (NTRS)
Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy
2005-01-01
A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
A static data flow simulation study at Ames Research Center
NASA Technical Reports Server (NTRS)
Barszcz, Eric; Howard, Lauri S.
1987-01-01
Demands in computational power, particularly in the area of computational fluid dynamics (CFD), led NASA Ames Research Center to study advanced computer architectures. One architecture being studied is the static data flow architecture based on research done by Jack B. Dennis at MIT. To improve understanding of this architecture, a static data flow simulator, written in Pascal, has been implemented for use on a Cray X-MP/48. A matrix multiply and a two-dimensional fast Fourier transform (FFT), two algorithms used in CFD work at Ames, have been run on the simulator. Execution times can vary by a factor of more than 2 depending on the partitioning method used to assign instructions to processing elements. Service time for matching tokens has proved to be a major bottleneck. Loop control and array address calculation overhead can double the execution time. The best sustained MFLOPS rates were less than 50% of the maximum capability of the machine.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1988-01-01
The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
Enterprise application architecture development based on DoDAF and TOGAF
NASA Astrophysics Data System (ADS)
Tao, Zhi-Gang; Luo, Yun-Feng; Chen, Chang-Xin; Wang, Ming-Zhe; Ni, Feng
2017-05-01
For the purpose of supporting the design and analysis of enterprise application architecture, here, we report a tailored enterprise application architecture description framework and its corresponding design method. The presented framework can effectively support service-oriented architecting and cloud computing by creating the metadata model based on architecture content framework (ACF), DoDAF metamodel (DM2) and Cloud Computing Modelling Notation (CCMN). The framework also makes an effort to extend and improve the mapping between The Open Group Architecture Framework (TOGAF) application architectural inputs/outputs, deliverables and Department of Defence Architecture Framework (DoDAF)-described models. The roadmap of 52 DoDAF-described models is constructed by creating the metamodels of these described models and analysing the constraint relationship among metamodels. By combining the tailored framework and the roadmap, this article proposes a service-oriented enterprise application architecture development process. Finally, a case study is presented to illustrate the results of implementing the tailored framework in the Southern Base Management Support and Information Platform construction project using the development process proposed by the paper.
Selecting a Benchmark Suite to Profile High-Performance Computing (HPC) Machines
2014-11-01
architectures. Machines now contain central processing units (CPUs), graphics processing units (GPUs), and many integrated core ( MIC ) architecture all...evaluate the feasibility and applicability of a new architecture just released to the market . Researchers are often unsure how available resources will...architectures. Having a suite of programs running on different architectures, such as GPUs, MICs , and CPUs, adds complexity and technical challenges
Phipps, Eric T.; D'Elia, Marta; Edwards, Harold C.; ...
2017-04-18
In this study, quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in anmore » embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).« less
GaAs Supercomputing: Architecture, Language, And Algorithms For Image Processing
NASA Astrophysics Data System (ADS)
Johl, John T.; Baker, Nick C.
1988-10-01
The application of high-speed GaAs processors in a parallel system matches the demanding computational requirements of image processing. The architecture of the McDonnell Douglas Astronautics Company (MDAC) vector processor is described along with the algorithms and language translator. Most image and signal processing algorithms can utilize parallel processing and show a significant performance improvement over sequential versions. The parallelization performed by this system is within each vector instruction. Since each vector has many elements, each requiring some computation, useful concurrent arithmetic operations can easily be performed. Balancing the memory bandwidth with the computation rate of the processors is an important design consideration for high efficiency and utilization. The architecture features a bus-based execution unit consisting of four to eight 32-bit GaAs RISC microprocessors running at a 200 MHz clock rate for a peak performance of 1.6 BOPS. The execution unit is connected to a vector memory with three buses capable of transferring two input words and one output word every 10 nsec. The address generators inside the vector memory perform different vector addressing modes and feed the data to the execution unit. The functions discussed in this paper include basic MATRIX OPERATIONS, 2-D SPATIAL CONVOLUTION, HISTOGRAM, and FFT. For each of these algorithms, assembly language programs were run on a behavioral model of the system to obtain performance figures.
Bioinspired Cellular Structures: Additive Manufacturing and Mechanical Properties
NASA Astrophysics Data System (ADS)
Stampfl, J.; Pettermann, H. E.; Liska, R.
Biological materials (e.g., wood, trabecular bone, marine skeletons) rely heavily on the use of cellular architecture, which provides several advantages. (1) The resulting structures can bear the variety of "real life" load spectra using a minimum of a given bulk material, featuring engineering lightweight design principles. (2) The inside of the structures is accessible to body fluids which deliver the required nutrients. (3) Furthermore, cellular architectures can grow organically by adding or removing individual struts or by changing the shape of the constituting elements. All these facts make the use of cellular architectures a reasonable choice for nature. Using additive manufacturing technologies (AMT), it is now possible to fabricate such structures for applications in engineering and biomedicine. In this chapter, we present methods that allow the 3D computational analysis of the mechanical properties of cellular structures with open porosity. Various different cellular architectures including disorder are studied. In order to quantify the influence of architecture, the apparent density is always kept constant. Furthermore, it is shown that how new advanced photopolymers can be used to tailor the mechanical and functional properties of the fabricated structures.
A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.
2014-12-01
Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while managing the uncertainties of scientific conclusions derived from such capabilities. This talk will provide an overview of JPL's efforts in developing a comprehensive architectural approach to data science.
Two-way cable television project
NASA Astrophysics Data System (ADS)
Wilkens, H.; Guenther, P.; Kiel, F.; Kraus, F.; Mahnkopf, P.; Schnee, R.
1982-02-01
The market demand for a multiuser computer system with interactive services was studied. Mean system work load at peak use hours was estimated and the complexity of dialog with a central computer was determined. Man machine communication by broadband cable television transmission, using digital techniques, was assumed. The end to end system is described. It is user friendly, able to handle 10,000 subscribers, and provides color television display. The central computer system architecture with remote audiovisual terminals is depicted and software is explained. Signal transmission requirements are dealt with. International availability of the test system, including sample programs, is indicated.
NASA Technical Reports Server (NTRS)
Sorini, Chris; Chattopadhyay, Aditi; Goldberg, Robert K.; Kohlman, Lee W.
2016-01-01
Understanding the high velocity impact response of polymer matrix composites with complex architectures is critical to many aerospace applications, including engine fan blade containment systems where the structure must be able to completely contain fan blades in the event of a blade-out. Despite the benefits offered by these materials, the complex nature of textile composites presents a significant challenge for the prediction of deformation and damage under both quasi-static and impact loading conditions. The relatively large mesoscale repeating unit cell (in comparison to the size of structural components) causes the material to behave like a structure rather than a homogeneous material. Impact experiments conducted at NASA Glenn Research Center have shown the damage patterns to be a function of the underlying material architecture. Traditional computational techniques that involve modeling these materials using smeared homogeneous, orthotropic material properties at the macroscale result in simulated damage patterns that are a function of the structural geometry, but not the material architecture. In order to preserve heterogeneity at the highest length scale in a robust yet computationally efficient manner, and capture the architecturally dependent damage patterns, a previously-developed subcell modeling approach where the braided composite unit cell is approximated as a series of four adjacent laminated composites is utilized. This work discusses the implementation of the subcell methodology into the commercial transient dynamic finite element code LS-DYNA (Livermore Software Technology Corp.). Verification and validation studies are also presented, including simulation of the tensile response of straight-sided and notched quasi-static coupons composed of a T700/PR520 triaxially braided [0deg/60deg/-60deg] composite. Based on the results of the verification and validation studies, advantages and limitations of the methodology as well as plans for future work are discussed.
NASA Astrophysics Data System (ADS)
Mehta, Neville; Kompalli, Suryaprakash; Chaudhary, Vipin
Teleradiology is the electronic transmission of radiological patient images, such as x-rays, CT, or MR across multiple locations. The goal could be interpretation, consultation, or medical records keeping. Information technology solutions have enabled electronic records and their associated benefits are evident in health care today. However, salient aspects of collaborative interfaces, and computer assisted diagnostic (CAD) tools are yet to be integrated into workflow designs. The Computer Assisted Diagnostics and Interventions (CADI) group at the University at Buffalo has developed an architecture that facilitates web-enabled use of CAD tools, along with the novel concept of synchronized collaboration. The architecture can support multiple teleradiology applications and case studies are presented here.
Design of a modular digital computer system, CDRL no. D001, final design plan
NASA Technical Reports Server (NTRS)
Easton, R. A.
1975-01-01
The engineering breadboard implementation for the CDRL no. D001 modular digital computer system developed during design of the logic system was documented. This effort followed the architecture study completed and documented previously, and was intended to verify the concepts of a fault tolerant, automatically reconfigurable, modular version of the computer system conceived during the architecture study. The system has a microprogrammed 32 bit word length, general register architecture and an instruction set consisting of a subset of the IBM System 360 instruction set plus additional fault tolerance firmware. The following areas were covered: breadboard packaging, central control element, central processing element, memory, input/output processor, and maintenance/status panel and electronics.
NASA Astrophysics Data System (ADS)
Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.
2017-11-01
Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Design and reliability analysis of DP-3 dynamic positioning control architecture
NASA Astrophysics Data System (ADS)
Wang, Fang; Wan, Lei; Jiang, Da-Peng; Xu, Yu-Ru
2011-12-01
As the exploration and exploitation of oil and gas proliferate throughout deepwater area, the requirements on the reliability of dynamic positioning system become increasingly stringent. The control objective ensuring safety operation at deep water will not be met by a single controller for dynamic positioning. In order to increase the availability and reliability of dynamic positioning control system, the triple redundancy hardware and software control architectures were designed and developed according to the safe specifications of DP-3 classification notation for dynamically positioned ships and rigs. The hardware redundant configuration takes the form of triple-redundant hot standby configuration including three identical operator stations and three real-time control computers which connect each other through dual networks. The function of motion control and redundancy management of control computers were implemented by software on the real-time operating system VxWorks. The software realization of task loose synchronization, majority voting and fault detection were presented in details. A hierarchical software architecture was planed during the development of software, consisting of application layer, real-time layer and physical layer. The behavior of the DP-3 dynamic positioning control system was modeled by a Markov model to analyze its reliability. The effects of variation in parameters on the reliability measures were investigated. The time domain dynamic simulation was carried out on a deepwater drilling rig to prove the feasibility of the proposed control architecture.
Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
Gawande, Nitin A.; Daily, Jeff A.; Siegel, Charles; ...
2018-05-05
Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors—including NVIDIA, Intel, AMD, and IBM—have architectural road maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating large DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. Here, this article provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path or Cray Aries. Our evaluation consistsmore » of a cross section of convolutional neural net workloads: CifarNet, AlexNet, GoogLeNet, and ResNet50 topologies using the Cifar10 and ImageNet datasets. The workloads are vendor-optimized for each architecture. We use sequentially equivalent implementations to maintain iso-accuracy between parallel and sequential DL models. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and the KNL can be competitive in performance/watt. We find that NVLink facilitates scaling efficiency on GPUs. However, its importance is heavily dependent on neural network architecture. Furthermore, for weak-scaling—sometimes encouraged by restricted GPU memory—NVLink is less important.« less
Scaling Deep Learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gawande, Nitin A.; Daily, Jeff A.; Siegel, Charles
Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors—including NVIDIA, Intel, AMD, and IBM—have architectural road maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating large DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. Here, this article provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path or Cray Aries. Our evaluation consistsmore » of a cross section of convolutional neural net workloads: CifarNet, AlexNet, GoogLeNet, and ResNet50 topologies using the Cifar10 and ImageNet datasets. The workloads are vendor-optimized for each architecture. We use sequentially equivalent implementations to maintain iso-accuracy between parallel and sequential DL models. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and the KNL can be competitive in performance/watt. We find that NVLink facilitates scaling efficiency on GPUs. However, its importance is heavily dependent on neural network architecture. Furthermore, for weak-scaling—sometimes encouraged by restricted GPU memory—NVLink is less important.« less
1993-03-01
values themselves. The Wools perform risk-adjusted present-value comparisons and compute the ROI using discount factors. The assessment of risk in a...developed X Window system, the de facto industry standard window system in the UNIX environment. An X- terminal’s use is limited to display. It has no...2.1 IT HARDWARE The DOS-based PC used in this analysis costs $2,060. It includes an ASL 486DX-33 Industry Standard Architecture (ISA) computer with 8
Lockheed Martin Idaho Technologies Company information management technology architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hughes, M.J.; Lau, P.K.S.
1996-05-01
The Information Management Technology Architecture (TA) is being driven by the business objectives of reducing costs and improving effectiveness. The strategy is to reduce the cost of computing through standardization. The Lockheed Martin Idaho Technologies Company (LMITCO) TA is a set of standards and products for use at the Idaho National Engineering Laboratory (INEL). The TA will provide direction for information management resource acquisitions, development of information systems, formulation of plans, and resolution of issues involving LMITCO computing resources. Exceptions to the preferred products may be granted by the Information Management Executive Council (IMEC). Certain implementation and deployment strategies aremore » inherent in the design and structure of LMITCO TA. These include: migration from centralized toward distributed computing; deployment of the networks, servers, and other information technology infrastructure components necessary for a more integrated information technology support environment; increased emphasis on standards to make it easier to link systems and to share information; and improved use of the company`s investment in desktop computing resources. The intent is for the LMITCO TA to be a living document constantly being reviewed to take advantage of industry directions to reduce costs while balancing technological diversity with business flexibility.« less
Medical Signal-Conditioning and Data-Interface System
NASA Technical Reports Server (NTRS)
Braun, Jeffrey; Jacobus, charles; Booth, Scott; Suarez, Michael; Smith, Derek; Hartnagle, Jeffrey; LePrell, Glenn
2006-01-01
A general-purpose portable, wearable electronic signal-conditioning and data-interface system is being developed for medical applications. The system can acquire multiple physiological signals (e.g., electrocardiographic, electroencephalographic, and electromyographic signals) from sensors on the wearer s body, digitize those signals that are received in analog form, preprocess the resulting data, and transmit the data to one or more remote location(s) via a radiocommunication link and/or the Internet. The system includes a computer running data-object-oriented software that can be programmed to configure the system to accept almost any analog or digital input signals from medical devices. The computing hardware and software implement a general-purpose data-routing-and-encapsulation architecture that supports tagging of input data and routing the data in a standardized way through the Internet and other modern packet-switching networks to one or more computer(s) for review by physicians. The architecture supports multiple-site buffering of data for redundancy and reliability, and supports both real-time and slower-than-real-time collection, routing, and viewing of signal data. Routing and viewing stations support insertion of automated analysis routines to aid in encoding, analysis, viewing, and diagnosis.
Examining the architecture of cellular computing through a comparative study with a computer
Wang, Degeng; Gribskov, Michael
2005-01-01
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software–hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's ‘hardware’ equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the ‘bandwidth’ of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed. PMID:16849179
Examining the architecture of cellular computing through a comparative study with a computer.
Wang, Degeng; Gribskov, Michael
2005-06-22
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
Automation of Data Traffic Control on DSM Architecture
NASA Technical Reports Server (NTRS)
Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry
2001-01-01
The design of distributed shared memory (DSM) computers liberates users from the duty to distribute data across processors and allows for the incremental development of parallel programs using, for example, OpenMP or Java threads. DSM architecture greatly simplifies the development of parallel programs having good performance on a few processors. However, to achieve a good program scalability on DSM computers requires that the user understand data flow in the application and use various techniques to avoid data traffic congestions. In this paper we discuss a number of such techniques, including data blocking, data placement, data transposition and page size control and evaluate their efficiency on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks. We also present a tool which automates the detection of constructs causing data congestions in Fortran array oriented codes and advises the user on code transformations for improving data traffic in the application.
A Survey Of Techniques for Managing and Leveraging Caches in GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2014-09-01
Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges ofmore » cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.« less
Portable multi-node LQCD Monte Carlo simulations using OpenACC
NASA Astrophysics Data System (ADS)
Bonati, Claudio; Calore, Enrico; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Sanfilippo, Francesco; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.
Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi
2018-01-01
Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Design and Analysis of a Neuromemristive Reservoir Computing Architecture for Biosignal Processing
Kudithipudi, Dhireesha; Saleh, Qutaiba; Merkel, Cory; Thesing, James; Wysocki, Bryant
2016-01-01
Reservoir computing (RC) is gaining traction in several signal processing domains, owing to its non-linear stateful computation, spatiotemporal encoding, and reduced training complexity over recurrent neural networks (RNNs). Previous studies have shown the effectiveness of software-based RCs for a wide spectrum of applications. A parallel body of work indicates that realizing RNN architectures using custom integrated circuits and reconfigurable hardware platforms yields significant improvements in power and latency. In this research, we propose a neuromemristive RC architecture, with doubly twisted toroidal structure, that is validated for biosignal processing applications. We exploit the device mismatch to implement the random weight distributions within the reservoir and propose mixed-signal subthreshold circuits for energy efficiency. A comprehensive analysis is performed to compare the efficiency of the neuromemristive RC architecture in both digital(reconfigurable) and subthreshold mixed-signal realizations. Both Electroencephalogram (EEG) and Electromyogram (EMG) biosignal benchmarks are used for validating the RC designs. The proposed RC architecture demonstrated an accuracy of 90 and 84% for epileptic seizure detection and EMG prosthetic finger control, respectively. PMID:26869876
Yoo, Dongjin
2012-07-01
Advanced additive manufacture (AM) techniques are now being developed to fabricate scaffolds with controlled internal pore architectures in the field of tissue engineering. In general, these techniques use a hybrid method which combines computer-aided design (CAD) with computer-aided manufacturing (CAM) tools to design and fabricate complicated three-dimensional (3D) scaffold models. The mathematical descriptions of micro-architectures along with the macro-structures of the 3D scaffold models are limited by current CAD technologies as well as by the difficulty of transferring the designed digital models to standard formats for fabrication. To overcome these difficulties, we have developed an efficient internal pore architecture design system based on triply periodic minimal surface (TPMS) unit cell libraries and associated computational methods to assemble TPMS unit cells into an entire scaffold model. In addition, we have developed a process planning technique based on TPMS internal architecture pattern of unit cells to generate tool paths for freeform fabrication of tissue engineering porous scaffolds. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
A Distributed Laboratory for Event-Driven Coastal Prediction and Hazard Planning
NASA Astrophysics Data System (ADS)
Bogden, P.; Allen, G.; MacLaren, J.; Creager, G. J.; Flournoy, L.; Sheng, Y. P.; Graber, H.; Graves, S.; Conover, H.; Luettich, R.; Perrie, W.; Ramakrishnan, L.; Reed, D. A.; Wang, H. V.
2006-12-01
The 2005 Atlantic hurricane season was the most active in recorded history. Collectively, 2005 hurricanes caused more than 2,280 deaths and record damages of over 100 billion dollars. Of the storms that made landfall, Dennis, Emily, Katrina, Rita, and Wilma caused most of the destruction. Accurate predictions of storm-driven surge, wave height, and inundation can save lives and help keep recovery costs down, provided the information gets to emergency response managers in time. The information must be available well in advance of landfall so that responders can weigh the costs of unnecessary evacuation against the costs of inadequate preparation. The SURA Coastal Ocean Observing and Prediction (SCOOP) Program is a multi-institution collaboration implementing a modular, distributed service-oriented architecture for real time prediction and visualization of the impacts of extreme atmospheric events. The modular infrastructure enables real-time prediction of multi- scale, multi-model, dynamic, data-driven applications. SURA institutions are working together to create a virtual and distributed laboratory integrating coastal models, simulation data, and observations with computational resources and high speed networks. The loosely coupled architecture allows teams of computer and coastal scientists at multiple institutions to innovate complex system components that are interconnected with relatively stable interfaces. The operational system standardizes at the interface level to enable substantial innovation by complementary communities of coastal and computer scientists. This architectural philosophy solves a long-standing problem associated with the transition from research to operations. The SCOOP Program thereby implements a prototype laboratory consistent with the vision of a national, multi-agency initiative called the Integrated Ocean Observing System (IOOS). Several service- oriented components of the SCOOP enterprise architecture have already been designed and implemented, including data archive and transport services, metadata registry and retrieval (catalog), resource management, and portal interfaces. SCOOP partners are integrating these at the service level and implementing reconfigurable workflows for several kinds of user scenarios, and are working with resource providers to prototype new policies and technologies for on-demand computing.
Embedded Data Processor and Portable Computer Technology testbeds
NASA Technical Reports Server (NTRS)
Alena, Richard; Liu, Yuan-Kwei; Goforth, Andre; Fernquist, Alan R.
1993-01-01
Attention is given to current activities in the Embedded Data Processor and Portable Computer Technology testbed configurations that are part of the Advanced Data Systems Architectures Testbed at the Information Sciences Division at NASA Ames Research Center. The Embedded Data Processor Testbed evaluates advanced microprocessors for potential use in mission and payload applications within the Space Station Freedom Program. The Portable Computer Technology (PCT) Testbed integrates and demonstrates advanced portable computing devices and data system architectures. The PCT Testbed uses both commercial and custom-developed devices to demonstrate the feasibility of functional expansion and networking for portable computers in flight missions.
A supportive architecture for CFD-based design optimisation
NASA Astrophysics Data System (ADS)
Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong
2014-03-01
Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.
p88110: A Graphical Simulator for Computer Architecture and Organization Courses
ERIC Educational Resources Information Center
Garcia, M. I.; Rodriguez, S.; Perez, A.; Garcia, A.
2009-01-01
Studying fundamental Computer Architecture and Organization topics requires a significant amount of practical work if students are to acquire a good grasp of the theoretical concepts presented in classroom lectures or textbooks. The use of simulators is commonly adopted in order to reach this objective. However, as most of the available…
Component architecture in drug discovery informatics.
Smith, Peter M
2002-05-01
This paper reviews the characteristics of a new model of computing that has been spurred on by the Internet, known as Netcentric computing. Developments in this model led to distributed component architectures, which, although not new ideas, are now realizable with modern tools such as Enterprise Java. The application of this approach to scientific computing, particularly in pharmaceutical discovery research, is discussed and highlighted by a particular case involving the management of biological assay data.
FPGA-based architecture for motion recovering in real-time
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar
2002-03-01
A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
Parallel, stochastic measurement of molecular surface area.
Juba, Derek; Varshney, Amitabh
2008-08-01
Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.
Neural networks and applications tutorial
NASA Astrophysics Data System (ADS)
Guyon, I.
1991-09-01
The importance of neural networks has grown dramatically during this decade. While only a few years ago they were primarily of academic interest, now dozens of companies and many universities are investigating the potential use of these systems and products are beginning to appear. The idea of building a machine whose architecture is inspired by that of the brain has roots which go far back in history. Nowadays, technological advances of computers and the availability of custom integrated circuits, permit simulations of hundreds or even thousands of neurons. In conjunction, the growing interest in learning machines, non-linear dynamics and parallel computation spurred renewed attention in artificial neural networks. Many tentative applications have been proposed, including decision systems (associative memories, classifiers, data compressors and optimizers), or parametric models for signal processing purposes (system identification, automatic control, noise canceling, etc.). While they do not always outperform standard methods, neural network approaches are already used in some real world applications for pattern recognition and signal processing tasks. The tutorial is divided into six lectures, that where presented at the Third Graduate Summer Course on Computational Physics (September 3-7, 1990) on Parallel Architectures and Applications, organized by the European Physical Society: (1) Introduction: machine learning and biological computation. (2) Adaptive artificial neurons (perceptron, ADALINE, sigmoid units, etc.): learning rules and implementations. (3) Neural network systems: architectures, learning algorithms. (4) Applications: pattern recognition, signal processing, etc. (5) Elements of learning theory: how to build networks which generalize. (6) A case study: a neural network for on-line recognition of handwritten alphanumeric characters.
Fault tolerant architectures for integrated aircraft electronics systems, task 2
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1984-01-01
The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported.
NASA Astrophysics Data System (ADS)
Liu, Lei; Hong, Xiaobin; Wu, Jian; Lin, Jintong
As Grid computing continues to gain popularity in the industry and research community, it also attracts more attention from the customer level. The large number of users and high frequency of job requests in the consumer market make it challenging. Clearly, all the current Client/Server(C/S)-based architecture will become unfeasible for supporting large-scale Grid applications due to its poor scalability and poor fault-tolerance. In this paper, based on our previous works [1, 2], a novel self-organized architecture to realize a highly scalable and flexible platform for Grids is proposed. Experimental results show that this architecture is suitable and efficient for consumer-oriented Grids.
A cognitive computational model inspired by the immune system response.
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective.
A Cognitive Computational Model Inspired by the Immune System Response
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective. PMID:25003131
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Follen, Gregory J. (Technical Monitor); Radenski, Atanas
2003-01-01
The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation
2008-05-19
Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation Vito Dai Electrical Engineering and Computer Sciences...servers or to redistribute to lists, requires prior specific permission. Data Compression for Maskless Lithography Systems: Architecture, Algorithms and...for Maskless Lithography Systems: Architecture, Algorithms and Implementation Copyright 2008 by Vito Dai 1 Abstract Data Compression for Maskless
The EPOS Vision for the Open Science Cloud
NASA Astrophysics Data System (ADS)
Jeffery, Keith; Harrison, Matt; Cocco, Massimo
2016-04-01
Cloud computing offers dynamic elastic scalability for data processing on demand. For much research activity, demand for computing is uneven over time and so CLOUD computing offers both cost-effectiveness and capacity advantages. However, as reported repeatedly by the EC Cloud Expert Group, there are barriers to the uptake of Cloud Computing: (1) security and privacy; (2) interoperability (avoidance of lock-in); (3) lack of appropriate systems development environments for application programmers to characterise their applications to allow CLOUD middleware to optimize their deployment and execution. From CERN, the Helix-Nebula group has proposed the architecture for the European Open Science Cloud. They are discussing with other e-Infrastructure groups such as EGI (GRIDs), EUDAT (data curation), AARC (network authentication and authorisation) and also with the EIROFORUM group of 'international treaty' RIs (Research Infrastructures) and the ESFRI (European Strategic Forum for Research Infrastructures) RIs including EPOS. Many of these RIs are either e-RIs (electronic-RIs) or have an e-RI interface for access and use. The EPOS architecture is centred on a portal: ICS (Integrated Core Services). The architectural design already allows for access to e-RIs (which may include any or all of data, software, users and resources such as computers or instruments). Those within any one domain (subject area) of EPOS are considered within the TCS (Thematic Core Services). Those outside, or available across multiple domains of EPOS, are ICS-d (Integrated Core Services-Distributed) since the intention is that they will be used by any or all of the TCS via the ICS. Another such service type is CES (Computational Earth Science); effectively an ICS-d specializing in high performance computation, analytics, simulation or visualization offered by a TCS for others to use. Already discussions are underway between EPOS and EGI, EUDAT, AARC and Helix-Nebula for those offerings to be considered as ICS-ds by EPOS.. Provision of access to ICS-Ds from ICS-C concerns several aspects: (a) Technical : it may be more or less difficult to connect and pass from ICS-C to the ICS-d/ CES the 'package' (probably a virtual machine) of data and software; (b) Security/privacy : including passing personal information e.g. related to AAAI (Authentication, authorization, accounting Infrastructure); (c) financial and legal : such as payment, licence conditions; Appropriate interfaces from ICS-C to ICS-d are being designed to accommodate these aspects. The Open Science Cloud is timely because it provides a framework to discuss governance and sustainability for computational resource provision as well as an effective interpretation of federated approach to HPC(High Performance Computing) -HTC (High Throughput Computing). It will be a unique opportunity to share and adopt procurement policies to provide access to computational resources for RIs. The current state of discussions and expected roadmap for the EPOS-Open Science Cloud relationship are presented.
Behavioral Reference Model for Pervasive Healthcare Systems.
Tahmasbi, Arezoo; Adabi, Sahar; Rezaee, Ali
2016-12-01
The emergence of mobile healthcare systems is an important outcome of application of pervasive computing concepts for medical care purposes. These systems provide the facilities and infrastructure required for automatic and ubiquitous sharing of medical information. Healthcare systems have a dynamic structure and configuration, therefore having an architecture is essential for future development of these systems. The need for increased response rate, problem limited storage, accelerated processing and etc. the tendency toward creating a new generation of healthcare system architecture highlight the need for further focus on cloud-based solutions for transfer data and data processing challenges. Integrity and reliability of healthcare systems are of critical importance, as even the slightest error may put the patients' lives in danger; therefore acquiring a behavioral model for these systems and developing the tools required to model their behaviors are of significant importance. The high-level designs may contain some flaws, therefor the system must be fully examined for different scenarios and conditions. This paper presents a software architecture for development of healthcare systems based on pervasive computing concepts, and then models the behavior of described system. A set of solutions are then proposed to improve the design's qualitative characteristics including, availability, interoperability and performance.
NASA Astrophysics Data System (ADS)
Suarez, Hernan; Zhang, Yan R.
2015-05-01
New radar applications need to perform complex algorithms and process large quantity of data to generate useful information for the users. This situation has motivated the search for better processing solutions that include low power high-performance processors, efficient algorithms, and high-speed interfaces. In this work, hardware implementation of adaptive pulse compression for real-time transceiver optimization are presented, they are based on a System-on-Chip architecture for Xilinx devices. This study also evaluates the performance of dedicated coprocessor as hardware accelerator units to speed up and improve the computation of computing-intensive tasks such matrix multiplication and matrix inversion which are essential units to solve the covariance matrix. The tradeoffs between latency and hardware utilization are also presented. Moreover, the system architecture takes advantage of the embedded processor, which is interconnected with the logic resources through the high performance AXI buses, to perform floating-point operations, control the processing blocks, and communicate with external PC through a customized software interface. The overall system functionality is demonstrated and tested for real-time operations using a Ku-band tested together with a low-cost channel emulator for different types of waveforms.
Fault-tolerant battery system employing intra-battery network architecture
Hagen, Ronald A.; Chen, Kenneth W.; Comte, Christophe; Knudson, Orlin B.; Rouillard, Jean
2000-01-01
A distributed energy storing system employing a communications network is disclosed. A distributed battery system includes a number of energy storing modules, each of which includes a processor and communications interface. In a network mode of operation, a battery computer communicates with each of the module processors over an intra-battery network and cooperates with individual module processors to coordinate module monitoring and control operations. The battery computer monitors a number of battery and module conditions, including the potential and current state of the battery and individual modules, and the conditions of the battery's thermal management system. An over-discharge protection system, equalization adjustment system, and communications system are also controlled by the battery computer. The battery computer logs and reports various status data on battery level conditions which may be reported to a separate system platform computer. A module transitions to a stand-alone mode of operation if the module detects an absence of communication connectivity with the battery computer. A module which operates in a stand-alone mode performs various monitoring and control functions locally within the module to ensure safe and continued operation.
NASA Astrophysics Data System (ADS)
Singh, Surya P. N.; Thayer, Scott M.
2002-02-01
This paper presents a novel algorithmic architecture for the coordination and control of large scale distributed robot teams derived from the constructs found within the human immune system. Using this as a guide, the Immunology-derived Distributed Autonomous Robotics Architecture (IDARA) distributes tasks so that broad, all-purpose actions are refined and followed by specific and mediated responses based on each unit's utility and capability to timely address the system's perceived need(s). This method improves on initial developments in this area by including often overlooked interactions of the innate immune system resulting in a stronger first-order, general response mechanism. This allows for rapid reactions in dynamic environments, especially those lacking significant a priori information. As characterized via computer simulation of a of a self-healing mobile minefield having up to 7,500 mines and 2,750 robots, IDARA provides an efficient, communications light, and scalable architecture that yields significant operation and performance improvements for large-scale multi-robot coordination and control.
Deep Space Network information system architecture study
NASA Technical Reports Server (NTRS)
Beswick, C. A.; Markley, R. W. (Editor); Atkinson, D. J.; Cooper, L. P.; Tausworthe, R. C.; Masline, R. C.; Jenkins, J. S.; Crowe, R. A.; Thomas, J. L.; Stoloff, M. J.
1992-01-01
The purpose of this article is to describe an architecture for the DSN information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990's. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies--i.e., computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control.
Autonomic Computing for Spacecraft Ground Systems
NASA Technical Reports Server (NTRS)
Li, Zhenping; Savkli, Cetin; Jones, Lori
2007-01-01
Autonomic computing for spacecraft ground systems increases the system reliability and reduces the cost of spacecraft operations and software maintenance. In this paper, we present an autonomic computing solution for spacecraft ground systems at NASA Goddard Space Flight Center (GSFC), which consists of an open standard for a message oriented architecture referred to as the GMSEC architecture (Goddard Mission Services Evolution Center), and an autonomic computing tool, the Criteria Action Table (CAT). This solution has been used in many upgraded ground systems for NASA 's missions, and provides a framework for developing solutions with higher autonomic maturity.
Traffic information computing platform for big data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn
Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.
Low-cost space-varying FIR filter architecture for computational imaging systems
NASA Astrophysics Data System (ADS)
Feng, Guotong; Shoaib, Mohammed; Schwartz, Edward L.; Dirk Robinson, M.
2010-01-01
Recent research demonstrates the advantage of designing electro-optical imaging systems by jointly optimizing the optical and digital subsystems. The optical systems designed using this joint approach intentionally introduce large and often space-varying optical aberrations that produce blurry optical images. Digital sharpening restores reduced contrast due to these intentional optical aberrations. Computational imaging systems designed in this fashion have several advantages including extended depth-of-field, lower system costs, and improved low-light performance. Currently, most consumer imaging systems lack the necessary computational resources to compensate for these optical systems with large aberrations in the digital processor. Hence, the exploitation of the advantages of the jointly designed computational imaging system requires low-complexity algorithms enabling space-varying sharpening. In this paper, we describe a low-cost algorithmic framework and associated hardware enabling the space-varying finite impulse response (FIR) sharpening required to restore largely aberrated optical images. Our framework leverages the space-varying properties of optical images formed using rotationally-symmetric optical lens elements. First, we describe an approach to leverage the rotational symmetry of the point spread function (PSF) about the optical axis allowing computational savings. Second, we employ a specially designed bank of sharpening filters tuned to the specific radial variation common to optical aberrations. We evaluate the computational efficiency and image quality achieved by using this low-cost space-varying FIR filter architecture.
Lai, Chin-Feng; Chen, Min; Pan, Jeng-Shyang; Youn, Chan-Hyun; Chao, Han-Chieh
2014-03-01
As cloud computing and wireless body sensor network technologies become gradually developed, ubiquitous healthcare services prevent accidents instantly and effectively, as well as provides relevant information to reduce related processing time and cost. This study proposes a co-processing intermediary framework integrated cloud and wireless body sensor networks, which is mainly applied to fall detection and 3-D motion reconstruction. In this study, the main focuses includes distributed computing and resource allocation of processing sensing data over the computing architecture, network conditions and performance evaluation. Through this framework, the transmissions and computing time of sensing data are reduced to enhance overall performance for the services of fall events detection and 3-D motion reconstruction.
Dynamic array processing for computationally intensive expert systems in CLIPS
NASA Technical Reports Server (NTRS)
Athavale, N. N.; Ragade, R. K.; Fenske, T. E.; Cassaro, M. A.
1990-01-01
This paper puts forth an architecture for implementing a loop for advanced data structure of arrays in CLIPS. An attempt is made to use multi-field variables in such an architecture to process a set of data during the decision making cycle. Also, current limitations on the expert system shells are discussed in brief in this paper. The resulting architecture is designed to circumvent the current limitations set by the expert system shell and also by the operating environment. Such advanced data structures are needed for tightly coupling symbolic and numeric computation modules.
A model for architectural comparison
NASA Astrophysics Data System (ADS)
Ho, Sam; Snyder, Larry
1988-04-01
Recently, architectures for sequential computers became a topic of much discussion and controversy. At the center of this storm is the Reduced Instruction Set Computer, or RISC, first described at Berkeley in 1980. While the merits of the RISC architecture cannot be ignored, its opponents have tried to do just that, while its proponents have expanded and frequently exaggerated them. This state of affairs has persisted to this day. No attempt is made to settle this controversy, since indeed there is likely no one answer. A qualitative framework is provided for a rational discussion of the issues.
Dual-scale topology optoelectronic processor.
Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H
1991-12-15
The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
Computational Cosmology at the Bleeding Edge
NASA Astrophysics Data System (ADS)
Habib, Salman
2013-04-01
Large-area sky surveys are providing a wealth of cosmological information to address the mysteries of dark energy and dark matter. Observational probes based on tracking the formation of cosmic structure are essential to this effort, and rely crucially on N-body simulations that solve the Vlasov-Poisson equation in an expanding Universe. As statistical errors from survey observations continue to shrink, and cosmological probes increase in number and complexity, simulations are entering a new regime in their use as tools for scientific inference. Changes in supercomputer architectures provide another rationale for developing new parallel simulation and analysis capabilities that can scale to computational concurrency levels measured in the millions to billions. In this talk I will outline the motivations behind the development of the HACC (Hardware/Hybrid Accelerated Cosmology Code) extreme-scale cosmological simulation framework and describe its essential features. By exploiting a novel algorithmic structure that allows flexible tuning across diverse computer architectures, including accelerated and many-core systems, HACC has attained a performance of 14 PFlops on the IBM BG/Q Sequoia system at 69% of peak, using more than 1.5 million cores.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Multicore Architecture-aware Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srinivasa, Avinash
Modern high performance systems are becoming increasingly complex and powerful due to advancements in processor and memory architecture. In order to keep up with this increasing complexity, applications have to be augmented with certain capabilities to fully exploit such systems. These may be at the application level, such as static or dynamic adaptations or at the system level, like having strategies in place to override some of the default operating system polices, the main objective being to improve computational performance of the application. The current work proposes two such capabilites with respect to multi-threaded scientific applications, in particular a largemore » scale physics application computing ab-initio nuclear structure. The first involves using a middleware tool to invoke dynamic adaptations in the application, so as to be able to adjust to the changing computational resource availability at run-time. The second involves a strategy for effective placement of data in main memory, to optimize memory access latencies and bandwidth. These capabilties when included were found to have a significant impact on the application performance, resulting in average speedups of as much as two to four times.« less
E-Governance and Service Oriented Computing Architecture Model
NASA Astrophysics Data System (ADS)
Tejasvee, Sanjay; Sarangdevot, S. S.
2010-11-01
E-Governance is the effective application of information communication and technology (ICT) in the government processes to accomplish safe and reliable information lifecycle management. Lifecycle of the information involves various processes as capturing, preserving, manipulating and delivering information. E-Governance is meant to transform of governance in better manner to the citizens which is transparent, reliable, participatory, and accountable in point of view. The purpose of this paper is to attempt e-governance model, focus on the Service Oriented Computing Architecture (SOCA) that includes combination of information and services provided by the government, innovation, find out the way of optimal service delivery to citizens and implementation in transparent and liable practice. This paper also try to enhance focus on the E-government Service Manager as a essential or key factors service oriented and computing model that provides a dynamically extensible structural design in which all area or branch can bring in innovative services. The heart of this paper examine is an intangible model that enables E-government communication for trade and business, citizen and government and autonomous bodies.
Eastman, Peter; Friedrichs, Mark S; Chodera, John D; Radmer, Randall J; Bruns, Christopher M; Ku, Joy P; Beauchamp, Kyle A; Lane, Thomas J; Wang, Lee-Ping; Shukla, Diwakar; Tye, Tony; Houston, Mike; Stich, Timo; Klein, Christoph; Shirts, Michael R; Pande, Vijay S
2013-01-08
OpenMM is a software toolkit for performing molecular simulations on a range of high performance computing architectures. It is based on a layered architecture: the lower layers function as a reusable library that can be invoked by any application, while the upper layers form a complete environment for running molecular simulations. The library API hides all hardware-specific dependencies and optimizations from the users and developers of simulation programs: they can be run without modification on any hardware on which the API has been implemented. The current implementations of OpenMM include support for graphics processing units using the OpenCL and CUDA frameworks. In addition, OpenMM was designed to be extensible, so new hardware architectures can be accommodated and new functionality (e.g., energy terms and integrators) can be easily added.
Eastman, Peter; Friedrichs, Mark S.; Chodera, John D.; Radmer, Randall J.; Bruns, Christopher M.; Ku, Joy P.; Beauchamp, Kyle A.; Lane, Thomas J.; Wang, Lee-Ping; Shukla, Diwakar; Tye, Tony; Houston, Mike; Stich, Timo; Klein, Christoph; Shirts, Michael R.; Pande, Vijay S.
2012-01-01
OpenMM is a software toolkit for performing molecular simulations on a range of high performance computing architectures. It is based on a layered architecture: the lower layers function as a reusable library that can be invoked by any application, while the upper layers form a complete environment for running molecular simulations. The library API hides all hardware-specific dependencies and optimizations from the users and developers of simulation programs: they can be run without modification on any hardware on which the API has been implemented. The current implementations of OpenMM include support for graphics processing units using the OpenCL and CUDA frameworks. In addition, OpenMM was designed to be extensible, so new hardware architectures can be accommodated and new functionality (e.g., energy terms and integrators) can be easily added. PMID:23316124
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Howser, Lona M.
1993-01-01
The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.
NASA Space Engineering Research Center for VLSI systems design
NASA Technical Reports Server (NTRS)
1991-01-01
This annual review reports the center's activities and findings on very large scale integration (VLSI) systems design for 1990, including project status, financial support, publications, the NASA Space Engineering Research Center (SERC) Symposium on VLSI Design, research results, and outreach programs. Processor chips completed or under development are listed. Research results summarized include a design technique to harden complementary metal oxide semiconductors (CMOS) memory circuits against single event upset (SEU); improved circuit design procedures; and advances in computer aided design (CAD), communications, computer architectures, and reliability design. Also described is a high school teacher program that exposes teachers to the fundamentals of digital logic design.
ERIC Educational Resources Information Center
Hung, Y.-C.
2012-01-01
This paper investigates the impact of combining self explaining (SE) with computer architecture diagrams to help novice students learn assembly language programming. Pre- and post-test scores for the experimental and control groups were compared and subjected to covariance (ANCOVA) statistical analysis. Results indicate that the SE-plus-diagram…
A Model for Minimizing Numeric Function Generator Complexity and Delay
2007-12-01
allow computation of difficult mathematical functions in less time and with less hardware than commonly employed methods. They compute piecewise...Programmable Gate Arrays (FPGAs). The algorithms and estimation techniques apply to various NFG architectures and mathematical functions. This...thesis compares hardware utilization and propagation delay for various NFG architectures, mathematical functions, word widths, and segmentation methods
Usage of Thin-Client/Server Architecture in Computer Aided Education
ERIC Educational Resources Information Center
Cimen, Caghan; Kavurucu, Yusuf; Aydin, Halit
2014-01-01
With the advances of technology, thin-client/server architecture has become popular in multi-user/single network environments. Thin-client is a user terminal in which the user can login to a domain and run programs by connecting to a remote server. Recent developments in network and hardware technologies (cloud computing, virtualization, etc.)…
Optical Computing Based on Neuronal Models
1988-05-01
walking, and cognition are far too complex for existing sequential digital computers. Therefore new architectures, hardware, and algorithms modeled...collective behavior, and iterative processing into optical processing and artificial neurodynamical systems. Another intriguing promise of neural nets is...with architectures, implementations, and programming; and material research s -7- called for. Our future research in neurodynamics will continue to
Using SPEEDES to simulate the blue gene interconnect network
NASA Technical Reports Server (NTRS)
Springer, P.; Upchurch, E.
2003-01-01
JPL and the Center for Advanced Computer Architecture (CACR) is conducting application and simulation analyses of BG/L in order to establish a range of effectiveness for the Blue Gene/L MPP architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution.
The Use of Metaphors as a Parametric Design Teaching Model: A Case Study
ERIC Educational Resources Information Center
Agirbas, Asli
2018-01-01
Teaching methodologies for parametric design are being researched all over the world, since there is a growing demand for computer programming logic and its fabrication process in architectural education. The computer programming courses in architectural education are usually done in a very short period of time, and so students have no chance to…
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.
1992-01-01
An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
A multitasking finite state architecture for computer control of an electric powertrain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burba, J.C.
1984-01-01
Finite state techniques provide a common design language between the control engineer and the computer engineer for event driven computer control systems. They simplify communication and provide a highly maintainable control system understandable by both. This paper describes the development of a control system for an electric vehicle powertrain utilizing finite state concepts. The basics of finite state automata are provided as a framework to discuss a unique multitasking software architecture developed for this application. The architecture employs conventional time-sliced techniques with task scheduling controlled by a finite state machine representation of the control strategy of the powertrain. The complexitiesmore » of excitation variable sampling in this environment are also considered.« less
Network architecture test-beds as platforms for ubiquitous computing.
Roscoe, Timothy
2008-10-28
Distributed systems research, and in particular ubiquitous computing, has traditionally assumed the Internet as a basic underlying communications substrate. Recently, however, the networking research community has come to question the fundamental design or 'architecture' of the Internet. This has been led by two observations: first, that the Internet as it stands is now almost impossible to evolve to support new functionality; and second, that modern applications of all kinds now use the Internet rather differently, and frequently implement their own 'overlay' networks above it to work around its perceived deficiencies. In this paper, I discuss recent academic projects to allow disruptive change to the Internet architecture, and also outline a radically different view of networking for ubiquitous computing that such proposals might facilitate.
Improving TOGAF ADM 9.1 Migration Planning Phase by ITIL V3 Service Transition
NASA Astrophysics Data System (ADS)
Hanum Harani, Nisa; Akhmad Arman, Arry; Maulana Awangga, Rolly
2018-04-01
Modification planning of business transformation involving technological utilization required a system of transition and migration planning process. Planning of system migration activity is the most important. The migration process is including complex elements such as business re-engineering, transition scheme mapping, data transformation, application development, individual involvement by computer and trial interaction. TOGAF ADM is the framework and method of enterprise architecture implementation. TOGAF ADM provides a manual refer to the architecture and migration planning. The planning includes an implementation solution, in this case, IT solution, but when the solution becomes an IT operational planning, TOGAF could not handle it. This paper presents a new model framework detail transitions process of integration between TOGAF and ITIL. We evaluated our models in field study inside a private university.
JPL control/structure interaction test bed real-time control computer architecture
NASA Technical Reports Server (NTRS)
Briggs, Hugh C.
1989-01-01
The Control/Structure Interaction Program is a technology development program for spacecraft that exhibit interactions between the control system and structural dynamics. The program objectives include development and verification of new design concepts - such as active structure - and new tools - such as combined structure and control optimization algorithm - and their verification in ground and possibly flight test. A focus mission spacecraft was designed based upon a space interferometer and is the basis for design of the ground test article. The ground test bed objectives include verification of the spacecraft design concepts, the active structure elements and certain design tools such as the new combined structures and controls optimization tool. In anticipation of CSI technology flight experiments, the test bed control electronics must emulate the computation capacity and control architectures of space qualifiable systems as well as the command and control networks that will be used to connect investigators with the flight experiment hardware. The Test Bed facility electronics were functionally partitioned into three units: a laboratory data acquisition system for structural parameter identification and performance verification; an experiment supervisory computer to oversee the experiment, monitor the environmental parameters and perform data logging; and a multilevel real-time control computing system. The design of the Test Bed electronics is presented along with hardware and software component descriptions. The system should break new ground in experimental control electronics and is of interest to anyone working in the verification of control concepts for large structures.
High-throughput neuroimaging-genetics computational infrastructure
Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Hobel, Sam; Vespa, Paul; Woo Moon, Seok; Van Horn, John D.; Franco, Joseph; Toga, Arthur W.
2014-01-01
Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure1. PMID:24795619
OS friendly microprocessor architecture: Hardware level computer security
NASA Astrophysics Data System (ADS)
Jungwirth, Patrick; La Fratta, Patrick
2016-05-01
We present an introduction to the patented OS Friendly Microprocessor Architecture (OSFA) and hardware level computer security. Conventional microprocessors have not tried to balance hardware performance and OS performance at the same time. Conventional microprocessors have depended on the Operating System for computer security and information assurance. The goal of the OS Friendly Architecture is to provide a high performance and secure microprocessor and OS system. We are interested in cyber security, information technology (IT), and SCADA control professionals reviewing the hardware level security features. The OS Friendly Architecture is a switched set of cache memory banks in a pipeline configuration. For light-weight threads, the memory pipeline configuration provides near instantaneous context switching times. The pipelining and parallelism provided by the cache memory pipeline provides for background cache read and write operations while the microprocessor's execution pipeline is running instructions. The cache bank selection controllers provide arbitration to prevent the memory pipeline and microprocessor's execution pipeline from accessing the same cache bank at the same time. This separation allows the cache memory pages to transfer to and from level 1 (L1) caching while the microprocessor pipeline is executing instructions. Computer security operations are implemented in hardware. By extending Unix file permissions bits to each cache memory bank and memory address, the OSFA provides hardware level computer security.
Internet Architecture: Lessons Learned and Looking Forward
2006-12-01
Internet Architecture: Lessons Learned and Looking Forward Geoffrey G. Xie Department of Computer Science Naval Postgraduate School April 2006... Internet architecture. Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is...readers are referred there for more information about a specific protocol or concept. 2. Origin of Internet Architecture The Internet is easily
ERIC Educational Resources Information Center
Uwakonye, Obioha; Alagbe, Oluwole; Oluwatayo, Adedapo; Alagbe, Taiye; Alalade, Gbenga
2015-01-01
As a result of globalization of digital technology, intellectual discourse on what constitutes the basic body of architectural knowledge to be imparted to future professionals has been on the increase. This digital revolution has brought to the fore the need to review the already overloaded architectural education curriculum of Nigerian schools of…
Sabne, Amit J.; Sakdhnagool, Putt; Lee, Seyong; ...
2015-07-13
Accelerator-based heterogeneous computing is gaining momentum in the high-performance computing arena. However, the increased complexity of heterogeneous architectures demands more generic, high-level programming models. OpenACC is one such attempt to tackle this problem. Although the abstraction provided by OpenACC offers productivity, it raises questions concerning both functional and performance portability. In this article, the authors propose HeteroIR, a high-level, architecture-independent intermediate representation, to map high-level programming models, such as OpenACC, to heterogeneous architectures. They present a compiler approach that translates OpenACC programs into HeteroIR and accelerator kernels to obtain OpenACC functional portability. They then evaluate the performance portability obtained bymore » OpenACC with their approach on 12 OpenACC programs on Nvidia CUDA, AMD GCN, and Intel Xeon Phi architectures. They study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.« less
The future of computing--new architectures and new technologies.
Warren, P
2004-02-01
All modern computers are designed using the 'von Neumann' architecture and built using silicon transistor technology. Both architecture and technology have been remarkably successful. Yet there are a range of problems for which this conventional architecture is not particularly well adapted, and new architectures are being proposed to solve these problems, in particular based on insight from nature. Transistor technology has enjoyed 50 years of continuing progress. However, the laws of physics dictate that within a relatively short time period this progress will come to an end. New technologies, based on molecular and biological sciences as well as quantum physics, are vying to replace silicon, or at least coexist with it and extend its capability. The paper describes these novel architectures and technologies, places them in the context of the kinds of problems they might help to solve, and predicts their possible manner and time of adoption. Finally it describes some key questions and research problems associated with their use.
Scalable service architecture for providing strong service guarantees
NASA Astrophysics Data System (ADS)
Christin, Nicolas; Liebeherr, Joerg
2002-07-01
For the past decade, a lot of Internet research has been devoted to providing different levels of service to applications. Initial proposals for service differentiation provided strong service guarantees, with strict bounds on delays, loss rates, and throughput, but required high overhead in terms of computational complexity and memory, both of which raise scalability concerns. Recently, the interest has shifted to service architectures with low overhead. However, these newer service architectures only provide weak service guarantees, which do not always address the needs of applications. In this paper, we describe a service architecture that supports strong service guarantees, can be implemented with low computational complexity, and only requires to maintain little state information. A key mechanism of the proposed service architecture is that it addresses scheduling and buffer management in a single algorithm. The presented architecture offers no solution for controlling the amount of traffic that enters the network. Instead, we plan on exploiting feedback mechanisms of TCP congestion control algorithms for the purpose of regulating the traffic entering the network.
NASA Technical Reports Server (NTRS)
Hall, Justin R.; Hastrup, Rolf C.
1990-01-01
The principal challenges in providing effective deep space navigation, telecommunications, and information management architectures and designs for Mars exploration support are presented. The fundamental objectives are to provide the mission with the means to monitor and control mission elements, obtain science, navigation, and engineering data, compute state vectors and navigate, and to move these data efficiently and automatically between mission nodes for timely analysis and decision making. New requirements are summarized, and related issues and challenges including the robust connectivity for manned and robotic links, are identified. Enabling strategies are discussed, and candidate architectures and driving technologies are described.
Architecture of the software for LAMOST fiber positioning subsystem
NASA Astrophysics Data System (ADS)
Peng, Xiaobo; Xing, Xiaozheng; Hu, Hongzhuan; Zhai, Chao; Li, Weimin
2004-09-01
The architecture of the software which controls the LAMOST fiber positioning sub-system is described. The software is composed of two parts as follows: a main control program in a computer and a unit controller program in a MCS51 single chip microcomputer ROM. And the function of the software includes: Client/Server model establishment, observation planning, collision handling, data transmission, pulse generation, CCD control, image capture and processing, and data analysis etc. Particular attention is paid to the ways in which different parts of the software can communicate. Also software techniques for multi threads, SOCKET programming, Microsoft Windows message response, and serial communications are discussed.
Multiprocessor and memory architecture of the neurocomputer SYNAPSE-1.
Ramacher, U; Raab, W; Anlauf, J; Hachmann, U; Beichter, J; Brüls, N; Wesseling, M; Sicheneder, E; Männer, R; Glass, J
1993-12-01
A general purpose neurocomputer, SYNAPSE-1, which exhibits a multiprocessor and memory architecture is presented. It offers wide flexibility with respect to neural algorithms and a speed-up factor of several orders of magnitude--including learning. The computational power is provided by a 2-dimensional systolic array of neural signal processors. Since the weights are stored outside these NSPs, memory size and processing power can be adapted individually to the application needs. A neural algorithms programming language, embedded in C(+2) has been defined for the user to cope with the neurocomputer. In a benchmark test, the prototype of SYNAPSE-1 was 8000 times as fast as a standard workstation.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Youcef
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
Modeling plant growth and development.
Prusinkiewicz, Przemyslaw
2004-02-01
Computational plant models or 'virtual plants' are increasingly seen as a useful tool for comprehending complex relationships between gene function, plant physiology, plant development, and the resulting plant form. The theory of L-systems, which was introduced by Lindemayer in 1968, has led to a well-established methodology for simulating the branching architecture of plants. Many current architectural models provide insights into the mechanisms of plant development by incorporating physiological processes, such as the transport and allocation of carbon. Other models aim at elucidating the geometry of plant organs, including flower petals and apical meristems, and are beginning to address the relationship between patterns of gene expression and the resulting plant form.
NASA Astrophysics Data System (ADS)
Hall, Justin R.; Hastrup, Rolf C.
1990-10-01
The principal challenges in providing effective deep space navigation, telecommunications, and information management architectures and designs for Mars exploration support are presented. The fundamental objectives are to provide the mission with the means to monitor and control mission elements, obtain science, navigation, and engineering data, compute state vectors and navigate, and to move these data efficiently and automatically between mission nodes for timely analysis and decision making. New requirements are summarized, and related issues and challenges including the robust connectivity for manned and robotic links, are identified. Enabling strategies are discussed, and candidate architectures and driving technologies are described.
Power Efficient Hardware Architecture of SHA-1 Algorithm for Trusted Mobile Computing
NASA Astrophysics Data System (ADS)
Kim, Mooseop; Ryou, Jaecheol
The Trusted Mobile Platform (TMP) is developed and promoted by the Trusted Computing Group (TCG), which is an industry standard body to enhance the security of the mobile computing environment. The built-in SHA-1 engine in TMP is one of the most important circuit blocks and contributes the performance of the whole platform because it is used as key primitives supporting platform integrity and command authentication. Mobile platforms have very stringent limitations with respect to available power, physical circuit area, and cost. Therefore special architecture and design methods for low power SHA-1 circuit are required. In this paper, we present a novel and efficient hardware architecture of low power SHA-1 design for TMP. Our low power SHA-1 hardware can compute 512-bit data block using less than 7,000 gates and has a power consumption about 1.1 mA on a 0.25μm CMOS process.
An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations
NASA Technical Reports Server (NTRS)
Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.
1996-01-01
We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms
NASA Technical Reports Server (NTRS)
Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.
1997-01-01
We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.