computing reference architecture: Topics by Science.gov

Sample records for computing reference architecture

Recursive computer architecture for VLSI

DOE Office of Scientific and Technical Information (OSTI.GOV)

Treleaven, P.C.; Hopkins, R.P.

1982-01-01

A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
The flight telerobotic servicer: From functional architecture to computer architecture

NASA Technical Reports Server (NTRS)

Lumia, Ronald; Fiala, John

1989-01-01

After a brief tutorial on the NASA/National Bureau of Standards Standard Reference Model for Telerobot Control System Architecture (NASREM) functional architecture, the approach to its implementation is shown. First, interfaces must be defined which are capable of supporting the known algorithms. This is illustrated by considering the interfaces required for the SERVO level of the NASREM functional architecture. After interface definition, the specific computer architecture for the implementation must be determined. This choice is obviously technology dependent. An example illustrating one possible mapping of the NASREM functional architecture to a particular set of computers which implements it is shown. The result of choosing the NASREM functional architecture is that it provides a technology independent paradigm which can be mapped into a technology dependent implementation capable of evolving with technology in the laboratory and in space.
Heavy Lift Vehicle (HLV) Avionics Flight Computing Architecture Study

NASA Technical Reports Server (NTRS)

Hodson, Robert F.; Chen, Yuan; Morgan, Dwayne R.; Butler, A. Marc; Sdhuh, Joseph M.; Petelle, Jennifer K.; Gwaltney, David A.; Coe, Lisa D.; Koelbl, Terry G.; Nguyen, Hai D.

2011-01-01

A NASA multi-Center study team was assembled from LaRC, MSFC, KSC, JSC and WFF to examine potential flight computing architectures for a Heavy Lift Vehicle (HLV) to better understand avionics drivers. The study examined Design Reference Missions (DRMs) and vehicle requirements that could impact the vehicles avionics. The study considered multiple self-checking and voting architectural variants and examined reliability, fault-tolerance, mass, power, and redundancy management impacts. Furthermore, a goal of the study was to develop the skills and tools needed to rapidly assess additional architectures should requirements or assumptions change.
MIT CSAIL and Lincoln Laboratory Task Force Report

DTIC Science & Technology

2016-08-01

projects have been very diverse, spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications...spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications, computing architectures and...to machine learning systems and algorithms, such as recommender systems, and “Big Data ” analytics . Advanced computing architectures broadly refer to
Internet Architecture: Lessons Learned and Looking Forward

DTIC Science & Technology

2006-12-01

Internet Architecture: Lessons Learned and Looking Forward Geoffrey G. Xie Department of Computer Science Naval Postgraduate School April 2006... Internet architecture. Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is...readers are referred there for more information about a specific protocol or concept. 2. Origin of Internet Architecture The Internet is easily
Autonomic Computing for Spacecraft Ground Systems

NASA Technical Reports Server (NTRS)

Li, Zhenping; Savkli, Cetin; Jones, Lori

2007-01-01

Autonomic computing for spacecraft ground systems increases the system reliability and reduces the cost of spacecraft operations and software maintenance. In this paper, we present an autonomic computing solution for spacecraft ground systems at NASA Goddard Space Flight Center (GSFC), which consists of an open standard for a message oriented architecture referred to as the GMSEC architecture (Goddard Mission Services Evolution Center), and an autonomic computing tool, the Criteria Action Table (CAT). This solution has been used in many upgraded ground systems for NASA 's missions, and provides a framework for developing solutions with higher autonomic maturity.
Can the Computer Design a School Building?

ERIC Educational Resources Information Center

Roberts, Charles

The implications of computer technology and architecture are discussed with reference to school building design. A brief introduction is given of computer applications in other fields leading to the conclusions that computers alone cannot design school buildings but may serve as a useful tool in the overall design process. Specific examples are…
Information Interaction: Providing a Framework for Information Architecture.

ERIC Educational Resources Information Center

Toms, Elaine G.

2002-01-01

Discussion of information architecture focuses on a model of information interaction that bridges the gap between human and computer and between information behavior and information retrieval. Illustrates how the process of information interaction is affected by the user, the system, and the content. (Contains 93 references.) (LRW)
Patterns and Practices for Future Architectures

DTIC Science & Technology

2014-08-01

14. SUBJECT TERMS computing architecture, graph algorithms, high-performance computing, big data , GPU 15. NUMBER OF PAGES 44 16. PRICE CODE 17...at Vertex 1 6 Figure 4: Data Structures Created by Kernel 1 of Single CPU, List Implementation Using the Graph in the Example from Section 1.2 9...Figure 5: Kernel 2 of Graph500 BFS Reference Implementation: Single CPU, List 10 Figure 6: Data Structures for Sequential CSR Algorithm 12 Figure 7
Hybrid Architectures for Evolutionary Computing Algorithms

DTIC Science & Technology

2008-01-01

other EC algorithms to FPGA Core Burns P1026/MAPLD 200532 Genetic Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based...on Parallel and Distributed Processing (IPPS/SPDP 󈨦), pp. 316-320, Proceedings. IEEE Computer Society 1998. [12] Scott, S. D. , Samal , A., and...Algorithm Hardware References S. Scott, A. Samal , and S. Seth, “HGA: A Hardware Based Genetic Algorithm”, Proceedings of the 1995 ACM Third
NASA/NBS (National Aeronautics and Space Administration/National Bureau of Standards) standard reference model for telerobot control system architecture (NASREM)

NASA Technical Reports Server (NTRS)

Albus, James S.; Mccain, Harry G.; Lumia, Ronald

1989-01-01

The document describes the NASA Standard Reference Model (NASREM) Architecture for the Space Station Telerobot Control System. It defines the functional requirements and high level specifications of the control system for the NASA space Station document for the functional specification, and a guideline for the development of the control system architecture, of the 10C Flight Telerobot Servicer. The NASREM telerobot control system architecture defines a set of standard modules and interfaces which facilitates software design, development, validation, and test, and make possible the integration of telerobotics software from a wide variety of sources. Standard interfaces also provide the software hooks necessary to incrementally upgrade future Flight Telerobot Systems as new capabilities develop in computer science, robotics, and autonomous system control.
Implementation theory of distortion-invariant pattern recognition for optical and digital signal processing systems

NASA Astrophysics Data System (ADS)

Lhamon, Michael Earl

A pattern recognition system which uses complex correlation filter banks requires proportionally more computational effort than single-real valued filters. This introduces increased computation burden but also introduces a higher level of parallelism, that common computing platforms fail to identify. As a result, we consider algorithm mapping to both optical and digital processors. For digital implementation, we develop computationally efficient pattern recognition algorithms, referred to as, vector inner product operators that require less computational effort than traditional fast Fourier methods. These algorithms do not need correlation and they map readily onto parallel digital architectures, which imply new architectures for optical processors. These filters exploit circulant-symmetric matrix structures of the training set data representing a variety of distortions. By using the same mathematical basis as with the vector inner product operations, we are able to extend the capabilities of more traditional correlation filtering to what we refer to as "Super Images". These "Super Images" are used to morphologically transform a complicated input scene into a predetermined dot pattern. The orientation of the dot pattern is related to the rotational distortion of the object of interest. The optical implementation of "Super Images" yields feature reduction necessary for using other techniques, such as artificial neural networks. We propose a parallel digital signal processor architecture based on specific pattern recognition algorithms but general enough to be applicable to other similar problems. Such an architecture is classified as a data flow architecture. Instead of mapping an algorithm to an architecture, we propose mapping the DSP architecture to a class of pattern recognition algorithms. Today's optical processing systems have difficulties implementing full complex filter structures. Typically, optical systems (like the 4f correlators) are limited to phase-only implementation with lower detection performance than full complex electronic systems. Our study includes pseudo-random pixel encoding techniques for approximating full complex filtering. Optical filter bank implementation is possible and they have the advantage of time averaging the entire filter bank at real time rates. Time-averaged optical filtering is computational comparable to billions of digital operations-per-second. For this reason, we believe future trends in high speed pattern recognition will involve hybrid architectures of both optical and DSP elements.
Motion camera based on a custom vision sensor and an FPGA architecture

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel

1998-09-01

A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Hardware-based Artificial Neural Networks for Size, Weight, and Power Constrained Platforms (Preprint)

DTIC Science & Technology

2012-11-01

few sensors/complex computations, and many sensors/simple computation. II. CHALLENGES WITH NANO-ENABLED NEUROMORPHIC CHIPS A wide variety of...scenarios. Neuromorphic processors, which are based on the highly parallelized computing architecture of the mammalian brain, show great promise in...in the brain. This fundamentally different approach, frequently referred to as neuromorphic computing, is thought to be better able to solve fuzzy
Requirements for plug and play information infrastructure frameworks and architectures to enable virtual enterprises

NASA Astrophysics Data System (ADS)

Bolton, Richard W.; Dewey, Allen; Horstmann, Paul W.; Laurentiev, John

1997-01-01

This paper examines the role virtual enterprises will have in supporting future business engagements and resulting technology requirements. Two representative end-user scenarios are proposed that define the requirements for 'plug-and-play' information infrastructure frameworks and architectures necessary to enable 'virtual enterprises' in US manufacturing industries. The scenarios provide a high- level 'needs analysis' for identifying key technologies, defining a reference architecture, and developing compliant reference implementations. Virtual enterprises are short- term consortia or alliances of companies formed to address fast-changing opportunities. Members of a virtual enterprise carry out their tasks as if they all worked for a single organization under 'one roof', using 'plug-and-play' information infrastructure frameworks and architectures to access and manage all information needed to support the product cycle. 'Plug-and-play' information infrastructure frameworks and architectures are required to enhance collaboration between companies corking together on different aspects of a manufacturing process. This new form of collaborative computing will decrease cycle-time and increase responsiveness to change.
Essential issues in multiprocessor systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gajski, D.D.; Peir, J.K.

1985-06-01

During the past several years, a great number of proposals have been made with the objective to increase supercomputer performance by an order of magnitude on the basis of a utilization of new computer architectures. The present paper is concerned with a suitable classification scheme for comparing these architectures. It is pointed out that there are basically four schools of thought as to the most important factor for an enhancement of computer performance. According to one school, the development of faster circuits will make it possible to retain present architectures, except, possibly, for a mechanism providing synchronization of parallel processes.more » A second school assigns priority to the optimization and vectorization of compilers, which will detect parallelism and help users to write better parallel programs. A third school believes in the predominant importance of new parallel algorithms, while the fourth school supports new models of computation. The merits of the four approaches are critically evaluated. 50 references.« less
An Object-Oriented Network-Centric Software Architecture for Physical Computing

NASA Astrophysics Data System (ADS)

Palmer, Richard

1997-08-01

Recent developments in object-oriented computer languages and infrastructure such as the Internet, Web browsers, and the like provide an opportunity to define a more productive computational environment for scientific programming that is based more closely on the underlying mathematics describing physics than traditional programming languages such as FORTRAN or C++. In this talk I describe an object-oriented software architecture for representing physical problems that includes classes for such common mathematical objects as geometry, boundary conditions, partial differential and integral equations, discretization and numerical solution methods, etc. In practice, a scientific program written using this architecture looks remarkably like the mathematics used to understand the problem, is typically an order of magnitude smaller than traditional FORTRAN or C++ codes, and hence easier to understand, debug, describe, etc. All objects in this architecture are ``network-enabled,'' which means that components of a software solution to a physical problem can be transparently loaded from anywhere on the Internet or other global network. The architecture is expressed as an ``API,'' or application programmers interface specification, with reference embeddings in Java, Python, and C++. A C++ class library for an early version of this API has been implemented for machines ranging from PC's to the IBM SP2, meaning that phidentical codes run on all architectures.
Architectures for Quantum Simulation Showing a Quantum Speedup

NASA Astrophysics Data System (ADS)

Bermejo-Vega, Juan; Hangleiter, Dominik; Schwarz, Martin; Raussendorf, Robert; Eisert, Jens

2018-04-01

One of the main aims in the field of quantum simulation is to achieve a quantum speedup, often referred to as "quantum computational supremacy," referring to the experimental realization of a quantum device that computationally outperforms classical computers. In this work, we show that one can devise versatile and feasible schemes of two-dimensional, dynamical, quantum simulators showing such a quantum speedup, building on intermediate problems involving nonadaptive, measurement-based, quantum computation. In each of the schemes, an initial product state is prepared, potentially involving an element of randomness as in disordered models, followed by a short-time evolution under a basic translationally invariant Hamiltonian with simple nearest-neighbor interactions and a mere sampling measurement in a fixed basis. The correctness of the final-state preparation in each scheme is fully efficiently certifiable. We discuss experimental necessities and possible physical architectures, inspired by platforms of cold atoms in optical lattices and a number of others, as well as specific assumptions that enter the complexity-theoretic arguments. This work shows that benchmark settings exhibiting a quantum speedup may require little control, in contrast to universal quantum computing. Thus, our proposal puts a convincing experimental demonstration of a quantum speedup within reach in the near term.
Engineering Technology Education: Bibliography 1989.

ERIC Educational Resources Information Center

Dyrud, Marilyn A., Comp.

1990-01-01

Over 200 references divided into 24 different areas are presented. Topics include administration, aeronautics, architecture, biomedical technology, CAD/CAM, civil engineering, computers, curriculum, electrical/electronics engineering, industrial engineering, industry and employment, instructional technology, laboratories, lasers, liberal studies,…
Reference Points: Engineering Technology Education Bibliography, 1987.

ERIC Educational Resources Information Center

Engineering Education, 1989

1989-01-01

Lists articles and books published in 1987. Selects the following headings: administration, aeronautical, architectural, CAD/CAM, civil, computers, curriculum, electrical/electronics, industrial, industry/government/employers, instructional technology, laboratories, liberal studies, manufacturing, mechanical, minorities, research, robotics,…

Chemical calculations on Cray computers

NASA Technical Reports Server (NTRS)

Taylor, Peter R.; Bauschlicher, Charles W., Jr.; Schwenke, David W.

1989-01-01

The influence of recent developments in supercomputing on computational chemistry is discussed with particular reference to Cray computers and their pipelined vector/limited parallel architectures. After reviewing Cray hardware and software the performance of different elementary program structures are examined, and effective methods for improving program performance are outlined. The computational strategies appropriate for obtaining optimum performance in applications to quantum chemistry and dynamics are discussed. Finally, some discussion is given of new developments and future hardware and software improvements.
Intelligent supercomputers: the Japanese computer sputnik

DOE Office of Scientific and Technical Information (OSTI.GOV)

Walter, G.

1983-11-01

Japan's government-supported fifth-generation computer project has had a pronounced effect on the American computer and information systems industry. The US firms are intensifying their research on and production of intelligent supercomputers, a combination of computer architecture and artificial intelligence software programs. While the present generation of computers is built for the processing of numbers, the new supercomputers will be designed specifically for the solution of symbolic problems and the use of artificial intelligence software. This article discusses new and exciting developments that will increase computer capabilities in the 1990s. 4 references.
Examining the architecture of cellular computing through a comparative study with a computer

PubMed Central

Wang, Degeng; Gribskov, Michael

2005-01-01

The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software–hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's ‘hardware’ equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the ‘bandwidth’ of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed. PMID:16849179
Examining the architecture of cellular computing through a comparative study with a computer.

PubMed

Wang, Degeng; Gribskov, Michael

2005-06-22

The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
Parallel Architectures for Planetary Exploration Requirements (PAPER)

NASA Technical Reports Server (NTRS)

Cezzar, Ruknet; Sen, Ranjan K.

1989-01-01

The Parallel Architectures for Planetary Exploration Requirements (PAPER) project is essentially research oriented towards technology insertion issues for NASA's unmanned planetary probes. It was initiated to complement and augment the long-term efforts for space exploration with particular reference to NASA/LaRC's (NASA Langley Research Center) research needs for planetary exploration missions of the mid and late 1990s. The requirements for space missions as given in the somewhat dated Advanced Information Processing Systems (AIPS) requirements document are contrasted with the new requirements from JPL/Caltech involving sensor data capture and scene analysis. It is shown that more stringent requirements have arisen as a result of technological advancements. Two possible architectures, the AIPS Proof of Concept (POC) configuration and the MAX Fault-tolerant dataflow multiprocessor, were evaluated. The main observation was that the AIPS design is biased towards fault tolerance and may not be an ideal architecture for planetary and deep space probes due to high cost and complexity. The MAX concepts appears to be a promising candidate, except that more detailed information is required. The feasibility for adding neural computation capability to this architecture needs to be studied. Key impact issues for architectural design of computing systems meant for planetary missions were also identified.
Reference Architecture Model Enabling Standards Interoperability.

PubMed

Blobel, Bernd

2017-01-01

Advanced health and social services paradigms are supported by a comprehensive set of domains managed by different scientific disciplines. Interoperability has to evolve beyond information and communication technology (ICT) concerns, including the real world business domains and their processes, but also the individual context of all actors involved. So, the system must properly reflect the environment in front and around the computer as essential and even defining part of the health system. This paper introduces an ICT-independent system-theoretical, ontology-driven reference architecture model allowing the representation and harmonization of all domains involved including the transformation into an appropriate ICT design and implementation. The entire process is completely formalized and can therefore be fully automated.
On-Line Tracking Controller for Brushless DC Motor Drives Using Artificial Neural Networks

NASA Technical Reports Server (NTRS)

Rubaai, Ahmed

1996-01-01

A real-time control architecture is developed for time-varying nonlinear brushless dc motors operating in a high performance drives environment. The developed control architecture possesses the capabilities of simultaneous on-line identification and control. The dynamics of the motor are modeled on-line and controlled using an artificial neural network, as the system runs. The control architecture combines the experience and dependability of adaptive tracking systems with potential and promise of the neural computing technology. The sensitivity of real-time controller to parametric changes that occur during training is investigated. Such changes are usually manifested by rapid changes in the load of the brushless motor drives. This sudden change in the external load is simulated for the sigmoidal and sinusoidal reference tracks. The ability of the neuro-controller to maintain reasonable tracking accuracy in the presence of external noise is also verified for a number of desired reference trajectories.
The NIST Real-Time Control System (RCS): A Reference Model Architecture for Computational Intelligence

NASA Technical Reports Server (NTRS)

Albus, James S.

1996-01-01

The Real-time Control System (RCS) developed at NIST and elsewhere over the past two decades defines a reference model architecture for design and analysis of complex intelligent control systems. The RCS architecture consists of a hierarchically layered set of functional processing modules connected by a network of communication pathways. The primary distinguishing feature of the layers is the bandwidth of the control loops. The characteristic bandwidth of each level is determined by the spatial and temporal integration window of filters, the temporal frequency of signals and events, the spatial frequency of patterns, and the planning horizon and granularity of the planners that operate at each level. At each level, tasks are decomposed into sequential subtasks, to be performed by cooperating sets of subordinate agents. At each level, signals from sensors are filtered and correlated with spatial and temporal features that are relevant to the control function being implemented at that level.
Rosen's (M,R) system as an X-machine.

PubMed

Palmer, Michael L; Williams, Richard A; Gatherer, Derek

2016-11-07

Robert Rosen's (M,R) system is an abstract biological network architecture that is allegedly both irreducible to sub-models of its component states and non-computable on a Turing machine. (M,R) stands as an obstacle to both reductionist and mechanistic presentations of systems biology, principally due to its self-referential structure. If (M,R) has the properties claimed for it, computational systems biology will not be possible, or at best will be a science of approximate simulations rather than accurate models. Several attempts have been made, at both empirical and theoretical levels, to disprove this assertion by instantiating (M,R) in software architectures. So far, these efforts have been inconclusive. In this paper, we attempt to demonstrate why - by showing how both finite state machine and stream X-machine formal architectures fail to capture the self-referential requirements of (M,R). We then show that a solution may be found in communicating X-machines, which remove self-reference using parallel computation, and then synthesise such machine architectures with object-orientation to create a formal basis for future software instantiations of (M,R) systems. Copyright © 2016 Elsevier Ltd. All rights reserved.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, James M.; Voigt, Robert G.; Romine, Charles H.

1988-01-01

This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, J. M.; Voigt, R. G.

1987-01-01

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, James M.; Voigt, Robert G.; Romine, Charles H.

1990-01-01

This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
MultiLIS: A Description of the System Design and Operational Features.

ERIC Educational Resources Information Center

Kelly, Glen J.; And Others

1988-01-01

Describes development, hardware requirements, and features of the MultiLIS integrated library software package. A system profile provides pricing information, operational characteristics, and technical specifications. Sidebars discuss MultiLIS integration structure, incremental architecture, and NCR Tower Computers. (4 references) (MES)
CLOCS (Computer with Low Context-Switching Time) Architecture Reference Documents

DTIC Science & Technology

1988-05-06

Peculiarities The only state inside the central processing unit(CPU) is a program status word. All data operations are memory to memory. One result of this... to the challenge "if I whore to design RISC, this is how I would do it." The architecture was designed by Mark Davis and Bill Gallmeister. 1.2...are memory to memory. Any special devices added should be memory mapped. The program counter is even memory mapped. 1.3.1 Working storage There is no
A Bandwidth-Optimized Multi-Core Architecture for Irregular Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Secchi, Simone; Tumeo, Antonino; Villa, Oreste

This paper presents an architecture template for next-generation high performance computing systems specifically targeted to irregular applications. We start our work by considering that future generation interconnection and memory bandwidth full-system numbers are expected to grow by a factor of 10. In order to keep up with such a communication capacity, while still resorting to fine-grained multithreading as the main way to tolerate unpredictable memory access latencies of irregular applications, we show how overall performance scaling can benefit from the multi-core paradigm. At the same time, we also show how such an architecture template must be coupled with specific techniquesmore » in order to optimize bandwidth utilization and achieve the maximum scalability. We propose a technique based on memory references aggregation, together with the related hardware implementation, as one of such optimization techniques. We explore the proposed architecture template by focusing on the Cray XMT architecture and, using a dedicated simulation infrastructure, validate the performance of our template with two typical irregular applications. Our experimental results prove the benefits provided by both the multi-core approach and the bandwidth optimization reference aggregation technique.« less
Design and Implementation of a Tool for Teaching Programming.

ERIC Educational Resources Information Center

Goktepe, Mesut; And Others

1989-01-01

Discussion of the use of computers in education focuses on a graphics-based system for teaching the Pascal programing language for problem solving. Topics discussed include user interface; notification based systems; communication processes; object oriented programing; workstations; graphics architecture; and flowcharts. (18 references) (LRW)
Integrating Green Building Criteria Into Housing Design Processes Case Study: Tropical Apartment At Kebon Melati, Jakarta

NASA Astrophysics Data System (ADS)

Farid, V. L.; Wonorahardjo, S.

2018-05-01

The implementation of Green Building criteria is relatively new in architectural practice, especially in Indonesia. Consequently, the integration of these criteria into design process has the potential to change the design process itself. The implementation of the green building criteria into the conventional design process will be discussed in this paper. The concept of this project is to design a residential unit with a natural air-conditioning system. To achieve this purpose, the Green Building criteria has been implemented since the beginning of the design process until the detailing process on the end of the project. Several studies was performed throughout the design process, such as: (1) Conceptual review, where several professionally proved theories related to Tropical Architecture and passive design are used for a reference, and (2) Computer simulations, such as Computational Fluid Dynamics (CFD) and wind tunnel simulation, used to represent the dynamic response of the surrounding environment towards the building. Hopefully this paper may become a reference for designing a green residential building.
Evolving binary classifiers through parallel computation of multiple fitness cases.

PubMed

Cagnoni, Stefano; Bergenti, Federico; Mordonini, Monica; Adorni, Giovanni

2005-06-01

This paper describes two versions of a novel approach to developing binary classifiers, based on two evolutionary computation paradigms: cellular programming and genetic programming. Such an approach achieves high computation efficiency both during evolution and at runtime. Evolution speed is optimized by allowing multiple solutions to be computed in parallel. Runtime performance is optimized explicitly using parallel computation in the case of cellular programming or implicitly taking advantage of the intrinsic parallelism of bitwise operators on standard sequential architectures in the case of genetic programming. The approach was tested on a digit recognition problem and compared with a reference classifier.
Analyzing and designing object-oriented missile simulations with concurrency

NASA Astrophysics Data System (ADS)

Randorf, Jeffrey Allen

2000-11-01

A software object model for the six degree-of-freedom missile modeling domain is presented. As a precursor, a domain analysis of the missile modeling domain was started, based on the Feature-Oriented Domain Analysis (FODA) technique described by the Software Engineering Institute (SEI). It was subsequently determined the FODA methodology is functionally equivalent to the Object Modeling Technique. The analysis used legacy software documentation and code from the ENDOSIM, KDEC, and TFrames 6-DOF modeling tools, including other technical literature. The SEI Object Connection Architecture (OCA) was the template for designing the object model. Three variants of the OCA were considered---a reference structure, a recursive structure, and a reference structure with augmentation for flight vehicle modeling. The reference OCA design option was chosen for maintaining simplicity while not compromising the expressive power of the OMT model. The missile architecture was then analyzed for potential areas of concurrent computing. It was shown how protected objects could be used for data passing between OCA object managers, allowing concurrent access without changing the OCA reference design intent or structure. The implementation language was the 1995 release of Ada. OCA software components were shown how to be expressed as Ada child packages. While acceleration of several low level and other high operations level are possible on proper hardware, there was a 33% degradation of 4th order Runge-Kutta integrator performance of two simultaneous ordinary differential equations using Ada tasking on a single processor machine. The Defense Department's High Level Architecture was introduced and explained in context with the OCA. It was shown the HLA and OCA were not mutually exclusive architectures, but complimentary. HLA was shown as an interoperability solution, with the OCA as an architectural vehicle for software reuse. Further directions for implementing a 6-DOF missile modeling environment are discussed.
Towards Formalizing the Java Security Architecture of JDK 1.2

DTIC Science & Technology

1998-01-01

and Richard E. Newman for their contributions to this paper. References 1. Balfanz , D. and Gong, L.: Experience with Secure Multi-Processing in Java...Privacy, IEEE Computer Society, Oakland, California, Pages 122-136, 1992. 18. Wallach, D. S., Balfanz , D., Dean, D., and Felton, E. W.: Extensible

Data flow language and interpreter for a reconfigurable distributed data processor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hurt, A.D.; Heath, J.R.

1982-01-01

An analytic language and an interpreter whereby an applications data flow graph may serve as an input to a reconfigurable distributed data processor is proposed. The architecture considered consists of a number of loosely coupled computing elements (CES) which may be linked to data and file memories through fully nonblocking interconnect networks. The real-time performance of such an architecture depends upon its ability to alter its topology in response to changes in application, asynchronous data rates and faults. Such a data flow language enhances the versatility of a reconfigurable architecture by allowing the user to specify the machine's topology atmore » a very high level. 11 references.« less
Cognitive Modeling of Individual Variation in Reference Production and Comprehension

PubMed Central

Hendriks, Petra

2016-01-01

A challenge for most theoretical and computational accounts of linguistic reference is the observation that language users vary considerably in their referential choices. Part of the variation observed among and within language users and across tasks may be explained from variation in the cognitive resources available to speakers and listeners. This paper presents a computational model of reference production and comprehension developed within the cognitive architecture ACT-R. Through simulations with this ACT-R model, it is investigated how cognitive constraints interact with linguistic constraints and features of the linguistic discourse in speakers’ production and listeners’ comprehension of referring expressions in specific tasks, and how this interaction may give rise to variation in referential choice. The ACT-R model of reference explains and predicts variation among language users in their referential choices as a result of individual and task-related differences in processing speed and working memory capacity. Because of limitations in their cognitive capacities, speakers sometimes underspecify or overspecify their referring expressions, and listeners sometimes choose incorrect referents or are overly liberal in their interpretation of referring expressions. PMID:27092101
Processor Emulator with Benchmark Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lloyd, G. Scott; Pearce, Roger; Gokhale, Maya

2015-11-13

A processor emulator and a suite of benchmark applications have been developed to assist in characterizing the performance of data-centric workloads on current and future computer architectures. Some of the applications have been collected from other open source projects. For more details on the emulator and an example of its usage, see reference [1].
76 FR 11407 - Review of Wireline Competition Bureau Data Practices, Computer III Further Remand Proceedings...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-03-02

... removal of the narrowband comparably efficient interconnection (CEI) and open network architecture (ONA... and copying during normal business hours in the FCC Reference Information Center, Portals II, 445 12th... addition, pursuant to the Small Business Paperwork Relief Act of 2002, we will seek specific comment on how...
New architecture for dynamic frame-skipping transcoder.

PubMed

Fung, Kai-Tat; Chan, Yui-Lam; Siu, Wan-Chi

2002-01-01

Transcoding is a key technique for reducing the bit rate of a previously compressed video signal. A high transcoding ratio may result in an unacceptable picture quality when the full frame rate of the incoming video bitstream is used. Frame skipping is often used as an efficient scheme to allocate more bits to the representative frames, so that an acceptable quality for each frame can be maintained. However, the skipped frame must be decompressed completely, which might act as a reference frame to nonskipped frames for reconstruction. The newly quantized discrete cosine transform (DCT) coefficients of the prediction errors need to be re-computed for the nonskipped frame with reference to the previous nonskipped frame; this can create undesirable complexity as well as introduce re-encoding errors. In this paper, we propose new algorithms and a novel architecture for frame-rate reduction to improve picture quality and to reduce complexity. The proposed architecture is mainly performed on the DCT domain to achieve a transcoder with low complexity. With the direct addition of DCT coefficients and an error compensation feedback loop, re-encoding errors are reduced significantly. Furthermore, we propose a frame-rate control scheme which can dynamically adjust the number of skipped frames according to the incoming motion vectors and re-encoding errors due to transcoding such that the decoded sequence can have a smooth motion as well as better transcoded pictures. Experimental results show that, as compared to the conventional transcoder, the new architecture for frame-skipping transcoder is more robust, produces fewer requantization errors, and has reduced computational complexity.
On some methods for improving time of reachability sets computation for the dynamic system control problem

NASA Astrophysics Data System (ADS)

Zimovets, Artem; Matviychuk, Alexander; Ushakov, Vladimir

2016-12-01

The paper presents two different approaches to reduce the time of computer calculation of reachability sets. First of these two approaches use different data structures for storing the reachability sets in the computer memory for calculation in single-threaded mode. Second approach is based on using parallel algorithms with reference to the data structures from the first approach. Within the framework of this paper parallel algorithm of approximate reachability set calculation on computer with SMP-architecture is proposed. The results of numerical modelling are presented in the form of tables which demonstrate high efficiency of parallel computing technology and also show how computing time depends on the used data structure.
Applying Strategic Visualization(Registered Trademark) to Lunar and Planetary Mission Design

NASA Technical Reports Server (NTRS)

Frassanito, John R.; Cooke, D. R.

2002-01-01

NASA teams, such as the NASA Exploration Team (NEXT), utilize advanced computational visualization processes to develop mission designs and architectures for lunar and planetary missions. One such process, Strategic Visualization (trademark), is a tool used extensively to help mission designers visualize various design alternatives and present them to other participants of their team. The participants, which may include NASA, industry, and the academic community, are distributed within a virtual network. Consequently, computer animation and other digital techniques provide an efficient means to communicate top-level technical information among team members. Today,Strategic Visualization(trademark) is used extensively both in the mission design process within the technical community, and to communicate the value of space exploration to the general public. Movies and digital images have been generated and shown on nationally broadcast television and the Internet, as well as in magazines and digital media. In our presentation will show excerpts of a computer-generated animation depicting the reference Earth/Moon L1 Libration Point Gateway architecture. The Gateway serves as a staging corridor for human expeditions to the lunar poles and other surface locations. Also shown are crew transfer systems and current reference lunar excursion vehicles as well as the Human and robotic construction of an inflatable telescope array for deployment to the Sun/Earth Libration Point.
AltiVec performance increases for autonomous robotics for the MARSSCAPE architecture program

NASA Astrophysics Data System (ADS)

Gothard, Benny M.

2002-02-01

One of the main tall poles that must be overcome to develop a fully autonomous vehicle is the inability of the computer to understand its surrounding environment to a level that is required for the intended task. The military mission scenario requires a robot to interact in a complex, unstructured, dynamic environment. Reference A High Fidelity Multi-Sensor Scene Understanding System for Autonomous Navigation The Mobile Autonomous Robot Software Self Composing Adaptive Programming Environment (MarsScape) perception research addresses three aspects of the problem; sensor system design, processing architectures, and algorithm enhancements. A prototype perception system has been demonstrated on robotic High Mobility Multi-purpose Wheeled Vehicle and All Terrain Vehicle testbeds. This paper addresses the tall pole of processing requirements and the performance improvements based on the selected MarsScape Processing Architecture. The processor chosen is the Motorola Altivec-G4 Power PC(PPC) (1998 Motorola, Inc.), a highly parallized commercial Single Instruction Multiple Data processor. Both derived perception benchmarks and actual perception subsystems code will be benchmarked and compared against previous Demo II-Semi-autonomous Surrogate Vehicle processing architectures along with desktop Personal Computers(PC). Performance gains are highlighted with progress to date, and lessons learned and future directions are described.
A reference architecture for the component factory

NASA Technical Reports Server (NTRS)

Basili, Victor R.; Caldiera, Gianluigi; Cantone, Giovanni

1992-01-01

Software reuse can be achieved through an organization that focuses on utilization of life cycle products from previous developments. The component factory is both an example of the more general concepts of experience and domain factory and an organizational unit worth being considered independently. The critical features of such an organization are flexibility and continuous improvement. In order to achieve these features we can represent the architecture of the factory at different levels of abstraction and define a reference architecture from which specific architectures can be derived by instantiation. A reference architecture is an implementation and organization independent representation of the component factory and its environment. The paper outlines this reference architecture, discusses the instantiation process, and presents some examples of specific architectures by comparing them in the framework of the reference model.
New ARCH: Future Generation Internet Architecture

DTIC Science & Technology

2004-08-01

a vocabulary to talk about a system . This provides a framework ( a “reference model ...layered model Modularity and abstraction are central tenets of Computer Science thinking. Modularity breaks a system into parts, normally to permit...this complexity is hidden. Abstraction suggests a structure for the system . A popular and simple structure is a layered model : lower layer
An independent review of the Multi-Path Redundant Avionics Suite (MPRAS) architecture assessment and characterization report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, M.R.

1991-02-01

In recent years the NASA Langley Research Center has funded several contractors to conduct conceptual designs defining architectures for fault tolerant computer systems. Such a system is referred to as a Multi-Path Redundant Avionics Suite (MPRAS), and would form the basis for avionics systems that would be used in future families of space vehicles in a variety of missions. The principal contractors were General Dynamics, Boeing, and Draper Laboratories. These contractors participated in a series of review meetings, and submitted final reports defining their candidate architectures. NASA then commissioned the Research Triangle Institute (RTI) to perform an assessment of thesemore » architectures to identify strengths and weaknesses of each. This report is a separate, independent review of the RTI assessment, done primarily to assure that the assessment was comprehensive and objective. The report also includes general recommendations relative to further MPRAS development.« less
A Standard-Based and Context-Aware Architecture for Personal Healthcare Smart Gateways.

PubMed

Santos, Danilo F S; Gorgônio, Kyller C; Perkusich, Angelo; Almeida, Hyggo O

2016-10-01

The rising availability of Personal Health Devices (PHDs) capable of Personal Network Area (PAN) communication and the desire of keeping a high quality of life are the ingredients of the Connected Health vision. In parallel, a growing number of personal and portable devices, like smartphones and tablet computers, are becoming capable of taking the role of health gateway, that is, a data collector for the sensor PHDs. However, as the number of PHDs increase, the number of other peripherals connected in PAN also increases. Therefore, PHDs are now competing for medium access with other devices, decreasing the Quality of Service (QoS) of health applications in the PAN. In this article we present a reference architecture to prioritize PHD connections based on their state and requirements, creating a healthcare Smart Gateway. Healthcare context information is extracted by observing the traffic through the gateway. A standard-based approach was used to identify health traffic based on ISO/IEEE 11073 family of standards. A reference implementation was developed showing the relevance of the problem and how the proposed architecture can assist in the prioritization. The reference Smart Gateway solution was integrated with a Connected Health System for the Internet of Things, validating its use in a real case scenario.
Performance Analysis of GFDL's GCM Line-By-Line Radiative Transfer Model on GPU and MIC Architectures

NASA Astrophysics Data System (ADS)

Menzel, R.; Paynter, D.; Jones, A. L.

2017-12-01

Due to their relatively low computational cost, radiative transfer models in global climate models (GCMs) run on traditional CPU architectures generally consist of shortwave and longwave parameterizations over a small number of wavelength bands. With the rise of newer GPU and MIC architectures, however, the performance of high resolution line-by-line radiative transfer models may soon approach those of the physical parameterizations currently employed in GCMs. Here we present an analysis of the current performance of a new line-by-line radiative transfer model currently under development at GFDL. Although originally designed to specifically exploit GPU architectures through the use of CUDA, the radiative transfer model has recently been extended to include OpenMP in an effort to also effectively target MIC architectures such as Intel's Xeon Phi. Using input data provided by the upcoming Radiative Forcing Model Intercomparison Project (RFMIP, as part of CMIP 6), we compare model results and performance data for various model configurations and spectral resolutions run on both GPU and Intel Knights Landing architectures to analogous runs of the standard Oxford Reference Forward Model on traditional CPUs.
Understanding Evolutionary Potential in Virtual CPU Instruction Set Architectures

PubMed Central

Bryson, David M.; Ofria, Charles

2013-01-01

We investigate fundamental decisions in the design of instruction set architectures for linear genetic programs that are used as both model systems in evolutionary biology and underlying solution representations in evolutionary computation. We subjected digital organisms with each tested architecture to seven different computational environments designed to present a range of evolutionary challenges. Our goal was to engineer a general purpose architecture that would be effective under a broad range of evolutionary conditions. We evaluated six different types of architectural features for the virtual CPUs: (1) genetic flexibility: we allowed digital organisms to more precisely modify the function of genetic instructions, (2) memory: we provided an increased number of registers in the virtual CPUs, (3) decoupled sensors and actuators: we separated input and output operations to enable greater control over data flow. We also tested a variety of methods to regulate expression: (4) explicit labels that allow programs to dynamically refer to specific genome positions, (5) position-relative search instructions, and (6) multiple new flow control instructions, including conditionals and jumps. Each of these features also adds complication to the instruction set and risks slowing evolution due to epistatic interactions. Two features (multiple argument specification and separated I/O) demonstrated substantial improvements in the majority of test environments, along with versions of each of the remaining architecture modifications that show significant improvements in multiple environments. However, some tested modifications were detrimental, though most exhibit no systematic effects on evolutionary potential, highlighting the robustness of digital evolution. Combined, these observations enhance our understanding of how instruction architecture impacts evolutionary potential, enabling the creation of architectures that support more rapid evolution of complex solutions to a broad range of challenges. PMID:24376669
Computers in Academic Architecture Libraries.

ERIC Educational Resources Information Center

Willis, Alfred; And Others

1992-01-01

Computers are widely used in architectural research and teaching in U.S. schools of architecture. A survey of libraries serving these schools sought information on the emphasis placed on computers by the architectural curriculum, accessibility of computers to library staff, and accessibility of computers to library patrons. Survey results and…
Open Systems Architecture for Command, Control and Communications

DTIC Science & Technology

1991-07-01

CONTENTS SECTION TITLE PAGE I. EXECUTIVE SUMMARY 5 II. TERMS OF REFERENCE 7 III. PANEL MEMBERSHIP 9 IV. INTRODUCTION 11 V. INDUSTRIAL REVOLUTION 19 VI...INTRODUCTION 18 19 V. INDUSTRIAL REVOLUTION 20 21 Initial manifestations of computer and communications standards emerged in the early seventies, largely...SYSTEMS INDUSTRIAL REVOLUTION Application Presentation Session Transport Internet Data Link Physical Application Presentation Session Transport
DAsHER CD: Developing a Data-Oriented Human-Centric Enterprise Architecture for EarthCube

NASA Astrophysics Data System (ADS)

Yang, C. P.; Yu, M.; Sun, M.; Qin, H.; Robinson, E.

2015-12-01

One of the biggest challenges that face Earth scientists is the resource discovery, access, and sharing in a desired fashion. EarthCube is targeted to enable geoscientists to address the challenges by fostering community-governed efforts that develop a common cyberinfrastructure for the purpose of collecting, accessing, analyzing, sharing and visualizing all forms of data and related resources, through the use of advanced technological and computational capabilities. Here we design an Enterprise Architecture (EA) for EarthCube to facilitate the knowledge management, communication and human collaboration in pursuit of the unprecedented data sharing across the geosciences. The design results will provide EarthCube a reference framework for developing geoscience cyberinfrastructure collaborated by different stakeholders, and identifying topics which should invoke high interest in the community. The development of this EarthCube EA framework leverages popular frameworks, such as Zachman, Gartner, DoDAF, and FEAF. The science driver of this design is the needs from EarthCube community, including the analyzed user requirements from EarthCube End User Workshop reports and EarthCube working group roadmaps, and feedbacks or comments from scientists obtained by organizing workshops. The final product of this Enterprise Architecture is a four-volume reference document: 1) Volume one is this document and comprises an executive summary of the EarthCube architecture, serving as an overview in the initial phases of architecture development; 2) Volume two is the major body of the design product. It outlines all the architectural design components or viewpoints; 3) Volume three provides taxonomy of the EarthCube enterprise augmented with semantics relations; 4) Volume four describes an example of utilizing this architecture for a geoscience project.
CONVEX mini manual

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.

1993-01-01

The use of the CONVEX computers that are an integral part of the Supercomputing Network Subsystems (SNS) of the Central Scientific Computing Complex of LaRC is briefly described. Features of the CONVEX computers that are significantly different than the CRAY supercomputers are covered, including: FORTRAN, C, architecture of the CONVEX computers, the CONVEX environment, batch job submittal, debugging, performance analysis, utilities unique to CONVEX, and documentation. This revision reflects the addition of the Applications Compiler and X-based debugger, CXdb. The document id intended for all CONVEX users as a ready reference to frequently asked questions and to more detailed information contained with the vendor manuals. It is appropriate for both the novice and the experienced user.
Using Computing and Data Grids for Large-Scale Science and Engineering

NASA Technical Reports Server (NTRS)

Johnston, William E.

2001-01-01

We use the term "Grid" to refer to a software system that provides uniform and location independent access to geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. These emerging data and computing Grids promise to provide a highly capable and scalable environment for addressing large-scale science problems. We describe the requirements for science Grids, the resulting services and architecture of NASA's Information Power Grid (IPG) and DOE's Science Grid, and some of the scaling issues that have come up in their implementation.
Mesoscopic modelling and simulation of soft matter.

PubMed

Schiller, Ulf D; Krüger, Timm; Henrich, Oliver

2017-12-20

The deformability of soft condensed matter often requires modelling of hydrodynamical aspects to gain quantitative understanding. This, however, requires specialised methods that can resolve the multiscale nature of soft matter systems. We review a number of the most popular simulation methods that have emerged, such as Langevin dynamics, dissipative particle dynamics, multi-particle collision dynamics, sometimes also referred to as stochastic rotation dynamics, and the lattice-Boltzmann method. We conclude this review with a short glance at current compute architectures for high-performance computing and community codes for soft matter simulation.

JETSPIN: A specific-purpose open-source software for simulations of nanofiber electrospinning

NASA Astrophysics Data System (ADS)

Lauricella, Marco; Pontrelli, Giuseppe; Coluzza, Ivan; Pisignano, Dario; Succi, Sauro

2015-12-01

We present the open-source computer program JETSPIN, specifically designed to simulate the electrospinning process of nanofibers. Its capabilities are shown with proper reference to the underlying model, as well as a description of the relevant input variables and associated test-case simulations. The various interactions included in the electrospinning model implemented in JETSPIN are discussed in detail. The code is designed to exploit different computational architectures, from single to parallel processor workstations. This paper provides an overview of JETSPIN, focusing primarily on its structure, parallel implementations, functionality, performance, and availability.
Emerging Neuromorphic Computing Architectures & Enabling Hardware for Cognitive Information Processing Applications

DTIC Science & Technology

2010-06-01

DATES COVEREDAPR 2009 – JAN 2010 (From - To) APR 2009 – JAN 2010 4. TITLE AND SUBTITLE EMERGING NEUROMORPHIC COMPUTING ARCHITECTURES AND ENABLING...14. ABSTRACT The highly cross-disciplinary emerging field of neuromorphic computing architectures for cognitive information processing applications...belief systems, software, computer engineering, etc. In our effort to develop cognitive systems atop a neuromorphic computing architecture, we explored
A Reference Architecture for Space Information Management

NASA Technical Reports Server (NTRS)

Mattmann, Chris A.; Crichton, Daniel J.; Hughes, J. Steven; Ramirez, Paul M.; Berrios, Daniel C.

2006-01-01

We describe a reference architecture for space information management systems that elegantly overcomes the rigid design of common information systems in many domains. The reference architecture consists of a set of flexible, reusable, independent models and software components that function in unison, but remain separately managed entities. The main guiding principle of the reference architecture is to separate the various models of information (e.g., data, metadata, etc.) from implemented system code, allowing each to evolve independently. System modularity, systems interoperability, and dynamic evolution of information system components are the primary benefits of the design of the architecture. The architecture requires the use of information models that are substantially more advanced than those used by the vast majority of information systems. These models are more expressive and can be more easily modularized, distributed and maintained than simpler models e.g., configuration files and data dictionaries. Our current work focuses on formalizing the architecture within a CCSDS Green Book and evaluating the architecture within the context of the C3I initiative.
Memristor-Based Synapse Design and Training Scheme for Neuromorphic Computing Architecture

DTIC Science & Technology

2012-06-01

system level built upon the conventional Von Neumann computer architecture [2][3]. Developing the neuromorphic architecture at chip level by...SCHEME FOR NEUROMORPHIC COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-11-2-0046 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...creation of memristor-based neuromorphic computing architecture. Rather than the existing crossbar-based neuron network designs, we focus on memristor
Theatre and Cinema Architecture: A Guide to Information Sources.

ERIC Educational Resources Information Center

Stoddard, Richard

This annotated bibliography cites works related to theatres, movie houses, opera houses, and dance facilities. It is divided into three parts: general references, theatre architecture, and cinema architecture. The part on general references includes bibliographies and periodicals. The second and main part of the guide, on theatre architecture,…
Developing a Distributed Computing Architecture at Arizona State University.

ERIC Educational Resources Information Center

Armann, Neil; And Others

1994-01-01

Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…
Behavioral Reference Model for Pervasive Healthcare Systems.

PubMed

Tahmasbi, Arezoo; Adabi, Sahar; Rezaee, Ali

2016-12-01

The emergence of mobile healthcare systems is an important outcome of application of pervasive computing concepts for medical care purposes. These systems provide the facilities and infrastructure required for automatic and ubiquitous sharing of medical information. Healthcare systems have a dynamic structure and configuration, therefore having an architecture is essential for future development of these systems. The need for increased response rate, problem limited storage, accelerated processing and etc. the tendency toward creating a new generation of healthcare system architecture highlight the need for further focus on cloud-based solutions for transfer data and data processing challenges. Integrity and reliability of healthcare systems are of critical importance, as even the slightest error may put the patients' lives in danger; therefore acquiring a behavioral model for these systems and developing the tools required to model their behaviors are of significant importance. The high-level designs may contain some flaws, therefor the system must be fully examined for different scenarios and conditions. This paper presents a software architecture for development of healthcare systems based on pervasive computing concepts, and then models the behavior of described system. A set of solutions are then proposed to improve the design's qualitative characteristics including, availability, interoperability and performance.
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures

DTIC Science & Technology

2017-10-04

Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
User's guide for FRMOD, a zero dimensional FRM burn code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Driemeryer, D.; Miley, G.H.

1979-10-15

The zero-dimensional FRM plasma burn code, FRMOD is written in the FORTRAN language and is currently available on the Control Data Corporation (CDC) 7600 computer at the Magnetic Fusion Energy Computer Center (MFECC), sponsored by the US Department of Energy, in Livermore, CA. This guide assumes that the user is familiar with the system architecture and some of the utility programs available on the MFE-7600 machine, since online documentation is available for system routines through the use of the DOCUMENT utility. Users may therefore refer to it for answers to system related questions.
Frances: A Tool for Understanding Computer Architecture and Assembly Language

ERIC Educational Resources Information Center

Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh

2012-01-01

Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
M3BA: A Mobile, Modular, Multimodal Biosignal Acquisition Architecture for Miniaturized EEG-NIRS-Based Hybrid BCI and Monitoring.

PubMed

von Luhmann, Alexander; Wabnitz, Heidrun; Sander, Tilmann; Muller, Klaus-Robert

2017-06-01

For the further development of the fields of telemedicine, neurotechnology, and brain-computer interfaces, advances in hybrid multimodal signal acquisition and processing technology are invaluable. Currently, there are no commonly available hybrid devices combining bioelectrical and biooptical neurophysiological measurements [here electroencephalography (EEG) and functional near-infrared spectroscopy (NIRS)]. Our objective was to design such an instrument in a miniaturized, customizable, and wireless form. We present here the design and evaluation of a mobile, modular, multimodal biosignal acquisition architecture (M3BA) based on a high-performance analog front-end optimized for biopotential acquisition, a microcontroller, and our openNIRS technology. The designed M3BA modules are very small configurable high-precision and low-noise modules (EEG input referred noise @ 500 SPS 1.39 μV pp , NIRS noise equivalent power NEP 750 nm = 5.92 pW pp , and NEP 850 nm = 4.77 pW pp ) with full input linearity, Bluetooth, 3-D accelerometer, and low power consumption. They support flexible user-specified biopotential reference setups and wireless body area/sensor network scenarios. Performance characterization and in-vivo experiments confirmed functionality and quality of the designed architecture. Telemedicine and assistive neurotechnology scenarios will increasingly include wearable multimodal sensors in the future. The M3BA architecture can significantly facilitate future designs for research in these and other fields that rely on customized mobile hybrid biosignal modal biosignal acquisition architecture (M3BA), multimodal, near-infrared spectroscopy (NIRS), wireless body area network (WBAN), wireless body sensor network (WBSN).
Tutorial: Computer architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gajski, D.D.; Milutinovic, V.M.; Siegel, H.J.

1986-01-01

This book presents the state-of-the-art in advanced computer architecture. It deals with the concepts underlying current architectures and covers approaches and techniques being used in the design of advanced computer systems.
Outline of a novel architecture for cortical computation.

PubMed

Majumdar, Kaushik

2008-03-01

In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.
Architecture Adaptive Computing Environment

NASA Technical Reports Server (NTRS)

Dorband, John E.

2006-01-01

Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
A parallel-processing approach to computing for the geographic sciences

USGS Publications Warehouse

Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Haga, Jim; Maddox, Brian; Feller, Mark

2001-01-01

The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting research into various areas, such as advanced computer architecture, algorithms to meet the processing needs for real-time image and data processing, the creation of custom datasets from seamless source data, rapid turn-around of products for emergency response, and support for computationally intense spatial and temporal modeling.
Development of a Multi-Disciplinary Computing Environment (MDICE)

NASA Technical Reports Server (NTRS)

Kingsley, Gerry; Siegel, John M., Jr.; Harrand, Vincent J.; Lawrence, Charles; Luker, Joel J.

1999-01-01

The growing need for and importance of multi-component and multi-disciplinary engineering analysis has been understood for many years. For many applications, loose (or semi-implicit) coupling is optimal, and allows the use of various legacy codes without requiring major modifications. For this purpose, CFDRC and NASA LeRC have developed a computational environment to enable coupling between various flow analysis codes at several levels of fidelity. This has been referred to as the Visual Computing Environment (VCE), and is being successfully applied to the analysis of several aircraft engine components. Recently, CFDRC and AFRL/VAAC (WL) have extended the framework and scope of VCE to enable complex multi-disciplinary simulations. The chosen initial focus is on aeroelastic aircraft applications. The developed software is referred to as MDICE-AE, an extensible system suitable for integration of several engineering analysis disciplines. This paper describes the methodology, basic architecture, chosen software technologies, salient library modules, and the current status of and plans for MDICE. A fluid-structure interaction application is described in a separate companion paper.
FPGA-Based High-Performance Embedded Systems for Adaptive Edge Computing in Cyber-Physical Systems: The ARTICo³ Framework.

PubMed

Rodríguez, Alfonso; Valverde, Juan; Portilla, Jorge; Otero, Andrés; Riesgo, Teresa; de la Torre, Eduardo

2018-06-08

Cyber-Physical Systems are experiencing a paradigm shift in which processing has been relocated to the distributed sensing layer and is no longer performed in a centralized manner. This approach, usually referred to as Edge Computing, demands the use of hardware platforms that are able to manage the steadily increasing requirements in computing performance, while keeping energy efficiency and the adaptability imposed by the interaction with the physical world. In this context, SRAM-based FPGAs and their inherent run-time reconfigurability, when coupled with smart power management strategies, are a suitable solution. However, they usually fail in user accessibility and ease of development. In this paper, an integrated framework to develop FPGA-based high-performance embedded systems for Edge Computing in Cyber-Physical Systems is presented. This framework provides a hardware-based processing architecture, an automated toolchain, and a runtime to transparently generate and manage reconfigurable systems from high-level system descriptions without additional user intervention. Moreover, it provides users with support for dynamically adapting the available computing resources to switch the working point of the architecture in a solution space defined by computing performance, energy consumption and fault tolerance. Results show that it is indeed possible to explore this solution space at run time and prove that the proposed framework is a competitive alternative to software-based edge computing platforms, being able to provide not only faster solutions, but also higher energy efficiency for computing-intensive algorithms with significant levels of data-level parallelism.
Processor tradeoffs in distributed real-time systems

NASA Technical Reports Server (NTRS)

Krishna, C. M.; Shin, Kang G.; Bhandari, Inderpal S.

1987-01-01

The problem of the optimization of the design of real-time distributed systems is examined with reference to a class of computer architectures similar to the continuously reconfigurable multiprocessor flight control system structure, CM2FCS. Particular attention is given to the impact of processor replacement and the burn-in time on the probability of dynamic failure and mean cost. The solution is obtained numerically and interpreted in the context of real-time applications.
Supporting Undergraduate Computer Architecture Students Using a Visual MIPS64 CPU Simulator

ERIC Educational Resources Information Center

Patti, D.; Spadaccini, A.; Palesi, M.; Fazzino, F.; Catania, V.

2012-01-01

The topics of computer architecture are always taught using an Assembly dialect as an example. The most commonly used textbooks in this field use the MIPS64 Instruction Set Architecture (ISA) to help students in learning the fundamentals of computer architecture because of its orthogonality and its suitability for real-world applications. This…
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques

DTIC Science & Technology

2013-03-01

MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated

Space and Ground Trades for Human Exploration and Wearable Computing

NASA Technical Reports Server (NTRS)

Lupisella, Mark; Donohue, John; Mandl, Dan; Ly, Vuong; Graves, Corey; Heimerdinger, Dan; Studor, George; Saiz, John; DeLaune, Paul; Clancey, William

2006-01-01

Human exploration of the Moon and Mars will present unique trade study challenges as ground system elements shift to planetary bodies and perhaps eventually to the bodies of human explorers in the form of wearable computing technologies. This presentation will highlight some of the key space and ground trade issues that will face the Exploration Initiative as NASA begins designing systems for the sustained human exploration of the Moon and Mars, with an emphasis on wearable computing. We will present some preliminary test results and scenarios that demonstrate how wearable computing might affect the trade space noted below. We will first present some background on wearable computing and its utility to NASA's Exploration Initiative. Next, we will discuss three broad architectural themes, some key ground and space trade issues within those themes and how they relate to wearable computing. Lastly, we will present some preliminary test results and suggest guidance for proceeding in the assessment and creation of a value-added role for wearable computing in the Exploration Initiative. The three broad ground-space architectural trade themes we will discuss are: 1. Functional Shift and Distribution: To what extent, if any, should traditional ground system functionality be shifted to, and distributed among, the Earth, Moon/Mars, and the human. explorer? 2. Situational Awareness and Autonomy: How much situational awareness (e.g. environmental conditions, biometrics, etc.) and autonomy is required and desired, and where should these capabilities reside? 3. Functional Redundancy: What functions (e.g. command, control, analysis) should exist simultaneously on Earth, the Moon/Mars, and the human explorer? These three themes can serve as the axes of a three-dimensional trade space, within which architectural solutions reside. We will show how wearable computers can fit into this trade space and what the possible implications could be for the rest of the ground and space architecture(s). We intend this to be an example of explorer-centric thinking in a fully integrated explorer paradigm, where integrated explorer refers to a human explorer having instant access to all relevant data, knowledge of the environment, science models, health and safety-related events, and other tools and information via wearable computing technologies. The trade study approach will include involvement from the relevant stakeholders (Constellation Systems, CCCI, EVA Project Office, Astronaut office, Mission Operations, Space Life Sciences, etc.) to develop operations concepts (and/or operations scenarios) from which a basic high-level set of requirements could be extracted. This set of requirements could serve as a foundation (along with stakeholder buy-in) that would help define the trade space and assist in identifying candidate technologies for further study and evolution to higher-level technology readiness levels.
78 FR 18415 - Connected Vehicle Reference Implementation Architecture Workshop; Notice of Public Meeting

Federal Register 2010, 2011, 2012, 2013, 2014

2013-03-26

... DEPARTMENT OF TRANSPORTATION Connected Vehicle Reference Implementation Architecture Workshop...) Intelligent Transportation System Joint Program Office (ITS JPO) will host a free Connected Vehicle Reference... manufacturing, developing, deploying, operating, or maintaining the connected [[Page 18416
Brain architecture: a design for natural computation.

PubMed

Kaiser, Marcus

2007-12-15

Fifty years ago, John von Neumann compared the architecture of the brain with that of the computers he invented and which are still in use today. In those days, the organization of computers was based on concepts of brain organization. Here, we give an update on current results on the global organization of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
A new software-based architecture for quantum computer

NASA Astrophysics Data System (ADS)

Wu, Nan; Song, FangMin; Li, Xiangdong

2010-04-01

In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
Unaligned instruction relocation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertolli, Carlo; O'Brien, John K.; Sallenave, Olivier H.

In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unalignedmore » ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.« less
Unaligned instruction relocation

DOEpatents

Bertolli, Carlo; O'Brien, John K.; Sallenave, Olivier H.; Sura, Zehra N.

2018-01-23

In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
Development Of A Three-Dimensional Circuit Integration Technology And Computer Architecture

NASA Astrophysics Data System (ADS)

Etchells, R. D.; Grinberg, J.; Nudd, G. R.

1981-12-01

This paper is the first of a series 1,2,3 describing a range of efforts at Hughes Research Laboratories, which are collectively referred to as "Three-Dimensional Microelectronics." The technology being developed is a combination of a unique circuit fabrication/packaging technology and a novel processing architecture. The packaging technology greatly reduces the parasitic impedances associated with signal-routing in complex VLSI structures, while simultaneously allowing circuit densities orders of magnitude higher than the current state-of-the-art. When combined with the 3-D processor architecture, the resulting machine exhibits a one- to two-order of magnitude simultaneous improvement over current state-of-the-art machines in the three areas of processing speed, power consumption, and physical volume. The 3-D architecture is essentially that commonly referred to as a "cellular array", with the ultimate implementation having as many as 512 x 512 processors working in parallel. The three-dimensional nature of the assembled machine arises from the fact that the chips containing the active circuitry of the processor are stacked on top of each other. In this structure, electrical signals are passed vertically through the chips via thermomigrated aluminum feedthroughs. Signals are passed between adjacent chips by micro-interconnects. This discussion presents a broad view of the total effort, as well as a more detailed treatment of the fabrication and packaging technologies themselves. The results of performance simulations of the completed 3-D processor executing a variety of algorithms are also presented. Of particular pertinence to the interests of the focal-plane array community is the simulation of the UNICORNS nonuniformity correction algorithms as executed by the 3-D architecture.
Low-power hardware implementation of movement decoding for brain computer interface with reduced-resolution discrete cosine transform.

PubMed

Minho Won; Albalawi, Hassan; Xin Li; Thomas, Donald E

2014-01-01

This paper describes a low-power hardware implementation for movement decoding of brain computer interface. Our proposed hardware design is facilitated by two novel ideas: (i) an efficient feature extraction method based on reduced-resolution discrete cosine transform (DCT), and (ii) a new hardware architecture of dual look-up table to perform discrete cosine transform without explicit multiplication. The proposed hardware implementation has been validated for movement decoding of electrocorticography (ECoG) signal by using a Xilinx FPGA Zynq-7000 board. It achieves more than 56× energy reduction over a reference design using band-pass filters for feature extraction.
Architectures for single-chip image computing

NASA Astrophysics Data System (ADS)

Gove, Robert J.

1992-04-01

This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
A modular architecture for transparent computation in recurrent neural networks.

PubMed

Carmantini, Giovanni S; Beim Graben, Peter; Desroches, Mathieu; Rodrigues, Serafim

2017-01-01

Computation is classically studied in terms of automata, formal languages and algorithms; yet, the relation between neural dynamics and symbolic representations and operations is still unclear in traditional eliminative connectionism. Therefore, we suggest a unique perspective on this central issue, to which we would like to refer as transparent connectionism, by proposing accounts of how symbolic computation can be implemented in neural substrates. In this study we first introduce a new model of dynamics on a symbolic space, the versatile shift, showing that it supports the real-time simulation of a range of automata. We then show that the Gödelization of versatile shifts defines nonlinear dynamical automata, dynamical systems evolving on a vectorial space. Finally, we present a mapping between nonlinear dynamical automata and recurrent artificial neural networks. The mapping defines an architecture characterized by its granular modularity, where data, symbolic operations and their control are not only distinguishable in activation space, but also spatially localizable in the network itself, while maintaining a distributed encoding of symbolic representations. The resulting networks simulate automata in real-time and are programmed directly, in the absence of network training. To discuss the unique characteristics of the architecture and their consequences, we present two examples: (i) the design of a Central Pattern Generator from a finite-state locomotive controller, and (ii) the creation of a network simulating a system of interactive automata that supports the parsing of garden-path sentences as investigated in psycholinguistics experiments. Copyright © 2016 Elsevier Ltd. All rights reserved.
Transitioning ISR architecture into the cloud

NASA Astrophysics Data System (ADS)

Lash, Thomas D.

2012-06-01

Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming

NASA Technical Reports Server (NTRS)

Dorband, John E.; Aburdene, Maurice F.

2002-01-01

Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.

PubMed

Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein

2015-12-01

Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.
Accelerating Cogent Confabulation: An Exploration in the Architecture Design Space

DTIC Science & Technology

2008-06-01

DATES COVERED (From - To) 1-8 June 2008 4. TITLE AND SUBTITLE ACCELERATING COGENT CONFABULATION: AN EXPLORATION IN THE ARCHITECTURE DESIGN SPACE 5a...spiking neural networks is proposed in reference [8]. Reference [9] investigates the architecture design of a Brain-state-in-a-box model. The...Richard Linderman2, Thomas Renz2, Qing Wu1 Accelerating Cogent Confabulation: an Exploration in the Architecture Design Space POSTPRINT complexity
Advanced computer architecture specification for automated weld systems

NASA Technical Reports Server (NTRS)

Katsinis, Constantine

1994-01-01

This report describes the requirements for an advanced automated weld system and the associated computer architecture, and defines the overall system specification from a broad perspective. According to the requirements of welding procedures as they relate to an integrated multiaxis motion control and sensor architecture, the computer system requirements are developed based on a proven multiple-processor architecture with an expandable, distributed-memory, single global bus architecture, containing individual processors which are assigned to specific tasks that support sensor or control processes. The specified architecture is sufficiently flexible to integrate previously developed equipment, be upgradable and allow on-site modifications.
Quantum Computing Architectural Design

NASA Astrophysics Data System (ADS)

West, Jacob; Simms, Geoffrey; Gyure, Mark

2006-03-01

Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
Quantum neural network-based EEG filtering for a brain-computer interface.

PubMed

Gandhi, Vaibhav; Prasad, Girijesh; Coyle, Damien; Behera, Laxmidhar; McGinnity, Thomas Martin

2014-02-01

A novel neural information processing architecture inspired by quantum mechanics and incorporating the well-known Schrodinger wave equation is proposed in this paper. The proposed architecture referred to as recurrent quantum neural network (RQNN) can characterize a nonstationary stochastic signal as time-varying wave packets. A robust unsupervised learning algorithm enables the RQNN to effectively capture the statistical behavior of the input signal and facilitates the estimation of signal embedded in noise with unknown characteristics. The results from a number of benchmark tests show that simple signals such as dc, staircase dc, and sinusoidal signals embedded within high noise can be accurately filtered and particle swarm optimization can be employed to select model parameters. The RQNN filtering procedure is applied in a two-class motor imagery-based brain-computer interface where the objective was to filter electroencephalogram (EEG) signals before feature extraction and classification to increase signal separability. A two-step inner-outer fivefold cross-validation approach is utilized to select the algorithm parameters subject-specifically for nine subjects. It is shown that the subject-specific RQNN EEG filtering significantly improves brain-computer interface performance compared to using only the raw EEG or Savitzky-Golay filtered EEG across multiple sessions.
Hypercluster Parallel Processor

NASA Technical Reports Server (NTRS)

Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela

1992-01-01

Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
Three-Dimensional Terahertz Coded-Aperture Imaging Based on Single Input Multiple Output Technology.

PubMed

Chen, Shuo; Luo, Chenggao; Deng, Bin; Wang, Hongqiang; Cheng, Yongqiang; Zhuang, Zhaowen

2018-01-19

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. In this paper, we propose a three-dimensional (3D) TCAI architecture based on single input multiple output (SIMO) technology, which can reduce the coding and sampling times sharply. The coded aperture applied in the proposed TCAI architecture loads either purposive or random phase modulation factor. In the transmitting process, the purposive phase modulation factor drives the terahertz beam to scan the divided 3D imaging cells. In the receiving process, the random phase modulation factor is adopted to modulate the terahertz wave to be spatiotemporally independent for high resolution. Considering human-scale targets, images of each 3D imaging cell are reconstructed one by one to decompose the global computational complexity, and then are synthesized together to obtain the complete high-resolution image. As for each imaging cell, the multi-resolution imaging method helps to reduce the computational burden on a large-scale reference-signal matrix. The experimental results demonstrate that the proposed architecture can achieve high-resolution imaging with much less time for 3D targets and has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
Reconciliation of the cloud computing model with US federal electronic health record regulations

PubMed Central

2011-01-01

Cloud computing refers to subscription-based, fee-for-service utilization of computer hardware and software over the Internet. The model is gaining acceptance for business information technology (IT) applications because it allows capacity and functionality to increase on the fly without major investment in infrastructure, personnel or licensing fees. Large IT investments can be converted to a series of smaller operating expenses. Cloud architectures could potentially be superior to traditional electronic health record (EHR) designs in terms of economy, efficiency and utility. A central issue for EHR developers in the US is that these systems are constrained by federal regulatory legislation and oversight. These laws focus on security and privacy, which are well-recognized challenges for cloud computing systems in general. EHRs built with the cloud computing model can achieve acceptable privacy and security through business associate contracts with cloud providers that specify compliance requirements, performance metrics and liability sharing. PMID:21727204

Reconciliation of the cloud computing model with US federal electronic health record regulations.

PubMed

Schweitzer, Eugene J

2012-01-01

Cloud computing refers to subscription-based, fee-for-service utilization of computer hardware and software over the Internet. The model is gaining acceptance for business information technology (IT) applications because it allows capacity and functionality to increase on the fly without major investment in infrastructure, personnel or licensing fees. Large IT investments can be converted to a series of smaller operating expenses. Cloud architectures could potentially be superior to traditional electronic health record (EHR) designs in terms of economy, efficiency and utility. A central issue for EHR developers in the US is that these systems are constrained by federal regulatory legislation and oversight. These laws focus on security and privacy, which are well-recognized challenges for cloud computing systems in general. EHRs built with the cloud computing model can achieve acceptable privacy and security through business associate contracts with cloud providers that specify compliance requirements, performance metrics and liability sharing.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs

NASA Technical Reports Server (NTRS)

Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan

2006-01-01

Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS

NASA Astrophysics Data System (ADS)

Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.

2018-03-01

Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.
The EPOS ICT Architecture

NASA Astrophysics Data System (ADS)

Jeffery, Keith; Harrison, Matt; Bailo, Daniele

2016-04-01

The EPOS-PP Project 2010-2014 proposed an architecture and demonstrated feasibility with a prototype. Requirements based on use cases were collected and an inventory of assets (e.g. datasets, software, users, computing resources, equipment/detectors, laboratory services) (RIDE) was developed. The architecture evolved through three stages of refinement with much consultation both with the EPOS community representing EPOS users and participants in geoscience and with the overall ICT community especially those working on research such as the RDA (Research Data Alliance) community. The architecture consists of a central ICS (Integrated Core Services) consisting of a portal and catalog, the latter providing to end-users a 'map' of all EPOS resources (datasets, software, users, computing, equipment/detectors etc.). ICS is extended to ICS-d (distributed ICS) for certain services (such as visualisation software services or Cloud computing resources) and CES (Computational Earth Science) for specific simulation or analytical processing. ICS also communicates with TCS (Thematic Core Services) which represent European-wide portals to national and local assets, resources and services in the various specific domains (e.g. seismology, volcanology, geodesy) of EPOS. The EPOS-IP project 2015-2019 started October 2015. Two work-packages cover the ICT aspects; WP6 involves interaction with the TCS while WP7 concentrates on ICS including interoperation with ICS-d and CES offerings: in short the ICT architecture. Based on the experience and results of EPOS-PP the ICT team held a pre-meeting in July 2015 and set out a project plan. The first major activity involved requirements (re-)collection with use cases and also updating the inventory of assets held by the various TCS in EPOS. The RIDE database of assets is currently being converted to CERIF (Common European Research Information Format - an EU Recommendation to Member States) to provide the basis for the EPOS-IP ICS Catalog. In parallel the ICT team is tracking developments in ICT for relevance to EPOS-IP. In particular, the potential utilisation of e-Is (e-Infrastructures) such as GEANT(network), AARC (security), EGI (GRID computing), EUDAT (data curation), PRACE (High Performance Computing), HELIX-Nebula / Open Science Cloud (Cloud computing) are being assessed. Similarly relationships to other e-RIs (e-Research Infrastructures) such as ENVRI+, EXCELERATE and other ESFRI (European Strategic Forum for Research Infrastructures) projects are developed to share experience and technology and to promote interoperability. EPOS ICT team members are also involved in VRE4EIC, a project developing a reference architecture and component software services for a Virtual Research Environment to be superimposed on EPOS-ICS. The challenge which is being tackled now is therefore to keep consistency and interoperability among the different modules, initiatives and actors which participate to the process of running the EPOS platform. It implies both a continuous update about IT aspects of mentioned initiatives and a refinement of the e-architecture designed so far. One major aspect of EPOS-IP is the ICT support for legalistic, financial and governance aspects of the EPOS ERIC to be initiated during EPOS-IP. This implies a sophisticated AAAI (Authentication, authorization, accounting infrastructure) with consistency throughout the software, communications and data stack.
Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands

PubMed Central

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too. PMID:27656140
Deep Learning with Convolutional Neural Networks Applied to Electromyography Data: A Resource for the Classification of Movements for Prosthetic Hands.

PubMed

Atzori, Manfredo; Cognolato, Matteo; Müller, Henning

2016-01-01

Natural control methods based on surface electromyography (sEMG) and pattern recognition are promising for hand prosthetics. However, the control robustness offered by scientific research is still not sufficient for many real life applications, and commercial prostheses are capable of offering natural control for only a few movements. In recent years deep learning revolutionized several fields of machine learning, including computer vision and speech recognition. Our objective is to test its methods for natural control of robotic hands via sEMG using a large number of intact subjects and amputees. We tested convolutional networks for the classification of an average of 50 hand movements in 67 intact subjects and 11 transradial amputees. The simple architecture of the neural network allowed to make several tests in order to evaluate the effect of pre-processing, layer architecture, data augmentation and optimization. The classification results are compared with a set of classical classification methods applied on the same datasets. The classification accuracy obtained with convolutional neural networks using the proposed architecture is higher than the average results obtained with the classical classification methods, but lower than the results obtained with the best reference methods in our tests. The results show that convolutional neural networks with a very simple architecture can produce accurate results comparable to the average classical classification methods. They show that several factors (including pre-processing, the architecture of the net and the optimization parameters) can be fundamental for the analysis of sEMG data. Larger networks can achieve higher accuracy on computer vision and object recognition tasks. This fact suggests that it may be interesting to evaluate if larger networks can increase sEMG classification accuracy too.
Monitoring WLCG with lambda-architecture: a new scalable data store and analytics platform for monitoring at petabyte scale.

NASA Astrophysics Data System (ADS)

Magnoni, L.; Suthakar, U.; Cordeiro, C.; Georgiou, M.; Andreeva, J.; Khan, A.; Smith, D. R.

2015-12-01

Monitoring the WLCG infrastructure requires the gathering and analysis of a high volume of heterogeneous data (e.g. data transfers, job monitoring, site tests) coming from different services and experiment-specific frameworks to provide a uniform and flexible interface for scientists and sites. The current architecture, where relational database systems are used to store, to process and to serve monitoring data, has limitations in coping with the foreseen increase in the volume (e.g. higher LHC luminosity) and the variety (e.g. new data-transfer protocols and new resource-types, as cloud-computing) of WLCG monitoring events. This paper presents a new scalable data store and analytics platform designed by the Support for Distributed Computing (SDC) group, at the CERN IT department, which uses a variety of technologies each one targeting specific aspects of big-scale distributed data-processing (commonly referred as lambda-architecture approach). Results of data processing on Hadoop for WLCG data activities monitoring are presented, showing how the new architecture can easily analyze hundreds of millions of transfer logs in a few minutes. Moreover, a comparison of data partitioning, compression and file format (e.g. CSV, Avro) is presented, with particular attention given to how the file structure impacts the overall MapReduce performance. In conclusion, the evolution of the current implementation, which focuses on data storage and batch processing, towards a complete lambda-architecture is discussed, with consideration of candidate technology for the serving layer (e.g. Elasticsearch) and a description of a proof of concept implementation, based on Apache Spark and Esper, for the real-time part which compensates for batch-processing latency and automates problem detection and failures.
Design and Analysis of Architectures for Structural Health Monitoring Systems

NASA Technical Reports Server (NTRS)

Mukkamala, Ravi; Sixto, S. L. (Technical Monitor)

2002-01-01

During the two-year project period, we have worked on several aspects of Health Usage and Monitoring Systems for structural health monitoring. In particular, we have made contributions in the following areas. 1. Reference HUMS architecture: We developed a high-level architecture for health monitoring and usage systems (HUMS). The proposed reference architecture is shown. It is compatible with the Generic Open Architecture (GOA) proposed as a standard for avionics systems. 2. HUMS kernel: One of the critical layers of HUMS reference architecture is the HUMS kernel. We developed a detailed design of a kernel to implement the high level architecture.3. Prototype implementation of HUMS kernel: We have implemented a preliminary version of the HUMS kernel on a Unix platform.We have implemented both a centralized system version and a distributed version. 4. SCRAMNet and HUMS: SCRAMNet (Shared Common Random Access Memory Network) is a system that is found to be suitable to implement HUMS. For this reason, we have conducted a simulation study to determine its stability in handling the input data rates in HUMS. 5. Architectural specification.
A synchronized computational architecture for generalized bilateral control of robot arms

NASA Technical Reports Server (NTRS)

Bejczy, Antal K.; Szakaly, Zoltan

1987-01-01

This paper describes a computational architecture for an interconnected high speed distributed computing system for generalized bilateral control of robot arms. The key method of the architecture is the use of fully synchronized, interrupt driven software. Since an objective of the development is to utilize the processing resources efficiently, the synchronization is done in the hardware level to reduce system software overhead. The architecture also achieves a balaced load on the communication channel. The paper also describes some architectural relations to trading or sharing manual and automatic control.
Model-Based Engine Control Architecture with an Extended Kalman Filter

NASA Technical Reports Server (NTRS)

Csank, Jeffrey T.; Connolly, Joseph W.

2016-01-01

This paper discusses the design and implementation of an extended Kalman filter (EKF) for model-based engine control (MBEC). Previously proposed MBEC architectures feature an optimal tuner Kalman Filter (OTKF) to produce estimates of both unmeasured engine parameters and estimates for the health of the engine. The success of this approach relies on the accuracy of the linear model and the ability of the optimal tuner to update its tuner estimates based on only a few sensors. Advances in computer processing are making it possible to replace the piece-wise linear model, developed off-line, with an on-board nonlinear model running in real-time. This will reduce the estimation errors associated with the linearization process, and is typically referred to as an extended Kalman filter. The non-linear extended Kalman filter approach is applied to the Commercial Modular Aero-Propulsion System Simulation 40,000 (C-MAPSS40k) and compared to the previously proposed MBEC architecture. The results show that the EKF reduces the estimation error, especially during transient operation.
Model-Based Engine Control Architecture with an Extended Kalman Filter

NASA Technical Reports Server (NTRS)

Csank, Jeffrey T.; Connolly, Joseph W.

2016-01-01

This paper discusses the design and implementation of an extended Kalman filter (EKF) for model-based engine control (MBEC). Previously proposed MBEC architectures feature an optimal tuner Kalman Filter (OTKF) to produce estimates of both unmeasured engine parameters and estimates for the health of the engine. The success of this approach relies on the accuracy of the linear model and the ability of the optimal tuner to update its tuner estimates based on only a few sensors. Advances in computer processing are making it possible to replace the piece-wise linear model, developed off-line, with an on-board nonlinear model running in real-time. This will reduce the estimation errors associated with the linearization process, and is typically referred to as an extended Kalman filter. The nonlinear extended Kalman filter approach is applied to the Commercial Modular Aero-Propulsion System Simulation 40,000 (C-MAPSS40k) and compared to the previously proposed MBEC architecture. The results show that the EKF reduces the estimation error, especially during transient operation.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation

NASA Technical Reports Server (NTRS)

Stocker, John C.; Golomb, Andrew M.

2011-01-01

Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Improving TOGAF ADM 9.1 Migration Planning Phase by ITIL V3 Service Transition

NASA Astrophysics Data System (ADS)

Hanum Harani, Nisa; Akhmad Arman, Arry; Maulana Awangga, Rolly

2018-04-01

Modification planning of business transformation involving technological utilization required a system of transition and migration planning process. Planning of system migration activity is the most important. The migration process is including complex elements such as business re-engineering, transition scheme mapping, data transformation, application development, individual involvement by computer and trial interaction. TOGAF ADM is the framework and method of enterprise architecture implementation. TOGAF ADM provides a manual refer to the architecture and migration planning. The planning includes an implementation solution, in this case, IT solution, but when the solution becomes an IT operational planning, TOGAF could not handle it. This paper presents a new model framework detail transitions process of integration between TOGAF and ITIL. We evaluated our models in field study inside a private university.
Architectural Specialization for Inter-Iteration Loop Dependence Patterns

DTIC Science & Technology

2015-10-01

Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness

NASA Astrophysics Data System (ADS)

Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.

2014-09-01

Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
RRAM-based parallel computing architecture using k-nearest neighbor classification for pattern recognition

NASA Astrophysics Data System (ADS)

Jiang, Yuning; Kang, Jinfeng; Wang, Xinan

2017-03-01

Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Cognitive Architectures and Human-Computer Interaction. Introduction to Special Issue.

ERIC Educational Resources Information Center

Gray, Wayne D.; Young, Richard M.; Kirschenbaum, Susan S.

1997-01-01

In this introduction to a special issue on cognitive architectures and human-computer interaction (HCI), editors and contributors provide a brief overview of cognitive architectures. The following four architectures represented by articles in this issue are: Soar; LICAI (linked model of comprehension-based action planning and instruction taking);…
Biomimetic design processes in architecture: morphogenetic and evolutionary computational design.

PubMed

Menges, Achim

2012-03-01

Design computation has profound impact on architectural design methods. This paper explains how computational design enables the development of biomimetic design processes specific to architecture, and how they need to be significantly different from established biomimetic processes in engineering disciplines. The paper first explains the fundamental difference between computer-aided and computational design in architecture, as the understanding of this distinction is of critical importance for the research presented. Thereafter, the conceptual relation and possible transfer of principles from natural morphogenesis to design computation are introduced and the related developments of generative, feature-based, constraint-based, process-based and feedback-based computational design methods are presented. This morphogenetic design research is then related to exploratory evolutionary computation, followed by the presentation of two case studies focusing on the exemplary development of spatial envelope morphologies and urban block morphologies.
Processing Cones: A Computational Structure for Image Analysis.

DTIC Science & Technology

1981-12-01

image analysis applications, referred to as a processing cone, is described and sample algorithms are presented. A fundamental characteristic of the structure is its hierarchical organization into two-dimensional arrays of decreasing resolution. In this architecture, a protypical function is defined on a local window of data and applied uniformly to all windows in a parallel manner. Three basic modes of processing are supported in the cone: reduction operations (upward processing), horizontal operations (processing at a single level) and projection operations (downward
Open Radio Communications Architecture Core Framework V1.1.0 Volume 1 Software Users Manual

DTIC Science & Technology

2005-02-01

on a PC utilizing the KDE desktop that comes with Red Hat Linux . The default desktop for most Red Hat Linux installations is the GNOME desktop. The...SCA) v2.2. The software was designed for a desktop computer running the Linux operating system (OS). It was developed in C++, uses ACE/TAO for CORBA...middleware, Xerces for the XML parser, and Red Hat Linux for the Operating System. The software is referred to as, Open Radio Communication

A Lightweight Intelligent Virtual Cinematography System for Machinima Production

DTIC Science & Technology

2007-01-01

portmanteau of machine and cinema , machinima refers to the innovation of leveraging video game technology to greatly ease the creation of computer...selecting camera angles to capture the action of an a priori unknown script as aesthetically appropriate cinema . There are a number of challenges therein...Proc. of the 4th International Conf. on Autonomous Agents. Young, R.M. and Riedl, M.O. 2003. Towards an Architecture for Intelligent Control of Narrative in Interactive Virtual Worlds. In Proc. of IUI 2003.
An International Strategy for Human Exploration of the Moon: The International Space Exploration Coordination Group (ISECG) Reference Architecture for Human Lunar Exploration

NASA Technical Reports Server (NTRS)

Laurini, Kathleen C.; Hufenbach, Bernhard; Junichiro, Kawaguchi; Piedboeuf, Jean-Claude; Schade, Britta; Lorenzoni, Andrea; Curtis, Jeremy; Hae-Dong, Kim

2010-01-01

The International Space Exploration Coordination Group (ISECG) was established in response to The Global Exploration Strategy: The Framework for Coordination developed by fourteen space agencies and released in May 2007. Several ISECG participating space agencies have been studying concepts for human exploration of the moon that allow individual and collective goals and objectives to be met. This 18 month study activity culminated with the development of the ISECG Reference Architecture for Human Lunar Exploration. The reference architecture is a series of elements delivered over time in a flexible and evolvable campaign. This paper will describe the reference architecture and how it will inform near-term and long-term programmatic planning within interested agencies. The reference architecture is intended to serve as a global point of departure conceptual architecture that enables individual agency investments in technology development and demonstration, International Space Station research and technology demonstration, terrestrial analog studies, and robotic precursor missions to contribute towards the eventual implementation of a human lunar exploration scenario which reflects the concepts and priorities established to date. It also serves to create opportunities for partnerships that will support evolution of this concept and its eventual realization. The ISECG Reference Architecture for Human Lunar Exploration (commonly referred to as the lunar gPoD) reflects the agency commitments to finding an effective balance between conducting important scientific investigations of and from the moon, as well as demonstrating and mastering the technologies and capabilities to send humans farther into the Solar System. The lunar gPoD begins with a robust robotic precursor phase that demonstrates technologies and capabilities considered important for the success of the campaign. Robotic missions will inform the human missions and buy down risks. Human exploration will start with a thorough scientific investigation of the polar region while allowing the ability to demonstrate and validate the systems needed to take humans on more ambitious lunar exploration excursions. The ISECG Reference Architecture for Human Lunar Exploration serves as a model for future cooperation and is documented in a summary report and a comprehensive document that also describes the collaborative international process that led to its development. ISECG plans to continue with architecture studies such as this to examine an open transportation architecture and other destinations, with expanded participation from ISECG agencies, as it works to inform international partnerships and advance the Global Exploration Strategy.
Innovative architectures for dense multi-microprocessor computers

NASA Technical Reports Server (NTRS)

Donaldson, Thomas; Doty, Karl; Engle, Steven W.; Larson, Robert E.; O'Reilly, John G.

1988-01-01

The results of a Phase I Small Business Innovative Research (SBIR) project performed for the NASA Langley Computational Structural Mechanics Group are described. The project resulted in the identification of a family of chordal-ring interconnection architectures with excellent potential to serve as the basis for new multimicroprocessor (MMP) computers. The paper presents examples of how computational algorithms from structural mechanics can be efficiently implemented on the chordal-ring architecture.
A computer architecture for intelligent machines

NASA Technical Reports Server (NTRS)

Lefebvre, D. R.; Saridis, G. N.

1992-01-01

The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems

NASA Technical Reports Server (NTRS)

Follen, Gregory J.; Lytle, John K. (Technical Monitor)

2002-01-01

Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
Deploying electromagnetic particle-in-cell (EM-PIC) codes on Xeon Phi accelerators boards

NASA Astrophysics Data System (ADS)

Fonseca, Ricardo

2014-10-01

The complexity of the phenomena involved in several relevant plasma physics scenarios, where highly nonlinear and kinetic processes dominate, makes purely theoretical descriptions impossible. Further understanding of these scenarios requires detailed numerical modeling, but fully relativistic particle-in-cell codes such as OSIRIS are computationally intensive. The quest towards Exaflop computer systems has lead to the development of HPC systems based on add-on accelerator cards, such as GPGPUs and more recently the Xeon Phi accelerators that power the current number 1 system in the world. These cards, also referred to as Intel Many Integrated Core Architecture (MIC) offer peak theoretical performances of >1 TFlop/s for general purpose calculations in a single board, and are receiving significant attention as an attractive alternative to CPUs for plasma modeling. In this work we report on our efforts towards the deployment of an EM-PIC code on a Xeon Phi architecture system. We will focus on the parallelization and vectorization strategies followed, and present a detailed performance evaluation of code performance in comparison with the CPU code.
Distributed computing environments for future space control systems

NASA Technical Reports Server (NTRS)

Viallefont, Pierre

1993-01-01

The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
Electro-Optic Computing Architectures. Volume I

DTIC Science & Technology

1998-02-01

The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
A reference architecture for integrated EHR in Colombia.

PubMed

de la Cruz, Edgar; Lopez, Diego M; Uribe, Gustavo; Gonzalez, Carolina; Blobel, Bernd

2011-01-01

The implementation of national EHR infrastructures has to start by a detailed definition of the overall structure and behavior of the EHR system (system architecture). Architectures have to be open, scalable, flexible, user accepted and user friendly, trustworthy, based on standards including terminologies and ontologies. The GCM provides an architectural framework created with the purpose of analyzing any kind of system, including EHR system´s architectures. The objective of this paper is to propose a reference architecture for the implementation of an integrated EHR in Colombia, based on the current state of system´s architectural models, and EHR standards. The proposed EHR architecture defines a set of services (elements) and their interfaces, to support the exchange of clinical documents, offering an open, scalable, flexible and semantically interoperable infrastructure. The architecture was tested in a pilot tele-consultation project in Colombia, where dental EHR are exchanged.
Image-Processing Software For A Hypercube Computer

NASA Technical Reports Server (NTRS)

Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.

1992-01-01

Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Experimental Comparison of Two Quantum Computing Architectures

DTIC Science & Technology

2017-03-28

IN A U G U RA L A RT IC LE CO M PU TE R SC IE N CE S Experimental comparison of two quantum computing architectures Norbert M. Linkea,b,1, Dmitri...the vast computing power a universal quantumcomputer could offer, several candidate systems are being explored. They have allowed experimental ...existing systems and the role of architecture in quantum computer design . These will be crucial for the realization of more advanced future incarna
Unified implementation of the reference architecture : concept of operations.

DOT National Transportation Integrated Search

2015-10-19

This document describes the Concept of Operations (ConOps) for the Unified Implementation of the Reference Architecture, located in Southeast Michigan, which supports connected vehicle research and development. This ConOps describes the current state...
A learnable parallel processing architecture towards unity of memory and computing

NASA Astrophysics Data System (ADS)

Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

2015-08-01

Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.

PubMed

Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

2015-08-14

Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Implementation methodology for interoperable personal health devices with low-voltage low-power constraints.

PubMed

Martinez-Espronceda, Miguel; Martinez, Ignacio; Serrano, Luis; Led, Santiago; Trigo, Jesús Daniel; Marzo, Asier; Escayola, Javier; Garcia, José

2011-05-01

Traditionally, e-Health solutions were located at the point of care (PoC), while the new ubiquitous user-centered paradigm draws on standard-based personal health devices (PHDs). Such devices place strict constraints on computation and battery efficiency that encouraged the International Organization for Standardization/IEEE11073 (X73) standard for medical devices to evolve from X73PoC to X73PHD. In this context, low-voltage low-power (LV-LP) technologies meet the restrictions of X73PHD-compliant devices. Since X73PHD does not approach the software architecture, the accomplishment of an efficient design falls directly on the software developer. Therefore, computational and battery performance of such LV-LP-constrained devices can even be outperformed through an efficient X73PHD implementation design. In this context, this paper proposes a new methodology to implement X73PHD into microcontroller-based platforms with LV-LP constraints. Such implementation methodology has been developed through a patterns-based approach and applied to a number of X73PHD-compliant agents (including weighing scale, blood pressure monitor, and thermometer specializations) and microprocessor architectures (8, 16, and 32 bits) as a proof of concept. As a reference, the results obtained in the weighing scale guarantee all features of X73PHD running over a microcontroller architecture based on ARM7TDMI requiring only 168 B of RAM and 2546 B of flash memory.
Human Exploration of Mars Design Reference Architecture 5.0

NASA Technical Reports Server (NTRS)

Drake, Bret G.

2010-01-01

This paper provides a summary of the Mars Design Reference Architecture 5.0 (DRA 5.0), which is the latest in a series of NASA Mars reference missions. It provides a vision of one potential approach to human Mars exploration. The reference architecture provides a common framework for future planning of systems concepts, technology development, and operational testing as well as Mars robotic missions, research that is conducted on the International Space Station, and future lunar exploration missions. This summary the Mars DRA 5.0 provides an overview of the overall mission approach, surface strategy and exploration goals, as well as the key systems and challenges for the first three human missions to Mars.
Using an object-based grid system to evaluate a newly developed EP approach to formulate SVMs as applied to the classification of organophosphate nerve agents

NASA Astrophysics Data System (ADS)

Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun

2004-04-01

This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.
Digital optical computers at the optoelectronic computing systems center

NASA Technical Reports Server (NTRS)

Jordan, Harry F.

1991-01-01

The Digital Optical Computing Program within the National Science Foundation Engineering Research Center for Opto-electronic Computing Systems has as its specific goal research on optical computing architectures suitable for use at the highest possible speeds. The program can be targeted toward exploiting the time domain because other programs in the Center are pursuing research on parallel optical systems, exploiting optical interconnection and optical devices and materials. Using a general purpose computing architecture as the focus, we are developing design techniques, tools and architecture for operation at the speed of light limit. Experimental work is being done with the somewhat low speed components currently available but with architectures which will scale up in speed as faster devices are developed. The design algorithms and tools developed for a general purpose, stored program computer are being applied to other systems such as optimally controlled optical communication networks.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters

PubMed Central

Torres-Huitzil, Cesar

2013-01-01

Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
Cooperating reduction machines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kluge, W.E.

1983-11-01

This paper presents a concept and a system architecture for the concurrent execution of program expressions of a concrete reduction language based on lamda-expressions. If formulated appropriately, these expressions are well-suited for concurrent execution, following a demand-driven model of computation. In particular, recursive program expressions with nonlinear expansion may, at run time, recursively be partitioned into a hierarchy of independent subexpressions which can be reduced by a corresponding hierarchy of virtual reduction machines. This hierarchy unfolds and collapses dynamically, with virtual machines recursively assuming the role of masters that create and eventually terminate, or synchronize with, slaves. The paper alsomore » proposes a nonhierarchically organized system of reduction machines, each featuring a stack architecture, that effectively supports the allocation of virtual machines to the real machines of the system in compliance with their hierarchical order of creation and termination. 25 references.« less

The diabolo classifier

PubMed

Schwenk

1998-11-15

We present a new classification architecture based on autoassociative neural networks that are used to learn discriminant models of each class. The proposed architecture has several interesting properties with respect to other model-based classifiers like nearest-neighbors or radial basis functions: it has a low computational complexity and uses a compact distributed representation of the models. The classifier is also well suited for the incorporation of a priori knowledge by means of a problem-specific distance measure. In particular, we will show that tangent distance (Simard, Le Cun, & Denker, 1993) can be used to achieve transformation invariance during learning and recognition. We demonstrate the application of this classifier to optical character recognition, where it has achieved state-of-the-art results on several reference databases. Relations to other models, in particular those based on principal component analysis, are also discussed.
Framework for a clinical information system.

PubMed

Van de Velde, R

2000-01-01

The current status of our work towards the design and implementation of a reference architecture for a Clinical Information System is presented. This architecture has been developed and implemented based on components following a strong underlying conceptual and technological model. Common Object Request Broker and n-tier technology featuring centralised and departmental clinical information systems as the back-end store for all clinical data are used. Servers located in the 'middle' tier apply the clinical (business) model and application rules to communicate with so-called 'thin client' workstations. The main characteristics are the focus on modelling and reuse of both data and business logic as there is a shift away from data and functional modelling towards object modelling. Scalability as well as adaptability to constantly changing requirements via component driven computing are the main reasons for that approach.
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis

DTIC Science & Technology

1998-02-01

The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Reference architecture of application services for personal wellbeing information management.

PubMed

Tuomainen, Mika; Mykkänen, Juha

2011-01-01

Personal information management has been proposed as an important enabler for individual empowerment concerning citizens' wellbeing and health information. In the MyWellbeing project in Finland, a strictly citizen-driven concept of "Coper" and related architectural and functional guidelines have been specified. We present a reference architecture and a set of identified application services to support personal wellbeing information management. In addition, the related standards and developments are discussed.
Reference Avionics Architecture for Lunar Surface Systems

NASA Technical Reports Server (NTRS)

Somervill, Kevin M.; Lapin, Jonathan C.; Schmidt, Oron L.

2010-01-01

Developing and delivering infrastructure capable of supporting long-term manned operations to the lunar surface has been a primary objective of the Constellation Program in the Exploration Systems Mission Directorate. Several concepts have been developed related to development and deployment lunar exploration vehicles and assets that provide critical functionality such as transportation, habitation, and communication, to name a few. Together, these systems perform complex safety-critical functions, largely dependent on avionics for control and behavior of system functions. These functions are implemented using interchangeable, modular avionics designed for lunar transit and lunar surface deployment. Systems are optimized towards reuse and commonality of form and interface and can be configured via software or component integration for special purpose applications. There are two core concepts in the reference avionics architecture described in this report. The first concept uses distributed, smart systems to manage complexity, simplify integration, and facilitate commonality. The second core concept is to employ extensive commonality between elements and subsystems. These two concepts are used in the context of developing reference designs for many lunar surface exploration vehicles and elements. These concepts are repeated constantly as architectural patterns in a conceptual architectural framework. This report describes the use of these architectural patterns in a reference avionics architecture for Lunar surface systems elements.
An energy efficient and high speed architecture for convolution computing based on binary resistive random access memory

NASA Astrophysics Data System (ADS)

Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng

2018-04-01

In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.
Parallel processing architecture for computing inverse differential kinematic equations of the PUMA arm

NASA Technical Reports Server (NTRS)

Hsia, T. C.; Lu, G. Z.; Han, W. H.

1987-01-01

In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
Pyramidal neurovision architecture for vision machines

NASA Astrophysics Data System (ADS)

Gupta, Madan M.; Knopf, George K.

1993-08-01

The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Human Exploration of Mars Design Reference Architecture 5.0

NASA Technical Reports Server (NTRS)

Drake, Bret G.; Hoffman, Stephen J.; Beaty, David W.

2009-01-01

This paper provides a summary of the 2007 Mars Design Reference Architecture 5.0 (DRA 5.0), which is the latest in a series of NASA Mars reference missions. It provides a vision of one potential approach to human Mars exploration including how Constellation systems can be used. The reference architecture provides a common framework for future planning of systems concepts, technology development, and operational testing as well as Mars robotic missions, research that is conducted on the International Space Station, and future lunar exploration missions. This summary the Mars DRA 5.0 provides an overview of the overall mission approach, surface strategy and exploration goals, as well as the key systems and challenges for the first three human missions to Mars.
Collaborative Working Architecture for IoT-Based Applications.

PubMed

Mora, Higinio; Signes-Pont, María Teresa; Gil, David; Johnsson, Magnus

2018-05-23

The new sensing applications need enhanced computing capabilities to handle the requirements of complex and huge data processing. The Internet of Things (IoT) concept brings processing and communication features to devices. In addition, the Cloud Computing paradigm provides resources and infrastructures for performing the computations and outsourcing the work from the IoT devices. This scenario opens new opportunities for designing advanced IoT-based applications, however, there is still much research to be done to properly gear all the systems for working together. This work proposes a collaborative model and an architecture to take advantage of the available computing resources. The resulting architecture involves a novel network design with different levels which combines sensing and processing capabilities based on the Mobile Cloud Computing (MCC) paradigm. An experiment is included to demonstrate that this approach can be used in diverse real applications. The results show the flexibility of the architecture to perform complex computational tasks of advanced applications.
Optimizing Engineering Tools Using Modern Ground Architectures

DTIC Science & Technology

2017-12-01

Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Architecture independent environment for developing engineering software on MIMD computers

NASA Technical Reports Server (NTRS)

Valimohamed, Karim A.; Lopez, L. A.

1990-01-01

Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
Protocol Architecture Model Report

NASA Technical Reports Server (NTRS)

Dhas, Chris

2000-01-01

NASA's Glenn Research Center (GRC) defines and develops advanced technology for high priority national needs in communications technologies for application to aeronautics and space. GRC tasked Computer Networks and Software Inc. (CNS) to examine protocols and architectures for an In-Space Internet Node. CNS has developed a methodology for network reference models to support NASA's four mission areas: Earth Science, Space Science, Human Exploration and Development of Space (REDS), Aerospace Technology. This report applies the methodology to three space Internet-based communications scenarios for future missions. CNS has conceptualized, designed, and developed space Internet-based communications protocols and architectures for each of the independent scenarios. The scenarios are: Scenario 1: Unicast communications between a Low-Earth-Orbit (LEO) spacecraft inspace Internet node and a ground terminal Internet node via a Tracking and Data Rela Satellite (TDRS) transfer; Scenario 2: Unicast communications between a Low-Earth-Orbit (LEO) International Space Station and a ground terminal Internet node via a TDRS transfer; Scenario 3: Multicast Communications (or "Multicasting"), 1 Spacecraft to N Ground Receivers, N Ground Transmitters to 1 Ground Receiver via a Spacecraft.
Trade-Off Analysis Report

NASA Technical Reports Server (NTRS)

Dhas, Chris

2000-01-01

NASAs Glenn Research Center (GRC) defines and develops advanced technology for high priority national needs in communications technologies for application to aeronautics and space. GRC tasked Computer Networks and Software Inc. (CNS) to examine protocols and architectures for an In-Space Internet Node. CNS has developed a methodology for network reference models to support NASAs four mission areas: Earth Science, Space Science, Human Exploration and Development of Space (REDS), Aerospace Technology. CNS previously developed a report which applied the methodology, to three space Internet-based communications scenarios for future missions. CNS conceptualized, designed, and developed space Internet-based communications protocols and architectures for each of the independent scenarios. GRC selected for further analysis the scenario that involved unicast communications between a Low-Earth-Orbit (LEO) International Space Station (ISS) and a ground terminal Internet node via a Tracking and Data Relay Satellite (TDRS) transfer. This report contains a tradeoff analysis on the selected scenario. The analysis examines the performance characteristics of the various protocols and architectures. The tradeoff analysis incorporates the results of a CNS developed analytical model that examined performance parameters.
Computer vision camera with embedded FPGA processing

NASA Astrophysics Data System (ADS)

Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

2000-03-01

Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
State-of-the-art in Heterogeneous Computing

DOE PAGES

Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...

2010-01-01

Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
A computer architecture for intelligent machines

NASA Technical Reports Server (NTRS)

Lefebvre, D. R.; Saridis, G. N.

1991-01-01

The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Theory of electronically phased coherent beam combination without a reference beam

NASA Astrophysics Data System (ADS)

Shay, Thomas M.

2006-12-01

The first theory for two novel coherent beam combination architectures that are the first electronic beam combination architectures that completely eliminate the need for a separate reference beam are presented. Detailed theoretical models are developed and presented for the first time.
Advanced computer architecture for large-scale real-time applications.

DOT National Transportation Integrated Search

1973-04-01

Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.

ERIC Educational Resources Information Center

Beltrametti, Monica; English, Will

1994-01-01

Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…

''Beauty of Wholeness and Beauty of Partiality.'' New Terms Defining the Concept of Beauty in Architecture in Terms of Sustainability and Computer Aided Design

ERIC Educational Resources Information Center

Farid, Ayman A.; Zaghloul, Weaam M.; Dewidar, Khaled M.

2014-01-01

The great shift in sustainability and computer aided design in the field of architecture caused a remarkable change in the architecture philosophy, new aspects of beauty and aesthetic values are being introduced, and traditional definitions for beauty cannot fully cover this aspects, which causes a gap between; new architecture works criticism and…
Programmable hardware for reconfigurable computing systems

NASA Astrophysics Data System (ADS)

Smith, Stephen

1996-10-01

In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
Execution environment for intelligent real-time control systems

NASA Technical Reports Server (NTRS)

Sztipanovits, Janos

1987-01-01

Modern telerobot control technology requires the integration of symbolic and non-symbolic programming techniques, different models of parallel computations, and various programming paradigms. The Multigraph Architecture, which has been developed for the implementation of intelligent real-time control systems is described. The layered architecture includes specific computational models, integrated execution environment and various high-level tools. A special feature of the architecture is the tight coupling between the symbolic and non-symbolic computations. It supports not only a data interface, but also the integration of the control structures in a parallel computing environment.
A reference web architecture and patterns for real-time visual analytics on large streaming data

NASA Astrophysics Data System (ADS)

Kandogan, Eser; Soroker, Danny; Rohall, Steven; Bak, Peter; van Ham, Frank; Lu, Jie; Ship, Harold-Jeffrey; Wang, Chun-Fu; Lai, Jennifer

2013-12-01

Monitoring and analysis of streaming data, such as social media, sensors, and news feeds, has become increasingly important for business and government. The volume and velocity of incoming data are key challenges. To effectively support monitoring and analysis, statistical and visual analytics techniques need to be seamlessly integrated; analytic techniques for a variety of data types (e.g., text, numerical) and scope (e.g., incremental, rolling-window, global) must be properly accommodated; interaction, collaboration, and coordination among several visualizations must be supported in an efficient manner; and the system should support the use of different analytics techniques in a pluggable manner. Especially in web-based environments, these requirements pose restrictions on the basic visual analytics architecture for streaming data. In this paper we report on our experience of building a reference web architecture for real-time visual analytics of streaming data, identify and discuss architectural patterns that address these challenges, and report on applying the reference architecture for real-time Twitter monitoring and analysis.
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy

PubMed Central

Hwang, Wen-Jyi; Cheng, Shih-Chang; Cheng, Chau-Jern

2011-01-01

This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system. PMID:22163688
Computer Architects.

ERIC Educational Resources Information Center

Betts, Janelle Lyon

2001-01-01

Describes a high school art assignment in which students utilize Appleworks or Claris Works to design their own house, after learning about architectural styles and how to use the computer program. States that the project develops student computer skills and increases student knowledge about architecture. (CMK)
Theory of electronic phase locking of an optical array without a reference beam

NASA Astrophysics Data System (ADS)

Shay, Thomas M.

2006-08-01

The first theory for two novel coherent beam combination architectures that are the first electronic beam combination architectures that completely eliminate the need for a separate reference beam are presented. Detailed theoretical models are developed and presented for the first time.
Fast Multipole Methods for Three-Dimensional N-body Problems

NASA Technical Reports Server (NTRS)

Koumoutsakos, P.

1995-01-01

We are developing computational tools for the simulations of three-dimensional flows past bodies undergoing arbitrary motions. High resolution viscous vortex methods have been developed that allow for extended simulations of two-dimensional configurations such as vortex generators. Our objective is to extend this methodology to three dimensions and develop a robust computational scheme for the simulation of such flows. A fundamental issue in the use of vortex methods is the ability of employing efficiently large numbers of computational elements to resolve the large range of scales that exist in complex flows. The traditional cost of the method scales as Omicron (N(sup 2)) as the N computational elements/particles induce velocities at each other, making the method unacceptable for simulations involving more than a few tens of thousands of particles. In the last decade fast methods have been developed that have operation counts of Omicron (N log N) or Omicron (N) (referred to as BH and GR respectively) depending on the details of the algorithm. These methods are based on the observation that the effect of a cluster of particles at a certain distance may be approximated by a finite series expansion. In order to exploit this observation we need to decompose the element population spatially into clusters of particles and build a hierarchy of clusters (a tree data structure) - smaller neighboring clusters combine to form a cluster of the next size up in the hierarchy and so on. This hierarchy of clusters allows one to determine efficiently when the approximation is valid. This algorithm is an N-body solver that appears in many fields of engineering and science. Some examples of its diverse use are in astrophysics, molecular dynamics, micro-magnetics, boundary element simulations of electromagnetic problems, and computer animation. More recently these N-body solvers have been implemented and applied in simulations involving vortex methods. Koumoutsakos and Leonard (1995) implemented the GR scheme in two dimensions for vector computer architectures allowing for simulations of bluff body flows using millions of particles. Winckelmans presented three-dimensional, viscous simulations of interacting vortex rings, using vortons and an implementation of a BH scheme for parallel computer architectures. Bhatt presented a vortex filament method to perform inviscid vortex ring interactions, with an alternative implementation of a BH scheme for a Connection Machine parallel computer architecture.
Hierarchical Chunking of Sequential Memory on Neuromorphic Architecture with Reduced Synaptic Plasticity

PubMed Central

Li, Guoqi; Deng, Lei; Wang, Dong; Wang, Wei; Zeng, Fei; Zhang, Ziyang; Li, Huanglong; Song, Sen; Pei, Jing; Shi, Luping

2016-01-01

Chunking refers to a phenomenon whereby individuals group items together when performing a memory task to improve the performance of sequential memory. In this work, we build a bio-plausible hierarchical chunking of sequential memory (HCSM) model to explain why such improvement happens. We address this issue by linking hierarchical chunking with synaptic plasticity and neuromorphic engineering. We uncover that a chunking mechanism reduces the requirements of synaptic plasticity since it allows applying synapses with narrow dynamic range and low precision to perform a memory task. We validate a hardware version of the model through simulation, based on measured memristor behavior with narrow dynamic range in neuromorphic circuits, which reveals how chunking works and what role it plays in encoding sequential memory. Our work deepens the understanding of sequential memory and enables incorporating it for the investigation of the brain-inspired computing on neuromorphic architecture. PMID:28066223
Distributed dynamic simulations of networked control and building performance applications.

PubMed

Yahiaoui, Azzedine

2018-02-01

The use of computer-based automation and control systems for smart sustainable buildings, often so-called Automated Buildings (ABs), has become an effective way to automatically control, optimize, and supervise a wide range of building performance applications over a network while achieving the minimum energy consumption possible, and in doing so generally refers to Building Automation and Control Systems (BACS) architecture. Instead of costly and time-consuming experiments, this paper focuses on using distributed dynamic simulations to analyze the real-time performance of network-based building control systems in ABs and improve the functions of the BACS technology. The paper also presents the development and design of a distributed dynamic simulation environment with the capability of representing the BACS architecture in simulation by run-time coupling two or more different software tools over a network. The application and capability of this new dynamic simulation environment are demonstrated by an experimental design in this paper.
Distributed dynamic simulations of networked control and building performance applications

PubMed Central

Yahiaoui, Azzedine

2017-01-01

The use of computer-based automation and control systems for smart sustainable buildings, often so-called Automated Buildings (ABs), has become an effective way to automatically control, optimize, and supervise a wide range of building performance applications over a network while achieving the minimum energy consumption possible, and in doing so generally refers to Building Automation and Control Systems (BACS) architecture. Instead of costly and time-consuming experiments, this paper focuses on using distributed dynamic simulations to analyze the real-time performance of network-based building control systems in ABs and improve the functions of the BACS technology. The paper also presents the development and design of a distributed dynamic simulation environment with the capability of representing the BACS architecture in simulation by run-time coupling two or more different software tools over a network. The application and capability of this new dynamic simulation environment are demonstrated by an experimental design in this paper. PMID:29568135
Evaluation of Visual Computer Simulator for Computer Architecture Education

ERIC Educational Resources Information Center

Imai, Yoshiro; Imai, Masatoshi; Moritoh, Yoshio

2013-01-01

This paper presents trial evaluation of a visual computer simulator in 2009-2011, which has been developed to play some roles of both instruction facility and learning tool simultaneously. And it illustrates an example of Computer Architecture education for University students and usage of e-Learning tool for Assembly Programming in order to…
Design of a massively parallel computer using bit serial processing elements

NASA Technical Reports Server (NTRS)

Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing

1995-01-01

A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
A heterogeneous hierarchical architecture for real-time computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Skroch, D.A.; Fornaro, R.J.

The need for high-speed data acquisition and control algorithms has prompted continued research in the area of multiprocessor systems and related programming techniques. The result presented here is a unique hardware and software architecture for high-speed real-time computer systems. The implementation of a prototype of this architecture has required the integration of architecture, operating systems and programming languages into a cohesive unit. This report describes a Heterogeneous Hierarchial Architecture for Real-Time (H{sup 2} ART) and system software for program loading and interprocessor communication.
Three Program Architecture for Design Optimization

NASA Technical Reports Server (NTRS)

Miura, Hirokazu; Olson, Lawrence E. (Technical Monitor)

1998-01-01

In this presentation, I would like to review historical perspective on the program architecture used to build design optimization capabilities based on mathematical programming and other numerical search techniques. It is rather straightforward to classify the program architecture in three categories as shown above. However, the relative importance of each of the three approaches has not been static, instead dynamically changing as the capabilities of available computational resource increases. For example, we considered that the direct coupling architecture would never be used for practical problems, but availability of such computer systems as multi-processor. In this presentation, I would like to review the roles of three architecture from historical as well as current and future perspective. There may also be some possibility for emergence of hybrid architecture. I hope to provide some seeds for active discussion where we are heading to in the very dynamic environment for high speed computing and communication.
An Experiment in the Use of Computer-Based Education to Teach Energy Considerations in Architectural Design.

ERIC Educational Resources Information Center

Arumi, Francisco N.

Computer programs capable of describing the thermal behavior of buildings are used to help architectural students understand environmental systems. The Numerical Simulation Laboratory at the Architectural School of the University of Texas at Austin was developed to provide the necessary software capable of simulating the energy transactions…
Analysis of Android Device-Based Solutions for Fall Detection

PubMed Central

Casilari, Eduardo; Luque, Rafael; Morón, María-José

2015-01-01

Falls are a major cause of health and psychological problems as well as hospitalization costs among older adults. Thus, the investigation on automatic Fall Detection Systems (FDSs) has received special attention from the research community during the last decade. In this area, the widespread popularity, decreasing price, computing capabilities, built-in sensors and multiplicity of wireless interfaces of Android-based devices (especially smartphones) have fostered the adoption of this technology to deploy wearable and inexpensive architectures for fall detection. This paper presents a critical and thorough analysis of those existing fall detection systems that are based on Android devices. The review systematically classifies and compares the proposals of the literature taking into account different criteria such as the system architecture, the employed sensors, the detection algorithm or the response in case of a fall alarms. The study emphasizes the analysis of the evaluation methods that are employed to assess the effectiveness of the detection process. The review reveals the complete lack of a reference framework to validate and compare the proposals. In addition, the study also shows that most research works do not evaluate the actual applicability of the Android devices (with limited battery and computing resources) to fall detection solutions. PMID:26213928
Analysis of Android Device-Based Solutions for Fall Detection.

PubMed

Casilari, Eduardo; Luque, Rafael; Morón, María-José

2015-07-23

Falls are a major cause of health and psychological problems as well as hospitalization costs among older adults. Thus, the investigation on automatic Fall Detection Systems (FDSs) has received special attention from the research community during the last decade. In this area, the widespread popularity, decreasing price, computing capabilities, built-in sensors and multiplicity of wireless interfaces of Android-based devices (especially smartphones) have fostered the adoption of this technology to deploy wearable and inexpensive architectures for fall detection. This paper presents a critical and thorough analysis of those existing fall detection systems that are based on Android devices. The review systematically classifies and compares the proposals of the literature taking into account different criteria such as the system architecture, the employed sensors, the detection algorithm or the response in case of a fall alarms. The study emphasizes the analysis of the evaluation methods that are employed to assess the effectiveness of the detection process. The review reveals the complete lack of a reference framework to validate and compare the proposals. In addition, the study also shows that most research works do not evaluate the actual applicability of the Android devices (with limited battery and computing resources) to fall detection solutions.
Massively parallel multicanonical simulations

NASA Astrophysics Data System (ADS)

Gross, Jonathan; Zierenberg, Johannes; Weigel, Martin; Janke, Wolfhard

2018-03-01

Generalized-ensemble Monte Carlo simulations such as the multicanonical method and similar techniques are among the most efficient approaches for simulations of systems undergoing discontinuous phase transitions or with rugged free-energy landscapes. As Markov chain methods, they are inherently serial computationally. It was demonstrated recently, however, that a combination of independent simulations that communicate weight updates at variable intervals allows for the efficient utilization of parallel computational resources for multicanonical simulations. Implementing this approach for the many-thread architecture provided by current generations of graphics processing units (GPUs), we show how it can be efficiently employed with of the order of 104 parallel walkers and beyond, thus constituting a versatile tool for Monte Carlo simulations in the era of massively parallel computing. We provide the fully documented source code for the approach applied to the paradigmatic example of the two-dimensional Ising model as starting point and reference for practitioners in the field.
Study of sensor spectral responses and data processing algorithms and architectures for onboard feature identification

NASA Technical Reports Server (NTRS)

Huck, F. O.; Davis, R. E.; Fales, C. L.; Aherron, R. M.

1982-01-01

A computational model of the deterministic and stochastic processes involved in remote sensing is used to study spectral feature identification techniques for real-time onboard processing of data acquired with advanced earth-resources sensors. Preliminary results indicate that: Narrow spectral responses are advantageous; signal normalization improves mean-square distance (MSD) classification accuracy but tends to degrade maximum-likelihood (MLH) classification accuracy; and MSD classification of normalized signals performs better than the computationally more complex MLH classification when imaging conditions change appreciably from those conditions during which reference data were acquired. The results also indicate that autonomous categorization of TM signals into vegetation, bare land, water, snow and clouds can be accomplished with adequate reliability for many applications over a reasonably wide range of imaging conditions. However, further analysis is required to develop computationally efficient boundary approximation algorithms for such categorization.

Object-based media and stream-based computing

NASA Astrophysics Data System (ADS)

Bove, V. Michael, Jr.

1998-03-01

Object-based media refers to the representation of audiovisual information as a collection of objects - the result of scene-analysis algorithms - and a script describing how they are to be rendered for display. Such multimedia presentations can adapt to viewing circumstances as well as to viewer preferences and behavior, and can provide a richer link between content creator and consumer. With faster networks and processors, such ideas become applicable to live interpersonal communications as well, creating a more natural and productive alternative to traditional videoconferencing. In this paper is outlined an example of object-based media algorithms and applications developed by my group, and present new hardware architectures and software methods that we have developed to enable meeting the computational requirements of object- based and other advanced media representations. In particular we describe stream-based processing, which enables automatic run-time parallelization of multidimensional signal processing tasks even given heterogenous computational resources.
High-Speed Photonic Reservoir Computing Using a Time-Delay-Based Architecture: Million Words per Second Classification

NASA Astrophysics Data System (ADS)

Larger, Laurent; Baylón-Fuentes, Antonio; Martinenghi, Romain; Udaltsov, Vladimir S.; Chembo, Yanne K.; Jacquot, Maxime

2017-01-01

Reservoir computing, originally referred to as an echo state network or a liquid state machine, is a brain-inspired paradigm for processing temporal information. It involves learning a "read-out" interpretation for nonlinear transients developed by high-dimensional dynamics when the latter is excited by the information signal to be processed. This novel computational paradigm is derived from recurrent neural network and machine learning techniques. It has recently been implemented in photonic hardware for a dynamical system, which opens the path to ultrafast brain-inspired computing. We report on a novel implementation involving an electro-optic phase-delay dynamics designed with off-the-shelf optoelectronic telecom devices, thus providing the targeted wide bandwidth. Computational efficiency is demonstrated experimentally with speech-recognition tasks. State-of-the-art speed performances reach one million words per second, with very low word error rate. Additionally, to record speed processing, our investigations have revealed computing-efficiency improvements through yet-unexplored temporal-information-processing techniques, such as simultaneous multisample injection and pitched sampling at the read-out compared to information "write-in".
An infrastructure for the integration of geoscience instruments and sensors on the Grid

NASA Astrophysics Data System (ADS)

Pugliese, R.; Prica, M.; Kourousias, G.; Del Linz, A.; Curri, A.

2009-04-01

The Grid, as a computing paradigm, has long been in the attention of both academia and industry[1]. The distributed and expandable nature of its general architecture result to scalability and more efficient utilisation of the computing infrastructures. The scientific community, including that of geosciences, often handles problems with very high requirements in data processing, transferring, and storing[2,3]. This has raised the interest on Grid technologies but these are often viewed solely as an access gateway to HPC. Suitable Grid infrastructures could provide the geoscience community with additional benefits like those of sharing, remote access and control of scientific systems. These systems can be scientific instruments, sensors, robots, cameras and any other device used in geosciences. The solution for practical, general, and feasible Grid-enabling of such devices requires non-intrusive extensions on core parts of the current Grid architecture. We propose an extended version of an architecture[4] that can serve as the solution to the problem. The solution we propose is called Grid Instrument Element (IE) [5]. It is an addition to the existing core Grid parts; the Computing Element (CE) and the Storage Element (SE) that serve the purposes that their name suggests. The IE that we will be referring to, and the related technologies have been developed in the EU project on the Deployment of Remote Instrumentation Infrastructure (DORII1). In DORII, partners of various scientific communities including those of Earthquake, Environmental science, and Experimental science, have adopted the technology of the Instrument Element in order to integrate to the Grid their devices. The Oceanographic and coastal observation and modelling Mediterranean Ocean Observing Network (OGS2), a DORII partner, is in the process of deploying the above mentioned Grid technologies on two types of observational modules: Argo profiling floats and a novel Autonomous Underwater Vehicle (AUV). In this paper i) we define the need for integration of instrumentation in the Grid, ii) we introduce the solution of the Instrument Element, iii) we demonstrate a suitable end-user web portal for accessing Grid resources, iv) we describe from the Grid-technological point of view the process of the integration to the Grid of two advanced environmental monitoring devices. References [1] M. Surridge, S. Taylor, D. De Roure, and E. Zaluska, "Experiences with GRIA—Industrial Applications on a Web Services Grid," e-Science and Grid Computing, First International Conference on e-Science and Grid Computing, 2005, pp. 98-105. [2] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, "The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets," Journal of Network and Computer Applications, vol. 23, 2000, pp. 187-200. [3] B. Allcock, J. Bester, J. Bresnahan, A.L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnel, and S. Tuecke, "Data management and transfer in high-performance computational grid environments," Parallel Computing, vol. 28, 2002, pp. 749-771. [4] E. Frizziero, M. Gulmini, F. Lelli, G. Maron, A. Oh, S. Orlando, A. Petrucci, S. Squizzato, and S. Traldi, "Instrument Element: A New Grid component that Enables the Control of Remote Instrumentation," Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)-Volume 00, IEEE Computer Society Washington, DC, USA, 2006. [5] R. Ranon, L. De Marco, A. Senerchia, S. Gabrielli, L. Chittaro, R. Pugliese, L. Del Cano, F. Asnicar, and M. Prica, "A Web-based Tool for Collaborative Access to Scientific Instruments in Cyberinfrastructures." 1 The DORII project is supported by the European Commission within the 7th Framework Programme (FP7/2007-2013) under grant agreement no. RI-213110. URL: http://www.dorii.eu 2 Istituto Nazionale di Oceanografia e di Geofisica Sperimentale. URL: http://www.ogs.trieste.it
Simulating Hydrologic Flow and Reactive Transport with PFLOTRAN and PETSc on Emerging Fine-Grained Parallel Computer Architectures

NASA Astrophysics Data System (ADS)

Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.

2017-12-01

As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
Concurrent extensions to the FORTRAN language for parallel programming of computational fluid dynamics algorithms

NASA Technical Reports Server (NTRS)

Weeks, Cindy Lou

1986-01-01

Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
Neuromorphic Computing, Architectures, Models, and Applications. A Beyond-CMOS Approach to Future Computing, June 29-July 1, 2016, Oak Ridge, TN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Potok, Thomas; Schuman, Catherine; Patton, Robert

The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schuller, Ivan K.; Stevens, Rick; Pino, Robinson

2015-10-29

Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Potok, Thomas E; Schuman, Catherine D; Young, Steven R

Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
Playable Serious Games for Studying and Programming Computational STEM and Informatics Applications of Distributed and Parallel Computer Architectures

ERIC Educational Resources Information Center

Amenyo, John-Thones

2012-01-01

Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
Generating and executing programs for a floating point single instruction multiple data instruction set architecture

DOEpatents

Gschwind, Michael K

2013-04-16

Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
GPS Block 2R Time Standard Assembly (TSA) architecture

NASA Technical Reports Server (NTRS)

Baker, Anthony P.

1990-01-01

The underlying philosophy of the Global Positioning System (GPS) 2R Time Standard Assembly (TSA) architecture is to utilize two frequency sources, one fixed frequency reference source and one system frequency source, and to couple the system frequency source to the reference frequency source via a sample data loop. The system source is used to provide the basic clock frequency and timing for the space vehicle (SV) and it uses a voltage controlled crystal oscillator (VCXO) with high short term stability. The reference source is an atomic frequency standard (AFS) with high long term stability. The architecture can support any type of frequency standard. In the system design rubidium, cesium, and H2 masers outputting a canonical frequency were accommodated. The architecture is software intensive. All VCXO adjustments are digital and are calculated by a processor. They are applied to the VCXO via a digital to analog converter.
Highly parallel computation

NASA Technical Reports Server (NTRS)

Denning, Peter J.; Tichy, Walter F.

1990-01-01

Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
Switching from computer to microcomputer architecture education

NASA Astrophysics Data System (ADS)

Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore

2010-03-01

In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to microcomputer architecture. The authors present their strategies towards a successful crossing of boundaries between engineering disciplines. This communication aims at providing a different aspect on professional courses that are, nowadays, addressed at the expense of traditional courses.
Three-Dimensional Nanobiocomputing Architectures With Neuronal Hypercells

DTIC Science & Technology

2007-06-01

Neumann architectures, and CMOS fabrication. Novel solutions of massive parallel distributed computing and processing (pipelined due to systolic... and processing platforms utilizing molecular hardware within an enabling organization and architecture. The design technology is based on utilizing a...Microsystems and Nanotechnologies investigated a novel 3D3 (Hardware Software Nanotechnology) technology to design super-high performance computing
Switching from Computer to Microcomputer Architecture Education

ERIC Educational Resources Information Center

Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore

2010-01-01

In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to…
An Architecture for Cross-Cloud System Management

NASA Astrophysics Data System (ADS)

Dodda, Ravi Teja; Smith, Chris; van Moorsel, Aad

The emergence of the cloud computing paradigm promises flexibility and adaptability through on-demand provisioning of compute resources. As the utilization of cloud resources extends beyond a single provider, for business as well as technical reasons, the issue of effectively managing such resources comes to the fore. Different providers expose different interfaces to their compute resources utilizing varied architectures and implementation technologies. This heterogeneity poses a significant system management problem, and can limit the extent to which the benefits of cross-cloud resource utilization can be realized. We address this problem through the definition of an architecture to facilitate the management of compute resources from different cloud providers in an homogenous manner. This preserves the flexibility and adaptability promised by the cloud computing paradigm, whilst enabling the benefits of cross-cloud resource utilization to be realized. The practical efficacy of the architecture is demonstrated through an implementation utilizing compute resources managed through different interfaces on the Amazon Elastic Compute Cloud (EC2) service. Additionally, we provide empirical results highlighting the performance differential of these different interfaces, and discuss the impact of this performance differential on efficiency and profitability.
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.

PubMed

Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang

2016-12-07

The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
Technical Reference Suite Addressing Challenges of Providing Assurance for Fault Management Architectural Design

NASA Technical Reports Server (NTRS)

Fitz, Rhonda; Whitman, Gerek

2016-01-01

Research into complexities of software systems Fault Management (FM) and how architectural design decisions affect safety, preservation of assets, and maintenance of desired system functionality has coalesced into a technical reference (TR) suite that advances the provision of safety and mission assurance. The NASA Independent Verification and Validation (IVV) Program, with Software Assurance Research Program support, extracted FM architectures across the IVV portfolio to evaluate robustness, assess visibility for validation and test, and define software assurance methods applied to the architectures and designs. This investigation spanned IVV projects with seven different primary developers, a wide range of sizes and complexities, and encompassed Deep Space Robotic, Human Spaceflight, and Earth Orbiter mission FM architectures. The initiative continues with an expansion of the TR suite to include Launch Vehicles, adding the benefit of investigating differences intrinsic to model-based FM architectures and insight into complexities of FM within an Agile software development environment, in order to improve awareness of how nontraditional processes affect FM architectural design and system health management.
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Radenski, Atanas; Follen, Gregory J. (Technical Monitor)

2001-01-01

The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
THE COMPUTER AND THE ARCHITECTURAL PROFESSION.

ERIC Educational Resources Information Center

HAVILAND, DAVID S.

THE ROLE OF ADVANCING TECHNOLOGY IN THE FIELD OF ARCHITECTURE IS DISCUSSED IN THIS REPORT. PROBLEMS IN COMMUNICATION AND THE DESIGN PROCESS ARE IDENTIFIED. ADVANTAGES AND DISADVANTAGES OF COMPUTERS ARE MENTIONED IN RELATION TO MAN AND MACHINE INTERACTION. PRESENT AND FUTURE IMPLICATIONS OF COMPUTER USAGE ARE IDENTIFIED AND DISCUSSED WITH RESPECT…

The Contribution of Visualization to Learning Computer Architecture

ERIC Educational Resources Information Center

Yehezkel, Cecile; Ben-Ari, Mordechai; Dreyfus, Tommy

2007-01-01

This paper describes a visualization environment and associated learning activities designed to improve learning of computer architecture. The environment, EasyCPU, displays a model of the components of a computer and the dynamic processes involved in program execution. We present the results of a research program that analysed the contribution of…
Technology advances and market forces: Their impact on high performance architectures

NASA Technical Reports Server (NTRS)

Best, D. R.

1978-01-01

Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
Avoiding and tolerating latency in large-scale next-generation shared-memory multiprocessors

NASA Technical Reports Server (NTRS)

Probst, David K.

1993-01-01

A scalable solution to the memory-latency problem is necessary to prevent the large latencies of synchronization and memory operations inherent in large-scale shared-memory multiprocessors from reducing high performance. We distinguish latency avoidance and latency tolerance. Latency is avoided when data is brought to nearby locales for future reference. Latency is tolerated when references are overlapped with other computation. Latency-avoiding locales include: processor registers, data caches used temporally, and nearby memory modules. Tolerating communication latency requires parallelism, allowing the overlap of communication and computation. Latency-tolerating techniques include: vector pipelining, data caches used spatially, prefetching in various forms, and multithreading in various forms. Relaxing the consistency model permits increased use of avoidance and tolerance techniques. Each model is a mapping from the program text to sets of partial orders on program operations; it is a convention about which temporal precedences among program operations are necessary. Information about temporal locality and parallelism constrains the use of avoidance and tolerance techniques. Suitable architectural primitives and compiler technology are required to exploit the increased freedom to reorder and overlap operations in relaxed models.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vineyard, Craig Michael; Verzi, Stephen Joseph

As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
Computing architecture for autonomous microgrids

DOEpatents

Goldsmith, Steven Y.

2015-09-29

A computing architecture that facilitates autonomously controlling operations of a microgrid is described herein. A microgrid network includes numerous computing devices that execute intelligent agents, each of which is assigned to a particular entity (load, source, storage device, or switch) in the microgrid. The intelligent agents can execute in accordance with predefined protocols to collectively perform computations that facilitate uninterrupted control of the .
Blueprint for a microwave trapped ion quantum computer.

PubMed

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K

2017-02-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

NASA Technical Reports Server (NTRS)

Carroll, Chester C.; Youngblood, John N.; Saha, Aindam

1987-01-01

Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
Computer architecture for efficient algorithmic executions in real-time systems: new technology for avionics systems and advanced space vehicles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carroll, C.C.; Youngblood, J.N.; Saha, A.

1987-12-01

Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
SU (2) lattice gauge theory simulations on Fermi GPUs

NASA Astrophysics Data System (ADS)

Cardoso, Nuno; Bicudo, Pedro

2011-05-01

In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
Modified Phasemeter for a Heterodyne Laser Interferometer

NASA Technical Reports Server (NTRS)

Loya, Frank M.

2010-01-01

Modifications have been made in the design of instruments of the type described in "Digital Averaging Phasemeter for Heterodyne Interferometry". A phasemeter of this type measures the difference between the phases of the unknown and reference heterodyne signals in a heterodyne laser interferometer. The phasemeter design lacked immunity to drift of the heterodyne frequency, was bandwidth-limited by computer bus architectures then in use, and was resolution-limited by the nature of field-programmable gate arrays (FPGAs) then available. The modifications have overcome these limitations and have afforded additional improvements in accuracy, speed, and modularity. The modifications are summarized.
CRAY mini manual. Revision D

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Howser, Lona M.

1993-01-01

This document briefly describes the use of the CRAY supercomputers that are an integral part of the Supercomputing Network Subsystem of the Central Scientific Computing Complex at LaRC. Features of the CRAY supercomputers are covered, including: FORTRAN, C, PASCAL, architectures of the CRAY-2 and CRAY Y-MP, the CRAY UNICOS environment, batch job submittal, debugging, performance analysis, parallel processing, utilities unique to CRAY, and documentation. The document is intended for all CRAY users as a ready reference to frequently asked questions and to more detailed information contained in the vendor manuals. It is appropriate for both the novice and the experienced user.
Automatic Blocking Of QR and LU Factorizations for Locality

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yi, Q; Kennedy, K; You, H

2004-03-26

QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, moremore » benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.« less
Multiprocessor architecture: Synthesis and evaluation

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1990-01-01

Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
Insider Threat Security Reference Architecture

DTIC Science & Technology

2012-04-01

this challenge. CMU/SEI-2012-TR-007 | 2 2 The Components of the ITSRA Figure 2 shows the four layers of the ITSRA. The Business Security layer......organizations improve their level of preparedness to address the insider threat. Business Security Architecture Data Security Architecture
A Neural Network Architecture For Rapid Model Indexing In Computer Vision Systems

NASA Astrophysics Data System (ADS)

Pawlicki, Ted

1988-03-01

Models of objects stored in memory have been shown to be useful for guiding the processing of computer vision systems. A major consideration in such systems, however, is how stored models are initially accessed and indexed by the system. As the number of stored models increases, the time required to search memory for the correct model becomes high. Parallel distributed, connectionist, neural networks' have been shown to have appealing content addressable memory properties. This paper discusses an architecture for efficient storage and reference of model memories stored as stable patterns of activity in a parallel, distributed, connectionist, neural network. The emergent properties of content addressability and resistance to noise are exploited to perform indexing of the appropriate object centered model from image centered primitives. The system consists of three network modules each of which represent information relative to a different frame of reference. The model memory network is a large state space vector where fields in the vector correspond to ordered component objects and relative, object based spatial relationships between the component objects. The component assertion network represents evidence about the existence of object primitives in the input image. It establishes local frames of reference for object primitives relative to the image based frame of reference. The spatial relationship constraint network is an intermediate representation which enables the association between the object based and the image based frames of reference. This intermediate level represents information about possible object orderings and establishes relative spatial relationships from the image based information in the component assertion network below. It is also constrained by the lawful object orderings in the model memory network above. The system design is consistent with current psychological theories of recognition by component. It also seems to support Marr's notions of hierarchical indexing. (i.e. the specificity, adjunct, and parent indices) It supports the notion that multiple canonical views of an object may have to be stored in memory to enable its efficient identification. The use of variable fields in the state space vectors appears to keep the number of required nodes in the network down to a tractable number while imposing a semantic value on different areas of the state space. This semantic imposition supports an interface between the analogical aspects of neural networks and the propositional paradigms of symbolic processing.
Feasibility study, software design, layout and simulation of a two-dimensional Fast Fourier Transform machine for use in optical array interferometry

NASA Technical Reports Server (NTRS)

Boriakoff, Valentin

1994-01-01

The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
An S N Algorithm for Modern Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Randal Scott

2016-08-29

LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics

NASA Technical Reports Server (NTRS)

Padovan, Joe; Gute, Doug; Johnson, Keith

1989-01-01

The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
Parallel computer vision

DOE Office of Scientific and Technical Information (OSTI.GOV)

Uhr, L.

1987-01-01

This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Computer Architecture's Changing Role in Rebooting Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeBenedictis, Erik P.

In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.

Computer Architecture's Changing Role in Rebooting Computing

DOE PAGES

DeBenedictis, Erik P.

2017-04-26

In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Design of a fault tolerant airborne digital computer. Volume 1: Architecture

NASA Technical Reports Server (NTRS)

Wensley, J. H.; Levitt, K. N.; Green, M. W.; Goldberg, J.; Neumann, P. G.

1973-01-01

This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive.
Using a software-defined computer in teaching the basics of computer architecture and operation

NASA Astrophysics Data System (ADS)

Kosowska, Julia; Mazur, Grzegorz

2017-08-01

The paper describes the concept and implementation of SDC_One software-defined computer designed for experimental and didactic purposes. Equipped with extensive hardware monitoring mechanisms, the device enables the students to monitor the computer's operation on bus transfer cycle or instruction cycle basis, providing the practical illustration of basic aspects of computer's operation. In the paper, we describe the hardware monitoring capabilities of SDC_One and some scenarios of using it in teaching the basics of computer architecture and microprocessor operation.
A Serial Bus Architecture for Parallel Processing Systems

DTIC Science & Technology

1986-09-01

pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more communication capacity is needed, pushing...chip. The wider the communication path the more pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more...13 2. A Suitable Architecture Sought 14 II. OPTIMUM ARCHITECTURE OF LARGE INTEGRATED A. PARTIONING SILICON FOR MAXIMUM 1? 1. Transistor
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance

NASA Technical Reports Server (NTRS)

Rushby, John

1999-01-01

Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
Modeling driver behavior in a cognitive architecture.

PubMed

Salvucci, Dario D

2006-01-01

This paper explores the development of a rigorous computational model of driver behavior in a cognitive architecture--a computational framework with underlying psychological theories that incorporate basic properties and limitations of the human system. Computational modeling has emerged as a powerful tool for studying the complex task of driving, allowing researchers to simulate driver behavior and explore the parameters and constraints of this behavior. An integrated driver model developed in the ACT-R (Adaptive Control of Thought-Rational) cognitive architecture is described that focuses on the component processes of control, monitoring, and decision making in a multilane highway environment. This model accounts for the steering profiles, lateral position profiles, and gaze distributions of human drivers during lane keeping, curve negotiation, and lane changing. The model demonstrates how cognitive architectures facilitate understanding of driver behavior in the context of general human abilities and constraints and how the driving domain benefits cognitive architectures by pushing model development toward more complex, realistic tasks. The model can also serve as a core computational engine for practical applications that predict and recognize driver behavior and distraction.
Advanced cloud fault tolerance system

NASA Astrophysics Data System (ADS)

Sumangali, K.; Benny, Niketa

2017-11-01

Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
The new landscape of parallel computer architecture

NASA Astrophysics Data System (ADS)

Shalf, John

2007-07-01

The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Comparing root architectural models

NASA Astrophysics Data System (ADS)

Schnepf, Andrea; Javaux, Mathieu; Vanderborght, Jan

2017-04-01

Plant roots play an important role in several soil processes (Gregory 2006). Root architecture development determines the sites in soil where roots provide input of carbon and energy and take up water and solutes. However, root architecture is difficult to determine experimentally when grown in opaque soil. Thus, root architectural models have been widely used and been further developed into functional-structural models that are able to simulate the fate of water and solutes in the soil-root system (Dunbabin et al. 2013). Still, a systematic comparison of the different root architectural models is missing. In this work, we focus on discrete root architecture models where roots are described by connected line segments. These models differ (a) in their model concepts, such as the description of distance between branches based on a prescribed distance (inter-nodal distance) or based on a prescribed time interval. Furthermore, these models differ (b) in the implementation of the same concept, such as the time step size, the spatial discretization along the root axes or the way stochasticity of parameters such as root growth direction, growth rate, branch spacing, branching angles are treated. Based on the example of two such different root models, the root growth module of R-SWMS and RootBox, we show the impact of these differences on simulated root architecture and aggregated information computed from this detailed simulation results, taking into account the stochastic nature of those models. References Dunbabin, V.M., Postma, J.A., Schnepf, A., Pagès, L., Javaux, M., Wu, L., Leitner, D., Chen, Y.L., Rengel, Z., Diggle, A.J. Modelling root-soil interactions using three-dimensional models of root growth, architecture and function (2013) Plant and Soil, 372 (1-2), pp. 93 - 124. Gregory (2006) Roots, rhizosphere and soil: the route to a better understanding of soil science? European Journal of Soil Science 57: 2-12.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing

DTIC Science & Technology

2000-03-06

and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
Blueprint for a microwave trapped ion quantum computer

PubMed Central

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.

2017-01-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
New approaches to digital transformation of petrochemical production

NASA Astrophysics Data System (ADS)

Andieva, E. Y.; Kapelyuhovskaya, A. A.

2017-08-01

The newest concepts of the reference architecture of digital industrial transformation are considered, the problems of their application for the enterprises having in their life cycle oil products processing and marketing are revealed. The concept of the reference architecture, providing a systematic representation of the fundamental changes in the approaches to production management based on the automation of production process control is proposed.
Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course

ERIC Educational Resources Information Center

Arbelaitz, Olatz; José I. Martín; Muguerza, Javier

2015-01-01

This paper presents an analysis of introducing active methodologies in the Computer Architecture course taught in the second year of the Computer Engineering Bachelor's degree program at the University of the Basque Country (UPV/EHU), Spain. The paper reports the experience from three academic years, 2011-2012, 2012-2013, and 2013-2014, in which…
A Survey and Evaluation of Simulators Suitable for Teaching Courses in Computer Architecture and Organization

ERIC Educational Resources Information Center

Nikolic, B.; Radivojevic, Z.; Djordjevic, J.; Milutinovic, V.

2009-01-01

Courses in Computer Architecture and Organization are regularly included in Computer Engineering curricula. These courses are usually organized in such a way that students obtain not only a purely theoretical experience, but also a practical understanding of the topics lectured. This practical work is usually done in a laboratory using simulators…
A Project-Based Learning Approach to Programmable Logic Design and Computer Architecture

ERIC Educational Resources Information Center

Kellett, C. M.

2012-01-01

This paper describes a course in programmable logic design and computer architecture as it is taught at the University of Newcastle, Australia. The course is designed around a major design project and has two supplemental assessment tasks that are also described. The context of the Computer Engineering degree program within which the course is…
From Archi Torture to Architecture: Undergraduate Students Design and Implement Computers Using the Multimedia Logic Emulator

ERIC Educational Resources Information Center

Stanley, Timothy D.; Wong, Lap Kei; Prigmore, Daniel; Benson, Justin; Fishler, Nathan; Fife, Leslie; Colton, Don

2007-01-01

Students learn better when they both hear and do. In computer architecture courses "doing" can be difficult in small schools without hardware laboratories hosted by computer engineering, electrical engineering, or similar departments. Software solutions exist. Our success with George Mills' Multimedia Logic (MML) is the focus of this paper. MML…
The Role of Sketch in Architecture Design

NASA Astrophysics Data System (ADS)

Li, Yanjin; Ning, Wen

2017-06-01

With the continuous development of computer technology, we rely more and more on the computer and pay more and more attention to the final design results, so that we ignore the importance of the sketch. However, the sketch is the most basic and effective way of architecture design. Based on the study of the sketch of Tjibao Cultural Center of sketch, the paper explores the role of sketch in architecture design .
SU (2) lattice gauge theory simulations on Fermi GPUs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p

2011-05-10

In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
Exploration of operator method digital optical computers for application to NASA

NASA Technical Reports Server (NTRS)

1990-01-01

Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
Using NASA's Reference Architecture: Comparing Polar and Geostationary Data Processing Systems

NASA Technical Reports Server (NTRS)

Ullman, Richard; Burnett, Michael

2013-01-01

The JPSS and GOES-R programs are housed at NASA GSFC and jointly implemented by NASA and NOAA to NOAA requirements. NASA's role in the JPSS Ground System is to develop and deploy the system according to NOAA requirements. NASA's role in the GOES-R ground segment is to provide Systems Engineering expertise and oversight for NOAA's development and deployment of the system. NASA's Earth Science Data Systems Reference Architecture is a document developed by NASA's Earth Science Data Systems Standards Process Group that describes a NASA Earth Observing Mission Ground system as a generic abstraction. The authors work within the respective ground segment projects and are also separately contributors to the Reference Architecture document. Opinions expressed are the author's only and are not NOAA, NASA or the Ground Projects' official positions.

Layered Architectures for Quantum Computers and Quantum Repeaters

NASA Astrophysics Data System (ADS)

Jones, Nathan C.

This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
Neural simulations on multi-core architectures.

PubMed

Eichner, Hubert; Klug, Tobias; Borst, Alexander

2009-01-01

Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing.
Neural Simulations on Multi-Core Architectures

PubMed Central

Eichner, Hubert; Klug, Tobias; Borst, Alexander

2009-01-01

Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing. PMID:19636393
Advanced flight computer. Special study

NASA Technical Reports Server (NTRS)

Coo, Dennis

1995-01-01

This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996.
Semi-Markov adjunction to the Computer-Aided Markov Evaluator (CAME)

NASA Technical Reports Server (NTRS)

Rosch, Gene; Hutchins, Monica A.; Leong, Frank J.; Babcock, Philip S., IV

1988-01-01

The rule-based Computer-Aided Markov Evaluator (CAME) program was expanded in its ability to incorporate the effect of fault-handling processes into the construction of a reliability model. The fault-handling processes are modeled as semi-Markov events and CAME constructs and appropriate semi-Markov model. To solve the model, the program outputs it in a form which can be directly solved with the Semi-Markov Unreliability Range Evaluator (SURE) program. As a means of evaluating the alterations made to the CAME program, the program is used to model the reliability of portions of the Integrated Airframe/Propulsion Control System Architecture (IAPSA 2) reference configuration. The reliability predictions are compared with a previous analysis. The results bear out the feasibility of utilizing CAME to generate appropriate semi-Markov models to model fault-handling processes.
Advanced information processing system for advanced launch system: Avionics architecture synthesis

NASA Technical Reports Server (NTRS)

Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.

1991-01-01

The Advanced Information Processing System (AIPS) is a fault-tolerant distributed computer system architecture that was developed to meet the real time computational needs of advanced aerospace vehicles. One such vehicle is the Advanced Launch System (ALS) being developed jointly by NASA and the Department of Defense to launch heavy payloads into low earth orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An avionics architecture that utilizes the AIPS hardware and software building blocks was synthesized for ALS. The AIPS for ALS architecture synthesis process starting with the ALS mission requirements and ending with an analysis of the candidate ALS avionics architecture is described.
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis

NASA Technical Reports Server (NTRS)

Brent, G. A.

1978-01-01

A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment

NASA Technical Reports Server (NTRS)

Duong, Tuan A.; Duong, Vu A.

2012-01-01

A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant.
Electromagnetic Physics Models for Parallel Computing Architectures

NASA Astrophysics Data System (ADS)

Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.

2016-10-01

The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Computational structures for robotic computations

NASA Technical Reports Server (NTRS)

Lee, C. S. G.; Chang, P. R.

1987-01-01

The computational problem of inverse kinematics and inverse dynamics of robot manipulators by taking advantage of parallelism and pipelining architectures is discussed. For the computation of inverse kinematic position solution, a maximum pipelined CORDIC architecture has been designed based on a functional decomposition of the closed-form joint equations. For the inverse dynamics computation, an efficient p-fold parallel algorithm to overcome the recurrence problem of the Newton-Euler equations of motion to achieve the time lower bound of O(log sub 2 n) has also been developed.
[Design and study of parallel computing environment of Monte Carlo simulation for particle therapy planning using a public cloud-computing infrastructure].

PubMed

Yokohama, Noriya

2013-07-01

This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
77 FR 35962 - Utilizing Rapidly Deployable Aerial Communications Architecture in Response to an Emergency

Federal Register 2010, 2011, 2012, 2013, 2014

2012-06-15

... Aerial Communications Architecture in Response to an Emergency AGENCY: Federal Communications Commission... deployable aerial communications architecture (DACA) in facilitating emergency response by rapidly restoring... copying during normal business hours in the FCC Reference Information Center, Portals II, 445 12th Street...
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems

NASA Technical Reports Server (NTRS)

Follen, Gregory J.

2003-01-01

Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT).
Solving the Cauchy-Riemann equations on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1987-01-01

Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
Application of Tessellation in Architectural Geometry Design

NASA Astrophysics Data System (ADS)

Chang, Wei

2018-06-01

Tessellation plays a significant role in architectural geometry design, which is widely used both through history of architecture and in modern architectural design with the help of computer technology. Tessellation has been found since the birth of civilization. In terms of dimensions, there are two- dimensional tessellations and three-dimensional tessellations; in terms of symmetry, there are periodic tessellations and aperiodic tessellations. Besides, some special types of tessellations such as Voronoi Tessellation and Delaunay Triangles are also included. Both Geometry and Crystallography, the latter of which is the basic theory of three-dimensional tessellations, need to be studied. In history, tessellation was applied into skins or decorations in architecture. The development of Computer technology enables tessellation to be more powerful, as seen in surface control, surface display and structure design, etc. Therefore, research on the application of tessellation in architectural geometry design is of great necessity in architecture studies.
A single VLSI chip for computing syndromes in the (225, 223) Reed-Solomon decoder

NASA Technical Reports Server (NTRS)

Hsu, I. S.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.

1986-01-01

A description of a single VLSI chip for computing syndromes in the (255, 223) Reed-Solomon decoder is presented. The architecture that leads to this single VLSI chip design makes use of the dual basis multiplication algorithm. The same architecture can be applied to design VLSI chips to compute various kinds of number theoretic transforms.
Parallel processing for digital picture comparison

NASA Technical Reports Server (NTRS)

Cheng, H. D.; Kou, L. T.

1987-01-01

In picture processing an important problem is to identify two digital pictures of the same scene taken under different lighting conditions. This kind of problem can be found in remote sensing, satellite signal processing and the related areas. The identification can be done by transforming the gray levels so that the gray level histograms of the two pictures are closely matched. The transformation problem can be solved by using the packing method. Researchers propose a VLSI architecture consisting of m x n processing elements with extensive parallel and pipelining computation capabilities to speed up the transformation with the time complexity 0(max(m,n)), where m and n are the numbers of the gray levels of the input picture and the reference picture respectively. If using uniprocessor and a dynamic programming algorithm, the time complexity will be 0(m(3)xn). The algorithm partition problem, as an important issue in VLSI design, is discussed. Verification of the proposed architecture is also given.
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)

NASA Technical Reports Server (NTRS)

Carroll, Chester C.; Owen, Jeffrey E.

1988-01-01

A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
Parameterized Micro-benchmarking: An Auto-tuning Approach for Complex Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ma, Wenjing; Krishnamoorthy, Sriram; Agrawal, Gagan

2012-05-15

Auto-tuning has emerged as an important practical method for creating highly optimized implementations of key computational kernels and applications. However, the growing complexity of architectures and applications is creating new challenges for auto-tuning. Complex applications can involve a prohibitively large search space that precludes empirical auto-tuning. Similarly, architectures are becoming increasingly complicated, making it hard to model performance. In this paper, we focus on the challenge to auto-tuning presented by applications with a large number of kernels and kernel instantiations. While these kernels may share a somewhat similar pattern, they differ considerably in problem sizes and the exact computation performed.more » We propose and evaluate a new approach to auto-tuning which we refer to as parameterized micro-benchmarking. It is an alternative to the two existing classes of approaches to auto-tuning: analytical model-based and empirical search-based. Particularly, we argue that the former may not be able to capture all the architectural features that impact performance, whereas the latter might be too expensive for an application that has several different kernels. In our approach, different expressions in the application, different possible implementations of each expression, and the key architectural features, are used to derive a simple micro-benchmark and a small parameter space. This allows us to learn the most significant features of the architecture that can impact the choice of implementation for each kernel. We have evaluated our approach in the context of GPU implementations of tensor contraction expressions encountered in excited state calculations in quantum chemistry. We have focused on two aspects of GPUs that affect tensor contraction execution: memory access patterns and kernel consolidation. Using our parameterized micro-benchmarking approach, we obtain a speedup of up to 2 over the version that used default optimizations, but no auto-tuning. We demonstrate that observations made from microbenchmarks match the behavior seen from real expressions. In the process, we make important observations about the memory hierarchy of two of the most recent NVIDIA GPUs, which can be used in other optimization frameworks as well.« less
Evidence of common and separate eye and hand accumulators underlying flexible eye-hand coordination

PubMed Central

Jana, Sumitash; Gopal, Atul

2016-01-01

Eye and hand movements are initiated by anatomically separate regions in the brain, and yet these movements can be flexibly coupled and decoupled, depending on the need. The computational architecture that enables this flexible coupling of independent effectors is not understood. Here, we studied the computational architecture that enables flexible eye-hand coordination using a drift diffusion framework, which predicts that the variability of the reaction time (RT) distribution scales with its mean. We show that a common stochastic accumulator to threshold, followed by a noisy effector-dependent delay, explains eye-hand RT distributions and their correlation in a visual search task that required decision-making, while an interactive eye and hand accumulator model did not. In contrast, in an eye-hand dual task, an interactive model better predicted the observed correlations and RT distributions than a common accumulator model. Notably, these two models could only be distinguished on the basis of the variability and not the means of the predicted RT distributions. Additionally, signatures of separate initiation signals were also observed in a small fraction of trials in the visual search task, implying that these distinct computational architectures were not a manifestation of the task design per se. Taken together, our results suggest two unique computational architectures for eye-hand coordination, with task context biasing the brain toward instantiating one of the two architectures. NEW & NOTEWORTHY Previous studies on eye-hand coordination have considered mainly the means of eye and hand reaction time (RT) distributions. Here, we leverage the approximately linear relationship between the mean and standard deviation of RT distributions, as predicted by the drift-diffusion model, to propose the existence of two distinct computational architectures underlying coordinated eye-hand movements. These architectures, for the first time, provide a computational basis for the flexible coupling between eye and hand movements. PMID:27784809

Fostering Enterprise Architecture Education and Training with the Enterprise Architecture Competence Framework

ERIC Educational Resources Information Center

Tambouris, Efthimios; Zotou, Maria; Kalampokis, Evangelos; Tarabanis, Konstantinos

2012-01-01

Enterprise architecture (EA) implementation refers to a set of activities ultimately aiming to align business objectives with information technology infrastructure in an organization. EA implementation is a multidisciplinary, complicated and endless process, hence, calls for adequate education and training programs that will build highly skilled…
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.

PubMed

Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan

2017-11-01

The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
Computer graphics in architecture and engineering

NASA Technical Reports Server (NTRS)

Greenberg, D. P.

1975-01-01

The present status of the application of computer graphics to the building profession or architecture and its relationship to other scientific and technical areas were discussed. It was explained that, due to the fragmented nature of architecture and building activities (in contrast to the aerospace industry), a comprehensive, economic utilization of computer graphics in this area is not practical and its true potential cannot now be realized due to the present inability of architects and structural, mechanical, and site engineers to rely on a common data base. Future emphasis will therefore have to be placed on a vertical integration of the construction process and effective use of a three-dimensional data base, rather than on waiting for any technological breakthrough in interactive computing.
Innovative architectures for dense multi-microprocessor computers

NASA Technical Reports Server (NTRS)

Larson, Robert E.

1989-01-01

The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
Fault tolerant architectures for integrated aircraft electronics systems

NASA Technical Reports Server (NTRS)

Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.

1983-01-01

Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered.
HyperForest: A high performance multi-processor architecture for real-time intelligent systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.

1997-04-01

Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doorsmore » for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.« less
A light-stimulated synaptic device based on graphene hybrid phototransistor

NASA Astrophysics Data System (ADS)

Qin, Shuchao; Wang, Fengqiu; Liu, Yujie; Wan, Qing; Wang, Xinran; Xu, Yongbing; Shi, Yi; Wang, Xiaomu; Zhang, Rong

2017-09-01

Neuromorphic chips refer to an unconventional computing architecture that is modelled on biological brains. They are increasingly employed for processing sensory data for machine vision, context cognition, and decision making. Despite rapid advances, neuromorphic computing has remained largely an electronic technology, making it a challenge to access the superior computing features provided by photons, or to directly process vision data that has increasing importance to artificial intelligence. Here we report a novel light-stimulated synaptic device based on a graphene-carbon nanotube hybrid phototransistor. Significantly, the device can respond to optical stimuli in a highly neuron-like fashion and exhibits flexible tuning of both short- and long-term plasticity. These features combined with the spatiotemporal processability make our device a capable counterpart to today’s electrically-driven artificial synapses, with superior reconfigurable capabilities. In addition, our device allows for generic optical spike processing, which provides a foundation for more sophisticated computing. The silicon-compatible, multifunctional photosensitive synapse opens up a new opportunity for neural networks enabled by photonics and extends current neuromorphic systems in terms of system complexities and functionalities.
Space station needs, attributes and architectural options: Architectural options and selection

NASA Technical Reports Server (NTRS)

Nelson, W. G.

1983-01-01

The approach, study results, and recommendations for defining and selecting space station architectural options are described. Space station system architecture is defined as the arrangement of elements (manned and unmanned on-orbit facilities, shuttle vehicles, orbital transfer vehicles, etc.), the number of these elements, their location (orbital inclination and altitude, and their functional performance capability, power, volume, crew, etc.). Architectural options are evaluated based on the degree of mission capture versus cost and required funding rate. Mission capture refers to the number of missions accommodated by the particular architecture.
Topical perspective on massive threading and parallelism.

PubMed

Farber, Robert M

2011-09-01

Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.
Scaling Watershed Models: Modern Approaches to Science Computation with MapReduce, Parallelization, and Cloud Optimization

EPA Science Inventory

Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Modelling parallel programs and multiprocessor architectures with AXE

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.

1991-01-01

AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
A high performance parallel computing architecture for robust image features

NASA Astrophysics Data System (ADS)

Zhou, Renyan; Liu, Leibo; Wei, Shaojun

2014-03-01

A design of parallel architecture for image feature detection and description is proposed in this article. The major component of this architecture is a 2D cellular network composed of simple reprogrammable processors, enabling the Hessian Blob Detector and Haar Response Calculation, which are the most computing-intensive stage of the Speeded Up Robust Features (SURF) algorithm. Combining this 2D cellular network and dedicated hardware for SURF descriptors, this architecture achieves real-time image feature detection with minimal software in the host processor. A prototype FPGA implementation of the proposed architecture achieves 1318.9 GOPS general pixel processing @ 100 MHz clock and achieves up to 118 fps in VGA (640 × 480) image feature detection. The proposed architecture is stand-alone and scalable so it is easy to be migrated into VLSI implementation.
The RISC (Reduced Instruction Set Computer) Architecture and Computer Performance Evaluation.

DTIC Science & Technology

1986-03-01

time where the main emphasis of the evaluation process is put on the software . The model is intended to provide a tool for computer architects to use...program, or 3) Was to be implemented in random logic more effec- tively than the equivalent sequence of software instructions. Both data and address...definition is the IEEE standard 729-1983 stating Computer Architecture as: " The process of defining a collection of hardware and software components and
First 3 years of operation of RIACS (Research Institute for Advanced Computer Science) (1983-1985)

NASA Technical Reports Server (NTRS)

Denning, P. J.

1986-01-01

The focus of the Research Institute for Advanced Computer Science (RIACS) is to explore matches between advanced computing architectures and the processes of scientific research. An architecture evaluation of the MIT static dataflow machine, specification of a graphical language for expressing distributed computations, and specification of an expert system for aiding in grid generation for two-dimensional flow problems was initiated. Research projects for 1984 and 1985 are summarized.
Design and Analysis of Compact DNA Strand Displacement Circuits for Analog Computation Using Autocatalytic Amplifiers.

PubMed

Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John

2018-01-19

A main goal in DNA computing is to build DNA circuits to compute designated functions using a minimal number of DNA strands. Here, we propose a novel architecture to build compact DNA strand displacement circuits to compute a broad scope of functions in an analog fashion. A circuit by this architecture is composed of three autocatalytic amplifiers, and the amplifiers interact to perform computation. We show DNA circuits to compute functions sqrt(x), ln(x) and exp(x) for x in tunable ranges with simulation results. A key innovation in our architecture, inspired by Napier's use of logarithm transforms to compute square roots on a slide rule, is to make use of autocatalytic amplifiers to do logarithmic and exponential transforms in concentration and time. In particular, we convert from the input that is encoded by the initial concentration of the input DNA strand, to time, and then back again to the output encoded by the concentration of the output DNA strand at equilibrium. This combined use of strand-concentration and time encoding of computational values may have impact on other forms of molecular computation.
Verification of Security Policy Enforcement in Enterprise Systems

NASA Astrophysics Data System (ADS)

Gupta, Puneet; Stoller, Scott D.

Many security requirements for enterprise systems can be expressed in a natural way as high-level access control policies. A high-level policy may refer to abstract information resources, independent of where the information is stored; it controls both direct and indirect accesses to the information; it may refer to the context of a request, i.e., the request’s path through the system; and its enforcement point and enforcement mechanism may be unspecified. Enforcement of a high-level policy may depend on the system architecture and the configurations of a variety of security mechanisms, such as firewalls, host login permissions, file permissions, DBMS access control, and application-specific security mechanisms. This paper presents a framework in which all of these can be conveniently and formally expressed, a method to verify that a high-level policy is enforced, and an algorithm to determine a trusted computing base for each resource.
Performances of multiprocessor multidisk architectures for continuous media storage

NASA Astrophysics Data System (ADS)

Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.

1996-03-01

Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Positioning navigation and timing service applications in cyber physical systems

NASA Astrophysics Data System (ADS)

Qu, Yi; Wu, Xiaojing; Zeng, Lingchuan

2017-10-01

The positioning navigation and timing (PNT) architecture was discussed in detail, whose history, evolvement, current status and future plan were presented, main technologies were listed, advantages and limitations of most technologies were compared, novel approaches were introduced, and future capacities were sketched. The concept of cyber-physical system (CPS) was described and their primary features were interpreted. Then the three-layer architecture of CPS was illustrated. Next CPS requirements on PNT services were analyzed, including requirements on position reference and time reference, requirements on temporal-spatial error monitor, requirements on dynamic services, real-time services, autonomous services, security services and standard services. Finally challenges faced by PNT applications in CPS were concluded. The conclusion was expected to facilitate PNT applications in CPS, and furthermore to provide references to the design and implementation of both architectures.
Electromagnetic physics models for parallel computing architectures

DOE PAGES

Amadio, G.; Ananya, A.; Apostolakis, J.; ...

2016-11-21

The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part ofmore » the GeantV project. Finally, the results of preliminary performance evaluation and physics validation are presented as well.« less
Architectural Implications of Cloud Computing

DTIC Science & Technology

2011-10-24

Public Cloud Infrastructure-as-a- Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of...Twitter #SEIVirtualForum © 2011 Carnegie Mellon University Software -as-a- Service ( SaaS ) Model of software deployment in which a third-party...and System Solutions (RTSS) Program. Her current interests and projects are in service -oriented architecture (SOA), cloud computing, and context

Generic Software for Emulating Multiprocessor Architectures.

DTIC Science & Technology

1985-05-01

RD-A157 662 GENERIC SOFTWARE FOR EMULATING MULTIPROCESSOR 1/2 AlRCHITECTURES(J) MASSACHUSETTS INST OF TECH CAMBRIDGE U LRS LAB FOR COMPUTER SCIENCE R...AREA & WORK UNIT NUMBERS MIT Laboratory for Computer Science 545 Technology Square Cambridge, MA 02139 ____________ I I. CONTROLLING OFFICE NAME AND...aide If neceeasy end Identify by block number) Computer architecture, emulation, simulation, dataf low 20. ABSTRACT (Continue an reverse slde It
Sigint Application for Polymorphous Computing Architecture (PCA): Wideband DF

DTIC Science & Technology

2006-08-01

Polymorphous Computing Architecture (PCA) program as stated by Robert Graybill is to Develop the computing foundation for agile systems by establishing...ubiquitous MUSIC algorithm rely upon an underlying narrowband signal model [8]. In this case, narrowband means that the signal bandwidth is less than...a wideband DF algorithm is needed to compensate for this model inadequacy. Among the various wideband DF techniques available, the coherent signal
Exploring Gigabyte Datasets in Real Time: Architectures, Interfaces and Time-Critical Design

NASA Technical Reports Server (NTRS)

Bryson, Steve; Gerald-Yamasaki, Michael (Technical Monitor)

1998-01-01

Architectures and Interfaces: The implications of real-time interaction on software architecture design: decoupling of interaction/graphics and computation into asynchronous processes. The performance requirements of graphics and computation for interaction. Time management in such an architecture. Examples of how visualization algorithms must be modified for high performance. Brief survey of interaction techniques and design, including direct manipulation and manipulation via widgets. talk discusses how human factors considerations drove the design and implementation of the virtual wind tunnel. Time-Critical Design: A survey of time-critical techniques for both computation and rendering. Emphasis on the assignment of a time budget to both the overall visualization environment and to each individual visualization technique in the environment. The estimation of the benefit and cost of an individual technique. Examples of the modification of visualization algorithms to allow time-critical control.
Hardware architecture design of image restoration based on time-frequency domain computation

NASA Astrophysics Data System (ADS)

Wen, Bo; Zhang, Jing; Jiao, Zipeng

2013-10-01

The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.
Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture.

PubMed

Takeda, Shuntaro; Furusawa, Akira

2017-09-22

We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture

NASA Astrophysics Data System (ADS)

Takeda, Shuntaro; Furusawa, Akira

2017-09-01

We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
High-throughput neuroimaging-genetics computational infrastructure

PubMed Central

Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Hobel, Sam; Vespa, Paul; Woo Moon, Seok; Van Horn, John D.; Franco, Joseph; Toga, Arthur W.

2014-01-01

Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure1. PMID:24795619
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.

PubMed

Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros

2014-06-25

The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions

PubMed Central

2014-01-01

Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
Automated Discovery of Machine-Specific Code Improvements

DTIC Science & Technology

1984-12-01

operation of the source language. Additional analysis may reveal special features of the target architecture that may be exploited to generate efficient...Additional analysis may reveal special features of the target architecture that may be exploited to generate efficient code. Such analysis is optional...incorporate knowledge of the source language, but do not refer to features of the target machine. These early phases are sometimes referred to as the
DOE Office of Scientific and Technical Information (OSTI.GOV)

McCaskey, Alexander J.

Hybrid programming models for beyond-CMOS technologies will prove critical for integrating new computing technologies alongside our existing infrastructure. Unfortunately the software infrastructure required to enable this is lacking or not available. XACC is a programming framework for extreme-scale, post-exascale accelerator architectures that integrates alongside existing conventional applications. It is a pluggable framework for programming languages developed for next-gen computing hardware architectures like quantum and neuromorphic computing. It lets computational scientists efficiently off-load classically intractable work to attached accelerators through user-friendly Kernel definitions. XACC makes post-exascale hybrid programming approachable for domain computational scientists.
Application of computational physics within Northrop

NASA Technical Reports Server (NTRS)

George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.

1987-01-01

An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
All-memristive neuromorphic computing with level-tuned neurons

NASA Astrophysics Data System (ADS)

Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos

2016-09-01

In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
All-memristive neuromorphic computing with level-tuned neurons.

PubMed

Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos

2016-09-02

In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
What Are the Buildings Saying? A Study of First-Year Undergraduate Students' Attributions about College Campus Architecture.

ERIC Educational Resources Information Center

Bennett, Michael A.; Benton, Stephen L.

2001-01-01

Examines the attributions college students (N=301) make toward pictures of college campus buildings. Results reveal that students attributed greater likelihood of individual success to pictures depicting modern architecture than they did to those depicting traditional architecture. (Contains 28 references and 3 tables.) (Author/GCP)
Implications of Multi-Core Architectures on the Development of Multiple Independent Levels of Security (MILS) Compliant Systems

DTIC Science & Technology

2012-10-01

REPORT 3. DATES COVERED (From - To) MAR 2010 – APR 2012 4 . TITLE AND SUBTITLE IMPLICATIONS OF MULT-CORE ARCHITECTURES ON THE DEVELOPMENT OF...Framework for Multicore Information Flow Analysis ...................................... 23 4 4.1 A Hypothetical Reference Architecture... 4 Figure 2: Pentium II Block Diagram
77 FR 187 - Federal Acquisition Regulation; Transition to the System for Award Management (SAM)

Federal Register 2010, 2011, 2012, 2013, 2014

2012-01-03

... architecture. Deletes reference to ``business partner network'' at 4.1100, Scope, which is no longer necessary...) architecture has begun. This effort will transition the Central Contractor Registration (CCR) database, the...) to the new architecture. This case provides the first step in updating the FAR for these changes, and...
Specialized computer architectures for computational aerodynamics

NASA Technical Reports Server (NTRS)

Stevenson, D. K.

1978-01-01

In recent years, computational fluid dynamics has made significant progress in modelling aerodynamic phenomena. Currently, one of the major barriers to future development lies in the compute-intensive nature of the numerical formulations and the relative high cost of performing these computations on commercially available general purpose computers, a cost high with respect to dollar expenditure and/or elapsed time. Today's computing technology will support a program designed to create specialized computing facilities to be dedicated to the important problems of computational aerodynamics. One of the still unresolved questions is the organization of the computing components in such a facility. The characteristics of fluid dynamic problems which will have significant impact on the choice of computer architecture for a specialized facility are reviewed.
Communication architecture for AAL. Supporting patient care by health care providers in AAL-enhanced living quarters.

PubMed

Nitzsche, T; Thiele, S; Häber, A; Winter, A

2014-01-01

This article is part of the Focus Theme of Methods of Information in Medicine on "Using Data from Ambient Assisted Living and Smart Homes in Electronic Health Records". Concepts of Ambient Assisted Living (AAL) support a long-term health monitoring and further medical and other services for multi-morbid patients with chronic diseases. In Germany many AAL and telemedical applications exist. Synergy effects by common agreements for essential application components and standards are not achieved. It is necessary to define a communication architecture which is based on common definitions of communication scenarios, application components and communication standards. The development of a communication architecture requires different steps. To gain a reference model for the problem area different AAL and telemedicine projects were compared and relevant data elements were generalized. The derived reference model defines standardized communication links. As a result the authors present an approach towards a reference architecture for AAL-communication. The focus of the architecture lays on the communication layer. The necessary application components are identified and a communication based on standards and their extensions is highlighted. The exchange of patient individual events supported by an event classification model, raw and aggregated data from the personal home area over a telemedicine center to health care providers is possible.
Achieving a Launch on Demand Capability

NASA Astrophysics Data System (ADS)

Greenberg, Joel S.

2002-01-01

The ability to place payloads [satellites] into orbit as and when required, often referred to as launch on demand, continues to be an elusive and yet largely unfulfilled goal. But what is the value of achieving launch on demand [LOD], and what metrics are appropriate? Achievement of a desired level of LOD capability must consider transportation system thruput, alternative transportation systems that comprise the transportation architecture, transportation demand, reliability and failure recovery characteristics of the alternatives, schedule guarantees, launch delays, payload integration schedules, procurement policies, and other factors. Measures of LOD capability should relate to the objective of the transportation architecture: the placement of payloads into orbit as and when required. Launch on demand capability must be defined in probabilistic terms such as the probability of not incurring a delay in excess of T when it is determined that it is necessary to place a payload into orbit. Three specific aspects of launch on demand are considered: [1] the ability to recover from adversity [i.e., a launch failure] and to keep up with the steady-state demand for placing satellites into orbit [this has been referred to as operability and resiliency], [2] the ability to respond to the requirement to launch a satellite when the need arises unexpectedly either because of an unexpected [random] on-orbit satellite failure that requires replacement or because of the sudden recognition of an unanticipated requirement, and [3] the ability to recover from adversity [i.e., a launch failure] during the placement of a constellation into orbit. The objective of this paper is to outline a formal approach for analyzing alternative transportation architectures in terms of their ability to provide a LOD capability. The economic aspect of LOD is developed by establishing a relationship between scheduling and the elimination of on-orbit spares while achieving the desired level of on-orbit availability. Results of an analysis are presented. The implications of launch on demand are addressed for each of the above three situations and related architecture performance metrics and computer simulation models are described that may be used to evaluate the implications of architecture and policy changes in terms of LOD requirements. The models and metrics are aimed at providing answers to such questions as: How well does a specified space transportation architecture respond to satellite launch demand and changes thereto? How well does a normally functioning and apparently architecture respond to unanticipated needs? What is the effect of a modification to the architecture on its ability to respond to satellite launch demand, including responding to unanticipated needs? What is the cost of the architecture [including facilities, operations, inventory, and satellites]? What is the sensitivity of overall architecture effectiveness and cost to various transportation system delays? What is the effect of adding [or eliminating] a launch vehicle or family of vehicles to [from] the architecture on its effectiveness and cost? What is the value of improving launch vehicle and satellite compatibility and what are the effects on probability of delay statistics and cost of designing for multi-launch vehicle compatibility

Analysis of impact of general-purpose graphics processor units in supersonic flow modeling

NASA Astrophysics Data System (ADS)

Emelyanov, V. N.; Karpenko, A. G.; Kozelkov, A. S.; Teterina, I. V.; Volkov, K. N.; Yalozo, A. V.

2017-06-01

Computational methods are widely used in prediction of complex flowfields associated with off-normal situations in aerospace engineering. Modern graphics processing units (GPU) provide architectures and new programming models that enable to harness their large processing power and to design computational fluid dynamics (CFD) simulations at both high performance and low cost. Possibilities of the use of GPUs for the simulation of external and internal flows on unstructured meshes are discussed. The finite volume method is applied to solve three-dimensional unsteady compressible Euler and Navier-Stokes equations on unstructured meshes with high resolution numerical schemes. CUDA technology is used for programming implementation of parallel computational algorithms. Solutions of some benchmark test cases on GPUs are reported, and the results computed are compared with experimental and computational data. Approaches to optimization of the CFD code related to the use of different types of memory are considered. Speedup of solution on GPUs with respect to the solution on central processor unit (CPU) is compared. Performance measurements show that numerical schemes developed achieve 20-50 speedup on GPU hardware compared to CPU reference implementation. The results obtained provide promising perspective for designing a GPU-based software framework for applications in CFD.
Efficient architecture for spike sorting in reconfigurable hardware.

PubMed

Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying

2013-11-01

This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
Hybrid parallel computing architecture for multiview phase shifting

NASA Astrophysics Data System (ADS)

Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun

2014-11-01

The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
The science of computing - Parallel computation

NASA Technical Reports Server (NTRS)

Denning, P. J.

1985-01-01

Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
Analog Computation by DNA Strand Displacement Circuits.

PubMed

Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John

2016-08-19

DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands, respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. On the basis of these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.
Advanced Architectures for Astrophysical Supercomputing

NASA Astrophysics Data System (ADS)

Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.

2010-12-01

Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
Verification methodology for fault-tolerant, fail-safe computers applied to maglev control computer systems. Final report, July 1991-May 1993

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lala, J.H.; Nagle, G.A.; Harper, R.E.

1993-05-01

The Maglev control computer system should be designed to verifiably possess high reliability and safety as well as high availability to make Maglev a dependable and attractive transportation alternative to the public. A Maglev control computer system has been designed using a design-for-validation methodology developed earlier under NASA and SDIO sponsorship for real-time aerospace applications. The present study starts by defining the maglev mission scenario and ends with the definition of a maglev control computer architecture. Key intermediate steps included definitions of functional and dependability requirements, synthesis of two candidate architectures, development of qualitative and quantitative evaluation criteria, and analyticalmore » modeling of the dependability characteristics of the two architectures. Finally, the applicability of the design-for-validation methodology was also illustrated by applying it to the German Transrapid TR07 maglev control system.« less
Computer Security Primer: Systems Architecture, Special Ontology and Cloud Virtual Machines

ERIC Educational Resources Information Center

Waguespack, Leslie J.

2014-01-01

With the increasing proliferation of multitasking and Internet-connected devices, security has reemerged as a fundamental design concern in information systems. The shift of IS curricula toward a largely organizational perspective of security leaves little room for focus on its foundation in systems architecture, the computational underpinnings of…
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
ATCA for Machines-- Advanced Telecommunications Computing Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Larsen, R.S.; /SLAC

2008-04-22

The Advanced Telecommunications Computing Architecture is a new industry open standard for electronics instrument modules and shelves being evaluated for the International Linear Collider (ILC). It is the first industrial standard designed for High Availability (HA). ILC availability simulations have shown clearly that the capabilities of ATCA are needed in order to achieve acceptable integrated luminosity. The ATCA architecture looks attractive for beam instruments and detector applications as well. This paper provides an overview of ongoing R&D including application of HA principles to power electronics systems.
Combining metric episodes with semantic event concepts within the Symbolic and Sub-Symbolic Robotics Intelligence Control System (SS-RICS)

NASA Astrophysics Data System (ADS)

Kelley, Troy D.; McGhee, S.

2013-05-01

This paper describes the ongoing development of a robotic control architecture that inspired by computational cognitive architectures from the discipline of cognitive psychology. The Symbolic and Sub-Symbolic Robotics Intelligence Control System (SS-RICS) combines symbolic and sub-symbolic representations of knowledge into a unified control architecture. The new architecture leverages previous work in cognitive architectures, specifically the development of the Adaptive Character of Thought-Rational (ACT-R) and Soar. This paper details current work on learning from episodes or events. The use of episodic memory as a learning mechanism has, until recently, been largely ignored by computational cognitive architectures. This paper details work on metric level episodic memory streams and methods for translating episodes into abstract schemas. The presentation will include research on learning through novelty and self generated feedback mechanisms for autonomous systems.
Sensing and perception: Connectionist approaches to subcognitive computing

NASA Technical Reports Server (NTRS)

Skrrypek, J.

1987-01-01

New approaches to machine sensing and perception are presented. The motivation for crossdisciplinary studies of perception in terms of AI and neurosciences is suggested. The question of computing architecture granularity as related to global/local computation underlying perceptual function is considered and examples of two environments are given. Finally, the examples of using one of the environments, UCLA PUNNS, to study neural architectures for visual function are presented.
Computer architecture evaluation for structural dynamics computations: Project summary

NASA Technical Reports Server (NTRS)

Standley, Hilda M.

1989-01-01

The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
The Evaluation of Rekeying Protocols Within the Hubenko Architecture as Applied to Wireless Sensor Networks

DTIC Science & Technology

2009-03-01

SENSOR NETWORKS THESIS Presented to the Faculty Department of Electrical and Computer Engineering Graduate School of Engineering and...hierarchical, and Secure Lock within a wireless sensor network (WSN) under the Hubenko architecture. Using a Matlab computer simulation, the impact of the...rekeying protocol should be applied given particular network parameters, such as WSN size. 10 1.3 Experimental Approach A computer simulation in
Topology optimization aided structural design: Interpretation, computational aspects and 3D printing.

PubMed

Kazakis, Georgios; Kanellopoulos, Ioannis; Sotiropoulos, Stefanos; Lagaros, Nikos D

2017-10-01

Construction industry has a major impact on the environment that we spend most of our life. Therefore, it is important that the outcome of architectural intuition performs well and complies with the design requirements. Architects usually describe as "optimal design" their choice among a rather limited set of design alternatives, dictated by their experience and intuition. However, modern design of structures requires accounting for a great number of criteria derived from multiple disciplines, often of conflicting nature. Such criteria derived from structural engineering, eco-design, bioclimatic and acoustic performance. The resulting vast number of alternatives enhances the need for computer-aided architecture in order to increase the possibility of arriving at a more preferable solution. Therefore, the incorporation of smart, automatic tools in the design process, able to further guide designer's intuition becomes even more indispensable. The principal aim of this study is to present possibilities to integrate automatic computational techniques related to topology optimization in the phase of intuition of civil structures as part of computer aided architectural design. In this direction, different aspects of a new computer aided architectural era related to the interpretation of the optimized designs, difficulties resulted from the increased computational effort and 3D printing capabilities are covered here in.
Computer sciences

NASA Technical Reports Server (NTRS)

Smith, Paul H.

1988-01-01

The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
Low-Bandwidth and Non-Compute Intensive Remote Identification of Microbes from Raw Sequencing Reads

PubMed Central

Gautier, Laurent; Lund, Ole

2013-01-01

Cheap DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples. We propose a novel general approach to the analysis of sequencing data where a reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data. Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references. Sequences for the references can be retrieved and used for exhaustive computation on the reads, such as alignment. To demonstrate this approach we have implemented a web server, indexing tens of thousands of publicly available genomes and genomic regions from various organisms and returning lists of matching hits from query sequencing reads. We have also implemented two clients: one running in a web browser, and one as a python script. Both are able to handle a large number of sequencing reads and from portable devices (the browser-based running on a tablet), perform its task within seconds, and consume an amount of bandwidth compatible with mobile broadband networks. Such client-server approaches could develop in the future, allowing a fully automated processing of sequencing data and routine instant quality check of sequencing runs from desktop sequencers. A web access is available at http://tapir.cbs.dtu.dk. The source code for a python command-line client, a server, and supplementary data are available at http://bit.ly/1aURxkc. PMID:24391826
Low-bandwidth and non-compute intensive remote identification of microbes from raw sequencing reads.

PubMed

Gautier, Laurent; Lund, Ole

2013-01-01

Cheap DNA sequencing may soon become routine not only for human genomes but also for practically anything requiring the identification of living organisms from their DNA: tracking of infectious agents, control of food products, bioreactors, or environmental samples. We propose a novel general approach to the analysis of sequencing data where a reference genome does not have to be specified. Using a distributed architecture we are able to query a remote server for hints about what the reference might be, transferring a relatively small amount of data. Our system consists of a server with known reference DNA indexed, and a client with raw sequencing reads. The client sends a sample of unidentified reads, and in return receives a list of matching references. Sequences for the references can be retrieved and used for exhaustive computation on the reads, such as alignment. To demonstrate this approach we have implemented a web server, indexing tens of thousands of publicly available genomes and genomic regions from various organisms and returning lists of matching hits from query sequencing reads. We have also implemented two clients: one running in a web browser, and one as a python script. Both are able to handle a large number of sequencing reads and from portable devices (the browser-based running on a tablet), perform its task within seconds, and consume an amount of bandwidth compatible with mobile broadband networks. Such client-server approaches could develop in the future, allowing a fully automated processing of sequencing data and routine instant quality check of sequencing runs from desktop sequencers. A web access is available at http://tapir.cbs.dtu.dk. The source code for a python command-line client, a server, and supplementary data are available at http://bit.ly/1aURxkc.
Predictor-Based Model Reference Adaptive Control

NASA Technical Reports Server (NTRS)

Lavretsky, Eugene; Gadient, Ross; Gregory, Irene M.

2009-01-01

This paper is devoted to robust, Predictor-based Model Reference Adaptive Control (PMRAC) design. The proposed adaptive system is compared with the now-classical Model Reference Adaptive Control (MRAC) architecture. Simulation examples are presented. Numerical evidence indicates that the proposed PMRAC tracking architecture has better than MRAC transient characteristics. In this paper, we presented a state-predictor based direct adaptive tracking design methodology for multi-input dynamical systems, with partially known dynamics. Efficiency of the design was demonstrated using short period dynamics of an aircraft. Formal proof of the reported PMRAC benefits constitute future research and will be reported elsewhere.
Patient identity management for secondary use of biomedical research data in a distributed computing environment.

PubMed

Nitzlnader, Michael; Schreier, Günter

2014-01-01

Dealing with data from different source domains is of increasing importance in today's large scale biomedical research endeavours. Within the European Network for Cancer research in Children and Adolescents (ENCCA) a solution to share such data for secondary use will be established. In this paper the solution arising from the aims of the ENCCA project and regulatory requirements concerning data protection and privacy is presented. Since the details of secondary biomedical dataset utilisation are often not known in advance, data protection regulations are met with an identity management concept that facilitates context-specific pseudonymisation and a way of data aggregation using a hidden reference table later on. Phonetic hashing is proposed to prevent duplicated patient registration and re-identification of patients is possible via a trusted third party only. Finally, the solution architecture allows for implementation in a distributed computing environment, including cloud-based elements.

Experience with abstract notation one

NASA Technical Reports Server (NTRS)

Harvey, James D.; Weaver, Alfred C.

1990-01-01

The development of computer science has produced a vast number of machine architectures, programming languages, and compiler technologies. The cross product of these three characteristics defines the spectrum of previous and present data representation methodologies. With regard to computer networks, the uniqueness of these methodologies presents an obstacle when disparate host environments are to be interconnected. Interoperability within a heterogeneous network relies upon the establishment of data representation commonality. The International Standards Organization (ISO) is currently developing the abstract syntax notation one standard (ASN.1) and the basic encoding rules standard (BER) that collectively address this problem. When used within the presentation layer of the open systems interconnection reference model, these two standards provide the data representation commonality required to facilitate interoperability. The details of a compiler that was built to automate the use of ASN.1 and BER are described. From this experience, insights into both standards are given and potential problems relating to this development effort are discussed.
An investigation of potential applications of OP-SAPS: Operational sampled analog processors

NASA Technical Reports Server (NTRS)

Parrish, E. A.; Mcvey, E. S.

1976-01-01

The impact of charge-coupled device (CCD) processors on future instrumentation was investigated. The CCD devices studied process sampled analog data and are referred to as OP-SAPS - operational sampled analog processors. Preliminary studies into various architectural configurations for systems composed of OP-SAPS show that they have potential in such diverse applications as pattern recognition and automatic control. It appears probable that OP-SAPS may be used to construct computing structures which can serve as special peripherals to large-scale computer complexes used in real time flight simulation. The research was limited to the following benchmark programs: (1) face recognition, (2) voice command and control, (3) terrain classification, and (4) terrain identification. A small amount of effort was spent on examining a method by which OP-SAPS may be used to decrease the limiting ground sampling distance encountered in remote sensing from satellites.
Architecture for hospital information integration

NASA Astrophysics Data System (ADS)

Chimiak, William J.; Janariz, Daniel L.; Martinez, Ralph

1999-07-01

The ongoing integration of hospital information systems (HIS) continues. Data storage systems, data networks and computers improve, data bases grow and health-care applications increase. Some computer operating systems continue to evolve and some fade. Health care delivery now depends on this computer-assisted environment. The result is the critical harmonization of the various hospital information systems becomes increasingly difficult. The purpose of this paper is to present an architecture for HIS integration that is computer-language-neutral and computer- hardware-neutral for the informatics applications. The proposed architecture builds upon the work done at the University of Arizona on middleware, the work of the National Electrical Manufacturers Association, and the American College of Radiology. It is a fresh approach to allowing applications engineers to access medical data easily and thus concentrates on the application techniques in which they are expert without struggling with medical information syntaxes. The HIS can be modeled using a hierarchy of information sub-systems thus facilitating its understanding. The architecture includes the resulting information model along with a strict but intuitive application programming interface, managed by CORBA. The CORBA requirement facilitates interoperability. It should also reduce software and hardware development times.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification

PubMed Central

Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao

2012-01-01

This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
DREAMS and IMAGE: A Model and Computer Implementation for Concurrent, Life-Cycle Design of Complex Systems

NASA Technical Reports Server (NTRS)

Hale, Mark A.; Craig, James I.; Mistree, Farrokh; Schrage, Daniel P.

1995-01-01

Computing architectures are being assembled that extend concurrent engineering practices by providing more efficient execution and collaboration on distributed, heterogeneous computing networks. Built on the successes of initial architectures, requirements for a next-generation design computing infrastructure can be developed. These requirements concentrate on those needed by a designer in decision-making processes from product conception to recycling and can be categorized in two areas: design process and design information management. A designer both designs and executes design processes throughout design time to achieve better product and process capabilities while expanding fewer resources. In order to accomplish this, information, or more appropriately design knowledge, needs to be adequately managed during product and process decomposition as well as recomposition. A foundation has been laid that captures these requirements in a design architecture called DREAMS (Developing Robust Engineering Analysis Models and Specifications). In addition, a computing infrastructure, called IMAGE (Intelligent Multidisciplinary Aircraft Generation Environment), is being developed that satisfies design requirements defined in DREAMS and incorporates enabling computational technologies.
Gigaflop architecture, a hardware perspective

NASA Technical Reports Server (NTRS)

Feierbach, G. F.

1978-01-01

Any super computer built in the early 1980s will use components that are available by fall 1978. The architecture of such a system cannot depart radically from current super computers if the software experience painfully acquired from these computers in the 70's is to apply. Given the above constraints, 10 billion floating point operations per second (BFLOPS) are attainable and a problem memory of 512 million (64 bit) words could be supported by the technology of the time. In contrast to this, industry is likely to respond with commercially available machines with a performance of less than 150 MFLOPS. This is due to self-imposed constraints on the manufacturers to provide upward compatible architectures (same instruction set) and systems which can be sold in significant volumes. Since this computing speed is inadequate to meet the demands of computational fluid dynamics, a special processor is required. Issues which are felt to be significant in the pursuit of maximum compute capability in this special processor are discussed.
Using Multimedia for Teaching Analysis in History of Modern Architecture.

ERIC Educational Resources Information Center

Perryman, Garry

This paper presents a case for the development and support of a computer-based interactive multimedia program for teaching analysis in community college architecture design programs. Analysis in architecture design is an extremely important strategy for the teaching of higher-order thinking skills, which senior schools of architecture look for in…
Progress in a novel architecture for high performance processing

NASA Astrophysics Data System (ADS)

Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin

2018-04-01

The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).
Multiple Embedded Processors for Fault-Tolerant Computing

NASA Technical Reports Server (NTRS)

Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

2005-01-01

A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
A static data flow simulation study at Ames Research Center

NASA Technical Reports Server (NTRS)

Barszcz, Eric; Howard, Lauri S.

1987-01-01

Demands in computational power, particularly in the area of computational fluid dynamics (CFD), led NASA Ames Research Center to study advanced computer architectures. One architecture being studied is the static data flow architecture based on research done by Jack B. Dennis at MIT. To improve understanding of this architecture, a static data flow simulator, written in Pascal, has been implemented for use on a Cray X-MP/48. A matrix multiply and a two-dimensional fast Fourier transform (FFT), two algorithms used in CFD work at Ames, have been run on the simulator. Execution times can vary by a factor of more than 2 depending on the partitioning method used to assign instructions to processing elements. Service time for matching tokens has proved to be a major bottleneck. Loop control and array address calculation overhead can double the execution time. The best sustained MFLOPS rates were less than 50% of the maximum capability of the machine.
Strategies for concurrent processing of complex algorithms in data driven architectures

NASA Technical Reports Server (NTRS)

Stoughton, John W.; Mielke, Roland R.

1988-01-01

The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
Computer-implemented security evaluation methods, security evaluation systems, and articles of manufacture

DOEpatents

Muller, George; Perkins, Casey J.; Lancaster, Mary J.; MacDonald, Douglas G.; Clements, Samuel L.; Hutton, William J.; Patrick, Scott W.; Key, Bradley Robert

2015-07-28

Computer-implemented security evaluation methods, security evaluation systems, and articles of manufacture are described. According to one aspect, a computer-implemented security evaluation method includes accessing information regarding a physical architecture and a cyber architecture of a facility, building a model of the facility comprising a plurality of physical areas of the physical architecture, a plurality of cyber areas of the cyber architecture, and a plurality of pathways between the physical areas and the cyber areas, identifying a target within the facility, executing the model a plurality of times to simulate a plurality of attacks against the target by an adversary traversing at least one of the areas in the physical domain and at least one of the areas in the cyber domain, and using results of the executing, providing information regarding a security risk of the facility with respect to the target.
The NASA Integrated Information Technology Architecture

NASA Technical Reports Server (NTRS)

Baldridge, Tim

1997-01-01

This document defines an Information Technology Architecture for the National Aeronautics and Space Administration (NASA), where Information Technology (IT) refers to the hardware, software, standards, protocols and processes that enable the creation, manipulation, storage, organization and sharing of information. An architecture provides an itemization and definition of these IT structures, a view of the relationship of the structures to each other and, most importantly, an accessible view of the whole. It is a fundamental assumption of this document that a useful, interoperable and affordable IT environment is key to the execution of the core NASA scientific and project competencies and business practices. This Architecture represents the highest level system design and guideline for NASA IT related activities and has been created on the authority of the NASA Chief Information Officer (CIO) and will be maintained under the auspices of that office. It addresses all aspects of general purpose, research, administrative and scientific computing and networking throughout the NASA Agency and is applicable to all NASA administrative offices, projects, field centers and remote sites. Through the establishment of five Objectives and six Principles this Architecture provides a blueprint for all NASA IT service providers: civil service, contractor and outsourcer. The most significant of the Objectives and Principles are the commitment to customer-driven IT implementations and the commitment to a simpler, cost-efficient, standards-based, modular IT infrastructure. In order to ensure that the Architecture is presented and defined in the context of the mission, project and business goals of NASA, this Architecture consists of four layers in which each subsequent layer builds on the previous layer. They are: 1) the Business Architecture: the operational functions of the business, or Enterprise, 2) the Systems Architecture: the specific Enterprise activities within the context of IT systems, 3) the Technical Architecture: a common, vendor-independent framework for design, integration and implementation of IT systems and 4) the Product Architecture: vendor=specific IT solutions. The Systems Architecture is effectively a description of the end-user "requirements". Generalized end-user requirements are discussed and subsequently organized into specific mission and project functions. The Technical Architecture depicts the framework, and relationship, of the specific IT components that enable the end-user functionality as described in the Systems Architecture. The primary components as described in the Technical Architecture are: 1) Applications: Basic Client Component, Object Creation Applications, Collaborative Applications, Object Analysis Applications, 2) Services: Messaging, Information Broker, Collaboration, Distributed Processing, and 3) Infrastructure: Network, Security, Directory, Certificate Management, Enterprise Management and File System. This Architecture also provides specific Implementation Recommendations, the most significant of which is the recognition of IT as core to NASA activities and defines a plan, which is aligned with the NASA strategic planning processes, for keeping the Architecture alive and useful.
Enterprise application architecture development based on DoDAF and TOGAF

NASA Astrophysics Data System (ADS)

Tao, Zhi-Gang; Luo, Yun-Feng; Chen, Chang-Xin; Wang, Ming-Zhe; Ni, Feng

2017-05-01

For the purpose of supporting the design and analysis of enterprise application architecture, here, we report a tailored enterprise application architecture description framework and its corresponding design method. The presented framework can effectively support service-oriented architecting and cloud computing by creating the metadata model based on architecture content framework (ACF), DoDAF metamodel (DM2) and Cloud Computing Modelling Notation (CCMN). The framework also makes an effort to extend and improve the mapping between The Open Group Architecture Framework (TOGAF) application architectural inputs/outputs, deliverables and Department of Defence Architecture Framework (DoDAF)-described models. The roadmap of 52 DoDAF-described models is constructed by creating the metamodels of these described models and analysing the constraint relationship among metamodels. By combining the tailored framework and the roadmap, this article proposes a service-oriented enterprise application architecture development process. Finally, a case study is presented to illustrate the results of implementing the tailored framework in the Southern Base Management Support and Information Platform construction project using the development process proposed by the paper.
Selecting a Benchmark Suite to Profile High-Performance Computing (HPC) Machines

DTIC Science & Technology

2014-11-01

architectures. Machines now contain central processing units (CPUs), graphics processing units (GPUs), and many integrated core ( MIC ) architecture all...evaluate the feasibility and applicability of a new architecture just released to the market . Researchers are often unsure how available resources will...architectures. Having a suite of programs running on different architectures, such as GPUs, MICs , and CPUs, adds complexity and technical challenges
Development of an unmanned maritime system reference architecture

NASA Astrophysics Data System (ADS)

Duarte, Christiane N.; Cramer, Megan A.; Stack, Jason R.

2014-06-01

The concept of operations (CONOPS) for unmanned maritime systems (UMS) continues to envision systems that are multi-mission, re-configurable and capable of acceptable performance over a wide range of environmental and contextual variability. Key enablers for these concepts of operation are an autonomy module which can execute different mission directives and a mission payload consisting of re-configurable sensor or effector suites. This level of modularity in mission payloads enables affordability, flexibility (i.e., more capability with future platforms) and scalability (i.e., force multiplication). The modularity in autonomy facilitates rapid technology integration, prototyping, testing and leveraging of state-of-the-art advances in autonomy research. Capability drivers imply a requirement to maintain an open architecture design for both research and acquisition programs. As the maritime platforms become more stable in their design (e.g. unmanned surface vehicles, unmanned underwater vehicles) future developments are able to focus on more capable sensors and more robust autonomy algorithms. To respond to Fleet needs, given an evolving threat, programs will want to interchange the latest sensor or a new and improved algorithm in a cost effective and efficient manner. In order to make this possible, the programs need a reference architecture that will define for technology providers where their piece fits and how to successfully integrate. With these concerns in mind, the US Navy established the Unmanned Maritime Systems Reference Architecture (UMS-RA) Working Group in August 2011. This group consists of Department of Defense and industry participants working the problem of defining reference architecture for autonomous operations of maritime systems. This paper summarizes its efforts to date.
Xyce parallel electronic simulator : reference guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

2011-05-01

This document is a reference guide to the Xyce Parallel Electronic Simulator, and is a companion document to the Xyce Users Guide. The focus of this document is (to the extent possible) exhaustively list device parameters, solver options, parser options, and other usage details of Xyce. This document is not intended to be a tutorial. Users who are new to circuit simulation are better served by the Xyce Users Guide. The Xyce Parallel Electronic Simulator has been written to support, in a rigorous manner, the simulation needs of the Sandia National Laboratories electrical designers. It is targeted specifically to runmore » on large-scale parallel computing platforms but also runs well on a variety of architectures including single processor workstations. It also aims to support a variety of devices and models specific to Sandia needs. This document is intended to complement the Xyce Users Guide. It contains comprehensive, detailed information about a number of topics pertinent to the usage of Xyce. Included in this document is a netlist reference for the input-file commands and elements supported within Xyce; a command line reference, which describes the available command line arguments for Xyce; and quick-references for users of other circuit codes, such as Orcad's PSpice and Sandia's ChileSPICE.« less
Special purpose parallel computer architecture for real-time control and simulation in robotic applications

NASA Technical Reports Server (NTRS)

Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)

1993-01-01

This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
Web-Based Architecture to Enable Compute-Intensive CAD Tools and Multi-user Synchronization in Teleradiology

NASA Astrophysics Data System (ADS)

Mehta, Neville; Kompalli, Suryaprakash; Chaudhary, Vipin

Teleradiology is the electronic transmission of radiological patient images, such as x-rays, CT, or MR across multiple locations. The goal could be interpretation, consultation, or medical records keeping. Information technology solutions have enabled electronic records and their associated benefits are evident in health care today. However, salient aspects of collaborative interfaces, and computer assisted diagnostic (CAD) tools are yet to be integrated into workflow designs. The Computer Assisted Diagnostics and Interventions (CADI) group at the University at Buffalo has developed an architecture that facilitates web-enabled use of CAD tools, along with the novel concept of synchronized collaboration. The architecture can support multiple teleradiology applications and case studies are presented here.
Design of a modular digital computer system, CDRL no. D001, final design plan

NASA Technical Reports Server (NTRS)

Easton, R. A.

1975-01-01

The engineering breadboard implementation for the CDRL no. D001 modular digital computer system developed during design of the logic system was documented. This effort followed the architecture study completed and documented previously, and was intended to verify the concepts of a fault tolerant, automatically reconfigurable, modular version of the computer system conceived during the architecture study. The system has a microprogrammed 32 bit word length, general register architecture and an instruction set consisting of a subset of the IBM System 360 instruction set plus additional fault tolerance firmware. The following areas were covered: breadboard packaging, central control element, central processing element, memory, input/output processor, and maintenance/status panel and electronics.

Performance evaluation of throughput computing workloads using multi-core processors and graphics processors

NASA Astrophysics Data System (ADS)

Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.

2017-11-01

Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Probabilistic Ontology Architecture for a Terrorist Identification Decision Support System

DTIC Science & Technology

2014-06-01

in real-world problems requires probabilistic ontologies, which integrate the inferential reasoning power of probabilistic representations with the... inferential reasoning power of probabilistic representations with the first-order expressivity of ontologies. The Reference Architecture for...ontology, terrorism, inferential reasoning, architecture I. INTRODUCTION A. Background Whether by nature or design, the personas of terrorists are
An Architecture Combining IMS-LD and Web Services for Flexible Data-Transfer in CSCL

ERIC Educational Resources Information Center

Magnisalis, Ioannis; Demetriadis, Stavros

2017-01-01

This article presents evaluation data regarding the MAPIS3 architecture which is proposed as a solution for the data-transfer among various tools to promote flexible collaborative learning designs. We describe the problem that this architecture deals with as "tool orchestration" in collaborative learning settings. This term refers to a…
Creating a New Architecture for the Learning College

ERIC Educational Resources Information Center

O'Banion, Terry

2007-01-01

The publication of "A Nation at Risk" in 1983 triggered a series of major reform efforts in education that are still evolving. As part of the reform efforts, leaders began to refer to a Learning Revolution that would "place learning first by overhauling the traditional architecture of education." The old architecture--time-bound, place-bound,…
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.

PubMed

Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi

2018-01-01

Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Design and Analysis of a Neuromemristive Reservoir Computing Architecture for Biosignal Processing

PubMed Central

Kudithipudi, Dhireesha; Saleh, Qutaiba; Merkel, Cory; Thesing, James; Wysocki, Bryant

2016-01-01

Reservoir computing (RC) is gaining traction in several signal processing domains, owing to its non-linear stateful computation, spatiotemporal encoding, and reduced training complexity over recurrent neural networks (RNNs). Previous studies have shown the effectiveness of software-based RCs for a wide spectrum of applications. A parallel body of work indicates that realizing RNN architectures using custom integrated circuits and reconfigurable hardware platforms yields significant improvements in power and latency. In this research, we propose a neuromemristive RC architecture, with doubly twisted toroidal structure, that is validated for biosignal processing applications. We exploit the device mismatch to implement the random weight distributions within the reservoir and propose mixed-signal subthreshold circuits for energy efficiency. A comprehensive analysis is performed to compare the efficiency of the neuromemristive RC architecture in both digital(reconfigurable) and subthreshold mixed-signal realizations. Both Electroencephalogram (EEG) and Electromyogram (EMG) biosignal benchmarks are used for validating the RC designs. The proposed RC architecture demonstrated an accuracy of 90 and 84% for epileptic seizure detection and EMG prosthetic finger control, respectively. PMID:26869876
New paradigms in internal architecture design and freeform fabrication of tissue engineering porous scaffolds.

PubMed

Yoo, Dongjin

2012-07-01

Advanced additive manufacture (AM) techniques are now being developed to fabricate scaffolds with controlled internal pore architectures in the field of tissue engineering. In general, these techniques use a hybrid method which combines computer-aided design (CAD) with computer-aided manufacturing (CAM) tools to design and fabricate complicated three-dimensional (3D) scaffold models. The mathematical descriptions of micro-architectures along with the macro-structures of the 3D scaffold models are limited by current CAD technologies as well as by the difficulty of transferring the designed digital models to standard formats for fabrication. To overcome these difficulties, we have developed an efficient internal pore architecture design system based on triply periodic minimal surface (TPMS) unit cell libraries and associated computational methods to assemble TPMS unit cells into an entire scaffold model. In addition, we have developed a process planning technique based on TPMS internal architecture pattern of unit cells to generate tool paths for freeform fabrication of tissue engineering porous scaffolds. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
Embedded Data Processor and Portable Computer Technology testbeds

NASA Technical Reports Server (NTRS)

Alena, Richard; Liu, Yuan-Kwei; Goforth, Andre; Fernquist, Alan R.

1993-01-01

Attention is given to current activities in the Embedded Data Processor and Portable Computer Technology testbed configurations that are part of the Advanced Data Systems Architectures Testbed at the Information Sciences Division at NASA Ames Research Center. The Embedded Data Processor Testbed evaluates advanced microprocessors for potential use in mission and payload applications within the Space Station Freedom Program. The Portable Computer Technology (PCT) Testbed integrates and demonstrates advanced portable computing devices and data system architectures. The PCT Testbed uses both commercial and custom-developed devices to demonstrate the feasibility of functional expansion and networking for portable computers in flight missions.
A supportive architecture for CFD-based design optimisation

NASA Astrophysics Data System (ADS)

Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong

2014-03-01

Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.
Prospective Architectures for Onboard vs Cloud-Based Decision Making for Unmanned Aerial Systems

NASA Technical Reports Server (NTRS)

Sankararaman, Shankar; Teubert, Christopher

2017-01-01

This paper investigates propsective architectures for decision-making in unmanned aerial systems. When these unmanned vehicles operate in urban environments, there are several sources of uncertainty that affect their behavior, and decision-making algorithms need to be robust to account for these different sources of uncertainty. It is important to account for several risk-factors that affect the flight of these unmanned systems, and facilitate decision-making by taking into consideration these various risk-factors. In addition, there are several technical challenges related to autonomous flight of unmanned aerial systems; these challenges include sensing, obstacle detection, path planning and navigation, trajectory generation and selection, etc. Many of these activities require significant computational power and in many situations, all of these activities need to be performed in real-time. In order to efficiently integrate these activities, it is important to develop a systematic architecture that can facilitate real-time decision-making. Four prospective architectures are discussed in this paper; on one end of the spectrum, the first architecture considers all activities/computations being performed onboard the vehicle whereas on the other end of the spectrum, the fourth and final architecture considers all activities/computations being performed in the cloud, using a new service known as Prognostics as a Service that is being developed at NASA Ames Research Center. The four different architectures are compared, their advantages and disadvantages are explained and conclusions are presented.
CMOL/CMOS hardware architectures and performance/price for Bayesian memory - The building block of intelligent systems

NASA Astrophysics Data System (ADS)

Zaveri, Mazad Shaheriar

The semiconductor/computer industry has been following Moore's law for several decades and has reaped the benefits in speed and density of the resultant scaling. Transistor density has reached almost one billion per chip, and transistor delays are in picoseconds. However, scaling has slowed down, and the semiconductor industry is now facing several challenges. Hybrid CMOS/nano technologies, such as CMOL, are considered as an interim solution to some of the challenges. Another potential architectural solution includes specialized architectures for applications/models in the intelligent computing domain, one aspect of which includes abstract computational models inspired from the neuro/cognitive sciences. Consequently in this dissertation, we focus on the hardware implementations of Bayesian Memory (BM), which is a (Bayesian) Biologically Inspired Computational Model (BICM). This model is a simplified version of George and Hawkins' model of the visual cortex, which includes an inference framework based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the BM. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom hardware architectures using both traditional CMOS and hybrid nanotechnology - CMOL, and investigating the baseline performance/price of these architectures. The results suggest that CMOL is a promising candidate for implementing a BM. Such implementations can utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32 to 40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a PC implementation. We later use this methodology to investigate the hardware implementations of cortex-scale spiking neural system, which is an approximate neural equivalent of BICM based cortex-scale system. The results of this investigation also suggest that CMOL is a promising candidate to implement such large-scale neuromorphic systems. In general, the assessment of such hypothetical baseline hardware architectures provides the prospects for building large-scale (mammalian cortex-scale) implementations of neuromorphic/Bayesian/intelligent systems using state-of-the-art and beyond state-of-the-art silicon structures.
Super and parallel computers and their impact on civil engineering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kamat, M.P.

1986-01-01

This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
p88110: A Graphical Simulator for Computer Architecture and Organization Courses

ERIC Educational Resources Information Center

Garcia, M. I.; Rodriguez, S.; Perez, A.; Garcia, A.

2009-01-01

Studying fundamental Computer Architecture and Organization topics requires a significant amount of practical work if students are to acquire a good grasp of the theoretical concepts presented in classroom lectures or textbooks. The use of simulators is commonly adopted in order to reach this objective. However, as most of the available…
Component architecture in drug discovery informatics.

PubMed

Smith, Peter M

2002-05-01

This paper reviews the characteristics of a new model of computing that has been spurred on by the Internet, known as Netcentric computing. Developments in this model led to distributed component architectures, which, although not new ideas, are now realizable with modern tools such as Enterprise Java. The application of this approach to scientific computing, particularly in pharmaceutical discovery research, is discussed and highlighted by a particular case involving the management of biological assay data.
FPGA-based architecture for motion recovering in real-time

NASA Astrophysics Data System (ADS)

Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar

2002-03-01

A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
A fast image registration approach of neural activities in light-sheet fluorescence microscopy images

NASA Astrophysics Data System (ADS)

Meng, Hui; Hui, Hui; Hu, Chaoen; Yang, Xin; Tian, Jie

2017-03-01

The ability of fast and single-neuron resolution imaging of neural activities enables light-sheet fluorescence microscopy (LSFM) as a powerful imaging technique in functional neural connection applications. The state-of-art LSFM imaging system can record the neuronal activities of entire brain for small animal, such as zebrafish or C. elegans at single-neuron resolution. However, the stimulated and spontaneous movements in animal brain result in inconsistent neuron positions during recording process. It is time consuming to register the acquired large-scale images with conventional method. In this work, we address the problem of fast registration of neural positions in stacks of LSFM images. This is necessary to register brain structures and activities. To achieve fast registration of neural activities, we present a rigid registration architecture by implementation of Graphics Processing Unit (GPU). In this approach, the image stacks were preprocessed on GPU by mean stretching to reduce the computation effort. The present image was registered to the previous image stack that considered as reference. A fast Fourier transform (FFT) algorithm was used for calculating the shift of the image stack. The calculations for image registration were performed in different threads while the preparation functionality was refactored and called only once by the master thread. We implemented our registration algorithm on NVIDIA Quadro K4200 GPU under Compute Unified Device Architecture (CUDA) programming environment. The experimental results showed that the registration computation can speed-up to 550ms for a full high-resolution brain image. Our approach also has potential to be used for other dynamic image registrations in biomedical applications.
Computation and brain processes, with special reference to neuroendocrine systems.

PubMed

Toni, Roberto; Spaletta, Giulia; Casa, Claudia Della; Ravera, Simone; Sandri, Giorgio

2007-01-01

The development of neural networks and brain automata has made neuroscientists aware that the performance limits of these brain-like devices lies, at least in part, in their computational power. The computational basis of a. standard cybernetic design, in fact, refers to that of a discrete and finite state machine or Turing Machine (TM). In contrast, it has been suggested that a number of human cerebral activites, from feedback controls up to mental processes, rely on a mixing of both finitary, digital-like and infinitary, continuous-like procedures. Therefore, the central nervous system (CNS) of man would exploit a form of computation going beyond that of a TM. This "non conventional" computation has been called hybrid computation. Some basic structures for hybrid brain computation are believed to be the brain computational maps, in which both Turing-like (digital) computation and continuous (analog) forms of calculus might occur. The cerebral cortex and brain stem appears primary candidate for this processing. However, also neuroendocrine structures like the hypothalamus are believed to exhibit hybrid computional processes, and might give rise to computational maps. Current theories on neural activity, including wiring and volume transmission, neuronal group selection and dynamic evolving models of brain automata, bring fuel to the existence of natural hybrid computation, stressing a cooperation between discrete and continuous forms of communication in the CNS. In addition, the recent advent of neuromorphic chips, like those to restore activity in damaged retina and visual cortex, suggests that assumption of a discrete-continuum polarity in designing biocompatible neural circuitries is crucial for their ensuing performance. In these bionic structures, in fact, a correspondence exists between the original anatomical architecture and synthetic wiring of the chip, resulting in a correspondence between natural and cybernetic neural activity. Thus, chip "form" provides a continuum essential to chip "function". We conclude that it is reasonable to predict the existence of hybrid computational processes in the course of many human, brain integrating activities, urging development of cybernetic approaches based on this modelling for adequate reproduction of a variety of cerebral performances.
Parallel, stochastic measurement of molecular surface area.

PubMed

Juba, Derek; Varshney, Amitabh

2008-08-01

Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.
Selecting an Architecture for a Safety-Critical Distributed Computer System with Power, Weight and Cost Considerations

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo

2014-01-01

This report presents an example of the application of multi-criteria decision analysis to the selection of an architecture for a safety-critical distributed computer system. The design problem includes constraints on minimum system availability and integrity, and the decision is based on the optimal balance of power, weight and cost. The analysis process includes the generation of alternative architectures, evaluation of individual decision criteria, and the selection of an alternative based on overall value. In this example presented here, iterative application of the quantitative evaluation process made it possible to deliberately generate an alternative architecture that is superior to all others regardless of the relative importance of cost.
Fault tolerant architectures for integrated aircraft electronics systems, task 2

NASA Technical Reports Server (NTRS)

Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.

1984-01-01

The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported.

Experimental Demonstration of a Self-organized Architecture for Emerging Grid Computing Applications on OBS Testbed

NASA Astrophysics Data System (ADS)

Liu, Lei; Hong, Xiaobin; Wu, Jian; Lin, Jintong

As Grid computing continues to gain popularity in the industry and research community, it also attracts more attention from the customer level. The large number of users and high frequency of job requests in the consumer market make it challenging. Clearly, all the current Client/Server(C/S)-based architecture will become unfeasible for supporting large-scale Grid applications due to its poor scalability and poor fault-tolerance. In this paper, based on our previous works [1, 2], a novel self-organized architecture to realize a highly scalable and flexible platform for Grids is proposed. Experimental results show that this architecture is suitable and efficient for consumer-oriented Grids.
A cognitive computational model inspired by the immune system response.

PubMed

Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim

2014-01-01

The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective.
A Cognitive Computational Model Inspired by the Immune System Response

PubMed Central

Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim

2014-01-01

The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective. PMID:25003131
Generic Divide and Conquer Internet-Based Computing

NASA Technical Reports Server (NTRS)

Follen, Gregory J. (Technical Monitor); Radenski, Atanas

2003-01-01

The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation

DTIC Science & Technology

2008-05-19

Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation Vito Dai Electrical Engineering and Computer Sciences...servers or to redistribute to lists, requires prior specific permission. Data Compression for Maskless Lithography Systems: Architecture, Algorithms and...for Maskless Lithography Systems: Architecture, Algorithms and Implementation Copyright 2008 by Vito Dai 1 Abstract Data Compression for Maskless
Strategic Mobility 21. Service Oriented Architecture (SOA) Reference Model - Global Transportation Management System Architecture

DTIC Science & Technology

2009-10-07

SECTION A. BUSINESS ENVIRONMENT 1 INTRODUCTION The Strategic Mobility 21 (SM21) program is currently in the process of developing the Joint...Platform ( BPP ) which enables the ability to rapidly compose new business processes and expand the core TMS feature-set to adapt to the challenges...Reference: Strategic Mobility 21 Contract N00014-06-C-0060 Dear Paul, In accordance with the requirements of referenced contract, we are pleased to
Traffic information computing platform for big data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn

Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.
A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

NASA Astrophysics Data System (ADS)

Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

2016-05-01

In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
GPU-accelerated adjoint algorithmic differentiation

NASA Astrophysics Data System (ADS)

Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe

2016-03-01

Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the ;tape;. Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography.
GPU-Accelerated Adjoint Algorithmic Differentiation.

PubMed

Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe

2016-03-01

Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the "tape". Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography.
GPU-Accelerated Adjoint Algorithmic Differentiation

PubMed Central

Gremse, Felix; Höfter, Andreas; Razik, Lukas; Kiessling, Fabian; Naumann, Uwe

2015-01-01

Many scientific problems such as classifier training or medical image reconstruction can be expressed as minimization of differentiable real-valued cost functions and solved with iterative gradient-based methods. Adjoint algorithmic differentiation (AAD) enables automated computation of gradients of such cost functions implemented as computer programs. To backpropagate adjoint derivatives, excessive memory is potentially required to store the intermediate partial derivatives on a dedicated data structure, referred to as the “tape”. Parallelization is difficult because threads need to synchronize their accesses during taping and backpropagation. This situation is aggravated for many-core architectures, such as Graphics Processing Units (GPUs), because of the large number of light-weight threads and the limited memory size in general as well as per thread. We show how these limitations can be mediated if the cost function is expressed using GPU-accelerated vector and matrix operations which are recognized as intrinsic functions by our AAD software. We compare this approach with naive and vectorized implementations for CPUs. We use four increasingly complex cost functions to evaluate the performance with respect to memory consumption and gradient computation times. Using vectorization, CPU and GPU memory consumption could be substantially reduced compared to the naive reference implementation, in some cases even by an order of complexity. The vectorization allowed usage of optimized parallel libraries during forward and reverse passes which resulted in high speedups for the vectorized CPU version compared to the naive reference implementation. The GPU version achieved an additional speedup of 7.5 ± 4.4, showing that the processing power of GPUs can be utilized for AAD using this concept. Furthermore, we show how this software can be systematically extended for more complex problems such as nonlinear absorption reconstruction for fluorescence-mediated tomography. PMID:26941443
Dynamic array processing for computationally intensive expert systems in CLIPS

NASA Technical Reports Server (NTRS)

Athavale, N. N.; Ragade, R. K.; Fenske, T. E.; Cassaro, M. A.

1990-01-01

This paper puts forth an architecture for implementing a loop for advanced data structure of arrays in CLIPS. An attempt is made to use multi-field variables in such an architecture to process a set of data during the decision making cycle. Also, current limitations on the expert system shells are discussed in brief in this paper. The resulting architecture is designed to circumvent the current limitations set by the expert system shell and also by the operating environment. Such advanced data structures are needed for tightly coupling symbolic and numeric computation modules.
A model for architectural comparison

NASA Astrophysics Data System (ADS)

Ho, Sam; Snyder, Larry

1988-04-01

Recently, architectures for sequential computers became a topic of much discussion and controversy. At the center of this storm is the Reduced Instruction Set Computer, or RISC, first described at Berkeley in 1980. While the merits of the RISC architecture cannot be ignored, its opponents have tried to do just that, while its proponents have expanded and frequently exaggerated them. This state of affairs has persisted to this day. No attempt is made to settle this controversy, since indeed there is likely no one answer. A qualitative framework is provided for a rational discussion of the issues.
Dual-scale topology optoelectronic processor.

PubMed

Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H

1991-12-15

The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
Recent advances in approximation concepts for optimum structural design

NASA Technical Reports Server (NTRS)

Barthelemy, Jean-Francois M.; Haftka, Raphael T.

1991-01-01

The basic approximation concepts used in structural optimization are reviewed. Some of the most recent developments in that area since the introduction of the concept in the mid-seventies are discussed. The paper distinguishes between local, medium-range, and global approximations; it covers functions approximations and problem approximations. It shows that, although the lack of comparative data established on reference test cases prevents an accurate assessment, there have been significant improvements. The largest number of developments have been in the areas of local function approximations and use of intermediate variable and response quantities. It also appears that some new methodologies are emerging which could greatly benefit from the introduction of new computer architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sierra Thermal /Fluid Team

The SIERRA Low Mach Module: Fuego along with the SIERRA Participating Media Radiation Module: Syrinx, henceforth referred to as Fuego and Syrinx, respectively, are the key elements of the ASCI fire environment simulation project. The fire environment simulation project is directed at characterizing both open large-scale pool fires and building enclosure fires. Fuego represents the turbulent, buoyantly-driven incompressible flow, heat transfer, mass transfer, combustion, soot, and absorption coefficient model portion of the simulation software. Syrinx represents the participating-media thermal radiation mechanics. This project is an integral part of the SIERRA multi-mechanics software development project. Fuego depends heavily upon the coremore » architecture developments provided by SIERRA for massively parallel computing, solution adaptivity, and mechanics coupling on unstructured grids.« less
Combining Self-Explaining with Computer Architecture Diagrams to Enhance the Learning of Assembly Language Programming

ERIC Educational Resources Information Center

Hung, Y.-C.

2012-01-01

This paper investigates the impact of combining self explaining (SE) with computer architecture diagrams to help novice students learn assembly language programming. Pre- and post-test scores for the experimental and control groups were compared and subjected to covariance (ANCOVA) statistical analysis. Results indicate that the SE-plus-diagram…
A Model for Minimizing Numeric Function Generator Complexity and Delay

DTIC Science & Technology

2007-12-01

allow computation of difficult mathematical functions in less time and with less hardware than commonly employed methods. They compute piecewise...Programmable Gate Arrays (FPGAs). The algorithms and estimation techniques apply to various NFG architectures and mathematical functions. This...thesis compares hardware utilization and propagation delay for various NFG architectures, mathematical functions, word widths, and segmentation methods
Usage of Thin-Client/Server Architecture in Computer Aided Education

ERIC Educational Resources Information Center

Cimen, Caghan; Kavurucu, Yusuf; Aydin, Halit

2014-01-01

With the advances of technology, thin-client/server architecture has become popular in multi-user/single network environments. Thin-client is a user terminal in which the user can login to a domain and run programs by connecting to a remote server. Recent developments in network and hardware technologies (cloud computing, virtualization, etc.)…
Optical Computing Based on Neuronal Models

DTIC Science & Technology

1988-05-01

walking, and cognition are far too complex for existing sequential digital computers. Therefore new architectures, hardware, and algorithms modeled...collective behavior, and iterative processing into optical processing and artificial neurodynamical systems. Another intriguing promise of neural nets is...with architectures, implementations, and programming; and material research s -7- called for. Our future research in neurodynamics will continue to

Using SPEEDES to simulate the blue gene interconnect network

NASA Technical Reports Server (NTRS)

Springer, P.; Upchurch, E.

2003-01-01

JPL and the Center for Advanced Computer Architecture (CACR) is conducting application and simulation analyses of BG/L in order to establish a range of effectiveness for the Blue Gene/L MPP architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution.
The Use of Metaphors as a Parametric Design Teaching Model: A Case Study

ERIC Educational Resources Information Center

Agirbas, Asli

2018-01-01

Teaching methodologies for parametric design are being researched all over the world, since there is a growing demand for computer programming logic and its fabrication process in architectural education. The computer programming courses in architectural education are usually done in a very short period of time, and so students have no chance to…
Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

NASA Technical Reports Server (NTRS)

Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

1992-01-01

An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
Service-Oriented Architecture for NVO and TeraGrid Computing

NASA Technical Reports Server (NTRS)

Jacob, Joseph; Miller, Craig; Williams, Roy; Steenberg, Conrad; Graham, Matthew

2008-01-01

The National Virtual Observatory (NVO) Extensible Secure Scalable Service Infrastructure (NESSSI) is a Web service architecture and software framework that enables Web-based astronomical data publishing and processing on grid computers such as the National Science Foundation's TeraGrid. Characteristics of this architecture include the following: (1) Services are created, managed, and upgraded by their developers, who are trusted users of computing platforms on which the services are deployed. (2) Service jobs can be initiated by means of Java or Python client programs run on a command line or with Web portals. (3) Access is granted within a graduated security scheme in which the size of a job that can be initiated depends on the level of authentication of the user.
A multitasking finite state architecture for computer control of an electric powertrain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burba, J.C.

1984-01-01

Finite state techniques provide a common design language between the control engineer and the computer engineer for event driven computer control systems. They simplify communication and provide a highly maintainable control system understandable by both. This paper describes the development of a control system for an electric vehicle powertrain utilizing finite state concepts. The basics of finite state automata are provided as a framework to discuss a unique multitasking software architecture developed for this application. The architecture employs conventional time-sliced techniques with task scheduling controlled by a finite state machine representation of the control strategy of the powertrain. The complexitiesmore » of excitation variable sampling in this environment are also considered.« less
Architecture and data processing alternatives for Tse computer. Volume 1: Tse logic design concepts and the development of image processing machine architectures

NASA Technical Reports Server (NTRS)

Rickard, D. A.; Bodenheimer, R. E.

1976-01-01

Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
Network architecture test-beds as platforms for ubiquitous computing.

PubMed

Roscoe, Timothy

2008-10-28

Distributed systems research, and in particular ubiquitous computing, has traditionally assumed the Internet as a basic underlying communications substrate. Recently, however, the networking research community has come to question the fundamental design or 'architecture' of the Internet. This has been led by two observations: first, that the Internet as it stands is now almost impossible to evolve to support new functionality; and second, that modern applications of all kinds now use the Internet rather differently, and frequently implement their own 'overlay' networks above it to work around its perceived deficiencies. In this paper, I discuss recent academic projects to allow disruptive change to the Internet architecture, and also outline a radically different view of networking for ubiquitous computing that such proposals might facilitate.
FPGA-based real-time phase measuring profilometry algorithm design and implementation

NASA Astrophysics Data System (ADS)

Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng

2016-11-01

Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
An MPI + $X$ implementation of contact global search using Kokkos

DOE PAGES

Hansen, Glen A.; Xavier, Patrick G.; Mish, Sam P.; ...

2015-10-05

This paper describes an approach that seeks to parallelize the spatial search associated with computational contact mechanics. In contact mechanics, the purpose of the spatial search is to find “nearest neighbors,” which is the prelude to an imprinting search that resolves the interactions between the external surfaces of contacting bodies. In particular, we are interested in the contact global search portion of the spatial search associated with this operation on domain-decomposition-based meshes. Specifically, we describe an implementation that combines standard domain-decomposition-based MPI-parallel spatial search with thread-level parallelism (MPI-X) available on advanced computer architectures (those with GPU coprocessors). Our goal ismore » to demonstrate the efficacy of the MPI-X paradigm in the overall contact search. Standard MPI-parallel implementations typically use a domain decomposition of the external surfaces of bodies within the domain in an attempt to efficiently distribute computational work. This decomposition may or may not be the same as the volume decomposition associated with the host physics. The parallel contact global search phase is then employed to find and distribute surface entities (nodes and faces) that are needed to compute contact constraints between entities owned by different MPI ranks without further inter-rank communication. Key steps of the contact global search include computing bounding boxes, building surface entity (node and face) search trees and finding and distributing entities required to complete on-rank (local) spatial searches. To enable source-code portability and performance across a variety of different computer architectures, we implemented the algorithm using the Kokkos hardware abstraction library. While we targeted development towards machines with a GPU accelerator per MPI rank, we also report performance results for OpenMP with a conventional multi-core compute node per rank. Results here demonstrate a 47 % decrease in the time spent within the global search algorithm, comparing the reference ACME algorithm with the GPU implementation, on an 18M face problem using four MPI ranks. As a result, while further work remains to maximize performance on the GPU, this result illustrates the potential of the proposed implementation.« less
Hardware accelerated high performance neutron transport computation based on AGENT methodology

NASA Astrophysics Data System (ADS)

Xiao, Shanjie

The spatial heterogeneity of the next generation Gen-IV nuclear reactor core designs brings challenges to the neutron transport analysis. The Arbitrary Geometry Neutron Transport (AGENT) AGENT code is a three-dimensional neutron transport analysis code being developed at the Laboratory for Neutronics and Geometry Computation (NEGE) at Purdue University. It can accurately describe the spatial heterogeneity in a hierarchical structure through the R-function solid modeler. The previous version of AGENT coupled the 2D transport MOC solver and the 1D diffusion NEM solver to solve the three dimensional Boltzmann transport equation. In this research, the 2D/1D coupling methodology was expanded to couple two transport solvers, the radial 2D MOC solver and the axial 1D MOC solver, for better accuracy. The expansion was benchmarked with the widely applied C5G7 benchmark models and two fast breeder reactor models, and showed good agreement with the reference Monte Carlo results. In practice, the accurate neutron transport analysis for a full reactor core is still time-consuming and thus limits its application. Therefore, another content of my research is focused on designing a specific hardware based on the reconfigurable computing technique in order to accelerate AGENT computations. It is the first time that the application of this type is used to the reactor physics and neutron transport for reactor design. The most time consuming part of the AGENT algorithm was identified. Moreover, the architecture of the AGENT acceleration system was designed based on the analysis. Through the parallel computation on the specially designed, highly efficient architecture, the acceleration design on FPGA acquires high performance at the much lower working frequency than CPUs. The whole design simulations show that the acceleration design would be able to speedup large scale AGENT computations about 20 times. The high performance AGENT acceleration system will drastically shortening the computation time for 3D full-core neutron transport analysis, making the AGENT methodology unique and advantageous, and thus supplies the possibility to extend the application range of neutron transport analysis in either industry engineering or academic research.
CisLunar Habitat Internal Architecture Design Criteria

NASA Technical Reports Server (NTRS)

Jones, R.; Kennedy, K.; Howard, R.; Whitmore, M.; Martin, C.; Garate, J.

2017-01-01

BACKGROUND: In preparation for human exploration to Mars, there is a need to define the development and test program that will validate deep space operations and systems. In that context, a Proving Grounds CisLunar habitat spacecraft is being defined as the next step towards this goal. This spacecraft will operate differently from the ISS or other spacecraft in human history. The performance envelope of this spacecraft (mass, volume, power, specifications, etc.) is being defined by the Future Capabilities Study Team. This team has recognized the need for a human-centered approach for the internal architecture of this spacecraft and has commissioned a CisLunar Phase-1 Habitat Internal Architecture Study Team to develop a NASA reference configuration, providing the Agency with a "smart buyer" approach for future acquisition. THE CISLUNAR HABITAT INTERNAL ARCHITECTURE STUDY: Overall, the CisLunar Habitat Internal Architecture study will address the most significant questions and risks in the current CisLunar architecture, habitation, and operations concept development. This effort is achieved through definition of design criteria, evaluation criteria and process, design of the CisLunar Habitat Phase-1 internal architecture, and the development and fabrication of internal architecture concepts combined with rigorous and methodical Human-in-the-Loop (HITL) evaluations and testing of the conceptual innovations in a controlled test environment. The vision of the CisLunar Habitat Internal Architecture Study is to design, build, and test a CisLunar Phase-1 Habitat Internal Architecture that will be used for habitation (e.g. habitability and human factors) evaluations. The evaluations will mature CisLunar habitat evaluation tools, guidelines, and standards, and will interface with other projects such as the Advanced Exploration Systems (AES) Program integrated Power, Avionics, Software (iPAS), and Logistics for integrated human-in-the-loop testing. The mission of the CisLunar Habitat Internal Architecture Study is to become a forcing function to establish a common understanding of CisLunar Phase-1 Habitation Internal Architecture design criteria, processes, and tools. The scope of the CisLunar Habitat Internal Architecture study is to design, develop, demonstrate, and evaluate a Phase-1 CisLunar Habitat common module internal architecture based on design criteria agreed to by NASA, the International Partners, and Commercial Exploration teams. This task is to define the CisLunar Phase-1 Internal Architecture Government Reference Design, assist NASA in becoming a "smart buyer" for Phase-1 Habitat Concepts, and ultimately to derive standards and requirements from the Internal Architecture Design Process. The first step was to define a Habitat Internal Architecture Design Criteria and create a structured philosophy to be used by design teams as a filter by which critical aspects of consideration would be identified for the purpose of organizing and utilizing interior spaces. With design criteria in place, the team will develop a series of iterative internal architecture concept designs which will be assessed by means of an evaluation criteria and process. These assessments will successively drive and refine the design, leading to the combination and down-selection of design concepts. A single refined reference design configuration will be developed into in a medium-to-high fidelity mockup. A multi-day human-in-the-loop mission test will fully evaluate the reference design and validate its configuration. Lessons learned from the design and evaluation will enable the team to identify appropriate standards for Phase-1 CisLunar Habitat Internal Architecture and will enable NASA to develop derived requirements in support of maturing CisLunar Habitation capabilities. This paper will describe the criteria definition process, workshop event, and resulting CisLunar Phase-1 Habitat Internal Architecture Design Criteria.
The Jupyter/IPython architecture: a unified view of computational research, from interactive exploration to communication and publication.

NASA Astrophysics Data System (ADS)

Ragan-Kelley, M.; Perez, F.; Granger, B.; Kluyver, T.; Ivanov, P.; Frederic, J.; Bussonnier, M.

2014-12-01

IPython has provided terminal-based tools for interactive computing in Python since 2001. The notebook document format and multi-process architecture introduced in 2011 have expanded the applicable scope of IPython into teaching, presenting, and sharing computational work, in addition to interactive exploration. The new architecture also allows users to work in any language, with implementations in Python, R, Julia, Haskell, and several other languages. The language agnostic parts of IPython have been renamed to Jupyter, to better capture the notion that a cross-language design can encapsulate commonalities present in computational research regardless of the programming language being used. This architecture offers components like the web-based Notebook interface, that supports rich documents that combine code and computational results with text narratives, mathematics, images, video and any media that a modern browser can display. This interface can be used not only in research, but also for publication and education, as notebooks can be converted to a variety of output formats, including HTML and PDF. Recent developments in the Jupyter project include a multi-user environment for hosting notebooks for a class or research group, a live collaboration notebook via Google Docs, and better support for languages other than Python.
OS friendly microprocessor architecture: Hardware level computer security

NASA Astrophysics Data System (ADS)

Jungwirth, Patrick; La Fratta, Patrick

2016-05-01

We present an introduction to the patented OS Friendly Microprocessor Architecture (OSFA) and hardware level computer security. Conventional microprocessors have not tried to balance hardware performance and OS performance at the same time. Conventional microprocessors have depended on the Operating System for computer security and information assurance. The goal of the OS Friendly Architecture is to provide a high performance and secure microprocessor and OS system. We are interested in cyber security, information technology (IT), and SCADA control professionals reviewing the hardware level security features. The OS Friendly Architecture is a switched set of cache memory banks in a pipeline configuration. For light-weight threads, the memory pipeline configuration provides near instantaneous context switching times. The pipelining and parallelism provided by the cache memory pipeline provides for background cache read and write operations while the microprocessor's execution pipeline is running instructions. The cache bank selection controllers provide arbitration to prevent the memory pipeline and microprocessor's execution pipeline from accessing the same cache bank at the same time. This separation allows the cache memory pages to transfer to and from level 1 (L1) caching while the microprocessor pipeline is executing instructions. Computer security operations are implemented in hardware. By extending Unix file permissions bits to each cache memory bank and memory address, the OSFA provides hardware level computer security.
Developing a New Framework for Integration and Teaching of Computer Aided Architectural Design (CAAD) in Nigerian Schools of Architecture

ERIC Educational Resources Information Center

Uwakonye, Obioha; Alagbe, Oluwole; Oluwatayo, Adedapo; Alagbe, Taiye; Alalade, Gbenga

2015-01-01

As a result of globalization of digital technology, intellectual discourse on what constitutes the basic body of architectural knowledge to be imparted to future professionals has been on the increase. This digital revolution has brought to the fore the need to review the already overloaded architectural education curriculum of Nigerian schools of…
Understanding Portability of a High-Level Programming Model on Contemporary Heterogeneous Architectures

DOE PAGES

Sabne, Amit J.; Sakdhnagool, Putt; Lee, Seyong; ...

2015-07-13

Accelerator-based heterogeneous computing is gaining momentum in the high-performance computing arena. However, the increased complexity of heterogeneous architectures demands more generic, high-level programming models. OpenACC is one such attempt to tackle this problem. Although the abstraction provided by OpenACC offers productivity, it raises questions concerning both functional and performance portability. In this article, the authors propose HeteroIR, a high-level, architecture-independent intermediate representation, to map high-level programming models, such as OpenACC, to heterogeneous architectures. They present a compiler approach that translates OpenACC programs into HeteroIR and accelerator kernels to obtain OpenACC functional portability. They then evaluate the performance portability obtained bymore » OpenACC with their approach on 12 OpenACC programs on Nvidia CUDA, AMD GCN, and Intel Xeon Phi architectures. They study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.« less
The future of computing--new architectures and new technologies.

PubMed

Warren, P

2004-02-01

All modern computers are designed using the 'von Neumann' architecture and built using silicon transistor technology. Both architecture and technology have been remarkably successful. Yet there are a range of problems for which this conventional architecture is not particularly well adapted, and new architectures are being proposed to solve these problems, in particular based on insight from nature. Transistor technology has enjoyed 50 years of continuing progress. However, the laws of physics dictate that within a relatively short time period this progress will come to an end. New technologies, based on molecular and biological sciences as well as quantum physics, are vying to replace silicon, or at least coexist with it and extend its capability. The paper describes these novel architectures and technologies, places them in the context of the kinds of problems they might help to solve, and predicts their possible manner and time of adoption. Finally it describes some key questions and research problems associated with their use.
Scalable service architecture for providing strong service guarantees

NASA Astrophysics Data System (ADS)

Christin, Nicolas; Liebeherr, Joerg

2002-07-01

For the past decade, a lot of Internet research has been devoted to providing different levels of service to applications. Initial proposals for service differentiation provided strong service guarantees, with strict bounds on delays, loss rates, and throughput, but required high overhead in terms of computational complexity and memory, both of which raise scalability concerns. Recently, the interest has shifted to service architectures with low overhead. However, these newer service architectures only provide weak service guarantees, which do not always address the needs of applications. In this paper, we describe a service architecture that supports strong service guarantees, can be implemented with low computational complexity, and only requires to maintain little state information. A key mechanism of the proposed service architecture is that it addresses scheduling and buffer management in a single algorithm. The presented architecture offers no solution for controlling the amount of traffic that enters the network. Instead, we plan on exploiting feedback mechanisms of TCP congestion control algorithms for the purpose of regulating the traffic entering the network.
GPU-computing in econophysics and statistical physics

NASA Astrophysics Data System (ADS)

Preis, T.

2011-03-01

A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.
Accelerating Astronomy & Astrophysics in the New Era of Parallel Computing: GPUs, Phi and Cloud Computing

NASA Astrophysics Data System (ADS)

Ford, Eric B.; Dindar, Saleh; Peters, Jorg

2015-08-01

The realism of astrophysical simulations and statistical analyses of astronomical data are set by the available computational resources. Thus, astronomers and astrophysicists are constantly pushing the limits of computational capabilities. For decades, astronomers benefited from massive improvements in computational power that were driven primarily by increasing clock speeds and required relatively little attention to details of the computational hardware. For nearly a decade, increases in computational capabilities have come primarily from increasing the degree of parallelism, rather than increasing clock speeds. Further increases in computational capabilities will likely be led by many-core architectures such as Graphical Processing Units (GPUs) and Intel Xeon Phi. Successfully harnessing these new architectures, requires significantly more understanding of the hardware architecture, cache hierarchy, compiler capabilities and network network characteristics.I will provide an astronomer's overview of the opportunities and challenges provided by modern many-core architectures and elastic cloud computing. The primary goal is to help an astronomical audience understand what types of problems are likely to yield more than order of magnitude speed-ups and which problems are unlikely to parallelize sufficiently efficiently to be worth the development time and/or costs.I will draw on my experience leading a team in developing the Swarm-NG library for parallel integration of large ensembles of small n-body systems on GPUs, as well as several smaller software projects. I will share lessons learned from collaborating with computer scientists, including both technical and soft skills. Finally, I will discuss the challenges of training the next generation of astronomers to be proficient in this new era of high-performance computing, drawing on experience teaching a graduate class on High-Performance Scientific Computing for Astrophysics and organizing a 2014 advanced summer school on Bayesian Computing for Astronomical Data Analysis with support of the Penn State Center for Astrostatistics and Institute for CyberScience.
Power Efficient Hardware Architecture of SHA-1 Algorithm for Trusted Mobile Computing

NASA Astrophysics Data System (ADS)

Kim, Mooseop; Ryou, Jaecheol

The Trusted Mobile Platform (TMP) is developed and promoted by the Trusted Computing Group (TCG), which is an industry standard body to enhance the security of the mobile computing environment. The built-in SHA-1 engine in TMP is one of the most important circuit blocks and contributes the performance of the whole platform because it is used as key primitives supporting platform integrity and command authentication. Mobile platforms have very stringent limitations with respect to available power, physical circuit area, and cost. Therefore special architecture and design methods for low power SHA-1 circuit are required. In this paper, we present a novel and efficient hardware architecture of low power SHA-1 design for TMP. Our low power SHA-1 hardware can compute 512-bit data block using less than 7,000 gates and has a power consumption about 1.1 mA on a 0.25μm CMOS process.

An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1996-01-01

We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms

NASA Technical Reports Server (NTRS)

Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.

1997-01-01

We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Performance Analysis of Distributed Object-Oriented Applications

NASA Technical Reports Server (NTRS)

Schoeffler, James D.

1998-01-01

The purpose of this research was to evaluate the efficiency of a distributed simulation architecture which creates individual modules which are made self-scheduling through the use of a message-based communication system used for requesting input data from another module which is the source of that data. To make the architecture as general as possible, the message-based communication architecture was implemented using standard remote object architectures (Common Object Request Broker Architecture (CORBA) and/or Distributed Component Object Model (DCOM)). A series of experiments were run in which different systems are distributed in a variety of ways across multiple computers and the performance evaluated. The experiments were duplicated in each case so that the overhead due to message communication and data transmission can be separated from the time required to actually perform the computational update of a module each iteration. The software used to distribute the modules across multiple computers was developed in the first year of the current grant and was modified considerably to add a message-based communication scheme supported by the DCOM distributed object architecture. The resulting performance was analyzed using a model created during the first year of this grant which predicts the overhead due to CORBA and DCOM remote procedure calls and includes the effects of data passed to and from the remote objects. A report covering the distributed simulation software and the results of the performance experiments has been submitted separately. The above report also discusses possible future work to apply the methodology to dynamically distribute the simulation modules so as to minimize overall computation time.
Parallel algorithms for mapping pipelined and parallel computations

NASA Technical Reports Server (NTRS)

Nicol, David M.

1988-01-01

Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob

2003-01-01

The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
Programming for 1.6 Millon cores: Early experiences with IBM's BG/Q SMP architecture

NASA Astrophysics Data System (ADS)

Glosli, James

2013-03-01

With the stall in clock cycle improvements a decade ago, the drive for computational performance has continues along a path of increasing core counts on a processor. The multi-core evolution has been expressed in both a symmetric multi processor (SMP) architecture and cpu/GPU architecture. Debates rage in the high performance computing (HPC) community which architecture best serves HPC. In this talk I will not attempt to resolve that debate but perhaps fuel it. I will discuss the experience of exploiting Sequoia, a 98304 node IBM Blue Gene/Q SMP at Lawrence Livermore National Laboratory. The advantages and challenges of leveraging the computational power BG/Q will be detailed through the discussion of two applications. The first application is a Molecular Dynamics code called ddcMD. This is a code developed over the last decade at LLNL and ported to BG/Q. The second application is a cardiac modeling code called Cardioid. This is a code that was recently designed and developed at LLNL to exploit the fine scale parallelism of BG/Q's SMP architecture. Through the lenses of these efforts I'll illustrate the need to rethink how we express and implement our computational approaches. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
A Survey on Next-generation Power Grid Data Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

You, Shutang; Zhu, Dr. Lin; Liu, Yong

2015-01-01

The operation and control of power grids will increasingly rely on data. A high-speed, reliable, flexible and secure data architecture is the prerequisite of the next-generation power grid. This paper summarizes the challenges in collecting and utilizing power grid data, and then provides reference data architecture for future power grids. Based on the data architecture deployment, related research on data architecture is reviewed and summarized in several categories including data measurement/actuation, data transmission, data service layer, data utilization, as well as two cross-cutting issues, interoperability and cyber security. Research gaps and future work are also presented.
Parallel compression/decompression-based datapath architecture for multibeam mask writers

NASA Astrophysics Data System (ADS)

Chaudhary, Narendra; Savari, Serap A.

2017-06-01

Multibeam electron beam systems will be used in the future for mask writing and for complimentary lithography. The major challenges of the multibeam systems are in meeting throughput requirements and in handling the large data volumes associated with writing grayscale data on the wafer. In terms of future communications and computational requirements Amdahl's Law suggests that a simple increase of computation power and parallelism may not be a sustainable solution. We propose a parallel data compression algorithm to exploit the sparsity of mask data and a grayscale video-like representation of data. To improve the communication and computational efficiency of these systems at the write time we propose an alternate datapath architecture partly motivated by multibeam direct write lithography and partly motivated by the circuit testing literature, where parallel decompression reduces clock cycles. We explain a deflection plate architecture inspired by NuFlare Technology's multibeam mask writing system and how our datapath architecture can be easily added to it to improve performance.
Parallel compression/decompression-based datapath architecture for multibeam mask writers

NASA Astrophysics Data System (ADS)

Chaudhary, Narendra; Savari, Serap A.

2017-10-01

Multibeam electron beam systems will be used in the future for mask writing and for complementary lithography. The major challenges of the multibeam systems are in meeting throughput requirements and in handling the large data volumes associated with writing grayscale data on the wafer. In terms of future communications and computational requirements, Amdahl's law suggests that a simple increase of computation power and parallelism may not be a sustainable solution. We propose a parallel data compression algorithm to exploit the sparsity of mask data and a grayscale video-like representation of data. To improve the communication and computational efficiency of these systems at the write time, we propose an alternate datapath architecture partly motivated by multibeam direct-write lithography and partly motivated by the circuit testing literature, where parallel decompression reduces clock cycles. We explain a deflection plate architecture inspired by NuFlare Technology's multibeam mask writing system and how our datapath architecture can be easily added to it to improve performance.
Real-time simulation of large-scale neural architectures for visual features computation based on GPU.

PubMed

Chessa, Manuela; Bianchi, Valentina; Zampetti, Massimo; Sabatini, Silvio P; Solari, Fabio

2012-01-01

The intrinsic parallelism of visual neural architectures based on distributed hierarchical layers is well suited to be implemented on the multi-core architectures of modern graphics cards. The design strategies that allow us to optimally take advantage of such parallelism, in order to efficiently map on GPU the hierarchy of layers and the canonical neural computations, are proposed. Speciﬁcally, the advantages of a cortical map-like representation of the data are exploited. Moreover, a GPU implementation of a novel neural architecture for the computation of binocular disparity from stereo image pairs, based on populations of binocular energy neurons, is presented. The implemented neural model achieves good performances in terms of reliability of the disparity estimates and a near real-time execution speed, thus demonstrating the effectiveness of the devised design strategies. The proposed approach is valid in general, since the neural building blocks we implemented are a common basis for the modeling of visual neural functionalities.
Virtual Business Operating Environment in the Cloud: Conceptual Architecture and Challenges

NASA Astrophysics Data System (ADS)

Nezhad, Hamid R. Motahari; Stephenson, Bryan; Singhal, Sharad; Castellanos, Malu

Advances in service oriented architecture (SOA) have brought us close to the once imaginary vision of establishing and running a virtual business, a business in which most or all of its business functions are outsourced to online services. Cloud computing offers a realization of SOA in which IT resources are offered as services that are more affordable, flexible and attractive to businesses. In this paper, we briefly study advances in cloud computing, and discuss the benefits of using cloud services for businesses and trade-offs that they have to consider. We then present 1) a layered architecture for the virtual business, and 2) a conceptual architecture for a virtual business operating environment. We discuss the opportunities and research challenges that are ahead of us in realizing the technical components of this conceptual architecture. We conclude by giving the outlook and impact of cloud services on both large and small businesses.
Strategies for concurrent processing of complex algorithms in data driven architectures

NASA Technical Reports Server (NTRS)

Stoughton, John W.; Mielke, Roland R.

1987-01-01

The results of ongoing research directed at developing a graph theoretical model for describing data and control flow associated with the execution of large grained algorithms in a spatial distributed computer environment is presented. This model is identified by the acronym ATAMM (Algorithm/Architecture Mapping Model). The purpose of such a model is to provide a basis for establishing rules for relating an algorithm to its execution in a multiprocessor environment. Specifications derived from the model lead directly to the description of a data flow architecture which is a consequence of the inherent behavior of the data and control flow described by the model. The purpose of the ATAMM based architecture is to optimize computational concurrency in the multiprocessor environment and to provide an analytical basis for performance evaluation. The ATAMM model and architecture specifications are demonstrated on a prototype system for concept validation.
A "Language Lab" for Architectural Design.

ERIC Educational Resources Information Center

Mackenzie, Arch; And Others

This paper discusses a "language lab" strategy in which traditional studio learning may be supplemented by language lessons using computer graphics techniques to teach architectural grammar, a body of elements and principles that govern the design of buildings belonging to a particular architectural theory or style. Two methods of…
75 FR 34004 - State Cemetery Grants

Federal Register 2010, 2011, 2012, 2013, 2014

2010-06-16

... architectural design codes that apply to grant applicants, we decided to update those references in a separate.... 39.63 Architectural design standards. Subpart C--Operation and Maintenance Projects Grant... acquisition, design and planning, earth moving, landscaping, construction, and provision of initial operating...
Geocomputation over Hybrid Computer Architecture and Systems: Prior Works and On-going Initiatives at UARK

NASA Astrophysics Data System (ADS)

Shi, X.

2015-12-01

As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
Efficient Graph Based Assembly of Short-Read Sequences on Hybrid Core Architecture

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sczyrba, Alex; Pratap, Abhishek; Canon, Shane

2011-03-22

Advanced architectures can deliver dramatically increased throughput for genomics and proteomics applications, reducing time-to-completion in some cases from days to minutes. One such architecture, hybrid-core computing, marries a traditional x86 environment with a reconfigurable coprocessor, based on field programmable gate array (FPGA) technology. In addition to higher throughput, increased performance can fundamentally improve research quality by allowing more accurate, previously impractical approaches. We will discuss the approach used by Convey?s de Bruijn graph constructor for short-read, de-novo assembly. Bioinformatics applications that have random access patterns to large memory spaces, such as graph-based algorithms, experience memory performance limitations on cache-based x86more » servers. Convey?s highly parallel memory subsystem allows application-specific logic to simultaneously access 8192 individual words in memory, significantly increasing effective memory bandwidth over cache-based memory systems. Many algorithms, such as Velvet and other de Bruijn graph based, short-read, de-novo assemblers, can greatly benefit from this type of memory architecture. Furthermore, small data type operations (four nucleotides can be represented in two bits) make more efficient use of logic gates than the data types dictated by conventional programming models.JGI is comparing the performance of Convey?s graph constructor and Velvet on both synthetic and real data. We will present preliminary results on memory usage and run time metrics for various data sets with different sizes, from small microbial and fungal genomes to very large cow rumen metagenome. For genomes with references we will also present assembly quality comparisons between the two assemblers.« less
PNNLs Data Intensive Computing research battles Homeland Security threats

ScienceCinema

David Thurman; Joe Kielman; Katherine Wolf; David Atkinson

2018-05-11

The Pacific Northwest National Laboratorys (PNNL's) approach to data intensive computing (DIC) is focused on three key research areas: hybrid hardware architecture, software architectures, and analytic algorithms. Advancements in these areas will help to address, and solve, DIC issues associated with capturing, managing, analyzing and understanding, in near real time, data at volumes and rates that push the frontiers of current technologies.
Design and Training of Limited-Interconnect Architectures

DTIC Science & Technology

1991-07-16

and signal processing. Neuromorphic (brain like) models, allow an alternative for achieving real-time operation tor such tasks, while having a...compact and robust architecture. Neuromorphic models consist of interconnections of simple computational nodes. In this approach, each node computes a...operational performance. I1. Research Objectives The research objectives were: 1. Development of on- chip local training rules specifically designed for
Pipelined CPU Design with FPGA in Teaching Computer Architecture

ERIC Educational Resources Information Center

Lee, Jong Hyuk; Lee, Seung Eun; Yu, Heon Chang; Suh, Taeweon

2012-01-01

This paper presents a pipelined CPU design project with a field programmable gate array (FPGA) system in a computer architecture course. The class project is a five-stage pipelined 32-bit MIPS design with experiments on the Altera DE2 board. For proper scheduling, milestones were set every one or two weeks to help students complete the project on…
PNNL pushing scientific discovery through data intensive computing breakthroughs

ScienceCinema

Deborah Gracio; David Koppenaal; Ruby Leung

2018-05-18

The Pacific Northwest National Laboratory's approach to data intensive computing (DIC) is focused on three key research areas: hybrid hardware architectures, software architectures, and analytic algorithms. Advancements in these areas will help to address, and solve, DIC issues associated with capturing, managing, analyzing and understanding, in near real time, data at volumes and rates that push the frontiers of current technologies.

Fault tolerant computer control for a Maglev transportation system

NASA Technical Reports Server (NTRS)

Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George

1994-01-01

Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.
Quantum error correction in crossbar architectures

NASA Astrophysics Data System (ADS)

Helsen, Jonas; Steudtner, Mark; Veldhorst, Menno; Wehner, Stephanie

2018-07-01

A central challenge for the scaling of quantum computing systems is the need to control all qubits in the system without a large overhead. A solution for this problem in classical computing comes in the form of so-called crossbar architectures. Recently we made a proposal for a large-scale quantum processor (Li et al arXiv:1711.03807 (2017)) to be implemented in silicon quantum dots. This system features a crossbar control architecture which limits parallel single-qubit control, but allows the scheme to overcome control scaling issues that form a major hurdle to large-scale quantum computing systems. In this work, we develop a language that makes it possible to easily map quantum circuits to crossbar systems, taking into account their architecture and control limitations. Using this language we show how to map well known quantum error correction codes such as the planar surface and color codes in this limited control setting with only a small overhead in time. We analyze the logical error behavior of this surface code mapping for estimated experimental parameters of the crossbar system and conclude that logical error suppression to a level useful for real quantum computation is feasible.
Robust Software Architecture for Robots

NASA Technical Reports Server (NTRS)

Aghazanian, Hrand; Baumgartner, Eric; Garrett, Michael

2009-01-01

Robust Real-Time Reconfigurable Robotics Software Architecture (R4SA) is the name of both a software architecture and software that embodies the architecture. The architecture was conceived in the spirit of current practice in designing modular, hard, realtime aerospace systems. The architecture facilitates the integration of new sensory, motor, and control software modules into the software of a given robotic system. R4SA was developed for initial application aboard exploratory mobile robots on Mars, but is adaptable to terrestrial robotic systems, real-time embedded computing systems in general, and robotic toys.
Parallel computing works

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
MINDS: Architecture & Design

DTIC Science & Technology

2006-07-14

MINDS: Architecture & Design Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union...Street SE Minneapolis, MN 55455-0159 USA TR 06-022 MINDS: Architecture & Design Varun Chandola, Eric Eilertson, Levent Ertoz, Gyorgy Simon, and Vipin...REPORT DATE 14 JUL 2006 2. REPORT TYPE 3. DATES COVERED 00-07-2006 to 00-07-2006 4. TITLE AND SUBTITLE MINDS: Architecture & Design 5a
Federal Emergency Management Information System (FEMIS) System Administration Guide for FEMIS Version 1.4.6

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arp, J.A.; Bower, J.C.; Burnett, R.A.

The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
Federal Emergency Management Information System (FEMIS), Installation Guide for FEMIS 1.4.6

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arp, J.A.; Burnett, R.A.; Carter, R.J.

The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
Federal Emergency Management Information System (FEMIS) Data Management Guide for FEMIS Version 1.4.6

DOE Office of Scientific and Technical Information (OSTI.GOV)

Angel, L.K.; Bower, J.C.; Burnett, R.A.

1999-06-29

The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures

DOE PAGES

Moreland, Kenneth; Sewell, Christopher; Usher, William; ...

2016-05-09

Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures

DOE PAGES

Moreland, Kenneth; Sewell, Christopher; Usher, William; ...

2016-05-09

Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
The Architecture of Information at Plateau Beaubourg

ERIC Educational Resources Information Center

Branda, Ewan Edward

2012-01-01

During the course of the 1960s, computers and information networks made their appearance in the public imagination. To architects on the cusp of architecture's postmodern turn, information technology offered new forms, metaphors, and techniques by which modern architecture's technological and utopian basis could be reasserted. Yet by the end of…
75 FR 2433 - Special Conditions: Boeing Model 747-8/-8F Airplanes, Systems and Data Networks Security...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-01-15

... design features associated with the architecture and connectivity capabilities of the airplane's computer... novel or unusual design features: digital systems architecture composed of several connected networks. The architecture and network configuration may be used for, or interfaced with, a diverse set of...
Integrated Cognitive-neuroscience Architectures for Understanding Sensemaking (ICArUS): Phase 1 Test and Evaluation Development Guide

DTIC Science & Technology

2014-11-01

Integrated Cognitive-neuroscience Architectures for Understanding Sensemaking (ICArUS): Phase 1 Test and Evaluation Development Guide Craig...Self-initiated sensemaking ........................................................................................... 19 Feature Vector Format: Tasks...The Integrated Cognitive-neuroscience Architectures for Understanding Sensemaking (ICArUS) Program aimed to build computational cognitive
Multi-level Hierarchical Poly Tree computer architectures

NASA Technical Reports Server (NTRS)

Padovan, Joe; Gute, Doug

1990-01-01

Based on the concept of hierarchical substructuring, this paper develops an optimal multi-level Hierarchical Poly Tree (HPT) parallel computer architecture scheme which is applicable to the solution of finite element and difference simulations. Emphasis is given to minimizing computational effort, in-core/out-of-core memory requirements, and the data transfer between processors. In addition, a simplified communications network that reduces the number of I/O channels between processors is presented. HPT configurations that yield optimal superlinearities are also demonstrated. Moreover, to generalize the scope of applicability, special attention is given to developing: (1) multi-level reduction trees which provide an orderly/optimal procedure by which model densification/simplification can be achieved, as well as (2) methodologies enabling processor grading that yields architectures with varying types of multi-level granularity.
Benchmarking high performance computing architectures with CMS’ skeleton framework

NASA Astrophysics Data System (ADS)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
FFT Computation with Systolic Arrays, A New Architecture

NASA Technical Reports Server (NTRS)

Boriakoff, Valentin

1994-01-01

The use of the Cooley-Tukey algorithm for computing the l-d FFT lends itself to a particular matrix factorization which suggests direct implementation by linearly-connected systolic arrays. Here we present a new systolic architecture that embodies this algorithm. This implementation requires a smaller number of processors and a smaller number of memory cells than other recent implementations, as well as having all the advantages of systolic arrays. For the implementation of the decimation-in-frequency case, word-serial data input allows continuous real-time operation without the need of a serial-to-parallel conversion device. No control or data stream switching is necessary. Computer simulation of this architecture was done in the context of a 1024 point DFT with a fixed point processor, and CMOS processor implementation has started.
NETRA: A parallel architecture for integrated vision systems. 1: Architecture and organization

NASA Technical Reports Server (NTRS)

Choudhary, Alok N.; Patel, Janak H.; Ahuja, Narendra

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is considered to be a system that uses vision algorithms from all levels of processing for a high level application (such as object recognition). A model of computation is presented for parallel processing for an IVS. Using the model, desired features and capabilities of a parallel architecture suitable for IVSs are derived. Then a multiprocessor architecture (called NETRA) is presented. This architecture is highly flexible without the use of complex interconnection schemes. The topology of NETRA is recursively defined and hence is easily scalable from small to large systems. Homogeneity of NETRA permits fault tolerance and graceful degradation under faults. It is a recursively defined tree-type hierarchical architecture where each of the leaf nodes consists of a cluster of processors connected with a programmable crossbar with selective broadcast capability to provide for desired flexibility. A qualitative evaluation of NETRA is presented. Then general schemes are described to map parallel algorithms onto NETRA. Algorithms are classified according to their communication requirements for parallel processing. An extensive analysis of inter-cluster communication strategies in NETRA is presented, and parameters affecting performance of parallel algorithms when mapped on NETRA are discussed. Finally, a methodology to evaluate performance of algorithms on NETRA is described.
Optimum Guidance Law and Information Management for a Large Number of Formation Flying Spacecrafts

NASA Astrophysics Data System (ADS)

Tsuda, Yuichi; Nakasuka, Shinichi

In recent years, formation flying technique is recognized as one of the most important technologies for deep space and orbital missions that involve multiple spacecraft operations. Formation flying mission improves simultaneous observability over a wide area, redundancy and reconfigurability of the system with relatively small and low cost spacecrafts compared with the conventional single spacecraft mission. From the viewpoint of guidance and control, realizing formation flying mission usually requires tight maintenance and control of the relative distances, speeds and orientations between the member satellites. This paper studies a practical architecture for formation flight missions focusing mainly on guidance and control, and describes a new guidance algorithm for changing and keeping the relative positions and speeds of the satellites in formation. The resulting algorithm is suitable for onboard processing and gives the optimum impulsive trajectory for satellites flying closely around a certain reference orbit, that can be elliptic, parabolic or hyperbolic. Based on this guidance algorithm, this study introduces an information management methodology between the member spacecrafts which is suitable for a large formation flight architecture. Routing and multicast communication based on the wireless local area network technology are introduced. Some mathematical analyses and computer simulations will be shown in the presentation to reveal the feasibility of the proposed formation flight architecture, especially when a very large number of satellites join the formation.
An integrated decision-making framework for transportation architectures: Application to aviation systems design

NASA Astrophysics Data System (ADS)

Lewe, Jung-Ho

The National Transportation System (NTS) is undoubtedly a complex system-of-systems---a collection of diverse 'things' that evolve over time, organized at multiple levels, to achieve a range of possibly conflicting objectives, and never quite behaving as planned. The purpose of this research is to develop a virtual transportation architecture for the ultimate goal of formulating an integrated decision-making framework. The foundational endeavor begins with creating an abstraction of the NTS with the belief that a holistic frame of reference is required to properly study such a multi-disciplinary, trans-domain system. The culmination of the effort produces the Transportation Architecture Field (TAF) as a mental model of the NTS, in which the relationships between four basic entity groups are identified and articulated. This entity-centric abstraction framework underpins the construction of a virtual NTS couched in the form of an agent-based model. The transportation consumers and the service providers are identified as adaptive agents that apply a set of preprogrammed behavioral rules to achieve their respective goals. The transportation infrastructure and multitude of exogenous entities (disruptors and drivers) in the whole system can also be represented without resorting to an extremely complicated structure. The outcome is a flexible, scalable, computational model that allows for examination of numerous scenarios which involve the cascade of interrelated effects of aviation technology, infrastructure, and socioeconomic changes throughout the entire system.
Earth Science Computational Architecture for Multi-disciplinary Investigations

NASA Astrophysics Data System (ADS)

Parker, J. W.; Blom, R.; Gurrola, E.; Katz, D.; Lyzenga, G.; Norton, C.

2005-12-01

Understanding the processes underlying Earth's deformation and mass transport requires a non-traditional, integrated, interdisciplinary, approach dependent on multiple space and ground based data sets, modeling, and computational tools. Currently, details of geophysical data acquisition, analysis, and modeling largely limit research to discipline domain experts. Interdisciplinary research requires a new computational architecture that is optimized to perform complex data processing of multiple solid Earth science data types in a user-friendly environment. A web-based computational framework is being developed and integrated with applications for automatic interferometric radar processing, and models for high-resolution deformation & gravity, forward models of viscoelastic mass loading over short wavelengths & complex time histories, forward-inverse codes for characterizing surface loading-response over time scales of days to tens of thousands of years, and inversion of combined space magnetic & gravity fields to constrain deep crustal and mantle properties. This framework combines an adaptation of the QuakeSim distributed services methodology with the Pyre framework for multiphysics development. The system uses a three-tier architecture, with a middle tier server that manages user projects, available resources, and security. This ensures scalability to very large networks of collaborators. Users log into a web page and have a personal project area, persistently maintained between connections, for each application. Upon selection of an application and host from a list of available entities, inputs may be uploaded or constructed from web forms and available data archives, including gravity, GPS and imaging radar data. The user is notified of job completion and directed to results posted via URLs. Interdisciplinary work is supported through easy availability of all applications via common browsers, application tutorials and reference guides, and worked examples with visual response. At the platform level, multi-physics application development and workflow are available in the enriched environment of the Pyre framework. Advantages for combining separate expert domains include: multiple application components efficiently interact through Python shared libraries, investigators may nimbly swap models and try new parameter values, and a rich array of common tools are inherent in the Pyre system. The first four specific investigations to use this framework are: Gulf Coast subsidence: understanding of partitioning between compaction, subsidence and growth faulting; Gravity & deformation of a layered spherical earth model due to large earthquakes; Rift setting of Lake Vostok, Antarctica; and global ice mass changes.

The Nebula Standard Computer Architecture,

DTIC Science & Technology

good target for high level languages, the designers also adopted a visibility approach in architecture design that provides more freedom for the hardware implementor while still maintaining software portability. (Author)
A set-theoretic model reference adaptive control architecture for disturbance rejection and uncertainty suppression with strict performance guarantees

NASA Astrophysics Data System (ADS)

Arabi, Ehsan; Gruenwald, Benjamin C.; Yucelen, Tansel; Nguyen, Nhan T.

2018-05-01

Research in adaptive control algorithms for safety-critical applications is primarily motivated by the fact that these algorithms have the capability to suppress the effects of adverse conditions resulting from exogenous disturbances, imperfect dynamical system modelling, degraded modes of operation, and changes in system dynamics. Although government and industry agree on the potential of these algorithms in providing safety and reducing vehicle development costs, a major issue is the inability to achieve a-priori, user-defined performance guarantees with adaptive control algorithms. In this paper, a new model reference adaptive control architecture for uncertain dynamical systems is presented to address disturbance rejection and uncertainty suppression. The proposed framework is predicated on a set-theoretic adaptive controller construction using generalised restricted potential functions.The key feature of this framework allows the system error bound between the state of an uncertain dynamical system and the state of a reference model, which captures a desired closed-loop system performance, to be less than a-priori, user-defined worst-case performance bound, and hence, it has the capability to enforce strict performance guarantees. Examples are provided to demonstrate the efficacy of the proposed set-theoretic model reference adaptive control architecture.
Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

NASA Astrophysics Data System (ADS)

Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

2015-09-01

The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
GAMUT: GPU accelerated microRNA analysis to uncover target genes through CUDA-miRanda

PubMed Central

2014-01-01

Background Non-coding sequences such as microRNAs have important roles in disease processes. Computational microRNA target identification (CMTI) is becoming increasingly important since traditional experimental methods for target identification pose many difficulties. These methods are time-consuming, costly, and often need guidance from computational methods to narrow down candidate genes anyway. However, most CMTI methods are computationally demanding, since they need to handle not only several million query microRNA and reference RNA pairs, but also several million nucleotide comparisons within each given pair. Thus, the need to perform microRNA identification at such large scale has increased the demand for parallel computing. Methods Although most CMTI programs (e.g., the miRanda algorithm) are based on a modified Smith-Waterman (SW) algorithm, the existing parallel SW implementations (e.g., CUDASW++ 2.0/3.0, SWIPE) are unable to meet this demand in CMTI tasks. We present CUDA-miRanda, a fast microRNA target identification algorithm that takes advantage of massively parallel computing on Graphics Processing Units (GPU) using NVIDIA's Compute Unified Device Architecture (CUDA). CUDA-miRanda specifically focuses on the local alignment of short (i.e., ≤ 32 nucleotides) sequences against longer reference sequences (e.g., 20K nucleotides). Moreover, the proposed algorithm is able to report multiple alignments (up to 191 top scores) and the corresponding traceback sequences for any given (query sequence, reference sequence) pair. Results Speeds over 5.36 Giga Cell Updates Per Second (GCUPs) are achieved on a server with 4 NVIDIA Tesla M2090 GPUs. Compared to the original miRanda algorithm, which is evaluated on an Intel Xeon E5620@2.4 GHz CPU, the experimental results show up to 166 times performance gains in terms of execution time. In addition, we have verified that the exact same targets were predicted in both CUDA-miRanda and the original miRanda implementations through multiple test datasets. Conclusions We offer a GPU-based alternative to high performance compute (HPC) that can be developed locally at a relatively small cost. The community of GPU developers in the biomedical research community, particularly for genome analysis, is still growing. With increasing shared resources, this community will be able to advance CMTI in a very significant manner. Our source code is available at https://sourceforge.net/projects/cudamiranda/. PMID:25077821
GPU and APU computations of Finite Time Lyapunov Exponent fields

NASA Astrophysics Data System (ADS)

Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros

2012-03-01

We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
A model of head-related transfer functions based on a state-space analysis

NASA Astrophysics Data System (ADS)

Adams, Norman Herkamp

This dissertation develops and validates a novel state-space method for binaural auditory display. Binaural displays seek to immerse a listener in a 3D virtual auditory scene with a pair of headphones. The challenge for any binaural display is to compute the two signals to supply to the headphones. The present work considers a general framework capable of synthesizing a wide variety of auditory scenes. The framework models collections of head-related transfer functions (HRTFs) simultaneously. This framework improves the flexibility of contemporary displays, but it also compounds the steep computational cost of the display. The cost is reduced dramatically by formulating the collection of HRTFs in the state-space and employing order-reduction techniques to design efficient approximants. Order-reduction techniques based on the Hankel-operator are found to yield accurate low-cost approximants. However, the inter-aural time difference (ITD) of the HRTFs degrades the time-domain response of the approximants. Fortunately, this problem can be circumvented by employing a state-space architecture that allows the ITD to be modeled outside of the state-space. Accordingly, three state-space architectures are considered. Overall, a multiple-input, single-output (MISO) architecture yields the best compromise between performance and flexibility. The state-space approximants are evaluated both empirically and psychoacoustically. An array of truncated FIR filters is used as a pragmatic reference system for comparison. For a fixed cost bound, the state-space systems yield lower approximation error than FIR arrays for D>10, where D is the number of directions in the HRTF collection. A series of headphone listening tests are also performed to validate the state-space approach, and to estimate the minimum order N of indiscriminable approximants. For D = 50, the state-space systems yield order thresholds less than half those of the FIR arrays. Depending upon the stimulus uncertainty, a minimum state-space order of 7≤N≤23 appears to be adequate. In conclusion, the proposed state-space method enables a more flexible and immersive binaural display with low computational cost.
Spatial reference frames of visual, vestibular, and multimodal heading signals in the dorsal subdivision of the medial superior temporal area.

PubMed

Fetsch, Christopher R; Wang, Sentao; Gu, Yong; Deangelis, Gregory C; Angelaki, Dora E

2007-01-17

Heading perception is a complex task that generally requires the integration of visual and vestibular cues. This sensory integration is complicated by the fact that these two modalities encode motion in distinct spatial reference frames (visual, eye-centered; vestibular, head-centered). Visual and vestibular heading signals converge in the primate dorsal subdivision of the medial superior temporal area (MSTd), a region thought to contribute to heading perception, but the reference frames of these signals remain unknown. We measured the heading tuning of MSTd neurons by presenting optic flow (visual condition), inertial motion (vestibular condition), or a congruent combination of both cues (combined condition). Static eye position was varied from trial to trial to determine the reference frame of tuning (eye-centered, head-centered, or intermediate). We found that tuning for optic flow was predominantly eye-centered, whereas tuning for inertial motion was intermediate but closer to head-centered. Reference frames in the two unimodal conditions were rarely matched in single neurons and uncorrelated across the population. Notably, reference frames in the combined condition varied as a function of the relative strength and spatial congruency of visual and vestibular tuning. This represents the first investigation of spatial reference frames in a naturalistic, multimodal condition in which cues may be integrated to improve perceptual performance. Our results compare favorably with the predictions of a recent neural network model that uses a recurrent architecture to perform optimal cue integration, suggesting that the brain could use a similar computational strategy to integrate sensory signals expressed in distinct frames of reference.
Benchmarking hardware architecture candidates for the NFIRAOS real-time controller

NASA Astrophysics Data System (ADS)

Smith, Malcolm; Kerley, Dan; Herriot, Glen; Véran, Jean-Pierre

2014-07-01

As a part of the trade study for the Narrow Field Infrared Adaptive Optics System, the adaptive optics system for the Thirty Meter Telescope, we investigated the feasibility of performing real-time control computation using a Linux operating system and Intel Xeon E5 CPUs. We also investigated a Xeon Phi based architecture which allows higher levels of parallelism. This paper summarizes both the CPU based real-time controller architecture and the Xeon Phi based RTC. The Intel Xeon E5 CPU solution meets the requirements and performs the computation for one AO cycle in an average of 767 microseconds. The Xeon Phi solution did not meet the 1200 microsecond time requirement and also suffered from unpredictable execution times. More detailed benchmark results are reported for both architectures.
Optical systolic solutions of linear algebraic equations

NASA Technical Reports Server (NTRS)

Neuman, C. P.; Casasent, D.

1984-01-01

The philosophy and data encoding possible in systolic array optical processor (SAOP) were reviewed. The multitude of linear algebraic operations achievable on this architecture is examined. These operations include such linear algebraic algorithms as: matrix-decomposition, direct and indirect solutions, implicit and explicit methods for partial differential equations, eigenvalue and eigenvector calculations, and singular value decomposition. This architecture can be utilized to realize general techniques for solving matrix linear and nonlinear algebraic equations, least mean square error solutions, FIR filters, and nested-loop algorithms for control engineering applications. The data flow and pipelining of operations, design of parallel algorithms and flexible architectures, application of these architectures to computationally intensive physical problems, error source modeling of optical processors, and matching of the computational needs of practical engineering problems to the capabilities of optical processors are emphasized.
Laboratory for Computer Science Progress Report 19, 1 July 1981-30 June 1982.

DTIC Science & Technology

1984-05-01

Multiprocessor Architectures 202 4. TRIX Operating System 209 5. VLSI Tools 212 ’SYSTEMATIC PROGRAM DEVELOPMENT, 221 1. Introduction 222 2. Specification...exploring distributed operating systems and the architecture of single-user powerful computers that are interconnected by communication networks. The...to now. In particular, we expect to experiment with languages, operating systems , and applications that establish the feasibility of distributed
T and D-Bench--Innovative Combined Support for Education and Research in Computer Architecture and Embedded Systems

ERIC Educational Resources Information Center

Soares, S. N.; Wagner, F. R.

2011-01-01

Teaching and Design Workbench (T&D-Bench) is a framework aimed at education and research in the areas of computer architecture and embedded systems. It includes a set of features not found in other educational environments. This set of features is the result of an original combination of design requirements for T&D-Bench: that the…
High-Speed Systolic Array Testbed.

DTIC Science & Technology

1987-10-01

applications since the concept was introduced by H.T. Kung In 1978. This highly parallel architecture of nearet neighbor data communciation and...must be addressed. For instance, should bit-serial or bit parallei computation be utilized. Does the dynamic range of the candidate applications or...numericai stability of the algorithms used require computations In fixed point and Integer format or the architecturally more complex and slower floating
Distributed chemical computing using ChemStar: an open source java remote method invocation architecture applied to large scale molecular data from PubChem.

PubMed

Karthikeyan, M; Krishnan, S; Pandey, Anil Kumar; Bender, Andreas; Tropsha, Alexander

2008-04-01

We present the application of a Java remote method invocation (RMI) based open source architecture to distributed chemical computing. This architecture was previously employed for distributed data harvesting of chemical information from the Internet via the Google application programming interface (API; ChemXtreme). Due to its open source character and its flexibility, the underlying server/client framework can be quickly adopted to virtually every computational task that can be parallelized. Here, we present the server/client communication framework as well as an application to distributed computing of chemical properties on a large scale (currently the size of PubChem; about 18 million compounds), using both the Marvin toolkit as well as the open source JOELib package. As an application, for this set of compounds, the agreement of log P and TPSA between the packages was compared. Outliers were found to be mostly non-druglike compounds and differences could usually be explained by differences in the underlying algorithms. ChemStar is the first open source distributed chemical computing environment built on Java RMI, which is also easily adaptable to user demands due to its "plug-in architecture". The complete source codes as well as calculated properties along with links to PubChem resources are available on the Internet via a graphical user interface at http://moltable.ncl.res.in/chemstar/.
How far away is plug 'n' play? Assessing the near-term potential of sonification and auditory display

NASA Technical Reports Server (NTRS)

Bargar, Robin

1995-01-01

The commercial music industry offers a broad range of plug 'n' play hardware and software scaled to music professionals and scaled to a broad consumer market. The principles of sound synthesis utilized in these products are relevant to application in virtual environments (VE). However, the closed architectures used in commercial music synthesizers are prohibitive to low-level control during real-time rendering, and the algorithms and sounds themselves are not standardized from product to product. To bring sound into VE requires a new generation of open architectures designed for human-controlled performance from interfaces embedded in immersive environments. This presentation addresses the state of the sonic arts in scientific computing and VE, analyzes research challenges facing sound computation, and offers suggestions regarding tools we might expect to become available during the next few years. A list of classes of audio functionality in VE includes sonification -- the use of sound to represent data from numerical models; 3D auditory display (spatialization and localization, also called externalization); navigation cues for positional orientation and for finding items or regions inside large spaces; voice recognition for controlling the computer; external communications between users in different spaces; and feedback to the user concerning his own actions or the state of the application interface. To effectively convey this considerable variety of signals, we apply principles of acoustic design to ensure the messages are neither confusing nor competing. We approach the design of auditory experience through a comprehensive structure for messages, and message interplay we refer to as an Automated Sound Environment. Our research addresses real-time sound synthesis, real-time signal processing and localization, interactive control of high-dimensional systems, and synchronization of sound and graphics.
On-Board Software Reference Architecture for Payloads

NASA Astrophysics Data System (ADS)

Bos, Victor; Rugina, Ana; Trcka, Adam

2016-08-01

The goal of the On-board Software Reference Architecture for Payloads (OSRA-P) is to identify an architecture for payload software to harmonize the payload domain, to enable more reuse of common/generic payload software across different payloads and missions and to ease the integration of the payloads with the platform.To investigate the payload domain, recent and current payload instruments of European space missions have been analyzed. This led to a Payload Catalogue describing 12 payload instruments as well as a Capability Matrix listing specific characteristics of each payload. In addition, a functional decomposition of payload software was prepared which contains functionalities typically found in payload systems. The definition of OSRA-P was evaluated by case studies and a dedicated OSRA-P workshop to gather feedback from the payload community.
Mapping SOA Artefacts onto an Enterprise Reference Architecture Framework

NASA Astrophysics Data System (ADS)

Noran, Ovidiu

Currently, there is still no common agreement on the service-Oriented architecture (SOA) definition, or the types and meaning of the artefacts involved in the creation and maintenance of an SOA. Furthermore, the SOA image shift from an infrastructure solution to a business-wide change project may have promoted a perception that SOA is a parallel initiative, a competitor and perhaps a successor of enterprise architecture (EA). This chapter attempts to map several typical SOA artefacts onto an enterprise reference framework commonly used in EA. This is done in order to show that the EA framework can express and structure most of the SOA artefacts and therefore, a framework for SOA could in fact be derived from an EA framework with the ensuing SOA-EA integration benefits.
A real-time control system for the control of suspended interferometers based on hybrid computing techniques

NASA Astrophysics Data System (ADS)

Acernese, Fausto; Barone, Fabrizio; De Rosa, Rosario; Eleuteri, Antonio; Milano, Leopoldo; Pardi, Silvio; Ricciardi, Iolanda; Russo, Guido

2004-09-01

One of the main requirements of a digital system for the control of interferometric detectors of gravitational waves is the computing power, that is a direct consequence of the increasing complexity of the digital algorithms necessary for the control signals generation. For this specific task many specialized non standard real-time architectures have been developed, often very expensive and difficult to upgrade. On the other hand, such computing power is generally fully available for off-line applications on standard Pc based systems. Therefore, a possible and obvious solution may be provided by the integration of both the real-time and off-line architecture resulting in a hybrid control system architecture based on standards available components, trying to get both the advantages of the perfect data synchronization provided by the real-time systems and by the large computing power available on Pc based systems. Such integration may be provided by the implementation of the link between the two different architectures through the standard Ethernet network, whose data transfer speed is largely increasing in these years, using the TCP/IP, UDP and raw Ethernet protocols. In this paper we describe the architecture of an hybrid Ethernet based real-time control system prototype we implemented in Napoli, discussing its characteristics and performances. Finally we discuss a possible application to the real-time control of a suspended mass of the mode cleaner of the 3m prototype optical interferometer for gravitational wave detection (IDGW-3P) operational in Napoli.
A Nonlinear Dynamic Inversion Predictor-Based Model Reference Adaptive Controller for a Generic Transport Model

NASA Technical Reports Server (NTRS)

Campbell, Stefan F.; Kaneshige, John T.

2010-01-01

Presented here is a Predictor-Based Model Reference Adaptive Control (PMRAC) architecture for a generic transport aircraft. At its core, this architecture features a three-axis, non-linear, dynamic-inversion controller. Command inputs for this baseline controller are provided by pilot roll-rate, pitch-rate, and sideslip commands. This paper will first thoroughly present the baseline controller followed by a description of the PMRAC adaptive augmentation to this control system. Results are presented via a full-scale, nonlinear simulation of NASA s Generic Transport Model (GTM).
A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units

PubMed Central

Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

2017-01-01

This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions. PMID:28208684
A Real-Time High Performance Computation Architecture for Multiple Moving Target Tracking Based on Wide-Area Motion Imagery via Cloud and Graphic Processing Units.

PubMed

Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik

2017-02-12

This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions.

Unit cell-based computer-aided manufacturing system for tissue engineering.

PubMed

Kang, Hyun-Wook; Park, Jeong Hun; Kang, Tae-Yun; Seol, Young-Joon; Cho, Dong-Woo

2012-03-01

Scaffolds play an important role in the regeneration of artificial tissues or organs. A scaffold is a porous structure with a micro-scale inner architecture in the range of several to several hundreds of micrometers. Therefore, computer-aided construction of scaffolds should provide sophisticated functionality for porous structure design and a tool path generation strategy that can achieve micro-scale architecture. In this study, a new unit cell-based computer-aided manufacturing (CAM) system was developed for the automated design and fabrication of a porous structure with micro-scale inner architecture that can be applied to composite tissue regeneration. The CAM system was developed by first defining a data structure for the computing process of a unit cell representing a single pore structure. Next, an algorithm and software were developed and applied to construct porous structures with a single or multiple pore design using solid freeform fabrication technology and a 3D tooth/spine computer-aided design model. We showed that this system is quite feasible for the design and fabrication of a scaffold for tissue engineering.
Architecture and Initial Development of a Digital Library Platform for Computable Knowledge Objects for Health.

PubMed

Flynn, Allen J; Bahulekar, Namita; Boisvert, Peter; Lagoze, Carl; Meng, George; Rampton, James; Friedman, Charles P

2017-01-01

Throughout the world, biomedical knowledge is routinely generated and shared through primary and secondary scientific publications. However, there is too much latency between publication of knowledge and its routine use in practice. To address this latency, what is actionable in scientific publications can be encoded to make it computable. We have created a purpose-built digital library platform to hold, manage, and share actionable, computable knowledge for health called the Knowledge Grid Library. Here we present it with its system architecture.
Concepts and Relations in Neurally Inspired In Situ Concept-Based Computing

PubMed Central

van der Velde, Frank

2016-01-01

In situ concept-based computing is based on the notion that conceptual representations in the human brain are “in situ.” In this way, they are grounded in perception and action. Examples are neuronal assemblies, whose connection structures develop over time and are distributed over different brain areas. In situ concepts representations cannot be copied or duplicated because that will disrupt their connection structure, and thus the meaning of these concepts. Higher-level cognitive processes, as found in language and reasoning, can be performed with in situ concepts by embedding them in specialized neurally inspired “blackboards.” The interactions between the in situ concepts and the blackboards form the basis for in situ concept computing architectures. In these architectures, memory (concepts) and processing are interwoven, in contrast with the separation between memory and processing found in Von Neumann architectures. Because the further development of Von Neumann computing (more, faster, yet power limited) is questionable, in situ concept computing might be an alternative for concept-based computing. In situ concept computing will be illustrated with a recently developed BABI reasoning task. Neurorobotics can play an important role in the development of in situ concept computing because of the development of in situ concept representations derived in scenarios as needed for reasoning tasks. Neurorobotics would also benefit from power limited and in situ concept computing. PMID:27242504
Concepts and Relations in Neurally Inspired In Situ Concept-Based Computing.

PubMed

van der Velde, Frank

2016-01-01

In situ concept-based computing is based on the notion that conceptual representations in the human brain are "in situ." In this way, they are grounded in perception and action. Examples are neuronal assemblies, whose connection structures develop over time and are distributed over different brain areas. In situ concepts representations cannot be copied or duplicated because that will disrupt their connection structure, and thus the meaning of these concepts. Higher-level cognitive processes, as found in language and reasoning, can be performed with in situ concepts by embedding them in specialized neurally inspired "blackboards." The interactions between the in situ concepts and the blackboards form the basis for in situ concept computing architectures. In these architectures, memory (concepts) and processing are interwoven, in contrast with the separation between memory and processing found in Von Neumann architectures. Because the further development of Von Neumann computing (more, faster, yet power limited) is questionable, in situ concept computing might be an alternative for concept-based computing. In situ concept computing will be illustrated with a recently developed BABI reasoning task. Neurorobotics can play an important role in the development of in situ concept computing because of the development of in situ concept representations derived in scenarios as needed for reasoning tasks. Neurorobotics would also benefit from power limited and in situ concept computing.
Heterogeneous real-time computing in radio astronomy

NASA Astrophysics Data System (ADS)

Ford, John M.; Demorest, Paul; Ransom, Scott

2010-07-01

Modern computer architectures suited for general purpose computing are often not the best choice for either I/O-bound or compute-bound problems. Sometimes the best choice is not to choose a single architecture, but to take advantage of the best characteristics of different computer architectures to solve your problems. This paper examines the tradeoffs between using computer systems based on the ubiquitous X86 Central Processing Units (CPU's), Field Programmable Gate Array (FPGA) based signal processors, and Graphical Processing Units (GPU's). We will show how a heterogeneous system can be produced that blends the best of each of these technologies into a real-time signal processing system. FPGA's tightly coupled to analog-to-digital converters connect the instrument to the telescope and supply the first level of computing to the system. These FPGA's are coupled to other FPGA's to continue to provide highly efficient processing power. Data is then packaged up and shipped over fast networks to a cluster of general purpose computers equipped with GPU's, which are used for floating-point intensive computation. Finally, the data is handled by the CPU and written to disk, or further processed. Each of the elements in the system has been chosen for its specific characteristics and the role it can play in creating a system that does the most for the least, in terms of power, space, and money.
Algorithms and software used in selecting structure of machine-training cluster based on neurocomputers

NASA Astrophysics Data System (ADS)

Romanchuk, V. A.; Lukashenko, V. V.

2018-05-01

The technique of functioning of a control system by a computing cluster based on neurocomputers is proposed. Particular attention is paid to the method of choosing the structure of the computing cluster due to the fact that the existing methods are not effective because of a specialized hardware base - neurocomputers, which are highly parallel computer devices with an architecture different from the von Neumann architecture. A developed algorithm for choosing the computational structure of a cloud cluster is described, starting from the direction of data transfer in the flow control graph of the program and its adjacency matrix.
SharP

DOE Office of Scientific and Technical Information (OSTI.GOV)

Venkata, Manjunath Gorentla; Aderholdt, William F

The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. along with hierarchical-heterogeneous memory, the system typically has a high-performing network ad a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also for running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecturemore » supports the convergence of the Big-Compute and Big-Data, the programming models and software layer have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. A programming abstraction to address this problem. The programming abstraction is implemented as a software library and runs on pre-exascale and exascale systems supporting current and emerging system architecture. Using distributed data-structures as a central concept, it provides (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications.« less
Importance of balanced architectures in the design of high-performance imaging systems

NASA Astrophysics Data System (ADS)

Sgro, Joseph A.; Stanton, Paul C.

1999-03-01

Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
Cognitive Computing for Security.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Debenedictis, Erik; Rothganger, Fredrick; Aimone, James Bradley

Final report for Cognitive Computing for Security LDRD 165613. It reports on the development of hybrid of general purpose/ne uromorphic computer architecture, with an emphasis on potential implementation with memristors.
Computer Technology: State of the Art.

ERIC Educational Resources Information Center

Withington, Frederic G.

1981-01-01

Describes the nature of modern general-purpose computer systems, including hardware, semiconductor electronics, microprocessors, computer architecture, input output technology, and system control programs. Seven suggested readings are cited. (FM)
Multidisciplinary Environments: A History of Engineering Framework Development

NASA Technical Reports Server (NTRS)

Padula, Sharon L.; Gillian, Ronnie E.

2006-01-01

This paper traces the history of engineering frameworks and their use by Multidisciplinary Design Optimization (MDO) practitioners. The approach is to reference papers that have been presented at one of the ten previous Multidisciplinary Analysis and Optimization (MA&O) conferences. By limiting the search to MA&O papers, the authors can (1) identify the key ideas that led to general purpose MDO frameworks and (2) uncover roadblocks that delayed the development of these ideas. The authors make no attempt to assign credit for revolutionary ideas or to assign blame for missed opportunities. Rather, the goal is to trace the various threads of computer architecture and software framework research and to observe how these threads contributed to the commercial framework products available today.
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE PAGES

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-11-23

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
A novel strategy for load balancing of distributed medical applications.

PubMed

Logeswaran, Rajasvaran; Chen, Li-Choo

2012-04-01

Current trends in medicine, specifically in the electronic handling of medical applications, ranging from digital imaging, paperless hospital administration and electronic medical records, telemedicine, to computer-aided diagnosis, creates a burden on the network. Distributed Service Architectures, such as Intelligent Network (IN), Telecommunication Information Networking Architecture (TINA) and Open Service Access (OSA), are able to meet this new challenge. Distribution enables computational tasks to be spread among multiple processors; hence, performance is an important issue. This paper proposes a novel approach in load balancing, the Random Sender Initiated Algorithm, for distribution of tasks among several nodes sharing the same computational object (CO) instances in Distributed Service Architectures. Simulations illustrate that the proposed algorithm produces better network performance than the benchmark load balancing algorithms-the Random Node Selection Algorithm and the Shortest Queue Algorithm, especially under medium and heavily loaded conditions.
Benchmarking high performance computing architectures with CMS’ skeleton framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

Here, in 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta,more » Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.« less
Cloud Computing for Mission Design and Operations

NASA Technical Reports Server (NTRS)

Arrieta, Juan; Attiyah, Amy; Beswick, Robert; Gerasimantos, Dimitrios

2012-01-01

The space mission design and operations community already recognizes the value of cloud computing and virtualization. However, natural and valid concerns, like security, privacy, up-time, and vendor lock-in, have prevented a more widespread and expedited adoption into official workflows. In the interest of alleviating these concerns, we propose a series of guidelines for internally deploying a resource-oriented hub of data and algorithms. These guidelines provide a roadmap for implementing an architecture inspired in the cloud computing model: associative, elastic, semantical, interconnected, and adaptive. The architecture can be summarized as exposing data and algorithms as resource-oriented Web services, coordinated via messaging, and running on virtual machines; it is simple, and based on widely adopted standards, protocols, and tools. The architecture may help reduce common sources of complexity intrinsic to data-driven, collaborative interactions and, most importantly, it may provide the means for teams and agencies to evaluate the cloud computing model in their specific context, with minimal infrastructure changes, and before committing to a specific cloud services provider.
Scalable quantum computer architecture with coupled donor-quantum dot qubits

DOEpatents

Schenkel, Thomas; Lo, Cheuk Chi; Weis, Christoph; Lyon, Stephen; Tyryshkin, Alexei; Bokor, Jeffrey

2014-08-26

A quantum bit computing architecture includes a plurality of single spin memory donor atoms embedded in a semiconductor layer, a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, wherein a first voltage applied across at least one pair of the aligned quantum dot and donor atom controls a donor-quantum dot coupling. A method of performing quantum computing in a scalable architecture quantum computing apparatus includes arranging a pattern of single spin memory donor atoms in a semiconductor layer, forming a plurality of quantum dots arranged with the semiconductor layer and aligned with the donor atoms, applying a first voltage across at least one aligned pair of a quantum dot and donor atom to control a donor-quantum dot coupling, and applying a second voltage between one or more quantum dots to control a Heisenberg exchange J coupling between quantum dots and to cause transport of a single spin polarized electron between quantum dots.
Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fijany, A.; Milman, M.; Redding, D.

1994-12-31

In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm,more » designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.« less
The role of architecture and ontology for interoperability.

PubMed

Blobel, Bernd; González, Carolina; Oemig, Frank; Lopéz, Diego; Nykänen, Pirkko; Ruotsalainen, Pekka

2010-01-01

Turning from organization-centric to process-controlled or even to personalized approaches, advanced healthcare settings have to meet special interoperability challenges. eHealth and pHealth solutions must assure interoperability between actors cooperating to achieve common business objectives. Hereby, the interoperability chain also includes individually tailored technical systems, but also sensors and actuators. For enabling corresponding pervasive computing and even autonomic computing, individualized systems have to be based on an architecture framework covering many domains, scientifically managed by specialized disciplines using their specific ontologies in a formalized way. Therefore, interoperability has to advance from a communication protocol to an architecture-centric approach mastering ontology coordination challenges.
Decision-theoretic saliency: computational principles, biological plausibility, and implications for neurophysiology and psychophysics.

PubMed

Gao, Dashan; Vasconcelos, Nuno

2009-01-01

A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
GPU computing of compressible flow problems by a meshless method with space-filling curves

NASA Astrophysics Data System (ADS)

Ma, Z. H.; Wang, H.; Pu, S. H.

2014-04-01

A graphic processing unit (GPU) implementation of a meshless method for solving compressible flow problems is presented in this paper. Least-square fit is used to discretize the spatial derivatives of Euler equations and an upwind scheme is applied to estimate the flux terms. The compute unified device architecture (CUDA) C programming model is employed to efficiently and flexibly port the meshless solver from CPU to GPU. Considering the data locality of randomly distributed points, space-filling curves are adopted to re-number the points in order to improve the memory performance. Detailed evaluations are firstly carried out to assess the accuracy and conservation property of the underlying numerical method. Then the GPU accelerated flow solver is used to solve external steady flows over aerodynamic configurations. Representative results are validated through extensive comparisons with the experimental, finite volume or other available reference solutions. Performance analysis reveals that the running time cost of simulations is significantly reduced while impressive (more than an order of magnitude) speedups are achieved.

Integration of Libration Point Orbit Dynamics into a Universal 3-D Autonomous Formation Flying Algorithm

NASA Technical Reports Server (NTRS)

Folta, David; Bauer, Frank H. (Technical Monitor)

2001-01-01

The autonomous formation flying control algorithm developed by the Goddard Space Flight Center (GSFC) for the New Millennium Program (NMP) Earth Observing-1 (EO-1) mission is investigated for applicability to libration point orbit formations. In the EO-1 formation-flying algorithm, control is accomplished via linearization about a reference transfer orbit with a state transition matrix (STM) computed from state inputs. The effect of libration point orbit dynamics on this algorithm architecture is explored via computation of STMs using the flight proven code, a monodromy matrix developed from a N-body model of a libration orbit, and a standard STM developed from the gravitational and coriolis effects as measured at the libration point. A comparison of formation flying Delta-Vs calculated from these methods is made to a standard linear quadratic regulator (LQR) method. The universal 3-D approach is optimal in the sense that it can be accommodated as an open-loop or closed-loop control using only state information.
Architecture of a spatial data service system for statistical analysis and visualization of regional climate changes

NASA Astrophysics Data System (ADS)

Titov, A. G.; Okladnikov, I. G.; Gordov, E. P.

2017-11-01

The use of large geospatial datasets in climate change studies requires the development of a set of Spatial Data Infrastructure (SDI) elements, including geoprocessing and cartographical visualization web services. This paper presents the architecture of a geospatial OGC web service system as an integral part of a virtual research environment (VRE) general architecture for statistical processing and visualization of meteorological and climatic data. The architecture is a set of interconnected standalone SDI nodes with corresponding data storage systems. Each node runs a specialized software, such as a geoportal, cartographical web services (WMS/WFS), a metadata catalog, and a MySQL database of technical metadata describing geospatial datasets available for the node. It also contains geospatial data processing services (WPS) based on a modular computing backend realizing statistical processing functionality and, thus, providing analysis of large datasets with the results of visualization and export into files of standard formats (XML, binary, etc.). Some cartographical web services have been developed in a system’s prototype to provide capabilities to work with raster and vector geospatial data based on OGC web services. The distributed architecture presented allows easy addition of new nodes, computing and data storage systems, and provides a solid computational infrastructure for regional climate change studies based on modern Web and GIS technologies.
An object-oriented software approach for a distributed human tracking motion system

NASA Astrophysics Data System (ADS)

Micucci, Daniela L.

2003-06-01

Tracking is a composite job involving the co-operation of autonomous activities which exploit a complex information model and rely on a distributed architecture. Both information and activities must be classified and related in several dimensions: abstraction levels (what is modelled and how information is processed); topology (where the modelled entities are); time (when entities exist); strategy (why something happens); responsibilities (who is in charge of processing the information). A proper Object-Oriented analysis and design approach leads to a modular architecture where information about conceptual entities is modelled at each abstraction level via classes and intra-level associations, whereas inter-level associations between classes model the abstraction process. Both information and computation are partitioned according to level-specific topological models. They are also placed in a temporal framework modelled by suitable abstractions. Domain-specific strategies control the execution of the computations. Computational components perform both intra-level processing and intra-level information conversion. The paper overviews the phases of the analysis and design process, presents major concepts at each abstraction level, and shows how the resulting design turns into a modular, flexible and adaptive architecture. Finally, the paper sketches how the conceptual architecture can be deployed into a concrete distribute architecture by relying on an experimental framework.
The Cloud2SM Project

NASA Astrophysics Data System (ADS)

Crinière, Antoine; Dumoulin, Jean; Mevel, Laurent; Andrade-Barosso, Guillermo; Simonin, Matthieu

2015-04-01

From the past decades the monitoring of civil engineering structure became a major field of research and development process in the domains of modelling and integrated instrumentation. This increasing of interest can be attributed in part to the need of controlling the aging of such structures and on the other hand to the need to optimize maintenance costs. From this standpoint the project Cloud2SM (Cloud architecture design for Structural Monitoring with in-line Sensors and Models tasking), has been launched to develop a robust information system able to assess the long term monitoring of civil engineering structures as well as interfacing various sensors and data. The specificity of such architecture is to be based on the notion of data processing through physical or statistical models. Thus the data processing, whether material or mathematical, can be seen here as a resource of the main architecture. The project can be divided in various items: -The sensors and their measurement process: Those items provide data to the main architecture and can embed storage or computational resources. Dependent of onboard capacity and the amount of data generated it can be distinguished heavy and light sensors. - The storage resources: Based on the cloud concept this resource can store at least two types of data, raw data and processed ones. - The computational resources: This item includes embedded "pseudo real time" resources as the dedicated computer cluster or computational resources. - The models: Used for the conversion of raw data to meaningful data. Those types of resources inform the system of their needs they can be seen as independents blocks of the system. - The user interface: This item can be divided in various HMI to assess maintaining operation on the sensors or pop-up some information to the user. - The demonstrators: The structures themselves. This project follows previous research works initiated in the European project ISTIMES [1]. It includes the infrared thermal monitoring of civil engineering structures [2-3] and/or the vibration monitoring of such structures [4-5]. The chosen architecture is based on the OGC standard in order to ensure the interoperability between the various measurement systems. This concept is extended to the notion of physical models. The last but not the least main objective of this project is to explore the feasibility and the reliability to deploy mathematical models and process a large amount of data using the GPGPU capacity of a dedicated computational cluster, while studying OGC standardization to those technical concepts. References [1] M. Proto et al., « Transport Infrastructure surveillance and Monitoring by Electromagnetic Sensing: the ISTIMES project », Journal Sensors, Sensors 2010, 10(12), 10620-10639; doi:10.3390/s101210620, December 2010. [2] J. Dumoulin, A. Crinière, R. Averty ," Detection and thermal characterization of the inner structure of the "Musmeci" bridge deck by infrared thermography monitoring ",Journal of Geophysics and Engineering, Volume 10, Number 2, 17 pages ,November 2013, IOP Science, doi:10.1088/1742-2132/10/6/064003. [3] J Dumoulin and V Boucher; "Infrared thermography system for transport infrastructures survey with inline local atmospheric parameter measurements and offline model for radiation attenuation evaluations," J. Appl. Remote Sens., 8(1), 084978 (2014). doi:10.1117/1.JRS.8.084978. [4] V. Le Cam, M. Doehler, M. Le Pen, L. Mevel. "Embedded modal analysis algorithms on the smart wireless sensor platform PEGASE", In Proc. 9th International Workshop on Structural Health Monitoring, Stanford, CA, USA, 2013. [5] M. Zghal, L. Mevel, P. Del Moral, "Modal parameter estimation using interacting Kalman filter", Mechanical Systems and Signal Processing, 2014.
HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation

NASA Technical Reports Server (NTRS)

Sterling, Thomas; Bergman, Larry

2000-01-01

Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention semiconductor logic. Wave Division Multiplexing optical communications can approach a peak per fiber bandwidth of 1 Tbps and the new Data Vortex network topology employing this technology can connect tens of thousands of ports providing a bi-section bandwidth on the order of a Petabyte per second with latencies well below 100 nanoseconds, even under heavy loads. Processor-in-Memory (PIM) technology combines logic and memory on the same chip exposing the internal bandwidth of the memory row buffers at low latency. And holographic storage photorefractive storage technologies provide high-density memory with access a thousand times faster than conventional disk technologies. Together these technologies enable a new class of shared memory system architecture with a peak performance in the range of a Petaflops but size and power requirements comparable to today's largest Teraflops scale systems. To achieve high-sustained performance, HTMT combines an advanced multithreading processor architecture with a memory-driven coarse-grained latency management strategy called "percolation", yielding high efficiency while reducing the much of the parallel programming burden. This paper will present the basic system architecture characteristics made possible through this series of advanced technologies and then give a detailed description of the new percolation approach to runtime latency management.
Architecture and the Information Revolution.

ERIC Educational Resources Information Center

Driscoll, Porter; And Others

1982-01-01

Traces how technological changes affect the architecture of the workplace. Traces these effects from the industrial revolution up through the computer revolution. Offers suggested designs for the computerized office of today and tomorrow. (JM)
Spaceborne Processor Array

NASA Technical Reports Server (NTRS)

Chow, Edward T.; Schatzel, Donald V.; Whitaker, William D.; Sterling, Thomas

2008-01-01

A Spaceborne Processor Array in Multifunctional Structure (SPAMS) can lower the total mass of the electronic and structural overhead of spacecraft, resulting in reduced launch costs, while increasing the science return through dynamic onboard computing. SPAMS integrates the multifunctional structure (MFS) and the Gilgamesh Memory, Intelligence, and Network Device (MIND) multi-core in-memory computer architecture into a single-system super-architecture. This transforms every inch of a spacecraft into a sharable, interconnected, smart computing element to increase computing performance while simultaneously reducing mass. The MIND in-memory architecture provides a foundation for high-performance, low-power, and fault-tolerant computing. The MIND chip has an internal structure that includes memory, processing, and communication functionality. The Gilgamesh is a scalable system comprising multiple MIND chips interconnected to operate as a single, tightly coupled, parallel computer. The array of MIND components shares a global, virtual name space for program variables and tasks that are allocated at run time to the distributed physical memory and processing resources. Individual processor- memory nodes can be activated or powered down at run time to provide active power management and to configure around faults. A SPAMS system is comprised of a distributed Gilgamesh array built into MFS, interfaces into instrument and communication subsystems, a mass storage interface, and a radiation-hardened flight computer.
Determining physiological cross-sectional area of extensor carpi radialis longus and brevis as a whole and by regions using 3D computer muscle models created from digitized fiber bundle data.

PubMed

Ravichandiran, Kajeandra; Ravichandiran, Mayoorendra; Oliver, Michele L; Singh, Karan S; McKee, Nancy H; Agur, Anne M R

2009-09-01

Architectural parameters and physiological cross-sectional area (PCSA) are important determinants of muscle function. Extensor carpi radialis longus (ECRL) and brevis (ECRB) are used in muscle transfers; however, their regional architectural differences have not been investigated. The aim of this study is to develop computational algorithms to quantify and compare architectural parameters (fiber bundle length, pennation angle, and volume) and PCSA of ECRL and ECRB. Fiber bundles distributed throughout the volume of ECRL (75+/-20) and ECRB (110+/-30) were digitized in eight formalin embalmed cadaveric specimens. The digitized data was reconstructed in Autodesk Maya with computational algorithms implemented in Python. The mean PCSA and fiber bundle length were significantly different between ECRL and ECRB (p < or = 0.05). Superficial ECRL had significantly longer fiber bundle length than the deep region, whereas the PCSA of superficial ECRB was significantly larger than the deep region. The regional quantification of architectural parameters and PCSA provides a framework for the exploration of partial tendon transfers of ECRL and ECRB.
Advanced information processing system: The Army fault tolerant architecture conceptual study. Volume 1: Army fault tolerant architecture overview

NASA Technical Reports Server (NTRS)

Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.

1992-01-01

Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.
Report from the MPP Working Group to the NASA Associate Administrator for Space Science and Applications

NASA Technical Reports Server (NTRS)

Fischer, James R.; Grosch, Chester; Mcanulty, Michael; Odonnell, John; Storey, Owen

1987-01-01

NASA's Office of Space Science and Applications (OSSA) gave a select group of scientists the opportunity to test and implement their computational algorithms on the Massively Parallel Processor (MPP) located at Goddard Space Flight Center, beginning in late 1985. One year later, the Working Group presented its report, which addressed the following: algorithms, programming languages, architecture, programming environments, the way theory relates, and performance measured. The findings point to a number of demonstrated computational techniques for which the MPP architecture is ideally suited. For example, besides executing much faster on the MPP than on conventional computers, systolic VLSI simulation (where distances are short), lattice simulation, neural network simulation, and image problems were found to be easier to program on the MPP's architecture than on a CYBER 205 or even a VAX. The report also makes technical recommendations covering all aspects of MPP use, and recommendations concerning the future of the MPP and machines based on similar architectures, expansion of the Working Group, and study of the role of future parallel processors for space station, EOS, and the Great Observatories era.
A performance analysis of advanced I/O architectures for PC-based network file servers

NASA Astrophysics Data System (ADS)

Huynh, K. D.; Khoshgoftaar, T. M.

1994-12-01

In the personal computing and workstation environments, more and more I/O adapters are becoming complete functional subsystems that are intelligent enough to handle I/O operations on their own without much intervention from the host processor. The IBM Subsystem Control Block (SCB) architecture has been defined to enhance the potential of these intelligent adapters by defining services and conventions that deliver command information and data to and from the adapters. In recent years, a new storage architecture, the Redundant Array of Independent Disks (RAID), has been quickly gaining acceptance in the world of computing. In this paper, we would like to discuss critical system design issues that are important to the performance of a network file server. We then present a performance analysis of the SCB architecture and disk array technology in typical network file server environments based on personal computers (PCs). One of the key issues investigated in this paper is whether a disk array can outperform a group of disks (of same type, same data capacity, and same cost) operating independently, not in parallel as in a disk array.
Integrating the Apache Big Data Stack with HPC for Big Data

NASA Astrophysics Data System (ADS)

Fox, G. C.; Qiu, J.; Jha, S.

2014-12-01

There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.
Implementation and analysis of a Navier-Stokes algorithm on parallel computers

NASA Technical Reports Server (NTRS)

Fatoohi, Raad A.; Grosch, Chester E.

1988-01-01

The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Selected Reference Books of 1998-99.

ERIC Educational Resources Information Center

McIlvaine, Eileen

1999-01-01

Presents an annotated bibliography of selected reference books published between 1998 and 1999 under subject headings for biography, journalism, mythology, languages and literature, architecture and city planning, political science, economics, history, and new editions and supplements. (LRW)
Constrained Multipoint Aerodynamic Shape Optimization Using an Adjoint Formulation and Parallel Computers

NASA Technical Reports Server (NTRS)

Reuther, James; Jameson, Antony; Alonso, Juan Jose; Rimlinger, Mark J.; Saunders, David

1997-01-01

An aerodynamic shape optimization method that treats the design of complex aircraft configurations subject to high fidelity computational fluid dynamics (CFD), geometric constraints and multiple design points is described. The design process will be greatly accelerated through the use of both control theory and distributed memory computer architectures. Control theory is employed to derive the adjoint differential equations whose solution allows for the evaluation of design gradient information at a fraction of the computational cost required by previous design methods. The resulting problem is implemented on parallel distributed memory architectures using a domain decomposition approach, an optimized communication schedule, and the MPI (Message Passing Interface) standard for portability and efficiency. The final result achieves very rapid aerodynamic design based on a higher order CFD method. In order to facilitate the integration of these high fidelity CFD approaches into future multi-disciplinary optimization (NW) applications, new methods must be developed which are capable of simultaneously addressing complex geometries, multiple objective functions, and geometric design constraints. In our earlier studies, we coupled the adjoint based design formulations with unconstrained optimization algorithms and showed that the approach was effective for the aerodynamic design of airfoils, wings, wing-bodies, and complex aircraft configurations. In many of the results presented in these earlier works, geometric constraints were satisfied either by a projection into feasible space or by posing the design space parameterization such that it automatically satisfied constraints. Furthermore, with the exception of reference 9 where the second author initially explored the use of multipoint design in conjunction with adjoint formulations, our earlier works have focused on single point design efforts. Here we demonstrate that the same methodology may be extended to treat complete configuration designs subject to multiple design points and geometric constraints. Examples are presented for both transonic and supersonic configurations ranging from wing alone designs to complex configuration designs involving wing, fuselage, nacelles and pylons.
Analysis of Disaster Preparedness Planning Measures in DoD Computer Facilities

DTIC Science & Technology

1993-09-01

city, stae, aod ZP code) 10 Source of Funding Numbers SProgram Element No lProject No ITask No lWork Unit Accesion I 11 Title include security...Computer Disaster Recovery .... 13 a. PC and LAN Lessons Learned . . ..... 13 2. Distributed Architectures . . . .. . 14 3. Backups...amount of expense, but no client problems." (Leeke, 1993, p. 8) 2. Distributed Architectures The majority of operations that were disrupted by the
Computer Architecture for Energy Efficient SFQ

DTIC Science & Technology

2014-08-27

IBM Corporation (T.J. Watson Research Laboratory) 1101 Kitchawan Road Yorktown Heights, NY 10598 -0000 2 ABSTRACT Number of Papers published in peer...accomplished during this ARO-sponsored project at IBM Research to identify and model an energy efficient SFQ-based computer architecture. The... IBM Windsor Blue (WB), illustrated schematically in Figure 2. The basic building block of WB is a "tile" comprised of a 64-bit arithmetic logic unit
Proactive Fault Tolerance Using Preemptive Migration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Engelmann, Christian; Vallee, Geoffroy R; Naughton, III, Thomas J

2009-01-01

Proactive fault tolerance (FT) in high-performance computing is a concept that prevents compute node failures from impacting running parallel applications by preemptively migrating application parts away from nodes that are about to fail. This paper provides a foundation for proactive FT by defining its architecture and classifying implementation options. This paper further relates prior work to the presented architecture and classification, and discusses the challenges ahead for needed supporting technologies.
Implementation of Virtualization Oriented Architecture: A Healthcare Industry Case Study

NASA Astrophysics Data System (ADS)

Rao, G. Subrahmanya Vrk; Parthasarathi, Jinka; Karthik, Sundararaman; Rao, Gvn Appa; Ganesan, Suresh

This paper presents a Virtualization Oriented Architecture (VOA) and an implementation of VOA for Hridaya - a Telemedicine initiative. Hadoop Compute cloud was established at our labs and jobs which require a massive computing capability such as ECG signal analysis were submitted and the study is presented in this current paper. VOA takes advantage of inexpensive community PCs and provides added advantages such as Fault Tolerance, Scalability, Performance, High Availability.
Optimization of knowledge sharing through multi-forum using cloud computing architecture

NASA Astrophysics Data System (ADS)

Madapusi Vasudevan, Sriram; Sankaran, Srivatsan; Muthuswamy, Shanmugasundaram; Ram, N. Sankar

2011-12-01

Knowledge sharing is done through various knowledge sharing forums which requires multiple logins through multiple browser instances. Here a single Multi-Forum knowledge sharing concept is introduced which requires only one login session which makes user to connect multiple forums and display the data in a single browser window. Also few optimization techniques are introduced here to speed up the access time using cloud computing architecture.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.