2010-06-01
DATES COVEREDAPR 2009 – JAN 2010 (From - To) APR 2009 – JAN 2010 4. TITLE AND SUBTITLE EMERGING NEUROMORPHIC COMPUTING ARCHITECTURES AND ENABLING...14. ABSTRACT The highly cross-disciplinary emerging field of neuromorphic computing architectures for cognitive information processing applications...belief systems, software, computer engineering, etc. In our effort to develop cognitive systems atop a neuromorphic computing architecture, we explored
Transitioning ISR architecture into the cloud
NASA Astrophysics Data System (ADS)
Lash, Thomas D.
2012-06-01
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas; Schuman, Catherine; Patton, Robert
The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
Architecture of a prehospital emergency patient care report system (PEPRS).
Majeed, Raphael W; Stöhr, Mark R; Röhrig, Rainer
2013-01-01
In recent years, prehospital emergency care adapted to the technology shift towards tablet computers and mobile computing. In particular, electronic patient care report (e-PCR) systems gained considerable attention and adoption in prehospital emergency medicine [1]. On the other hand, hospital information systems are already widely adopted. Yet, there is no universal solution for integrating prehospital emergency reports into electronic medical records of hospital information systems. Previous projects either relied on proprietary viewing workstations or examined and transferred only data for specific diseases (e.g. stroke patients[2]). Using requirements engineering and a three step software engineering approach, this project presents a generic architecture for integrating prehospital emergency care reports into hospital information systems. Aim of this project is to describe a generic architecture which can be used to implement data transfer and integration of pre hospital emergency care reports to hospital information systems. In summary, the prototype was able to integrate data in a standardized manner. The devised methods can be used design generic software for prehospital to hospital data integration.
Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.
Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein
2015-12-01
Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.
Programmable hardware for reconfigurable computing systems
NASA Astrophysics Data System (ADS)
Smith, Stephen
1996-10-01
In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
Neural simulations on multi-core architectures.
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing.
Neural Simulations on Multi-Core Architectures
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing. PMID:19636393
Three Program Architecture for Design Optimization
NASA Technical Reports Server (NTRS)
Miura, Hirokazu; Olson, Lawrence E. (Technical Monitor)
1998-01-01
In this presentation, I would like to review historical perspective on the program architecture used to build design optimization capabilities based on mathematical programming and other numerical search techniques. It is rather straightforward to classify the program architecture in three categories as shown above. However, the relative importance of each of the three approaches has not been static, instead dynamically changing as the capabilities of available computational resource increases. For example, we considered that the direct coupling architecture would never be used for practical problems, but availability of such computer systems as multi-processor. In this presentation, I would like to review the roles of three architecture from historical as well as current and future perspective. There may also be some possibility for emergence of hybrid architecture. I hope to provide some seeds for active discussion where we are heading to in the very dynamic environment for high speed computing and communication.
Electromagnetic Physics Models for Parallel Computing Architectures
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Electromagnetic physics models for parallel computing architectures
Amadio, G.; Ananya, A.; Apostolakis, J.; ...
2016-11-21
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part ofmore » the GeantV project. Finally, the results of preliminary performance evaluation and physics validation are presented as well.« less
NASA Astrophysics Data System (ADS)
Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.
2017-12-01
As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
An Architecture for Cross-Cloud System Management
NASA Astrophysics Data System (ADS)
Dodda, Ravi Teja; Smith, Chris; van Moorsel, Aad
The emergence of the cloud computing paradigm promises flexibility and adaptability through on-demand provisioning of compute resources. As the utilization of cloud resources extends beyond a single provider, for business as well as technical reasons, the issue of effectively managing such resources comes to the fore. Different providers expose different interfaces to their compute resources utilizing varied architectures and implementation technologies. This heterogeneity poses a significant system management problem, and can limit the extent to which the benefits of cross-cloud resource utilization can be realized. We address this problem through the definition of an architecture to facilitate the management of compute resources from different cloud providers in an homogenous manner. This preserves the flexibility and adaptability promised by the cloud computing paradigm, whilst enabling the benefits of cross-cloud resource utilization to be realized. The practical efficacy of the architecture is demonstrated through an implementation utilizing compute resources managed through different interfaces on the Amazon Elastic Compute Cloud (EC2) service. Additionally, we provide empirical results highlighting the performance differential of these different interfaces, and discuss the impact of this performance differential on efficiency and profitability.
NASA Astrophysics Data System (ADS)
Liu, Lei; Hong, Xiaobin; Wu, Jian; Lin, Jintong
As Grid computing continues to gain popularity in the industry and research community, it also attracts more attention from the customer level. The large number of users and high frequency of job requests in the consumer market make it challenging. Clearly, all the current Client/Server(C/S)-based architecture will become unfeasible for supporting large-scale Grid applications due to its poor scalability and poor fault-tolerance. In this paper, based on our previous works [1, 2], a novel self-organized architecture to realize a highly scalable and flexible platform for Grids is proposed. Experimental results show that this architecture is suitable and efficient for consumer-oriented Grids.
Bioinspired architecture approach for a one-billion transistor smart CMOS camera chip
NASA Astrophysics Data System (ADS)
Fey, Dietmar; Komann, Marcus
2007-05-01
In the paper we present a massively parallel VLSI architecture for future smart CMOS camera chips with up to one billion transistors. To exploit efficiently the potential offered by future micro- or nanoelectronic devices traditional on central structures oriented parallel architectures based on MIMD or SIMD approaches will fail. They require too long and too many global interconnects for the distribution of code or the access to common memory. On the other hand nature developed self-organising and emergent principles to manage successfully complex structures based on lots of interacting simple elements. Therefore we developed a new as Marching Pixels denoted emergent computing paradigm based on a mixture of bio-inspired computing models like cellular automaton and artificial ants. In the paper we present different Marching Pixels algorithms and the corresponding VLSI array architecture. A detailed synthesis result for a 0.18 μm CMOS process shows that a 256×256 pixel image is processed in less than 10 ms assuming a moderate 100 MHz clock rate for the processor array. Future higher integration densities and a 3D chip stacking technology will allow the integration and processing of Mega pixels within the same time since our architecture is fully scalable.
Modeling driver behavior in a cognitive architecture.
Salvucci, Dario D
2006-01-01
This paper explores the development of a rigorous computational model of driver behavior in a cognitive architecture--a computational framework with underlying psychological theories that incorporate basic properties and limitations of the human system. Computational modeling has emerged as a powerful tool for studying the complex task of driving, allowing researchers to simulate driver behavior and explore the parameters and constraints of this behavior. An integrated driver model developed in the ACT-R (Adaptive Control of Thought-Rational) cognitive architecture is described that focuses on the component processes of control, monitoring, and decision making in a multilane highway environment. This model accounts for the steering profiles, lateral position profiles, and gaze distributions of human drivers during lane keeping, curve negotiation, and lane changing. The model demonstrates how cognitive architectures facilitate understanding of driver behavior in the context of general human abilities and constraints and how the driving domain benefits cognitive architectures by pushing model development toward more complex, realistic tasks. The model can also serve as a core computational engine for practical applications that predict and recognize driver behavior and distraction.
Pohjonen, Hanna; Ross, Peeter; Blickman, Johan G; Kamman, Richard
2007-01-01
Emerging technologies are transforming the workflows in healthcare enterprises. Computing grids and handheld mobile/wireless devices are providing clinicians with enterprise-wide access to all patient data and analysis tools on a pervasive basis. In this paper, emerging technologies are presented that provide computing grids and streaming-based access to image and data management functions, and system architectures that enable pervasive computing on a cost-effective basis. Finally, the implications of such technologies are investigated regarding the positive impacts on clinical workflows.
Parallel, stochastic measurement of molecular surface area.
Juba, Derek; Varshney, Amitabh
2008-08-01
Biochemists often wish to compute surface areas of proteins. A variety of algorithms have been developed for this task, but they are designed for traditional single-processor architectures. The current trend in computer hardware is towards increasingly parallel architectures for which these algorithms are not well suited. We describe a parallel, stochastic algorithm for molecular surface area computation that maps well to the emerging multi-core architectures. Our algorithm is also progressive, providing a rough estimate of surface area immediately and refining this estimate as time goes on. Furthermore, the algorithm generates points on the molecular surface which can be used for point-based rendering. We demonstrate a GPU implementation of our algorithm and show that it compares favorably with several existing molecular surface computation programs, giving fast estimates of the molecular surface area with good accuracy.
Client/Server Architecture Promises Radical Changes.
ERIC Educational Resources Information Center
Freeman, Grey; York, Jerry
1991-01-01
This article discusses the emergence of the client/server paradigm for the delivery of computer applications, its emergence in response to the proliferation of microcomputers and local area networks, the applicability of the model in academic institutions, and its implications for college campus information technology organizations. (Author/DB)
Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuller, Ivan K.; Stevens, Rick; Pino, Robinson
2015-10-29
Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
Design and Delivery of Multiple Server-Side Computer Languages Course
ERIC Educational Resources Information Center
Wang, Shouhong; Wang, Hai
2011-01-01
Given the emergence of service-oriented architecture, IS students need to be knowledgeable of multiple server-side computer programming languages to be able to meet the needs of the job market. This paper outlines the pedagogy of an innovative course of multiple server-side computer languages for the undergraduate IS majors. The paper discusses…
A Grid Infrastructure for Supporting Space-based Science Operations
NASA Technical Reports Server (NTRS)
Bradford, Robert N.; Redman, Sandra H.; McNair, Ann R. (Technical Monitor)
2002-01-01
Emerging technologies for computational grid infrastructures have the potential for revolutionizing the way computers are used in all aspects of our lives. Computational grids are currently being implemented to provide a large-scale, dynamic, and secure research and engineering environments based on standards and next-generation reusable software, enabling greater science and engineering productivity through shared resources and distributed computing for less cost than traditional architectures. Combined with the emerging technologies of high-performance networks, grids provide researchers, scientists and engineers the first real opportunity for an effective distributed collaborative environment with access to resources such as computational and storage systems, instruments, and software tools and services for the most computationally challenging applications.
An Architectural Experience for Interface Design
ERIC Educational Resources Information Center
Gong, Susan P.
2016-01-01
The problem of human-computer interface design was brought to the foreground with the emergence of the personal computer, the increasing complexity of electronic systems, and the need to accommodate the human operator in these systems. With each new technological generation discovering the interface design problems of its own technologies, initial…
Cheong, Chang Heon; Lee, Seonhye
2018-01-01
The prevention of airborne infections in emergency departments is a very important issue. This study investigated the effects of architectural features on airborne pathogen dispersion in emergency departments by using a CFD (computational fluid dynamics) simulation tool. The study included three architectural features as the major variables: increased ventilation rate, inlet and outlet diffuser positions, and partitions between beds. The most effective method for preventing pathogen dispersion and reducing the pathogen concentration was found to be increasing the ventilation rate. Installing partitions between the beds and changing the ventilation system’s inlet and outlet diffuser positions contributed only minimally to reducing the concentration of airborne pathogens. PMID:29534043
Cheong, Chang Heon; Lee, Seonhye
2018-03-13
The prevention of airborne infections in emergency departments is a very important issue. This study investigated the effects of architectural features on airborne pathogen dispersion in emergency departments by using a CFD (computational fluid dynamics) simulation tool. The study included three architectural features as the major variables: increased ventilation rate, inlet and outlet diffuser positions, and partitions between beds. The most effective method for preventing pathogen dispersion and reducing the pathogen concentration was found to be increasing the ventilation rate. Installing partitions between the beds and changing the ventilation system's inlet and outlet diffuser positions contributed only minimally to reducing the concentration of airborne pathogens.
An integrative architecture for a sensor-supported trust management system.
Trček, Denis
2012-01-01
Trust plays a key role not only in e-worlds and emerging pervasive computing environments, but also already for millennia in human societies. Trust management solutions that have being around now for some fifteen years were primarily developed for the above mentioned cyber environments and they are typically focused on artificial agents, sensors, etc. However, this paper presents extensions of a new methodology together with architecture for trust management support that is focused on humans and human-like agents. With this methodology and architecture sensors play a crucial role. The architecture presents an already deployable tool for multi and interdisciplinary research in various areas where humans are involved. It provides new ways to obtain an insight into dynamics and evolution of such structures, not only in pervasive computing environments, but also in other important areas like management and decision making support.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brun, B.
1997-07-01
Computer technology has improved tremendously during the last years with larger media capacity, memory and more computational power. Visual computing with high-performance graphic interface and desktop computational power have changed the way engineers accomplish everyday tasks, development and safety studies analysis. The emergence of parallel computing will permit simulation over a larger domain. In addition, new development methods, languages and tools have appeared in the last several years.
A smart sensor architecture based on emergent computation in an array of outer-totalistic cells
NASA Astrophysics Data System (ADS)
Dogaru, Radu; Dogaru, Ioana; Glesner, Manfred
2005-06-01
A novel smart-sensor architecture is proposed, capable to segment and recognize characters in a monochrome image. It is capable to provide a list of ASCII codes representing the recognized characters from the monochrome visual field. It can operate as a blind's aid or for industrial applications. A bio-inspired cellular model with simple linear neurons was found the best to perform the nontrivial task of cropping isolated compact objects such as handwritten digits or characters. By attaching a simple outer-totalistic cell to each pixel sensor, emergent computation in the resulting cellular automata lattice provides a straightforward and compact solution to the otherwise computationally intensive problem of character segmentation. A simple and robust recognition algorithm is built in a compact sequential controller accessing the array of cells so that the integrated device can provide directly a list of codes of the recognized characters. Preliminary simulation tests indicate good performance and robustness to various distortions of the visual field.
An Integrative Architecture for a Sensor-Supported Trust Management System
Trček, Denis
2012-01-01
Trust plays a key role not only in e-worlds and emerging pervasive computing environments, but also already for millennia in human societies. Trust management solutions that have being around now for some fifteen years were primarily developed for the above mentioned cyber environments and they are typically focused on artificial agents, sensors, etc. However, this paper presents extensions of a new methodology together with architecture for trust management support that is focused on humans and human-like agents. With this methodology and architecture sensors play a crucial role. The architecture presents an already deployable tool for multi and interdisciplinary research in various areas where humans are involved. It provides new ways to obtain an insight into dynamics and evolution of such structures, not only in pervasive computing environments, but also in other important areas like management and decision making support. PMID:23112628
Challenges & Roadmap for Beyond CMOS Computing Simulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rodrigues, Arun F.; Frank, Michael P.
Simulating HPC systems is a difficult task and the emergence of “Beyond CMOS” architectures and execution models will increase that difficulty. This document presents a “tutorial” on some of the simulation challenges faced by conventional and non-conventional architectures (Section 1) and goals and requirements for simulating Beyond CMOS systems (Section 2). These provide background for proposed short- and long-term roadmaps for simulation efforts at Sandia (Sections 3 and 4). Additionally, a brief explanation of a proof-of-concept integration of a Beyond CMOS architectural simulator is presented (Section 2.3).
Fault tolerant computer control for a Maglev transportation system
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George
1994-01-01
Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.
PIMS: Memristor-Based Processing-in-Memory-and-Storage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cook, Jeanine
Continued progress in computing has augmented the quest for higher performance with a new quest for higher energy efficiency. This has led to the re-emergence of Processing-In-Memory (PIM) ar- chitectures that offer higher density and performance with some boost in energy efficiency. Past PIM work either integrated a standard CPU with a conventional DRAM to improve the CPU- memory link, or used a bit-level processor with Single Instruction Multiple Data (SIMD) control, but neither matched the energy consumption of the memory to the computation. We originally proposed to develop a new architecture derived from PIM that more effectively addressed energymore » efficiency for high performance scientific, data analytics, and neuromorphic applications. We also originally planned to implement a von Neumann architecture with arithmetic/logic units (ALUs) that matched the power consumption of an advanced storage array to maximize energy efficiency. Implementing this architecture in storage was our original idea, since by augmenting storage (in- stead of memory), the system could address both in-memory computation and applications that accessed larger data sets directly from storage, hence Processing-in-Memory-and-Storage (PIMS). However, as our research matured, we discovered several things that changed our original direc- tion, the most important being that a PIM that implements a standard von Neumann-type archi- tecture results in significant energy efficiency improvement, but only about a O(10) performance improvement. In addition to this, the emergence of new memory technologies moved us to propos- ing a non-von Neumann architecture, called Superstrider, implemented not in storage, but in a new DRAM technology called High Bandwidth Memory (HBM). HBM is a stacked DRAM tech- nology that includes a logic layer where an architecture such as Superstrider could potentially be implemented.« less
ERIC Educational Resources Information Center
Buechley, Leah, Ed.; Peppler, Kylie, Ed.; Eisenberg, Michael, Ed.; Yasmin, Kafai, Ed.
2013-01-01
"Textile Messages" focuses on the emerging field of electronic textiles, or e-textiles--computers that can be soft, colorful, approachable, and beautiful. E-textiles are articles of clothing, home furnishings, or architectures that include embedded computational and electronic elements. This book introduces a collection of tools that…
A Survey Of Architectural Approaches for Managing Embedded DRAM and Non-volatile On-chip Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh; Vetter, Jeffrey S; Li, Dong
Recent trends of CMOS scaling and increasing number of on-chip cores have led to a large increase in the size of on-chip caches. Since SRAM has low density and consumes large amount of leakage power, its use in designing on-chip caches has become more challenging. To address this issue, researchers are exploring the use of several emerging memory technologies, such as embedded DRAM, spin transfer torque RAM, resistive RAM, phase change RAM and domain wall memory. In this paper, we survey the architectural approaches proposed for designing memory systems and, specifically, caches with these emerging memory technologies. To highlight theirmore » similarities and differences, we present a classification of these technologies and architectural approaches based on their key characteristics. We also briefly summarize the challenges in using these technologies for architecting caches. We believe that this survey will help the readers gain insights into the emerging memory device technologies, and their potential use in designing future computing systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venkata, Manjunath Gorentla; Aderholdt, William F
The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. along with hierarchical-heterogeneous memory, the system typically has a high-performing network ad a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also for running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecturemore » supports the convergence of the Big-Compute and Big-Data, the programming models and software layer have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. A programming abstraction to address this problem. The programming abstraction is implemented as a software library and runs on pre-exascale and exascale systems supporting current and emerging system architecture. Using distributed data-structures as a central concept, it provides (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications.« less
Cloud computing for energy management in smart grid - an application survey
NASA Astrophysics Data System (ADS)
Naveen, P.; Kiing Ing, Wong; Kobina Danquah, Michael; Sidhu, Amandeep S.; Abu-Siada, Ahmed
2016-03-01
The smart grid is the emerging energy system wherein the application of information technology, tools and techniques that make the grid run more efficiently. It possesses demand response capacity to help balance electrical consumption with supply. The challenges and opportunities of emerging and future smart grids can be addressed by cloud computing. To focus on these requirements, we provide an in-depth survey on different cloud computing applications for energy management in the smart grid architecture. In this survey, we present an outline of the current state of research on smart grid development. We also propose a model of cloud based economic power dispatch for smart grid.
A PACS archive architecture supported on cloud services.
Silva, Luís A Bastião; Costa, Carlos; Oliveira, José Luis
2012-05-01
Diagnostic imaging procedures have continuously increased over the last decade and this trend may continue in coming years, creating a great impact on storage and retrieval capabilities of current PACS. Moreover, many smaller centers do not have financial resources or requirements that justify the acquisition of a traditional infrastructure. Alternative solutions, such as cloud computing, may help address this emerging need. A tremendous amount of ubiquitous computational power, such as that provided by Google and Amazon, are used every day as a normal commodity. Taking advantage of this new paradigm, an architecture for a Cloud-based PACS archive that provides data privacy, integrity, and availability is proposed. The solution is independent from the cloud provider and the core modules were successfully instantiated in examples of two cloud computing providers. Operational metrics for several medical imaging modalities were tabulated and compared for Google Storage, Amazon S3, and LAN PACS. A PACS-as-a-Service archive that provides storage of medical studies using the Cloud was developed. The results show that the solution is robust and that it is possible to store, query, and retrieve all desired studies in a similar way as in a local PACS approach. Cloud computing is an emerging solution that promises high scalability of infrastructures, software, and applications, according to a "pay-as-you-go" business model. The presented architecture uses the cloud to setup medical data repositories and can have a significant impact on healthcare institutions by reducing IT infrastructures.
High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations
NASA Technical Reports Server (NTRS)
Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.
2003-01-01
Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.
NASA Astrophysics Data System (ADS)
Evans, J. D.; Hao, W.; Chettri, S. R.
2014-12-01
Disaster risk management has grown to rely on earth observations, multi-source data analysis, numerical modeling, and interagency information sharing. The practice and outcomes of disaster risk management will likely undergo further change as several emerging earth science technologies come of age: mobile devices; location-based services; ubiquitous sensors; drones; small satellites; satellite direct readout; Big Data analytics; cloud computing; Web services for predictive modeling, semantic reconciliation, and collaboration; and many others. Integrating these new technologies well requires developing and adapting them to meet current needs; but also rethinking current practice to draw on new capabilities to reach additional objectives. This requires a holistic view of the disaster risk management enterprise and of the analytical or operational capabilities afforded by these technologies. One helpful tool for this assessment, the GEOSS Architecture for the Use of Remote Sensing Products in Disaster Management and Risk Assessment (Evans & Moe, 2013), considers all phases of the disaster risk management lifecycle for a comprehensive set of natural hazard types, and outlines common clusters of activities and their use of information and computation resources. We are using these architectural views, together with insights from current practice, to highlight effective, interrelated roles for emerging earth science technologies in disaster risk management. These roles may be helpful in creating roadmaps for research and development investment at national and international levels.
32 bit digital optical computer - A hardware update
NASA Technical Reports Server (NTRS)
Guilfoyle, Peter S.; Carter, James A., III; Stone, Richard V.; Pape, Dennis R.
1990-01-01
Such state-of-the-art devices as multielement linear laser diode arrays, multichannel acoustooptic modulators, optical relays, and avalanche photodiode arrays, are presently applied to the implementation of a 32-bit supercomputer's general-purpose optical central processing architecture. Shannon's theorem, Morozov's control operator method (in conjunction with combinatorial arithmetic), and DeMorgan's law have been used to design an architecture whose 100 MHz clock renders it fully competitive with emerging planar-semiconductor technology. Attention is given to the architecture's multichannel Bragg cells, thermal design and RF crosstalk considerations, and the first and second anamorphic relay legs.
Integrated Distributed Directory Service for KSC
NASA Technical Reports Server (NTRS)
Ghansah, Isaac
1997-01-01
This paper describes an integrated distributed directory services (DDS) architecture as a fundamental component of KSC distributed computing systems. Specifically, an architecture for an integrated directory service based on DNS and X.500/LDAP has been suggested. The architecture supports using DNS in its traditional role as a name service and X.500 for other services. Specific designs were made in the integration of X.500 DDS for Public Key Certificates, Kerberos Security Services, Network-wide Login, Electronic Mail, WWW URLS, Servers, and other diverse network objects. Issues involved in incorporating the emerging Microsoft Active Directory Service MADS in KSC's X.500 were discussed.
Stamatakis, Alexandros; Ott, Michael
2008-12-27
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on 'gappy' multi-gene alignments. By 'gappy' we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAXML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.
A Web of Things-Based Emerging Sensor Network Architecture for Smart Control Systems.
Khan, Murad; Silva, Bhagya Nathali; Han, Kijun
2017-02-09
The Web of Things (WoT) plays an important role in the representation of the objects connected to the Internet of Things in a more transparent and effective way. Thus, it enables seamless and ubiquitous web communication between users and the smart things. Considering the importance of WoT, we propose a WoT-based emerging sensor network (WoT-ESN), which collects data from sensors, routes sensor data to the web, and integrate smart things into the web employing a representational state transfer (REST) architecture. A smart home scenario is introduced to evaluate the proposed WoT-ESN architecture. The smart home scenario is tested through computer simulation of the energy consumption of various household appliances, device discovery, and response time performance. The simulation results show that the proposed scheme significantly optimizes the energy consumption of the household appliances and the response time of the appliances.
A Web of Things-Based Emerging Sensor Network Architecture for Smart Control Systems
Khan, Murad; Silva, Bhagya Nathali; Han, Kijun
2017-01-01
The Web of Things (WoT) plays an important role in the representation of the objects connected to the Internet of Things in a more transparent and effective way. Thus, it enables seamless and ubiquitous web communication between users and the smart things. Considering the importance of WoT, we propose a WoT-based emerging sensor network (WoT-ESN), which collects data from sensors, routes sensor data to the web, and integrate smart things into the web employing a representational state transfer (REST) architecture. A smart home scenario is introduced to evaluate the proposed WoT-ESN architecture. The smart home scenario is tested through computer simulation of the energy consumption of various household appliances, device discovery, and response time performance. The simulation results show that the proposed scheme significantly optimizes the energy consumption of the household appliances and the response time of the appliances. PMID:28208787
Efficient parallel implementation of active appearance model fitting algorithm on GPU.
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
High-precision arithmetic in mathematical physics
Bailey, David H.; Borwein, Jonathan M.
2015-05-12
For many scientific calculations, particularly those involving empirical data, IEEE 32-bit floating-point arithmetic produces results of sufficient accuracy, while for other applications IEEE 64-bit floating-point is more appropriate. But for some very demanding applications, even higher levels of precision are often required. Furthermore, this article discusses the challenge of high-precision computation, in the context of mathematical physics, and highlights what facilities are required to support future computation, in light of emerging developments in computer architecture.
Evolution of a designless nanoparticle network into reconfigurable Boolean logic
NASA Astrophysics Data System (ADS)
Bose, S. K.; Lawrence, C. P.; Liu, Z.; Makarenko, K. S.; van Damme, R. M. J.; Broersma, H. J.; van der Wiel, W. G.
2015-12-01
Natural computers exploit the emergent properties and massive parallelism of interconnected networks of locally active components. Evolution has resulted in systems that compute quickly and that use energy efficiently, utilizing whatever physical properties are exploitable. Man-made computers, on the other hand, are based on circuits of functional units that follow given design rules. Hence, potentially exploitable physical processes, such as capacitive crosstalk, to solve a problem are left out. Until now, designless nanoscale networks of inanimate matter that exhibit robust computational functionality had not been realized. Here we artificially evolve the electrical properties of a disordered nanomaterials system (by optimizing the values of control voltages using a genetic algorithm) to perform computational tasks reconfigurably. We exploit the rich behaviour that emerges from interconnected metal nanoparticles, which act as strongly nonlinear single-electron transistors, and find that this nanoscale architecture can be configured in situ into any Boolean logic gate. This universal, reconfigurable gate would require about ten transistors in a conventional circuit. Our system meets the criteria for the physical realization of (cellular) neural networks: universality (arbitrary Boolean functions), compactness, robustness and evolvability, which implies scalability to perform more advanced tasks. Our evolutionary approach works around device-to-device variations and the accompanying uncertainties in performance. Moreover, it bears a great potential for more energy-efficient computation, and for solving problems that are very hard to tackle in conventional architectures.
Neuromorphic Kalman filter implementation in IBM’s TrueNorth
NASA Astrophysics Data System (ADS)
Carney, R.; Bouchard, K.; Calafiura, P.; Clark, D.; Donofrio, D.; Garcia-Sciveres, M.; Livezey, J.
2017-10-01
Following the advent of a post-Moore’s law field of computation, novel architectures continue to emerge. With composite, multi-million connection neuromorphic chips like IBM’s TrueNorth, neural engineering has now become a feasible technology in this novel computing paradigm. High Energy Physics experiments are continuously exploring new methods of computation and data handling, including neuromorphic, to support the growing challenges of the field and be prepared for future commodity computing trends. This work details the first instance of a Kalman filter implementation in IBM’s neuromorphic architecture, TrueNorth, for both parallel and serial spike trains. The implementation is tested on multiple simulated systems and its performance is evaluated with respect to an equivalent non-spiking Kalman filter. The limits of the implementation are explored whilst varying the size of weight and threshold registers, the number of spikes used to encode a state, size of neuron block for spatial encoding, and neuron potential reset schemes.
Vectorization, threading, and cache-blocking considerations for hydrocodes on emerging architectures
Fung, J.; Aulwes, R. T.; Bement, M. T.; ...
2015-07-14
This work reports on considerations for improving computational performance in preparation for current and expected changes to computer architecture. The algorithms studied will include increasingly complex prototypes for radiation hydrodynamics codes, such as gradient routines and diffusion matrix assembly (e.g., in [1-6]). The meshes considered for the algorithms are structured or unstructured meshes. The considerations applied for performance improvements are meant to be general in terms of architecture (not specifically graphical processing unit (GPUs) or multi-core machines, for example) and include techniques for vectorization, threading, tiling, and cache blocking. Out of a survey of optimization techniques on applications such asmore » diffusion and hydrodynamics, we make general recommendations with a view toward making these techniques conceptually accessible to the applications code developer. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.« less
Perspectives in numerical astrophysics:
NASA Astrophysics Data System (ADS)
Reverdy, V.
2016-12-01
In this discussion paper, we investigate the current and future status of numerical astrophysics and highlight key questions concerning the transition to the exascale era. We first discuss the fact that one of the main motivation behind high performance simulations should not be the reproduction of observational or experimental data, but the understanding of the emergence of complexity from fundamental laws. This motivation is put into perspective regarding the quest for more computational power and we argue that extra computational resources can be used to gain in abstraction. Then, the readiness level of present-day simulation codes in regard to upcoming exascale architecture is examined and two major challenges are raised concerning both the central role of data movement for performances and the growing complexity of codes. Software architecture is finally presented as a key component to make the most of upcoming architectures while solving original physics problems.
NASA Astrophysics Data System (ADS)
Vodenicarevic, D.; Locatelli, N.; Mizrahi, A.; Friedman, J. S.; Vincent, A. F.; Romera, M.; Fukushima, A.; Yakushiji, K.; Kubota, H.; Yuasa, S.; Tiwari, S.; Grollier, J.; Querlioz, D.
2017-11-01
Low-energy random number generation is critical for many emerging computing schemes proposed to complement or replace von Neumann architectures. However, current random number generators are always associated with an energy cost that is prohibitive for these computing schemes. We introduce random number bit generation based on specific nanodevices: superparamagnetic tunnel junctions. We experimentally demonstrate high-quality random bit generation that represents an orders-of-magnitude improvement in energy efficiency over current solutions. We show that the random generation speed improves with nanodevice scaling, and we investigate the impact of temperature, magnetic field, and cross talk. Finally, we show how alternative computing schemes can be implemented using superparamagentic tunnel junctions as random number generators. These results open the way for fabricating efficient hardware computing devices leveraging stochasticity, and they highlight an alternative use for emerging nanodevices.
Petascale Many Body Methods for Complex Correlated Systems
NASA Astrophysics Data System (ADS)
Pruschke, Thomas
2012-02-01
Correlated systems constitute an important class of materials in modern condensed matter physics. Correlation among electrons are at the heart of all ordering phenomena and many intriguing novel aspects, such as quantum phase transitions or topological insulators, observed in a variety of compounds. Yet, theoretically describing these phenomena is still a formidable task, even if one restricts the models used to the smallest possible set of degrees of freedom. Here, modern computer architectures play an essential role, and the joint effort to devise efficient algorithms and implement them on state-of-the art hardware has become an extremely active field in condensed-matter research. To tackle this task single-handed is quite obviously not possible. The NSF-OISE funded PIRE collaboration ``Graduate Education and Research in Petascale Many Body Methods for Complex Correlated Systems'' is a successful initiative to bring together leading experts around the world to form a virtual international organization for addressing these emerging challenges and educate the next generation of computational condensed matter physicists. The collaboration includes research groups developing novel theoretical tools to reliably and systematically study correlated solids, experts in efficient computational algorithms needed to solve the emerging equations, and those able to use modern heterogeneous computer architectures to make then working tools for the growing community.
A global distributed storage architecture
NASA Technical Reports Server (NTRS)
Lionikis, Nemo M.; Shields, Michael F.
1996-01-01
NSA architects and planners have come to realize that to gain the maximum benefit from, and keep pace with, emerging technologies, we must move to a radically different computing architecture. The compute complex of the future will be a distributed heterogeneous environment, where, to a much greater extent than today, network-based services are invoked to obtain resources. Among the rewards of implementing the services-based view are that it insulates the user from much of the complexity of our multi-platform, networked, computer and storage environment and hides its diverse underlying implementation details. In this paper, we will describe one of the fundamental services being built in our envisioned infrastructure; a global, distributed archive with near-real-time access characteristics. Our approach for adapting mass storage services to this infrastructure will become clear as the service is discussed.
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.; ...
2011-03-02
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
A Nanotechnology-Ready Computing Scheme based on a Weakly Coupled Oscillator Network
NASA Astrophysics Data System (ADS)
Vodenicarevic, Damir; Locatelli, Nicolas; Abreu Araujo, Flavio; Grollier, Julie; Querlioz, Damien
2017-03-01
With conventional transistor technologies reaching their limits, alternative computing schemes based on novel technologies are currently gaining considerable interest. Notably, promising computing approaches have proposed to leverage the complex dynamics emerging in networks of coupled oscillators based on nanotechnologies. The physical implementation of such architectures remains a true challenge, however, as most proposed ideas are not robust to nanotechnology devices’ non-idealities. In this work, we propose and investigate the implementation of an oscillator-based architecture, which can be used to carry out pattern recognition tasks, and which is tailored to the specificities of nanotechnologies. This scheme relies on a weak coupling between oscillators, and does not require a fine tuning of the coupling values. After evaluating its reliability under the severe constraints associated to nanotechnologies, we explore the scalability of such an architecture, suggesting its potential to realize pattern recognition tasks using limited resources. We show that it is robust to issues like noise, variability and oscillator non-linearity. Defining network optimization design rules, we show that nano-oscillator networks could be used for efficient cognitive processing.
A Nanotechnology-Ready Computing Scheme based on a Weakly Coupled Oscillator Network.
Vodenicarevic, Damir; Locatelli, Nicolas; Abreu Araujo, Flavio; Grollier, Julie; Querlioz, Damien
2017-03-21
With conventional transistor technologies reaching their limits, alternative computing schemes based on novel technologies are currently gaining considerable interest. Notably, promising computing approaches have proposed to leverage the complex dynamics emerging in networks of coupled oscillators based on nanotechnologies. The physical implementation of such architectures remains a true challenge, however, as most proposed ideas are not robust to nanotechnology devices' non-idealities. In this work, we propose and investigate the implementation of an oscillator-based architecture, which can be used to carry out pattern recognition tasks, and which is tailored to the specificities of nanotechnologies. This scheme relies on a weak coupling between oscillators, and does not require a fine tuning of the coupling values. After evaluating its reliability under the severe constraints associated to nanotechnologies, we explore the scalability of such an architecture, suggesting its potential to realize pattern recognition tasks using limited resources. We show that it is robust to issues like noise, variability and oscillator non-linearity. Defining network optimization design rules, we show that nano-oscillator networks could be used for efficient cognitive processing.
A Nanotechnology-Ready Computing Scheme based on a Weakly Coupled Oscillator Network
Vodenicarevic, Damir; Locatelli, Nicolas; Abreu Araujo, Flavio; Grollier, Julie; Querlioz, Damien
2017-01-01
With conventional transistor technologies reaching their limits, alternative computing schemes based on novel technologies are currently gaining considerable interest. Notably, promising computing approaches have proposed to leverage the complex dynamics emerging in networks of coupled oscillators based on nanotechnologies. The physical implementation of such architectures remains a true challenge, however, as most proposed ideas are not robust to nanotechnology devices’ non-idealities. In this work, we propose and investigate the implementation of an oscillator-based architecture, which can be used to carry out pattern recognition tasks, and which is tailored to the specificities of nanotechnologies. This scheme relies on a weak coupling between oscillators, and does not require a fine tuning of the coupling values. After evaluating its reliability under the severe constraints associated to nanotechnologies, we explore the scalability of such an architecture, suggesting its potential to realize pattern recognition tasks using limited resources. We show that it is robust to issues like noise, variability and oscillator non-linearity. Defining network optimization design rules, we show that nano-oscillator networks could be used for efficient cognitive processing. PMID:28322262
NASA Astrophysics Data System (ADS)
Shi, X.
2015-12-01
As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
Universal Design: Implications for Computing Education
ERIC Educational Resources Information Center
Burgstahler, Sheryl
2011-01-01
Universal design (UD), a concept that grew from the field of architecture, has recently emerged as a paradigm for designing instructional methods, curriculum, and assessments that are welcoming and accessible to students with a wide range of characteristics, including those related to race, ethnicity, native language, gender, age, and disability.…
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)
NASA Astrophysics Data System (ADS)
Vincenti, Henri
2016-03-01
The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
Multimedia And Internetworking Architecture Infrastructure On Interactive E-Learning System
NASA Astrophysics Data System (ADS)
Indah, K. A. T.; Sukarata, G.
2018-01-01
Interactive e-learning is a distance learning method that involves information technology, electronic system or computer as one means of learning system used for teaching and learning process that is implemented without having face to face directly between teacher and student. A strong dependence on emerging technologies greatly influences the way in which the architecture is designed to produce a powerful interactive e-learning network. In this paper analyzed an architecture model where learning can be done interactively, involving many participants (N-way synchronized distance learning) using video conferencing technology. Also used broadband internet network as well as multicast techniques as a troubleshooting method for bandwidth usage can be efficient.
NASA Astrophysics Data System (ADS)
Neff, John A.
1989-12-01
Experiments originating from Gestalt psychology have shown that representing information in a symbolic form provides a more effective means to understanding. Computer scientists have been struggling for the last two decades to determine how best to create, manipulate, and store collections of symbolic structures. In the past, much of this struggling led to software innovations because that was the path of least resistance. For example, the development of heuristics for organizing the searching through knowledge bases was much less expensive than building massively parallel machines that could search in parallel. That is now beginning to change with the emergence of parallel architectures which are showing the potential for handling symbolic structures. This paper will review the relationships between symbolic computing and parallel computing architectures, and will identify opportunities for optics to significantly impact the performance of such computing machines. Although neural networks are an exciting subset of massively parallel computing structures, this paper will not touch on this area since it is receiving a great deal of attention in the literature. That is, the concepts presented herein do not consider the distributed representation of knowledge.
NASA Astrophysics Data System (ADS)
Ho, Wan Ching; Dautenhahn, Kerstin; Nehaniv, Chrystopher
2008-03-01
In this paper, we discuss the concept of autobiographic agent and how memory may extend an agent's temporal horizon and increase its adaptability. These concepts are applied to an implementation of a scenario where agents are interacting in a complex virtual artificial life environment. We present computational memory architectures for autobiographic virtual agents that enable agents to retrieve meaningful information from their dynamic memories which increases their adaptation and survival in the environment. The design of the memory architectures, the agents, and the virtual environment are described in detail. Next, a series of experimental studies and their results are presented which show the adaptive advantage of autobiographic memory, i.e. from remembering significant experiences. Also, in a multi-agent scenario where agents can communicate via stories based on their autobiographic memory, it is found that new adaptive behaviours can emerge from an individual's reinterpretation of experiences received from other agents whereby higher communication frequency yields better group performance. An interface is described that visualises the memory contents of an agent. From an observer perspective, the agents' behaviours can be understood as individually structured, and temporally grounded, and, with the communication of experience, can be seen to rely on emergent mixed narrative reconstructions combining the experiences of several agents. This research leads to insights into how bottom-up story-telling and autobiographic reconstruction in autonomous, adaptive agents allow temporally grounded behaviour to emerge. The article concludes with a discussion of possible implications of this research direction for future autobiographic, narrative agents.
High-performance computing with quantum processing units
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...
2017-03-01
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
High-performance computing with quantum processing units
DOE Office of Scientific and Technical Information (OSTI.GOV)
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Dependency graph for code analysis on emerging architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shashkov, Mikhail Jurievich; Lipnikov, Konstantin
Direct acyclic dependency (DAG) graph is becoming the standard for modern multi-physics codes.The ideal DAG is the true block-scheme of a multi-physics code. Therefore, it is the convenient object for insitu analysis of the cost of computations and algorithmic bottlenecks related to statistical frequent data motion and dymanical machine state.
Towards Energy-Performance Trade-off Analysis of Parallel Applications
ERIC Educational Resources Information Center
Korthikanti, Vijay Anand Reddy
2011-01-01
Energy consumption by computer systems has emerged as an important concern, both at the level of individual devices (limited battery capacity in mobile systems) and at the societal level (the production of Green House Gases). In parallel architectures, applications may be executed on a variable number of cores and these cores may operate at…
Integration of Wireless Technologies in Smart University Campus Environment: Framework Architecture
ERIC Educational Resources Information Center
Khamayseh, Yaser; Mardini, Wail; Aljawarneh, Shadi; Yassein, Muneer Bani
2015-01-01
In this paper, the authors are particularly interested in enhancing the education process by integrating new tools to the teaching environments. This enhancement is part of an emerging concept, called smart campus. Smart University Campus will come up with a new ubiquitous computing and communication field and change people's lives radically by…
FPGA-Based Laboratory Assignments for NoC-Based Manycore Systems
ERIC Educational Resources Information Center
Ttofis, C.; Theocharides, T.; Michael, M. K.
2012-01-01
Manycore systems have emerged as being one of the dominant architectural trends in next-generation computer systems. These highly parallel systems are expected to be interconnected via packet-based networks-on-chip (NoC). The complexity of such systems poses novel and exciting challenges in academia, as teaching their design requires the students…
NASA Technical Reports Server (NTRS)
Bhasin, Kul; Hayden, Jeffrey L.
2005-01-01
For human and robotic exploration missions in the Vision for Exploration, roadmaps are needed for capability development and investments based on advanced technology developments. A roadmap development process was undertaken for the needed communications, and networking capabilities and technologies for the future human and robotics missions. The underlying processes are derived from work carried out during development of the future space communications architecture, an d NASA's Space Architect Office (SAO) defined formats and structures for accumulating data. Interrelationships were established among emerging requirements, the capability analysis and technology status, and performance data. After developing an architectural communications and networking framework structured around the assumed needs for human and robotic exploration, in the vicinity of Earth, Moon, along the path to Mars, and in the vicinity of Mars, information was gathered from expert participants. This information was used to identify the capabilities expected from the new infrastructure and the technological gaps in the way of obtaining them. We define realistic, long-term space communication architectures based on emerging needs and translate the needs into interfaces, functions, and computer processing that will be required. In developing our roadmapping process, we defined requirements for achieving end-to-end activities that will be carried out by future NASA human and robotic missions. This paper describes: 10 the architectural framework developed for analysis; 2) our approach to gathering and analyzing data from NASA, industry, and academia; 3) an outline of the technology research to be done, including milestones for technology research and demonstrations with timelines; and 4) the technology roadmaps themselves.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Runnels, Scott Robert; Bachrach, Harrison Ian; Carlson, Nils
The two primary purposes of LANL’s Computational Physics Student Summer Workshop are (1) To educate graduate and exceptional undergraduate students in the challenges and applications of computational physics of interest to LANL, and (2) Entice their interest toward those challenges. Computational physics is emerging as a discipline in its own right, combining expertise in mathematics, physics, and computer science. The mathematical aspects focus on numerical methods for solving equations on the computer as well as developing test problems with analytical solutions. The physics aspects are very broad, ranging from low-temperature material modeling to extremely high temperature plasma physics, radiation transportmore » and neutron transport. The computer science issues are concerned with matching numerical algorithms to emerging architectures and maintaining the quality of extremely large codes built to perform multi-physics calculations. Although graduate programs associated with computational physics are emerging, it is apparent that the pool of U.S. citizens in this multi-disciplinary field is relatively small and is typically not focused on the aspects that are of primary interest to LANL. Furthermore, more structured foundations for LANL interaction with universities in computational physics is needed; historically interactions rely heavily on individuals’ personalities and personal contacts. Thus a tertiary purpose of the Summer Workshop is to build an educational network of LANL researchers, university professors, and emerging students to advance the field and LANL’s involvement in it.« less
Reprogrammable logic in memristive crossbar for in-memory computing
NASA Astrophysics Data System (ADS)
Cheng, Long; Zhang, Mei-Yun; Li, Yi; Zhou, Ya-Xiong; Wang, Zhuo-Rui; Hu, Si-Yu; Long, Shi-Bing; Liu, Ming; Miao, Xiang-Shui
2017-12-01
Memristive stateful logic has emerged as a promising next-generation in-memory computing paradigm to address escalating computing-performance pressures in traditional von Neumann architecture. Here, we present a nonvolatile reprogrammable logic method that can process data between different rows and columns in a memristive crossbar array based on material implication (IMP) logic. Arbitrary Boolean logic can be executed with a reprogrammable cell containing four memristors in a crossbar array. In the fabricated Ti/HfO2/W memristive array, some fundamental functions, such as universal NAND logic and data transfer, were experimentally implemented. Moreover, using eight memristors in a 2 × 4 array, a one-bit full adder was theoretically designed and verified by simulation to exhibit the feasibility of our method to accomplish complex computing tasks. In addition, some critical logic-related performances were further discussed, such as the flexibility of data processing, cascading problem and bit error rate. Such a method could be a step forward in developing IMP-based memristive nonvolatile logic for large-scale in-memory computing architecture.
Sengupta, Abhronil; Shim, Yong; Roy, Kaushik
2016-12-01
Non-Boolean computing based on emerging post-CMOS technologies can potentially pave the way for low-power neural computing platforms. However, existing work on such emerging neuromorphic architectures have either focused on solely mimicking the neuron, or the synapse functionality. While memristive devices have been proposed to emulate biological synapses, spintronic devices have proved to be efficient at performing the thresholding operation of the neuron at ultra-low currents. In this work, we propose an All-Spin Artificial Neural Network where a single spintronic device acts as the basic building block of the system. The device offers a direct mapping to synapse and neuron functionalities in the brain while inter-layer network communication is accomplished via CMOS transistors. To the best of our knowledge, this is the first demonstration of a neural architecture where a single nanoelectronic device is able to mimic both neurons and synapses. The ultra-low voltage operation of low resistance magneto-metallic neurons enables the low-voltage operation of the array of spintronic synapses, thereby leading to ultra-low power neural architectures. Device-level simulations, calibrated to experimental results, was used to drive the circuit and system level simulations of the neural network for a standard pattern recognition problem. Simulation studies indicate energy savings by ∼ 100× in comparison to a corresponding digital/analog CMOS neuron implementation.
Distributed virtual environment for emergency medical training
NASA Astrophysics Data System (ADS)
Stytz, Martin R.; Banks, Sheila B.; Garcia, Brian W.; Godsell-Stytz, Gayl M.
1997-07-01
In many professions where individuals must work in a team in a high stress environment to accomplish a time-critical task, individual and team performance can benefit from joint training using distributed virtual environments (DVEs). One professional field that lacks but needs a high-fidelity team training environment is the field of emergency medicine. Currently, emergency department (ED) medical personnel train by using words to create a metal picture of a situation for the physician and staff, who then cooperate to solve the problems portrayed by the word picture. The need in emergency medicine for realistic virtual team training is critical because ED staff typically encounter rarely occurring but life threatening situations only once in their careers and because ED teams currently have no realistic environment in which to practice their team skills. The resulting lack of experience and teamwork makes diagnosis and treatment more difficult. Virtual environment based training has the potential to redress these shortfalls. The objective of our research is to develop a state-of-the-art virtual environment for emergency medicine team training. The virtual emergency room (VER) allows ED physicians and medical staff to realistically prepare for emergency medical situations by performing triage, diagnosis, and treatment on virtual patients within an environment that provides them with the tools they require and the team environment they need to realistically perform these three tasks. There are several issues that must be addressed before this vision is realized. The key issues deal with distribution of computations; the doctor and staff interface to the virtual patient and ED equipment; the accurate simulation of individual patient organs' response to injury, medication, and treatment; and an accurate modeling of the symptoms and appearance of the patient while maintaining a real-time interaction capability. Our ongoing work addresses all of these issues. In this paper we report on our prototype VER system and its distributed system architecture for an emergency department distributed virtual environment for emergency medical staff training. The virtual environment enables emergency department physicians and staff to develop their diagnostic and treatment skills using the virtual tools they need to perform diagnostic and treatment tasks. Virtual human imagery, and real-time virtual human response are used to create the virtual patient and present a scenario. Patient vital signs are available to the emergency department team as they manage the virtual case. The work reported here consists of the system architectures we developed for the distributed components of the virtual emergency room. The architectures we describe consist of the network level architecture as well as the software architecture for each actor within the virtual emergency room. We describe the role of distributed interactive simulation and other enabling technologies within the virtual emergency room project.
Cournède, Paul-Henry; Mathieu, Amélie; Houllier, François; Barthélémy, Daniel; de Reffye, Philippe
2008-01-01
Background and Aims The dynamical system of plant growth GREENLAB was originally developed for individual plants, without explicitly taking into account interplant competition for light. Inspired by the competition models developed in the context of forest science for mono-specific stands, we propose to adapt the method of crown projection onto the x–y plane to GREENLAB, in order to study the effects of density on resource acquisition and on architectural development. Methods The empirical production equation of GREENLAB is extrapolated to stands by computing the exposed photosynthetic foliage area of each plant. The computation is based on the combination of Poisson models of leaf distribution for all the neighbouring plants whose crown projection surfaces overlap. To study the effects of density on architectural development, we link the proposed competition model to the model of interaction between functional growth and structural development introduced by Mathieu (2006, PhD Thesis, Ecole Centrale de Paris, France). Key Results and Conclusions The model is applied to mono-specific field crops and forest stands. For high-density crops at full cover, the model is shown to be equivalent to the classical equation of field crop production ( Howell and Musick, 1985, in Les besoins en eau des cultures; Paris: INRA Editions). However, our method is more accurate at the early stages of growth (before cover) or in the case of intermediate densities. It may potentially account for local effects, such as uneven spacing, variation in the time of plant emergence or variation in seed biomass. The application of the model to trees illustrates the expression of plant plasticity in response to competition for light. Density strongly impacts on tree architectural development through interactions with the source–sink balances during growth. The effects of density on tree height and radial growth that are commonly observed in real stands appear as emerging properties of the model. PMID:18037666
Cournède, Paul-Henry; Mathieu, Amélie; Houllier, François; Barthélémy, Daniel; de Reffye, Philippe
2008-05-01
The dynamical system of plant growth GREENLAB was originally developed for individual plants, without explicitly taking into account interplant competition for light. Inspired by the competition models developed in the context of forest science for mono-specific stands, we propose to adapt the method of crown projection onto the x-y plane to GREENLAB, in order to study the effects of density on resource acquisition and on architectural development. The empirical production equation of GREENLAB is extrapolated to stands by computing the exposed photosynthetic foliage area of each plant. The computation is based on the combination of Poisson models of leaf distribution for all the neighbouring plants whose crown projection surfaces overlap. To study the effects of density on architectural development, we link the proposed competition model to the model of interaction between functional growth and structural development introduced by Mathieu (2006, PhD Thesis, Ecole Centrale de Paris, France). The model is applied to mono-specific field crops and forest stands. For high-density crops at full cover, the model is shown to be equivalent to the classical equation of field crop production (Howell and Musick, 1985, in Les besoins en eau des cultures; Paris: INRA Editions). However, our method is more accurate at the early stages of growth (before cover) or in the case of intermediate densities. It may potentially account for local effects, such as uneven spacing, variation in the time of plant emergence or variation in seed biomass. The application of the model to trees illustrates the expression of plant plasticity in response to competition for light. Density strongly impacts on tree architectural development through interactions with the source-sink balances during growth. The effects of density on tree height and radial growth that are commonly observed in real stands appear as emerging properties of the model.
Parallel algorithm for computation of second-order sequential best rotations
NASA Astrophysics Data System (ADS)
Redif, Soydan; Kasap, Server
2013-12-01
Algorithms for computing an approximate polynomial matrix eigenvalue decomposition of para-Hermitian systems have emerged as a powerful, generic signal processing tool. A technique that has shown much success in this regard is the sequential best rotation (SBR2) algorithm. Proposed is a scheme for parallelising SBR2 with a view to exploiting the modern architectural features and inherent parallelism of field-programmable gate array (FPGA) technology. Experiments show that the proposed scheme can achieve low execution times while requiring minimal FPGA resources.
Genten: Software for Generalized Tensor Decompositions v. 1.0.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phipps, Eric T.; Kolda, Tamara G.; Dunlavy, Daniel
Tensors, or multidimensional arrays, are a powerful mathematical means of describing multiway data. This software provides computational means for decomposing or approximating a given tensor in terms of smaller tensors of lower dimension, focusing on decomposition of large, sparse tensors. These techniques have applications in many scientific areas, including signal processing, linear algebra, computer vision, numerical analysis, data mining, graph analysis, neuroscience and more. The software is designed to take advantage of parallelism present emerging computer architectures such has multi-core CPUs, many-core accelerators such as the Intel Xeon Phi, and computation-oriented GPUs to enable efficient processing of large tensors.
ActiveTutor: Towards More Adaptive Features in an E-Learning Framework
ERIC Educational Resources Information Center
Fournier, Jean-Pierre; Sansonnet, Jean-Paul
2008-01-01
Purpose: This paper aims to sketch the emerging notion of auto-adaptive software when applied to e-learning software. Design/methodology/approach: The study and the implementation of the auto-adaptive architecture are based on the operational framework "ActiveTutor" that is used for teaching the topic of computer science programming in first-grade…
A review of emerging non-volatile memory (NVM) technologies and applications
NASA Astrophysics Data System (ADS)
Chen, An
2016-11-01
This paper will review emerging non-volatile memory (NVM) technologies, with the focus on phase change memory (PCM), spin-transfer-torque random-access-memory (STTRAM), resistive random-access-memory (RRAM), and ferroelectric field-effect-transistor (FeFET) memory. These promising NVM devices are evaluated in terms of their advantages, challenges, and applications. Their performance is compared based on reported parameters of major industrial test chips. Memory selector devices and cell structures are discussed. Changing market trends toward low power (e.g., mobile, IoT) and data-centric applications create opportunities for emerging NVMs. High-performance and low-cost emerging NVMs may simplify memory hierarchy, introduce non-volatility in logic gates and circuits, reduce system power, and enable novel architectures. Storage-class memory (SCM) based on high-density NVMs could fill the performance and density gap between memory and storage. Some unique characteristics of emerging NVMs can be utilized for novel applications beyond the memory space, e.g., neuromorphic computing, hardware security, etc. In the beyond-CMOS era, emerging NVMs have the potential to fulfill more important functions and enable more efficient, intelligent, and secure computing systems.
A Cloud Robotics Based Service for Managing RPAS in Emergency, Rescue and Hazardous Scenarios
NASA Astrophysics Data System (ADS)
Silvagni, Mario; Chiaberge, Marcello; Sanguedolce, Claudio; Dara, Gianluca
2016-04-01
Cloud robotics and cloud services are revolutionizing not only the ICT world but also the robotics industry, giving robots more computing capabilities, storage and connection bandwidth while opening new scenarios that blend the physical to the digital world. In this vision, new IT architectures are required to manage robots, retrieve data from them and create services to interact with users. Among all the robots this work is mainly focused on flying robots, better known as drones, UAV (Unmanned Aerial Vehicle) or RPAS (Remotely Piloted Aircraft Systems). The cloud robotics approach shifts the concept of having a single local "intelligence" for every single UAV, as a unique device that carries out onboard all the computation and storage processes, to a more powerful "centralized brain" located in the cloud. This breakthrough opens new scenarios where UAVs are agents, relying on remote servers for most of their computational load and data storage, creating a network of devices where they can share knowledge and information. Many applications, using UAVs, are growing as interesting and suitable devices for environment monitoring. Many services can be build fetching data from UAVs, such as telemetry, video streaming, pictures or sensors data; once. These services, part of the IT architecture, can be accessed via web by other devices or shared with other UAVs. As test cases of the proposed architecture, two examples are reported. In the first one a search and rescue or emergency management, where UAVs are required for monitoring intervention, is shown. In case of emergency or aggression, the user requests the emergency service from the IT architecture, providing GPS coordinates and an identification number. The IT architecture uses a UAV (choosing among the available one according to distance, service status, etc.) to reach him/her for monitoring and support operations. In the meantime, an officer will use the service to see the current position of the UAV, its telemetry and video streaming from its camera. Data are stored for further use and documentation and can be shared to all the involved personal or services. The second case refer to imaging survey. An investigation area is selected using a map or a set of coordinates by a user that can be on the field on in a management facility. The cloud system elaborate this data and automatically compute a flight plan that consider the survey data requirements (i.e: picture ground resolution, overlapping) but also several environment constraints (i.e: no fly zones, possible hazardous areas, known obstacles, etc). Once the flight plan is loaded in the selected UAV the mission starts. During the mission, if a suitable data network coverage is available, the UAV transmit acquired images (typically low quality image to limit bandwidth) and shooting pose in order to perform a preliminary check during the mission and minimize failing in survey; if not, all data are uploaded asynchronously after the mission. The cloud servers perform all the tasks related to image processing (mosaic, ortho-photo, geo-referencing, 3D models) and data management.
Emotions are emergent processes: they require a dynamic computational architecture
Scherer, Klaus R.
2009-01-01
Emotion is a cultural and psychobiological adaptation mechanism which allows each individual to react flexibly and dynamically to environmental contingencies. From this claim flows a description of the elements theoretically needed to construct a virtual agent with the ability to display human-like emotions and to respond appropriately to human emotional expression. This article offers a brief survey of the desirable features of emotion theories that make them ideal blueprints for agent models. In particular, the component process model of emotion is described, a theory which postulates emotion-antecedent appraisal on different levels of processing that drive response system patterning predictions. In conclusion, investing seriously in emergent computational modelling of emotion using a nonlinear dynamic systems approach is suggested. PMID:19884141
DOE Office of Scientific and Technical Information (OSTI.GOV)
Runnels, Scott Robert; Caldwell, Wendy; Brown, Barton Jed
The two primary purposes of LANL’s Computational Physics Student Summer Workshop are (1) To educate graduate and exceptional undergraduate students in the challenges and applications of computational physics of interest to LANL, and (2) Entice their interest toward those challenges. Computational physics is emerging as a discipline in its own right, combining expertise in mathematics, physics, and computer science. The mathematical aspects focus on numerical methods for solving equations on the computer as well as developing test problems with analytical solutions. The physics aspects are very broad, ranging from low-temperature material modeling to extremely high temperature plasma physics, radiation transportmore » and neutron transport. The computer science issues are concerned with matching numerical algorithms to emerging architectures and maintaining the quality of extremely large codes built to perform multi-physics calculations. Although graduate programs associated with computational physics are emerging, it is apparent that the pool of U.S. citizens in this multi-disciplinary field is relatively small and is typically not focused on the aspects that are of primary interest to LANL. Furthermore, more structured foundations for LANL interaction with universities in computational physics is needed; historically interactions rely heavily on individuals’ personalities and personal contacts. Thus a tertiary purpose of the Summer Workshop is to build an educational network of LANL researchers, university professors, and emerging students to advance the field and LANL’s involvement in it. This report includes both the background for the program and the reports from the students.« less
Borresen, Jon; Lynch, Stephen
2012-01-01
In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory.
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2007-01-01
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dual-core and Intel quad-core designs, the heterogeneous STI Cell, as well as the first scientificmore » study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
Digital quantum simulators in a scalable architecture of hybrid spin-photon qubits
Chiesa, Alessandro; Santini, Paolo; Gerace, Dario; Raftery, James; Houck, Andrew A.; Carretta, Stefano
2015-01-01
Resolving quantum many-body problems represents one of the greatest challenges in physics and physical chemistry, due to the prohibitively large computational resources that would be required by using classical computers. A solution has been foreseen by directly simulating the time evolution through sequences of quantum gates applied to arrays of qubits, i.e. by implementing a digital quantum simulator. Superconducting circuits and resonators are emerging as an extremely promising platform for quantum computation architectures, but a digital quantum simulator proposal that is straightforwardly scalable, universal, and realizable with state-of-the-art technology is presently lacking. Here we propose a viable scheme to implement a universal quantum simulator with hybrid spin-photon qubits in an array of superconducting resonators, which is intrinsically scalable and allows for local control. As representative examples we consider the transverse-field Ising model, a spin-1 Hamiltonian, and the two-dimensional Hubbard model and we numerically simulate the scheme by including the main sources of decoherence. PMID:26563516
Open Systems Architecture for Command, Control and Communications
1991-07-01
CONTENTS SECTION TITLE PAGE I. EXECUTIVE SUMMARY 5 II. TERMS OF REFERENCE 7 III. PANEL MEMBERSHIP 9 IV. INTRODUCTION 11 V. INDUSTRIAL REVOLUTION 19 VI...INTRODUCTION 18 19 V. INDUSTRIAL REVOLUTION 20 21 Initial manifestations of computer and communications standards emerged in the early seventies, largely...SYSTEMS INDUSTRIAL REVOLUTION Application Presentation Session Transport Internet Data Link Physical Application Presentation Session Transport
Crowd Simulation Incorporating Agent Psychological Models, Roles and Communication
2005-01-01
system (PMFserv) that implements human behavior models from a range of ability, stress, emotion , decision theoretic and motivation sources. An...autonomous agents, human behavior models, culture and emotions 1. Introduction There are many applications of computer animation and simulation where...We describe a new architecture to integrate a psychological model into a crowd simulation system in order to obtain believable emergent behaviors
ERIC Educational Resources Information Center
Stamas, Paul J.
2013-01-01
This case study research followed the two-year transition of a medium-sized manufacturing firm towards a service-oriented enterprise. A service-oriented enterprise is an emerging architecture of the firm that leverages the paradigm of services computing to integrate the capabilities of the firm with the complementary competencies of business…
Computers in Academic Architecture Libraries.
ERIC Educational Resources Information Center
Willis, Alfred; And Others
1992-01-01
Computers are widely used in architectural research and teaching in U.S. schools of architecture. A survey of libraries serving these schools sought information on the emphasis placed on computers by the architectural curriculum, accessibility of computers to library staff, and accessibility of computers to library patrons. Survey results and…
LVFS: A Big Data File Storage Bridge for the HPC Community
NASA Astrophysics Data System (ADS)
Golpayegani, N.; Halem, M.; Mauoka, E.; Fonseca, L. F.
2015-12-01
Merging Big Data capabilities into High Performance Computing architecture starts at the file storage level. Heterogeneous storage systems are emerging which offer enhanced features for dealing with Big Data such as the IBM GPFS storage system's integration into Hadoop Map-Reduce. Taking advantage of these capabilities requires file storage systems to be adaptive and accommodate these new storage technologies. We present the extension of the Lightweight Virtual File System (LVFS) currently running as the production system for the MODIS Level 1 and Atmosphere Archive and Distribution System (LAADS) to incorporate a flexible plugin architecture which allows easy integration of new HPC hardware and/or software storage technologies without disrupting workflows, system architectures and only minimal impact on existing tools. We consider two essential aspects provided by the LVFS plugin architecture needed for the future HPC community. First, it allows for the seamless integration of new and emerging hardware technologies which are significantly different than existing technologies such as Segate's Kinetic disks and Intel's 3DXPoint non-volatile storage. Second is the transparent and instantaneous conversion between new software technologies and various file formats. With most current storage system a switch in file format would require costly reprocessing and nearly doubling of storage requirements. We will install LVFS on UMBC's IBM iDataPlex cluster with a heterogeneous storage architecture utilizing local, remote, and Seagate Kinetic storage as a case study. LVFS merges different kinds of storage architectures to show users a uniform layout and, therefore, prevent any disruption in workflows, architecture design, or tool usage. We will show how LVFS will convert HDF data produced by applying machine learning algorithms to Xco2 Level 2 data from the OCO-2 satellite to produce CO2 surface fluxes into GeoTIFF for visualization.
Using Computing and Data Grids for Large-Scale Science and Engineering
NASA Technical Reports Server (NTRS)
Johnston, William E.
2001-01-01
We use the term "Grid" to refer to a software system that provides uniform and location independent access to geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. These emerging data and computing Grids promise to provide a highly capable and scalable environment for addressing large-scale science problems. We describe the requirements for science Grids, the resulting services and architecture of NASA's Information Power Grid (IPG) and DOE's Science Grid, and some of the scaling issues that have come up in their implementation.
Mesoscopic modelling and simulation of soft matter.
Schiller, Ulf D; Krüger, Timm; Henrich, Oliver
2017-12-20
The deformability of soft condensed matter often requires modelling of hydrodynamical aspects to gain quantitative understanding. This, however, requires specialised methods that can resolve the multiscale nature of soft matter systems. We review a number of the most popular simulation methods that have emerged, such as Langevin dynamics, dissipative particle dynamics, multi-particle collision dynamics, sometimes also referred to as stochastic rotation dynamics, and the lattice-Boltzmann method. We conclude this review with a short glance at current compute architectures for high-performance computing and community codes for soft matter simulation.
Picture archiving and computing systems: the key to enterprise digital imaging.
Krohn, Richard
2002-09-01
The utopian view of the electronic medical record includes the digital transformation of all aspects of patient information. Historically, imagery from the radiology, cardiology, ophthalmology, and pathology departments, as well as the emergency room, has been a morass of paper, film, and other media, isolated within each department's system architecture. In answer to this dilemma, picture archiving and computing systems have become the focal point of efforts to create a single platform for the collection, storage, and distribution of clinical imagery throughout the health care enterprise.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmalz, Mark S
2011-07-24
Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
A Survey Of Techniques for Managing and Leveraging Caches in GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2014-09-01
Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges ofmore » cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-15
... Aerial Communications Architecture in Response to an Emergency AGENCY: Federal Communications Commission... deployable aerial communications architecture (DACA) in facilitating emergency response by rapidly restoring... copying during normal business hours in the FCC Reference Information Center, Portals II, 445 12th Street...
Borresen, Jon; Lynch, Stephen
2012-01-01
In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory. PMID:23173034
HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation
NASA Technical Reports Server (NTRS)
Sterling, Thomas; Bergman, Larry
2000-01-01
Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention semiconductor logic. Wave Division Multiplexing optical communications can approach a peak per fiber bandwidth of 1 Tbps and the new Data Vortex network topology employing this technology can connect tens of thousands of ports providing a bi-section bandwidth on the order of a Petabyte per second with latencies well below 100 nanoseconds, even under heavy loads. Processor-in-Memory (PIM) technology combines logic and memory on the same chip exposing the internal bandwidth of the memory row buffers at low latency. And holographic storage photorefractive storage technologies provide high-density memory with access a thousand times faster than conventional disk technologies. Together these technologies enable a new class of shared memory system architecture with a peak performance in the range of a Petaflops but size and power requirements comparable to today's largest Teraflops scale systems. To achieve high-sustained performance, HTMT combines an advanced multithreading processor architecture with a memory-driven coarse-grained latency management strategy called "percolation", yielding high efficiency while reducing the much of the parallel programming burden. This paper will present the basic system architecture characteristics made possible through this series of advanced technologies and then give a detailed description of the new percolation approach to runtime latency management.
Memristor-Based Synapse Design and Training Scheme for Neuromorphic Computing Architecture
2012-06-01
system level built upon the conventional Von Neumann computer architecture [2][3]. Developing the neuromorphic architecture at chip level by...SCHEME FOR NEUROMORPHIC COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-11-2-0046 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...creation of memristor-based neuromorphic computing architecture. Rather than the existing crossbar-based neuron network designs, we focus on memristor
Framework for architecture-independent run-time reconfigurable applications
NASA Astrophysics Data System (ADS)
Lehn, David I.; Hudson, Rhett D.; Athanas, Peter M.
2000-10-01
Configurable Computing Machines (CCMs) have emerged as a technology with the computational benefits of custom ASICs as well as the flexibility and reconfigurability of general-purpose microprocessors. Significant effort from the research community has focused on techniques to move this reconfigurability from a rapid application development tool to a run-time tool. This requires the ability to change the hardware design while the application is executing and is known as Run-Time Reconfiguration (RTR). Widespread acceptance of run-time reconfigurable custom computing depends upon the existence of high-level automated design tools. Such tools must reduce the designers effort to port applications between different platforms as the architecture, hardware, and software evolves. A Java implementation of a high-level application framework, called Janus, is presented here. In this environment, developers create Java classes that describe the structural behavior of an application. The framework allows hardware and software modules to be freely mixed and interchanged. A compilation phase of the development process analyzes the structure of the application and adapts it to the target platform. Janus is capable of structuring the run-time behavior of an application to take advantage of the memory and computational resources available.
NASA Astrophysics Data System (ADS)
Pruhs, Kirk
A particularly important emergent technology is heterogeneous processors (or cores), which many computer architects believe will be the dominant architectural design in the future. The main advantage of a heterogeneous architecture, relative to an architecture of identical processors, is that it allows for the inclusion of processors whose design is specialized for particular types of jobs, and for jobs to be assigned to a processor best suited for that job. Most notably, it is envisioned that these heterogeneous architectures will consist of a small number of high-power high-performance processors for critical jobs, and a larger number of lower-power lower-performance processors for less critical jobs. Naturally, the lower-power processors would be more energy efficient in terms of the computation performed per unit of energy expended, and would generate less heat per unit of computation. For a given area and power budget, heterogeneous designs can give significantly better performance for standard workloads. Moreover, even processors that were designed to be homogeneous, are increasingly likely to be heterogeneous at run time: the dominant underlying cause is the increasing variability in the fabrication process as the feature size is scaled down (although run time faults will also play a role). Since manufacturing yields would be unacceptably low if every processor/core was required to be perfect, and since there would be significant performance loss from derating the entire chip to the functioning of the least functional processor (which is what would be required in order to attain processor homogeneity), some processor heterogeneity seems inevitable in chips with many processors/cores.
An evolutionary algorithm that constructs recurrent neural networks.
Angeline, P J; Saunders, G M; Pollack, J B
1994-01-01
Standard methods for simultaneously inducing the structure and weights of recurrent neural networks limit every task to an assumed class of architectures. Such a simplification is necessary since the interactions between network structure and function are not well understood. Evolutionary computations, which include genetic algorithms and evolutionary programming, are population-based search methods that have shown promise in many similarly complex tasks. This paper argues that genetic algorithms are inappropriate for network acquisition and describes an evolutionary program, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks. GNARL's empirical acquisition method allows for the emergence of complex behaviors and topologies that are potentially excluded by the artificial architectural constraints imposed in standard network induction methods.
Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Oliker, Leonid; Vuduc, Richard
2008-10-16
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific-optimization methodologies for important scientific computations. In this work, we examine sparse matrix-vector multiply (SpMV) - one of the most heavily used kernels in scientific computing - across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD quad-core, AMD dual-core, and Intel quad-core designs, the heterogeneous STI Cell, as well as one ofmore » the first scientific studies of the highly multithreaded Sun Victoria Falls (a Niagara2 SMP). We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing state-of-the-art serial and parallel SpMV implementations. Additionally, we present key insights into the architectural trade-offs of leading multicore design strategies, in the context of demanding memory-bound numerical algorithms.« less
NASA Astrophysics Data System (ADS)
Cao, Jian; Li, Qi; Cheng, Jicheng
2005-10-01
This paper discusses the concept, key technologies and main application of Spatial Services Grid. The technologies of Grid computing and Webservice is playing a revolutionary role in studying the spatial information services. The concept of the SSG (Spatial Services Grid) is put forward based on the SIG (Spatial Information Grid) and OGSA (open grid service architecture). Firstly, the grid computing is reviewed and the key technologies of SIG and their main applications are reviewed. Secondly, the grid computing and three kinds of SIG (in broad sense)--SDG (spatial data grid), SIG (spatial information grid) and SSG (spatial services grid) and their relationships are proposed. Thirdly, the key technologies of the SSG (spatial services grid) is put forward. Finally, three representative applications of SSG (spatial services grid) are discussed. The first application is urban location based services gird, which is a typical spatial services grid and can be constructed on OGSA (Open Grid Services Architecture) and digital city platform. The second application is region sustainable development grid which is the key to the urban development. The third application is Region disaster and emergency management services grid.
A Programming Framework for Scientific Applications on CPU-GPU Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Owens, John
2013-03-24
At a high level, my research interests center around designing, programming, and evaluating computer systems that use new approaches to solve interesting problems. The rapid change of technology allows a variety of different architectural approaches to computationally difficult problems, and a constantly shifting set of constraints and trends makes the solutions to these problems both challenging and interesting. One of the most important recent trends in computing has been a move to commodity parallel architectures. This sea change is motivated by the industry’s inability to continue to profitably increase performance on a single processor and instead to move to multiplemore » parallel processors. In the period of review, my most significant work has been leading a research group looking at the use of the graphics processing unit (GPU) as a general-purpose processor. GPUs can potentially deliver superior performance on a broad range of problems than their CPU counterparts, but effectively mapping complex applications to a parallel programming model with an emerging programming environment is a significant and important research problem.« less
Boolean and brain-inspired computing using spin-transfer torque devices
NASA Astrophysics Data System (ADS)
Fan, Deliang
Several completely new approaches (such as spintronic, carbon nanotube, graphene, TFETs, etc.) to information processing and data storage technologies are emerging to address the time frame beyond current Complementary Metal-Oxide-Semiconductor (CMOS) roadmap. The high speed magnetization switching of a nano-magnet due to current induced spin-transfer torque (STT) have been demonstrated in recent experiments. Such STT devices can be explored in compact, low power memory and logic design. In order to truly leverage STT devices based computing, researchers require a re-think of circuit, architecture, and computing model, since the STT devices are unlikely to be drop-in replacements for CMOS. The potential of STT devices based computing will be best realized by considering new computing models that are inherently suited to the characteristics of STT devices, and new applications that are enabled by their unique capabilities, thereby attaining performance that CMOS cannot achieve. The goal of this research is to conduct synergistic exploration in architecture, circuit and device levels for Boolean and brain-inspired computing using nanoscale STT devices. Specifically, we first show that the non-volatile STT devices can be used in designing configurable Boolean logic blocks. We propose a spin-memristor threshold logic (SMTL) gate design, where memristive cross-bar array is used to perform current mode summation of binary inputs and the low power current mode spintronic threshold device carries out the energy efficient threshold operation. Next, for brain-inspired computing, we have exploited different spin-transfer torque device structures that can implement the hard-limiting and soft-limiting artificial neuron transfer functions respectively. We apply such STT based neuron (or 'spin-neuron') in various neural network architectures, such as hierarchical temporal memory and feed-forward neural network, for performing "human-like" cognitive computing, which show more than two orders of lower energy consumption compared to state of the art CMOS implementation. Finally, we show the dynamics of injection locked Spin Hall Effect Spin-Torque Oscillator (SHE-STO) cluster can be exploited as a robust multi-dimensional distance metric for associative computing, image/ video analysis, etc. Our simulation results show that the proposed system architecture with injection locked SHE-STOs and the associated CMOS interface circuits can be suitable for robust and energy efficient associative computing and pattern matching.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arp, J.A.; Bower, J.C.; Burnett, R.A.
The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
Federal Emergency Management Information System (FEMIS), Installation Guide for FEMIS 1.4.6
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arp, J.A.; Burnett, R.A.; Carter, R.J.
The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Angel, L.K.; Bower, J.C.; Burnett, R.A.
1999-06-29
The Federal Emergency Management Information System (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the U.S. Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are corrected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication data distribution and notification functionality necessary to operate FEMIS in a networked, client/server environment.« less
Behavioral Reference Model for Pervasive Healthcare Systems.
Tahmasbi, Arezoo; Adabi, Sahar; Rezaee, Ali
2016-12-01
The emergence of mobile healthcare systems is an important outcome of application of pervasive computing concepts for medical care purposes. These systems provide the facilities and infrastructure required for automatic and ubiquitous sharing of medical information. Healthcare systems have a dynamic structure and configuration, therefore having an architecture is essential for future development of these systems. The need for increased response rate, problem limited storage, accelerated processing and etc. the tendency toward creating a new generation of healthcare system architecture highlight the need for further focus on cloud-based solutions for transfer data and data processing challenges. Integrity and reliability of healthcare systems are of critical importance, as even the slightest error may put the patients' lives in danger; therefore acquiring a behavioral model for these systems and developing the tools required to model their behaviors are of significant importance. The high-level designs may contain some flaws, therefor the system must be fully examined for different scenarios and conditions. This paper presents a software architecture for development of healthcare systems based on pervasive computing concepts, and then models the behavior of described system. A set of solutions are then proposed to improve the design's qualitative characteristics including, availability, interoperability and performance.
Developing a Distributed Computing Architecture at Arizona State University.
ERIC Educational Resources Information Center
Armann, Neil; And Others
1994-01-01
Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…
Phipps, Eric T.; D'Elia, Marta; Edwards, Harold C.; ...
2017-04-18
In this study, quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in anmore » embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Madduri, Kamesh; Im, Eun-Jin; Ibrahim, Khaled Z.
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broadmore » range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.« less
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
Parallel computing for probabilistic fatigue analysis
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.
1993-01-01
This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.
OFMspert: An architecture for an operator's associate that evolves to an intelligent tutor
NASA Technical Reports Server (NTRS)
Mitchell, Christine M.
1991-01-01
With the emergence of new technology for both human-computer interaction and knowledge-based systems, a range of opportunities exist which enhance the effectiveness and efficiency of controllers of high-risk engineering systems. The design of an architecture for an operator's associate is described. This associate is a stand-alone model-based system designed to interact with operators of complex dynamic systems, such as airplanes, manned space systems, and satellite ground control systems in ways comparable to that of a human assistant. The operator function model expert system (OFMspert) architecture and the design and empirical validation of OFMspert's understanding component are described. The design and validation of OFMspert's interactive and control components are also described. A description of current work in which OFMspert provides the foundation in the development of an intelligent tutor that evolves to an assistant, as operator expertise evolves from novice to expert, is provided.
Emerging Security Mechanisms for Medical Cyber Physical Systems.
Kocabas, Ovunc; Soyata, Tolga; Aktas, Mehmet K
2016-01-01
The following decade will witness a surge in remote health-monitoring systems that are based on body-worn monitoring devices. These Medical Cyber Physical Systems (MCPS) will be capable of transmitting the acquired data to a private or public cloud for storage and processing. Machine learning algorithms running in the cloud and processing this data can provide decision support to healthcare professionals. There is no doubt that the security and privacy of the medical data is one of the most important concerns in designing an MCPS. In this paper, we depict the general architecture of an MCPS consisting of four layers: data acquisition, data aggregation, cloud processing, and action. Due to the differences in hardware and communication capabilities of each layer, different encryption schemes must be used to guarantee data privacy within that layer. We survey conventional and emerging encryption schemes based on their ability to provide secure storage, data sharing, and secure computation. Our detailed experimental evaluation of each scheme shows that while the emerging encryption schemes enable exciting new features such as secure sharing and secure computation, they introduce several orders-of-magnitude computational and storage overhead. We conclude our paper by outlining future research directions to improve the usability of the emerging encryption schemes in an MCPS.
Frances: A Tool for Understanding Computer Architecture and Assembly Language
ERIC Educational Resources Information Center
Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh
2012-01-01
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
Tutorial: Computer architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gajski, D.D.; Milutinovic, V.M.; Siegel, H.J.
1986-01-01
This book presents the state-of-the-art in advanced computer architecture. It deals with the concepts underlying current architectures and covers approaches and techniques being used in the design of advanced computer systems.
Outline of a novel architecture for cortical computation.
Majumdar, Kaushik
2008-03-01
In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.
Opportunities for nonvolatile memory systems in extreme-scale high-performance computing
Vetter, Jeffrey S.; Mittal, Sparsh
2015-01-12
For extreme-scale high-performance computing systems, system-wide power consumption has been identified as one of the key constraints moving forward, where DRAM main memory systems account for about 30 to 50 percent of a node's overall power consumption. As the benefits of device scaling for DRAM memory slow, it will become increasingly difficult to keep memory capacities balanced with increasing computational rates offered by next-generation processors. However, several emerging memory technologies related to nonvolatile memory (NVM) devices are being investigated as an alternative for DRAM. Moving forward, NVM devices could offer solutions for HPC architectures. Researchers are investigating how to integratemore » these emerging technologies into future extreme-scale HPC systems and how to expose these capabilities in the software stack and applications. In addition, current results show several of these strategies could offer high-bandwidth I/O, larger main memory capacities, persistent data structures, and new approaches for application resilience and output postprocessing, such as transaction-based incremental checkpointing and in situ visualization, respectively.« less
Low-power, transparent optical network interface for high bandwidth off-chip interconnects.
Liboiron-Ladouceur, Odile; Wang, Howard; Garg, Ajay S; Bergman, Keren
2009-04-13
The recent emergence of multicore architectures and chip multiprocessors (CMPs) has accelerated the bandwidth requirements in high-performance processors for both on-chip and off-chip interconnects. For next generation computing clusters, the delivery of scalable power efficient off-chip communications to each compute node has emerged as a key bottleneck to realizing the full computational performance of these systems. The power dissipation is dominated by the off-chip interface and the necessity to drive high-speed signals over long distances. We present a scalable photonic network interface approach that fully exploits the bandwidth capacity offered by optical interconnects while offering significant power savings over traditional E/O and O/E approaches. The power-efficient interface optically aggregates electronic serial data streams into a multiple WDM channel packet structure at time-of-flight latencies. We demonstrate a scalable optical network interface with 70% improvement in power efficiency for a complete end-to-end PCI Express data transfer.
A parallel-processing approach to computing for the geographic sciences
Crane, Michael; Steinwand, Dan; Beckmann, Tim; Krpan, Greg; Haga, Jim; Maddox, Brian; Feller, Mark
2001-01-01
The overarching goal of this project is to build a spatially distributed infrastructure for information science research by forming a team of information science researchers and providing them with similar hardware and software tools to perform collaborative research. Four geographically distributed Centers of the U.S. Geological Survey (USGS) are developing their own clusters of low-cost personal computers into parallel computing environments that provide a costeffective way for the USGS to increase participation in the high-performance computing community. Referred to as Beowulf clusters, these hybrid systems provide the robust computing power required for conducting research into various areas, such as advanced computer architecture, algorithms to meet the processing needs for real-time image and data processing, the creation of custom datasets from seamless source data, rapid turn-around of products for emergency response, and support for computationally intense spatial and temporal modeling.
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
Wilk, S; Michalowski, W; O'Sullivan, D; Farion, K; Sayyad-Shirabad, J; Kuziemsky, C; Kukawka, B
2013-01-01
The purpose of this study was to create a task-based support architecture for developing clinical decision support systems (CDSSs) that assist physicians in making decisions at the point-of-care in the emergency department (ED). The backbone of the proposed architecture was established by a task-based emergency workflow model for a patient-physician encounter. The architecture was designed according to an agent-oriented paradigm. Specifically, we used the O-MaSE (Organization-based Multi-agent System Engineering) method that allows for iterative translation of functional requirements into architectural components (e.g., agents). The agent-oriented paradigm was extended with ontology-driven design to implement ontological models representing knowledge required by specific agents to operate. The task-based architecture allows for the creation of a CDSS that is aligned with the task-based emergency workflow model. It facilitates decoupling of executable components (agents) from embedded domain knowledge (ontological models), thus supporting their interoperability, sharing, and reuse. The generic architecture was implemented as a pilot system, MET3-AE--a CDSS to help with the management of pediatric asthma exacerbation in the ED. The system was evaluated in a hospital ED. The architecture allows for the creation of a CDSS that integrates support for all tasks from the task-based emergency workflow model, and interacts with hospital information systems. Proposed architecture also allows for reusing and sharing system components and knowledge across disease-specific CDSSs.
The Dynamic Brain: From Spiking Neurons to Neural Masses and Cortical Fields
Deco, Gustavo; Jirsa, Viktor K.; Robinson, Peter A.; Breakspear, Michael; Friston, Karl
2008-01-01
The cortex is a complex system, characterized by its dynamics and architecture, which underlie many functions such as action, perception, learning, language, and cognition. Its structural architecture has been studied for more than a hundred years; however, its dynamics have been addressed much less thoroughly. In this paper, we review and integrate, in a unifying framework, a variety of computational approaches that have been used to characterize the dynamics of the cortex, as evidenced at different levels of measurement. Computational models at different space–time scales help us understand the fundamental mechanisms that underpin neural processes and relate these processes to neuroscience data. Modeling at the single neuron level is necessary because this is the level at which information is exchanged between the computing elements of the brain; the neurons. Mesoscopic models tell us how neural elements interact to yield emergent behavior at the level of microcolumns and cortical columns. Macroscopic models can inform us about whole brain dynamics and interactions between large-scale neural systems such as cortical regions, the thalamus, and brain stem. Each level of description relates uniquely to neuroscience data, from single-unit recordings, through local field potentials to functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), and magnetoencephalogram (MEG). Models of the cortex can establish which types of large-scale neuronal networks can perform computations and characterize their emergent properties. Mean-field and related formulations of dynamics also play an essential and complementary role as forward models that can be inverted given empirical data. This makes dynamic models critical in integrating theory and experiments. We argue that elaborating principled and informed models is a prerequisite for grounding empirical neuroscience in a cogent theoretical framework, commensurate with the achievements in the physical sciences. PMID:18769680
Future computing platforms for science in a power constrained era
Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; ...
2015-12-23
Power consumption will be a key constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics (HEP). This makes performance-per-watt a crucial metric for selecting cost-efficient computing solutions. For this paper, we have done a wide survey of current and emerging architectures becoming available on the market including x86-64 variants, ARMv7 32-bit, ARMv8 64-bit, Many-Core and GPU solutions, as well as newer System-on-Chip (SoC) solutions. We compare performance and energy efficiency using an evolving set of standardized HEP-related benchmarks and power measurement techniques we have been developing. In conclusion, we evaluate the potentialmore » for use of such computing solutions in the context of DHTC systems, such as the Worldwide LHC Computing Grid (WLCG).« less
Accelerating artificial intelligence with reconfigurable computing
NASA Astrophysics Data System (ADS)
Cieszewski, Radoslaw
Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.
NASA Astrophysics Data System (ADS)
Berres, A.; Karthik, R.; Nugent, P.; Sorokine, A.; Myers, A.; Pang, H.
2017-12-01
Building an integrated data infrastructure that can meet the needs of a sustainable energy-water resource management requires a robust data management and geovisual analytics platform, capable of cross-domain scientific discovery and knowledge generation. Such a platform can facilitate the investigation of diverse complex research and policy questions for emerging priorities in Energy-Water Nexus (EWN) science areas. Using advanced data analytics, machine learning techniques, multi-dimensional statistical tools, and interactive geovisualization components, such a multi-layered federated platform is being developed, the Energy-Water Nexus Knowledge Discovery Framework (EWN-KDF). This platform utilizes several enterprise-grade software design concepts and standards such as extensible service-oriented architecture, open standard protocols, event-driven programming model, enterprise service bus, and adaptive user interfaces to provide a strategic value to the integrative computational and data infrastructure. EWN-KDF is built on the Compute and Data Environment for Science (CADES) environment in Oak Ridge National Laboratory (ORNL).
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
Architecture of Allosteric Materials and Edge Modes
NASA Astrophysics Data System (ADS)
Yan, Le; Ravasio, Riccardo; Brito, Carolina; Wyart, Matthieu
Allostery, a long-range elasticity-mediated interaction, remains the biggest mystery decades after its discovery in proteins. We introduce a numerical scheme to evolve functional materials that can accomplish a specified mechanical task. In this scheme, the number of solutions, their spatial architectures and the correlations among them can be computed. As an example, we consider an ``allosteric'' task, which requires the material to respond specifically to a stimulus at a distant active site. We find that functioning materials evolve a less-constrained trumpet-shaped region connecting the stimulus and active sites and that the amplitude of the elastic response varies non-monotonically along the trumpet. As previously shown for some proteins, we find that correlations appearing during evolution alone are sufficient to identify key aspects of this design. Finally, we show that the success of this architecture stems from the emergence of soft edge modes recently found to appear near the surface of marginally connected materials. Overall, our in silico evolution experiment offers a new window to study the relationship between structure, function, and correlations emerging during evolution. L.Y. was supported in part by the National Science Foundation under Grant No. NSF PHY11-25915. M.W. thanks the Swiss National Science Foundation for support under Grant No. 200021-165509 and the Simons Foundation Grant (#454953 Matthieu Wyart).
Supporting Undergraduate Computer Architecture Students Using a Visual MIPS64 CPU Simulator
ERIC Educational Resources Information Center
Patti, D.; Spadaccini, A.; Palesi, M.; Fazzino, F.; Catania, V.
2012-01-01
The topics of computer architecture are always taught using an Assembly dialect as an example. The most commonly used textbooks in this field use the MIPS64 Instruction Set Architecture (ISA) to help students in learning the fundamentals of computer architecture because of its orthogonality and its suitability for real-world applications. This…
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques
2013-03-01
MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated
High-performance multiprocessor architecture for a 3-D lattice gas model
NASA Technical Reports Server (NTRS)
Lee, F.; Flynn, M.; Morf, M.
1991-01-01
The lattice gas method has recently emerged as a promising discrete particle simulation method in areas such as fluid dynamics. We present a very high-performance scalable multiprocessor architecture, called ALGE, proposed for the simulation of a realistic 3-D lattice gas model, Henon's 24-bit FCHC isometric model. Each of these VLSI processors is as powerful as a CRAY-2 for this application. ALGE is scalable in the sense that it achieves linear speedup for both fixed and increasing problem sizes with more processors. The core computation of a lattice gas model consists of many repetitions of two alternating phases: particle collision and propagation. Functional decomposition by symmetry group and virtual move are the respective keys to efficient implementation of collision and propagation.
Exascale computing and what it means for shock physics
NASA Astrophysics Data System (ADS)
Germann, Timothy
2015-06-01
The U.S. Department of Energy is preparing to launch an Exascale Computing Initiative, to address the myriad challenges required to deploy and effectively utilize an exascale-class supercomputer (i.e., one capable of performing 1018 operations per second) in the 2023 timeframe. Since physical (power dissipation) requirements limit clock rates to at most a few GHz, this will necessitate the coordination of on the order of a billion concurrent operations, requiring sophisticated system and application software, and underlying mathematical algorithms, that may differ radically from traditional approaches. Even at the smaller workstation or cluster level of computation, the massive concurrency and heterogeneity within each processor will impact computational scientists. Through the multi-institutional, multi-disciplinary Exascale Co-design Center for Materials in Extreme Environments (ExMatEx), we have initiated an early and deep collaboration between domain (computational materials) scientists, applied mathematicians, computer scientists, and hardware architects, in order to establish the relationships between algorithms, software stacks, and architectures needed to enable exascale-ready materials science application codes within the next decade. In my talk, I will discuss these challenges, and what it will mean for exascale-era electronic structure, molecular dynamics, and engineering-scale simulations of shock-compressed condensed matter. In particular, we anticipate that the emerging hierarchical, heterogeneous architectures can be exploited to achieve higher physical fidelity simulations using adaptive physics refinement. This work is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research.
RE-PLAN: An Extensible Software Architecture to Facilitate Disaster Response Planning
O’Neill, Martin; Mikler, Armin R.; Indrakanti, Saratchandra; Tiwari, Chetan; Jimenez, Tamara
2014-01-01
Computational tools are needed to make data-driven disaster mitigation planning accessible to planners and policymakers without the need for programming or GIS expertise. To address this problem, we have created modules to facilitate quantitative analyses pertinent to a variety of different disaster scenarios. These modules, which comprise the REsponse PLan ANalyzer (RE-PLAN) framework, may be used to create tools for specific disaster scenarios that allow planners to harness large amounts of disparate data and execute computational models through a point-and-click interface. Bio-E, a user-friendly tool built using this framework, was designed to develop and analyze the feasibility of ad hoc clinics for treating populations following a biological emergency event. In this article, the design and implementation of the RE-PLAN framework are described, and the functionality of the modules used in the Bio-E biological emergency mitigation tool are demonstrated. PMID:25419503
Brain architecture: a design for natural computation.
Kaiser, Marcus
2007-12-15
Fifty years ago, John von Neumann compared the architecture of the brain with that of the computers he invented and which are still in use today. In those days, the organization of computers was based on concepts of brain organization. Here, we give an update on current results on the global organization of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
A Distributed Laboratory for Event-Driven Coastal Prediction and Hazard Planning
NASA Astrophysics Data System (ADS)
Bogden, P.; Allen, G.; MacLaren, J.; Creager, G. J.; Flournoy, L.; Sheng, Y. P.; Graber, H.; Graves, S.; Conover, H.; Luettich, R.; Perrie, W.; Ramakrishnan, L.; Reed, D. A.; Wang, H. V.
2006-12-01
The 2005 Atlantic hurricane season was the most active in recorded history. Collectively, 2005 hurricanes caused more than 2,280 deaths and record damages of over 100 billion dollars. Of the storms that made landfall, Dennis, Emily, Katrina, Rita, and Wilma caused most of the destruction. Accurate predictions of storm-driven surge, wave height, and inundation can save lives and help keep recovery costs down, provided the information gets to emergency response managers in time. The information must be available well in advance of landfall so that responders can weigh the costs of unnecessary evacuation against the costs of inadequate preparation. The SURA Coastal Ocean Observing and Prediction (SCOOP) Program is a multi-institution collaboration implementing a modular, distributed service-oriented architecture for real time prediction and visualization of the impacts of extreme atmospheric events. The modular infrastructure enables real-time prediction of multi- scale, multi-model, dynamic, data-driven applications. SURA institutions are working together to create a virtual and distributed laboratory integrating coastal models, simulation data, and observations with computational resources and high speed networks. The loosely coupled architecture allows teams of computer and coastal scientists at multiple institutions to innovate complex system components that are interconnected with relatively stable interfaces. The operational system standardizes at the interface level to enable substantial innovation by complementary communities of coastal and computer scientists. This architectural philosophy solves a long-standing problem associated with the transition from research to operations. The SCOOP Program thereby implements a prototype laboratory consistent with the vision of a national, multi-agency initiative called the Integrated Ocean Observing System (IOOS). Several service- oriented components of the SCOOP enterprise architecture have already been designed and implemented, including data archive and transport services, metadata registry and retrieval (catalog), resource management, and portal interfaces. SCOOP partners are integrating these at the service level and implementing reconfigurable workflows for several kinds of user scenarios, and are working with resource providers to prototype new policies and technologies for on-demand computing.
The path toward HEP High Performance Computing
NASA Astrophysics Data System (ADS)
Apostolakis, John; Brun, René; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro
2014-06-01
High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from the recent technology evolution in computing.
A new software-based architecture for quantum computer
NASA Astrophysics Data System (ADS)
Wu, Nan; Song, FangMin; Li, Xiangdong
2010-04-01
In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
On implementation of DCTCP on three-tier and fat-tree data center network topologies.
Zafar, Saima; Bashir, Abeer; Chaudhry, Shafique Ahmad
2016-01-01
A data center is a facility for housing computational and storage systems interconnected through a communication network called data center network (DCN). Due to a tremendous growth in the computational power, storage capacity and the number of inter-connected servers, the DCN faces challenges concerning efficiency, reliability and scalability. Although transmission control protocol (TCP) is a time-tested transport protocol in the Internet, DCN challenges such as inadequate buffer space in switches and bandwidth limitations have prompted the researchers to propose techniques to improve TCP performance or design new transport protocols for DCN. Data center TCP (DCTCP) emerge as one of the most promising solutions in this domain which employs the explicit congestion notification feature of TCP to enhance the TCP congestion control algorithm. While DCTCP has been analyzed for two-tier tree-based DCN topology for traffic between servers in the same rack which is common in cloud applications, it remains oblivious to the traffic patterns common in university and private enterprise networks which traverse the complete network interconnect spanning upper tier layers. We also recognize that DCTCP performance cannot remain unaffected by the underlying DCN architecture hence there is a need to test and compare DCTCP performance when implemented over diverse DCN architectures. Some of the most notable DCN architectures are the legacy three-tier, fat-tree, BCube, DCell, VL2, and CamCube. In this research, we simulate the two switch-centric DCN architectures; the widely deployed legacy three-tier architecture and the promising fat-tree architecture using network simulator and analyze the performance of DCTCP in terms of throughput and delay for realistic traffic patterns. We also examine how DCTCP prevents incast and outcast congestion when realistic DCN traffic patterns are employed in above mentioned topologies. Our results show that the underlying DCN architecture significantly impacts DCTCP performance. We find that DCTCP gives optimal performance in fat-tree topology and is most suitable for large networks.
Oubbati, Mohamed; Kord, Bahram; Koprinkova-Hristova, Petia; Palm, Günther
2014-04-01
The new tendency of artificial intelligence suggests that intelligence must be seen as a result of the interaction between brains, bodies and environments. This view implies that designing sophisticated behaviour requires a primary focus on how agents are functionally coupled to their environments. Under this perspective, we present early results with the application of reservoir computing as an efficient tool to understand how behaviour emerges from interaction. Specifically, we present reservoir computing models, that are inspired by imitation learning designs, to extract the essential components of behaviour that results from agent-environment interaction dynamics. Experimental results using a mobile robot are reported to validate the learning architectures.
NASA Astrophysics Data System (ADS)
Oubbati, Mohamed; Kord, Bahram; Koprinkova-Hristova, Petia; Palm, Günther
2014-04-01
The new tendency of artificial intelligence suggests that intelligence must be seen as a result of the interaction between brains, bodies and environments. This view implies that designing sophisticated behaviour requires a primary focus on how agents are functionally coupled to their environments. Under this perspective, we present early results with the application of reservoir computing as an efficient tool to understand how behaviour emerges from interaction. Specifically, we present reservoir computing models, that are inspired by imitation learning designs, to extract the essential components of behaviour that results from agent-environment interaction dynamics. Experimental results using a mobile robot are reported to validate the learning architectures.
Metal oxide resistive random access memory based synaptic devices for brain-inspired computing
NASA Astrophysics Data System (ADS)
Gao, Bin; Kang, Jinfeng; Zhou, Zheng; Chen, Zhe; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan
2016-04-01
The traditional Boolean computing paradigm based on the von Neumann architecture is facing great challenges for future information technology applications such as big data, the Internet of Things (IoT), and wearable devices, due to the limited processing capability issues such as binary data storage and computing, non-parallel data processing, and the buses requirement between memory units and logic units. The brain-inspired neuromorphic computing paradigm is believed to be one of the promising solutions for realizing more complex functions with a lower cost. To perform such brain-inspired computing with a low cost and low power consumption, novel devices for use as electronic synapses are needed. Metal oxide resistive random access memory (ReRAM) devices have emerged as the leading candidate for electronic synapses. This paper comprehensively addresses the recent work on the design and optimization of metal oxide ReRAM-based synaptic devices. A performance enhancement methodology and optimized operation scheme to achieve analog resistive switching and low-energy training behavior are provided. A three-dimensional vertical synapse network architecture is proposed for high-density integration and low-cost fabrication. The impacts of the ReRAM synaptic device features on the performances of neuromorphic systems are also discussed on the basis of a constructed neuromorphic visual system with a pattern recognition function. Possible solutions to achieve the high recognition accuracy and efficiency of neuromorphic systems are presented.
Architectures for single-chip image computing
NASA Astrophysics Data System (ADS)
Gove, Robert J.
1992-04-01
This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krueger, Jens; Micikevicius, Paulius; Williams, Samuel
Reverse Time Migration (RTM) is one of the main approaches in the seismic processing industry for imaging the subsurface structure of the Earth. While RTM provides qualitative advantages over its predecessors, it has a high computational cost warranting implementation on HPC architectures. We focus on three progressively more complex kernels extracted from RTM: for isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) media. In this work, we examine performance optimization of forward wave modeling, which describes the computational kernels used in RTM, on emerging multi- and manycore processors and introduce a novel common subexpression elimination optimization formore » TTI kernels. We compare attained performance and energy efficiency in both the single-node and distributed memory environments in order to satisfy industry’s demands for fidelity, performance, and energy efficiency. Moreover, we discuss the interplay between architecture (chip and system) and optimizations (both on-node computation) highlighting the importance of NUMA-aware approaches to MPI communication. Ultimately, our results show we can improve CPU energy efficiency by more than 10× on Magny Cours nodes while acceleration via multiple GPUs can surpass the energy-efficient Intel Sandy Bridge by as much as 3.6×.« less
High performance cellular level agent-based simulation with FLAME for the GPU.
Richmond, Paul; Walker, Dawn; Coakley, Simon; Romano, Daniela
2010-05-01
Driven by the availability of experimental data and ability to simulate a biological scale which is of immediate interest, the cellular scale is fast emerging as an ideal candidate for middle-out modelling. As with 'bottom-up' simulation approaches, cellular level simulations demand a high degree of computational power, which in large-scale simulations can only be achieved through parallel computing. The flexible large-scale agent modelling environment (FLAME) is a template driven framework for agent-based modelling (ABM) on parallel architectures ideally suited to the simulation of cellular systems. It is available for both high performance computing clusters (www.flame.ac.uk) and GPU hardware (www.flamegpu.com) and uses a formal specification technique that acts as a universal modelling format. This not only creates an abstraction from the underlying hardware architectures, but avoids the steep learning curve associated with programming them. In benchmarking tests and simulations of advanced cellular systems, FLAME GPU has reported massive improvement in performance over more traditional ABM frameworks. This allows the time spent in the development and testing stages of modelling to be drastically reduced and creates the possibility of real-time visualisation for simple visual face-validation.
A Stochastic Spiking Neural Network for Virtual Screening.
Morro, A; Canals, V; Oliver, A; Alomar, M L; Galan-Prado, F; Ballester, P J; Rossello, J L
2018-04-01
Virtual screening (VS) has become a key computational tool in early drug design and screening performance is of high relevance due to the large volume of data that must be processed to identify molecules with the sought activity-related pattern. At the same time, the hardware implementations of spiking neural networks (SNNs) arise as an emerging computing technique that can be applied to parallelize processes that normally present a high cost in terms of computing time and power. Consequently, SNN represents an attractive alternative to perform time-consuming processing tasks, such as VS. In this brief, we present a smart stochastic spiking neural architecture that implements the ultrafast shape recognition (USR) algorithm achieving two order of magnitude of speed improvement with respect to USR software implementations. The neural system is implemented in hardware using field-programmable gate arrays allowing a highly parallelized USR implementation. The results show that, due to the high parallelization of the system, millions of compounds can be checked in reasonable times. From these results, we can state that the proposed architecture arises as a feasible methodology to efficiently enhance time-consuming data-mining processes such as 3-D molecular similarity search.
Integrated, Not Isolated: Defining Typological Proximity in an Integrated Multilingual Architecture
Putnam, Michael T.; Carlson, Matthew; Reitter, David
2018-01-01
On the surface, bi- and multilingualism would seem to be an ideal context for exploring questions of typological proximity. The obvious intuition is that the more closely related two languages are, the easier it should be to implement the two languages in one mind. This is the starting point adopted here, but we immediately run into the difficulty that the overwhelming majority of cognitive, computational, and linguistic research on bi- and multilingualism exhibits a monolingual bias (i.e., where monolingual grammars are used as the standard of comparison for outputs from bilingual grammars). The primary questions so far have focused on how bilinguals balance and switch between their two languages, but our perspective on typology leads us to consider the nature of bi- and multi-lingual systems as a whole. Following an initial proposal from Hsin (2014), we conjecture that bilingual grammars are neither isolated, nor (completely) conjoined with one another in the bilingual mind, but rather exist as integrated source grammars that are further mitigated by a common, combined grammar (Cook, 2016; Goldrick et al., 2016a,b; Putnam and Klosinski, 2017). Here we conceive such a combined grammar in a parallel, distributed, and gradient architecture implemented in a shared vector-space model that employs compression through routinization and dimensionality reduction. We discuss the emergence of such representations and their function in the minds of bilinguals. This architecture aims to be consistent with empirical results on bilingual cognition and memory representations in computational cognitive architectures. PMID:29354079
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
NASA Astrophysics Data System (ADS)
Marcer, Peter J.; Rowlands, Peter
2013-09-01
The principal criteria Cn (n = 1 to 23) and grammatical production rules are set out of a universal computational rewrite language spelling out a semantic description of an emergent, self-organizing architecture for the cosmos. These language productions already predicate: (1) Einstein's conservation law of energy, momentum and mass and, subsequently, (2) with respect to gauge invariant relativistic space time (both Lorentz special & Einstein general); (3) Standard Model elementary particle physics; (4) the periodic table of the elements & chemical valence; and (5) the molecular biological basis of the DNA / RNA genetic code; so enabling the Cybernetic Machine specialist Groups Mission Statement premise;** (6) that natural semantic language thinking at the higher level of the self-organized emergent chemical molecular complexity of the human brain (only surpassed by that of the cosmos itself!) would be realized (7) by this same universal semantic language via (8) an architecture of a conscious human brain/mind and self which, it predicates consists of its neural / glia and microtubule substrates respectively, so as to endow it with; (9) the intelligent semantic capability to be able to specify, symbolize, spell out and understand the cosmos that conceived it; and (10) provide a quantum physical explanation of consciousness and of how (11) the dichotomy between first person subjectivity and third person objectivity or `hard problem' is resolved.
Advanced computer architecture specification for automated weld systems
NASA Technical Reports Server (NTRS)
Katsinis, Constantine
1994-01-01
This report describes the requirements for an advanced automated weld system and the associated computer architecture, and defines the overall system specification from a broad perspective. According to the requirements of welding procedures as they relate to an integrated multiaxis motion control and sensor architecture, the computer system requirements are developed based on a proven multiple-processor architecture with an expandable, distributed-memory, single global bus architecture, containing individual processors which are assigned to specific tasks that support sensor or control processes. The specified architecture is sufficiently flexible to integrate previously developed equipment, be upgradable and allow on-site modifications.
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
Recursive computer architecture for VLSI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treleaven, P.C.; Hopkins, R.P.
1982-01-01
A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
Hypercluster Parallel Processor
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela
1992-01-01
Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
2017-01-01
The continuous technological advances in favor of mHealth represent a key factor in the improvement of medical emergency services. This systematic review presents the identification, study, and classification of the most up-to-date approaches surrounding the deployment of architectures for mHealth. Our review includes 25 articles obtained from databases such as IEEE Xplore, Scopus, SpringerLink, ScienceDirect, and SAGE. This review focused on studies addressing mHealth systems for outdoor emergency situations. In 60% of the articles, the deployment architecture relied in the connective infrastructure associated with emergent technologies such as cloud services, distributed services, Internet-of-things, machine-to-machine, vehicular ad hoc network, and service-oriented architecture. In 40% of the literature review, the deployment architecture for mHealth considered traditional connective infrastructure. Only 20% of the studies implemented an energy consumption protocol to extend system lifetime. We concluded that there is a need for more integrated solutions specifically for outdoor scenarios. Energy consumption protocols are needed to be implemented and evaluated. Emergent connective technologies are redefining the information management and overcome traditional technologies. PMID:29075430
Gonzalez, Enrique; Peña, Raul; Avila, Alfonso; Vargas-Rosales, Cesar; Munoz-Rodriguez, David
2017-01-01
The continuous technological advances in favor of mHealth represent a key factor in the improvement of medical emergency services. This systematic review presents the identification, study, and classification of the most up-to-date approaches surrounding the deployment of architectures for mHealth. Our review includes 25 articles obtained from databases such as IEEE Xplore, Scopus, SpringerLink, ScienceDirect, and SAGE. This review focused on studies addressing mHealth systems for outdoor emergency situations. In 60% of the articles, the deployment architecture relied in the connective infrastructure associated with emergent technologies such as cloud services, distributed services, Internet-of-things, machine-to-machine, vehicular ad hoc network, and service-oriented architecture. In 40% of the literature review, the deployment architecture for mHealth considered traditional connective infrastructure. Only 20% of the studies implemented an energy consumption protocol to extend system lifetime. We concluded that there is a need for more integrated solutions specifically for outdoor scenarios. Energy consumption protocols are needed to be implemented and evaluated. Emergent connective technologies are redefining the information management and overcome traditional technologies.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan
2006-01-01
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Design and development of a mobile system for supporting emergency triage.
Michalowski, W; Slowinski, R; Wilk, S; Farion, K J; Pike, J; Rubin, S
2005-01-01
Our objective was to design and develop a mobile clinical decision support system for emergency triage of different acute pain presentations. The system should interact with existing hospital information systems, run on mobile computing devices (handheld computers) and be suitable for operation in weak-connectivity conditions (with unstable connections between mobile clients and a server). The MET (Mobile Emergency Triage) system was designed following an extended client-server architecture. The client component, responsible for triage decision support, is built as a knowledge-based system, with domain ontology separated from generic problem solving methods and used for the automatic creation of a user interface. The MET system is well suited for operation in the Emergency Department of a hospital. The system's external interactions are managed by the server, while the MET clients, running on handheld computers are used by clinicians for collecting clinical data and supporting triage at the bedside. The functionality of the MET client is distributed into specialized modules, responsible for triaging specific types of acute pain presentations. The modules are stored on the server, and on request they can be transferred and executed on the mobile clients. The modular design provides for easy extension of the system's functionality. A clinical trial of the MET system validated the appropriateness of the system's design, and proved the usefulness and acceptance of the system in clinical practice. The MET system captures the necessary hospital data, allows for entry of patient information, and provides triage support. By operating on handheld computers, it fits into the regular emergency department workflow without introducing any hindrances or disruptions. It supports triage anytime and anywhere, directly at the point of care, and also can be used as an electronic patient chart, facilitating structured data collection.
Emergency healthcare process automation using mobile computing and cloud services.
Poulymenopoulou, M; Malamateniou, F; Vassilacopoulos, G
2012-10-01
Emergency care is basically concerned with the provision of pre-hospital and in-hospital medical and/or paramedical services and it typically involves a wide variety of interdependent and distributed activities that can be interconnected to form emergency care processes within and between Emergency Medical Service (EMS) agencies and hospitals. Hence, in developing an information system for emergency care processes, it is essential to support individual process activities and to satisfy collaboration and coordination needs by providing readily access to patient and operational information regardless of location and time. Filling this information gap by enabling the provision of the right information, to the right people, at the right time fosters new challenges, including the specification of a common information format, the interoperability among heterogeneous institutional information systems or the development of new, ubiquitous trans-institutional systems. This paper is concerned with the development of an integrated computer support to emergency care processes by evolving and cross-linking institutional healthcare systems. To this end, an integrated EMS cloud-based architecture has been developed that allows authorized users to access emergency case information in standardized document form, as proposed by the Integrating the Healthcare Enterprise (IHE) profile, uses the Organization for the Advancement of Structured Information Standards (OASIS) standard Emergency Data Exchange Language (EDXL) Hospital Availability Exchange (HAVE) for exchanging operational data with hospitals and incorporates an intelligent module that supports triaging and selecting the most appropriate ambulances and hospitals for each case.
Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS
NASA Astrophysics Data System (ADS)
Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.
2018-03-01
Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.
A synchronized computational architecture for generalized bilateral control of robot arms
NASA Technical Reports Server (NTRS)
Bejczy, Antal K.; Szakaly, Zoltan
1987-01-01
This paper describes a computational architecture for an interconnected high speed distributed computing system for generalized bilateral control of robot arms. The key method of the architecture is the use of fully synchronized, interrupt driven software. Since an objective of the development is to utilize the processing resources efficiently, the synchronization is done in the hardware level to reduce system software overhead. The architecture also achieves a balaced load on the communication channel. The paper also describes some architectural relations to trading or sharing manual and automatic control.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Stocker, John C.; Golomb, Andrew M.
2011-01-01
Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Analytical Performance Modeling and Validation of Intel’s Xeon Phi Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chunduri, Sudheer; Balaprakash, Prasanna; Morozov, Vitali
Modeling the performance of scientific applications on emerging hardware plays a central role in achieving extreme-scale computing goals. Analytical models that capture the interaction between applications and hardware characteristics are attractive because even a reasonably accurate model can be useful for performance tuning before the hardware is made available. In this paper, we develop a hardware model for Intel’s second-generation Xeon Phi architecture code-named Knights Landing (KNL) for the SKOPE framework. We validate the KNL hardware model by projecting the performance of mini-benchmarks and application kernels. The results show that our KNL model can project the performance with prediction errorsmore » of 10% to 20%. The hardware model also provides informative recommendations for code transformations and tuning.« less
Architectural Specialization for Inter-Iteration Loop Dependence Patterns
2015-10-01
Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness
NASA Astrophysics Data System (ADS)
Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.
2014-09-01
Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
Wei, Xiaohui; Sun, Bingyi; Cui, Jiaxu; Xu, Gaochao
2016-01-01
As a result of the greatly increased use of mobile devices, the disadvantages of portable devices have gradually begun to emerge. To solve these problems, the use of mobile cloud computing assisted by cloud data centers has been proposed. However, cloud data centers are always very far from the mobile requesters. In this paper, we propose an improved multi-objective local mobile cloud model: Compounded Local Mobile Cloud Architecture with Dynamic Priority Queues (LMCpri). This new architecture could briefly store jobs that arrive simultaneously at the cloudlet in different priority positions according to the result of auction processing, and then execute partitioning tasks on capable helpers. In the Scheduling Module, NSGA-II is employed as the scheduling algorithm to shorten processing time and decrease requester cost relative to PSO and sequential scheduling. The simulation results show that the number of iteration times that is defined to 30 is the best choice of the system. In addition, comparing with LMCque, LMCpri is able to effectively accommodate a requester who would like his job to be executed in advance and shorten execution time. Finally, we make a comparing experiment between LMCpri and cloud assisting architecture, and the results reveal that LMCpri presents a better performance advantage than cloud assisting architecture.
Wei, Xiaohui; Sun, Bingyi; Cui, Jiaxu; Xu, Gaochao
2016-01-01
As a result of the greatly increased use of mobile devices, the disadvantages of portable devices have gradually begun to emerge. To solve these problems, the use of mobile cloud computing assisted by cloud data centers has been proposed. However, cloud data centers are always very far from the mobile requesters. In this paper, we propose an improved multi-objective local mobile cloud model: Compounded Local Mobile Cloud Architecture with Dynamic Priority Queues (LMCpri). This new architecture could briefly store jobs that arrive simultaneously at the cloudlet in different priority positions according to the result of auction processing, and then execute partitioning tasks on capable helpers. In the Scheduling Module, NSGA-II is employed as the scheduling algorithm to shorten processing time and decrease requester cost relative to PSO and sequential scheduling. The simulation results show that the number of iteration times that is defined to 30 is the best choice of the system. In addition, comparing with LMCque, LMCpri is able to effectively accommodate a requester who would like his job to be executed in advance and shorten execution time. Finally, we make a comparing experiment between LMCpri and cloud assisting architecture, and the results reveal that LMCpri presents a better performance advantage than cloud assisting architecture. PMID:27419854
NASA Astrophysics Data System (ADS)
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-03-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Cognitive Architectures and Human-Computer Interaction. Introduction to Special Issue.
ERIC Educational Resources Information Center
Gray, Wayne D.; Young, Richard M.; Kirschenbaum, Susan S.
1997-01-01
In this introduction to a special issue on cognitive architectures and human-computer interaction (HCI), editors and contributors provide a brief overview of cognitive architectures. The following four architectures represented by articles in this issue are: Soar; LICAI (linked model of comprehension-based action planning and instruction taking);…
Ab Initio Reactive Computer Aided Molecular Design
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martínez, Todd J.
Few would dispute that theoretical chemistry tools can now provide keen insights into chemical phenomena. Yet the holy grail of efficient and reliable prediction of complex reactivity has remained elusive. Fortunately, recent advances in electronic structure theory based on the concepts of both element- and rank-sparsity, coupled with the emergence of new highly parallel computer architectures, have led to a significant increase in the time and length scales which can be simulated using first principles molecular dynamics. This then opens the possibility of new discovery-based approaches to chemical reactivity, such as the recently proposed ab initio nanoreactor. Here, we arguemore » that due to these and other recent advances, the holy grail of computational discovery for complex chemical reactivity is rapidly coming within our reach.« less
Ab Initio Reactive Computer Aided Molecular Design
Martínez, Todd J.
2017-03-21
Few would dispute that theoretical chemistry tools can now provide keen insights into chemical phenomena. Yet the holy grail of efficient and reliable prediction of complex reactivity has remained elusive. Fortunately, recent advances in electronic structure theory based on the concepts of both element- and rank-sparsity, coupled with the emergence of new highly parallel computer architectures, have led to a significant increase in the time and length scales which can be simulated using first principles molecular dynamics. This then opens the possibility of new discovery-based approaches to chemical reactivity, such as the recently proposed ab initio nanoreactor. Here, we arguemore » that due to these and other recent advances, the holy grail of computational discovery for complex chemical reactivity is rapidly coming within our reach.« less
Biomimetic design processes in architecture: morphogenetic and evolutionary computational design.
Menges, Achim
2012-03-01
Design computation has profound impact on architectural design methods. This paper explains how computational design enables the development of biomimetic design processes specific to architecture, and how they need to be significantly different from established biomimetic processes in engineering disciplines. The paper first explains the fundamental difference between computer-aided and computational design in architecture, as the understanding of this distinction is of critical importance for the research presented. Thereafter, the conceptual relation and possible transfer of principles from natural morphogenesis to design computation are introduced and the related developments of generative, feature-based, constraint-based, process-based and feedback-based computational design methods are presented. This morphogenetic design research is then related to exploratory evolutionary computation, followed by the presentation of two case studies focusing on the exemplary development of spatial envelope morphologies and urban block morphologies.
How architecture wins technology wars.
Morris, C R; Ferguson, C H
1993-01-01
Signs of revolutionary transformation in the global computer industry are everywhere. A roll call of the major industry players reads like a waiting list in the emergency room. The usual explanations for the industry's turmoil are at best inadequate. Scale, friendly government policies, manufacturing capabilities, a strong position in desktop markets, excellent software, top design skills--none of these is sufficient, either by itself or in combination, to ensure competitive success in information technology. A new paradigm is required to explain patterns of success and failure. Simply stated, success flows to the company that manages to establish proprietary architectural control over a broad, fast-moving, competitive space. Architectural strategies have become crucial to information technology because of the astonishing rate of improvement in microprocessors and other semiconductor components. Since no single vendor can keep pace with the outpouring of cheap, powerful, mass-produced components, customers insist on stitching together their own local systems solutions. Architectures impose order on the system and make the interconnections possible. The architectural controller is the company that controls the standard by which the entire information package is assembled. Microsoft's Windows is an excellent example of this. Because of the popularity of Windows, companies like Lotus must conform their software to its parameters in order to compete for market share. In the 1990s, proprietary architectural control is not only possible but indispensable to competitive success. What's more, it has broader implications for organizational structure: architectural competition is giving rise to a new form of business organization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
1998-08-01
An estimated 85% of the installed base of software is a custom application with a production quantity of one. In practice, almost 100% of military software systems are custom software. Paradoxically, the marginal costs of producing additional units are near zero. So why hasn`t the software market, a market with high design costs and low productions costs evolved like other similar custom widget industries, such as automobiles and hardware chips? The military software industry seems immune to market pressures that have motivated a multilevel supply chain structure in other widget industries: design cost recovery, improve quality through specialization, and enablemore » rapid assembly from purchased components. The primary goal of the ComponentWare Consortium (CWC) technology plan was to overcome barriers to building and deploying mission-critical information systems by using verified, reusable software components (Component Ware). The adoption of the ComponentWare infrastructure is predicated upon a critical mass of the leading platform vendors` inevitable adoption of adopting emerging, object-based, distributed computing frameworks--initially CORBA and COM/OLE. The long-range goal of this work is to build and deploy military systems from verified reusable architectures. The promise of component-based applications is to enable developers to snap together new applications by mixing and matching prefabricated software components. A key result of this effort is the concept of reusable software architectures. A second important contribution is the notion that a software architecture is something that can be captured in a formal language and reused across multiple applications. The formalization and reuse of software architectures provide major cost and schedule improvements. The Unified Modeling Language (UML) is fast becoming the industry standard for object-oriented analysis and design notation for object-based systems. However, the lack of a standard real-time distributed object operating system, lack of a standard Computer-Aided Software Environment (CASE) tool notation and lack of a standard CASE tool repository has limited the realization of component software. The approach to fulfilling this need is the software component factory innovation. The factory approach takes advantage of emerging standards such as UML, CORBA, Java and the Internet. The key technical innovation of the software component factory is the ability to assemble and test new system configurations as well as assemble new tools on demand from existing tools and architecture design repositories.« less
The flight telerobotic servicer: From functional architecture to computer architecture
NASA Technical Reports Server (NTRS)
Lumia, Ronald; Fiala, John
1989-01-01
After a brief tutorial on the NASA/National Bureau of Standards Standard Reference Model for Telerobot Control System Architecture (NASREM) functional architecture, the approach to its implementation is shown. First, interfaces must be defined which are capable of supporting the known algorithms. This is illustrated by considering the interfaces required for the SERVO level of the NASREM functional architecture. After interface definition, the specific computer architecture for the implementation must be determined. This choice is obviously technology dependent. An example illustrating one possible mapping of the NASREM functional architecture to a particular set of computers which implements it is shown. The result of choosing the NASREM functional architecture is that it provides a technology independent paradigm which can be mapped into a technology dependent implementation capable of evolving with technology in the laboratory and in space.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Donaldson, Thomas; Doty, Karl; Engle, Steven W.; Larson, Robert E.; O'Reilly, John G.
1988-01-01
The results of a Phase I Small Business Innovative Research (SBIR) project performed for the NASA Langley Computational Structural Mechanics Group are described. The project resulted in the identification of a family of chordal-ring interconnection architectures with excellent potential to serve as the basis for new multimicroprocessor (MMP) computers. The paper presents examples of how computational algorithms from structural mechanics can be efficiently implemented on the chordal-ring architecture.
A Semantic Grid Oriented to E-Tourism
NASA Astrophysics Data System (ADS)
Zhang, Xiao Ming
With increasing complexity of tourism business models and tasks, there is a clear need of the next generation e-Tourism infrastructure to support flexible automation, integration, computation, storage, and collaboration. Currently several enabling technologies such as semantic Web, Web service, agent and grid computing have been applied in the different e-Tourism applications, however there is no a unified framework to be able to integrate all of them. So this paper presents a promising e-Tourism framework based on emerging semantic grid, in which a number of key design issues are discussed including architecture, ontologies structure, semantic reconciliation, service and resource discovery, role based authorization and intelligent agent. The paper finally provides the implementation of the framework.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1992-01-01
The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.; Lytle, John K. (Technical Monitor)
2002-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
Sharma, Harshita; Zerbe, Norman; Klempert, Iris; Hellwich, Olaf; Hufnagl, Peter
2017-11-01
Deep learning using convolutional neural networks is an actively emerging field in histological image analysis. This study explores deep learning methods for computer-aided classification in H&E stained histopathological whole slide images of gastric carcinoma. An introductory convolutional neural network architecture is proposed for two computerized applications, namely, cancer classification based on immunohistochemical response and necrosis detection based on the existence of tumor necrosis in the tissue. Classification performance of the developed deep learning approach is quantitatively compared with traditional image analysis methods in digital histopathology requiring prior computation of handcrafted features, such as statistical measures using gray level co-occurrence matrix, Gabor filter-bank responses, LBP histograms, gray histograms, HSV histograms and RGB histograms, followed by random forest machine learning. Additionally, the widely known AlexNet deep convolutional framework is comparatively analyzed for the corresponding classification problems. The proposed convolutional neural network architecture reports favorable results, with an overall classification accuracy of 0.6990 for cancer classification and 0.8144 for necrosis detection. Copyright © 2017 Elsevier Ltd. All rights reserved.
FPGA-accelerated algorithm for the regular expression matching system
NASA Astrophysics Data System (ADS)
Russek, P.; Wiatr, K.
2015-01-01
This article describes an algorithm to support a regular expressions matching system. The goal was to achieve an attractive performance system with low energy consumption. The basic idea of the algorithm comes from a concept of the Bloom filter. It starts from the extraction of static sub-strings for strings of regular expressions. The algorithm is devised to gain from its decomposition into parts which are intended to be executed by custom hardware and the central processing unit (CPU). The pipelined custom processor architecture is proposed and a software algorithm explained accordingly. The software part of the algorithm was coded in C and runs on a processor from the ARM family. The hardware architecture was described in VHDL and implemented in field programmable gate array (FPGA). The performance results and required resources of the above experiments are given. An example of target application for the presented solution is computer and network security systems. The idea was tested on nearly 100,000 body-based viruses from the ClamAV virus database. The solution is intended for the emerging technology of clusters of low-energy computing nodes.
A compound memristive synapse model for statistical learning through STDP in spiking neural networks
Bill, Johannes; Legenstein, Robert
2014-01-01
Memristors have recently emerged as promising circuit elements to mimic the function of biological synapses in neuromorphic computing. The fabrication of reliable nanoscale memristive synapses, that feature continuous conductance changes based on the timing of pre- and postsynaptic spikes, has however turned out to be challenging. In this article, we propose an alternative approach, the compound memristive synapse, that circumvents this problem by the use of memristors with binary memristive states. A compound memristive synapse employs multiple bistable memristors in parallel to jointly form one synapse, thereby providing a spectrum of synaptic efficacies. We investigate the computational implications of synaptic plasticity in the compound synapse by integrating the recently observed phenomenon of stochastic filament formation into an abstract model of stochastic switching. Using this abstract model, we first show how standard pulsing schemes give rise to spike-timing dependent plasticity (STDP) with a stabilizing weight dependence in compound synapses. In a next step, we study unsupervised learning with compound synapses in networks of spiking neurons organized in a winner-take-all architecture. Our theoretical analysis reveals that compound-synapse STDP implements generalized Expectation-Maximization in the spiking network. Specifically, the emergent synapse configuration represents the most salient features of the input distribution in a Mixture-of-Gaussians generative model. Furthermore, the network's spike response to spiking input streams approximates a well-defined Bayesian posterior distribution. We show in computer simulations how such networks learn to represent high-dimensional distributions over images of handwritten digits with high fidelity even in presence of substantial device variations and under severe noise conditions. Therefore, the compound memristive synapse may provide a synaptic design principle for future neuromorphic architectures. PMID:25565943
Distributed computing environments for future space control systems
NASA Technical Reports Server (NTRS)
Viallefont, Pierre
1993-01-01
The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
A Metascalable Computing Framework for Large Spatiotemporal-Scale Atomistic Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nomura, K; Seymour, R; Wang, W
2009-02-17
A metascalable (or 'design once, scale on new architectures') parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials based on spatiotemporal data locality principles, which is expected to scale on emerging multipetaflops architectures. The framework consists of: (1) an embedded divide-and-conquer (EDC) algorithmic framework based on spatial locality to design linear-scaling algorithms for high complexity problems; (2) a space-time-ensemble parallel (STEP) approach based on temporal locality to predict long-time dynamics, while introducing multiple parallelization axes; and (3) a tunable hierarchical cellular decomposition (HCD) parallelization framework to map these O(N) algorithms onto a multicore cluster based onmore » hybrid implementation combining message passing and critical section-free multithreading. The EDC-STEP-HCD framework exposes maximal concurrency and data locality, thereby achieving: (1) inter-node parallel efficiency well over 0.95 for 218 billion-atom molecular-dynamics and 1.68 trillion electronic-degrees-of-freedom quantum-mechanical simulations on 212,992 IBM BlueGene/L processors (superscalability); (2) high intra-node, multithreading parallel efficiency (nanoscalability); and (3) nearly perfect time/ensemble parallel efficiency (eon-scalability). The spatiotemporal scale covered by MD simulation on a sustained petaflops computer per day (i.e. petaflops {center_dot} day of computing) is estimated as NT = 2.14 (e.g. N = 2.14 million atoms for T = 1 microseconds).« less
Electro-Optic Computing Architectures. Volume I
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
Towards a New Assessment of Urban Areas from Local to Global Scales
NASA Astrophysics Data System (ADS)
Bhaduri, B. L.; Roy Chowdhury, P. K.; McKee, J.; Weaver, J.; Bright, E.; Weber, E.
2015-12-01
Since early 2000s, starting with NASA MODIS, satellite based remote sensing has facilitated collection of imagery with medium spatial resolution but high temporal resolution (daily). This trend continues with an increasing number of sensors and data products. Increasing spatial and temporal resolutions of remotely sensed data archives, from both public and commercial sources, have significantly enhanced the quality of mapping and change data products. However, even with automation of such analysis on evolving computing platforms, rates of data processing have been suboptimal largely because of the ever-increasing pixel to processor ratio coupled with limitations of the computing architectures. Novel approaches utilizing spatiotemporal data mining techniques and computational architectures have emerged that demonstrates the potential for sustained and geographically scalable landscape monitoring to be operational. We exemplify this challenge with two broad research initiatives on High Performance Geocomputation at Oak Ridge National Laboratory: (a) mapping global settlement distribution; (b) developing national critical infrastructure databases. Our present effort, on large GPU based architectures, to exploit high resolution (1m or less) satellite and airborne imagery for extracting settlements at global scale is yielding understanding of human settlement patterns and urban areas at unprecedented resolution. Comparison of such urban land cover database, with existing national and global land cover products, at various geographic scales in selected parts of the world is revealing intriguing patterns and insights for urban assessment. Early results, from the USA, Taiwan, and Egypt, indicate closer agreements (5-10%) in urban area assessments among databases at larger, aggregated geographic extents. However, spatial variability at local scales could be significantly different (over 50% disagreement).
Approximate Computing Techniques for Iterative Graph Algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panyala, Ajay R.; Subasi, Omer; Halappanavar, Mahantesh
Approximate computing enables processing of large-scale graphs by trading off quality for performance. Approximate computing techniques have become critical not only due to the emergence of parallel architectures but also the availability of large scale datasets enabling data-driven discovery. Using two prototypical graph algorithms, PageRank and community detection, we present several approximate computing heuristics to scale the performance with minimal loss of accuracy. We present several heuristics including loop perforation, data caching, incomplete graph coloring and synchronization, and evaluate their efficiency. We demonstrate performance improvements of up to 83% for PageRank and up to 450x for community detection, with lowmore » impact of accuracy for both the algorithms. We expect the proposed approximate techniques will enable scalable graph analytics on data of importance to several applications in science and their subsequent adoption to scale similar graph algorithms.« less
Accelerating Climate Simulations Through Hybrid Computing
NASA Technical Reports Server (NTRS)
Zhou, Shujia; Sinno, Scott; Cruz, Carlos; Purcell, Mark
2009-01-01
Unconventional multi-core processors (e.g., IBM Cell B/E and NYIDIDA GPU) have emerged as accelerators in climate simulation. However, climate models typically run on parallel computers with conventional processors (e.g., Intel and AMD) using MPI. Connecting accelerators to this architecture efficiently and easily becomes a critical issue. When using MPI for connection, we identified two challenges: (1) identical MPI implementation is required in both systems, and; (2) existing MPI code must be modified to accommodate the accelerators. In response, we have extended and deployed IBM Dynamic Application Virtualization (DAV) in a hybrid computing prototype system (one blade with two Intel quad-core processors, two IBM QS22 Cell blades, connected with Infiniband), allowing for seamlessly offloading compute-intensive functions to remote, heterogeneous accelerators in a scalable, load-balanced manner. Currently, a climate solar radiation model running with multiple MPI processes has been offloaded to multiple Cell blades with approx.10% network overhead.
The CEOS Global Observation Strategy for Disaster Risk Management: An Enterprise Architect's View
NASA Astrophysics Data System (ADS)
Moe, K.; Evans, J. D.; Frye, S.
2013-12-01
The Committee on Earth Observation Satellites (CEOS) Working Group on Information Systems and Services (WGISS), on behalf of the Global Earth Observation System of Systems (GEOSS), is defining an enterprise architecture (known as GA.4.D) for the use of satellite observations in international disaster management. This architecture defines the scope and structure of the disaster management enterprise (based on disaster types and phases); its processes (expressed via use cases / system functions); and its core values (in particular, free and open data sharing via standard interfaces). The architecture also details how a disaster management enterprise describes, obtains, and handles earth observations and data products for decision-support; and how it draws on distributed computational services for streamlined operational capability. We have begun to apply this architecture to a new CEOS initiative, the Global Observation Strategy for Disaster Risk Management (DRM). CEOS is defining this Strategy based on the outcomes of three pilot projects focused on seismic hazards, volcanoes, and floods. These pilots offer a unique opportunity to characterize and assess the impacts (benefits / costs) of the GA.4.D architecture in practice. In particular, the DRM Floods Pilot is applying satellite-based optical and radar data to flood mitigation, warning, and response, including monitoring and modeling at regional to global scales. It is focused on serving user needs and building local institutional / technical capacity in the Caribbean, Southern Africa, and Southeast Asia. In the context of these CEOS DRM Pilots, we are characterizing where and how the GA.4D architecture helps participants to: - Understand the scope and nature of hazard events quickly and accurately - Assure timely delivery of observations into analysis, modeling, and decision-making - Streamline user access to products - Lower barriers to entry for users or suppliers - Streamline or focus field operations in disaster reduction - Reduce redundancies and gaps in inter-organizational systems - Assist in planning / managing / prioritizing information and computing resources - Adapt computational resources to new technologies or evolving user needs - Sustain capability for the long term Insights from this exercise are helping us to abstract best practices applicable to other contexts, disaster types, and disaster phases, whereby local communities can improve their use of satellite data for greater preparedness. This effort is also helping to assess the likely impacts and roles of emerging technologies (such as cloud computing, "Big Data" analysis, location-based services, crowdsourcing, semantic services, small satellites, drones, direct broadcast, or model webs) in future disaster management activities.
Image-Processing Software For A Hypercube Computer
NASA Technical Reports Server (NTRS)
Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.
1992-01-01
Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Experimental Comparison of Two Quantum Computing Architectures
2017-03-28
IN A U G U RA L A RT IC LE CO M PU TE R SC IE N CE S Experimental comparison of two quantum computing architectures Norbert M. Linkea,b,1, Dmitri...the vast computing power a universal quantumcomputer could offer, several candidate systems are being explored. They have allowed experimental ...existing systems and the role of architecture in quantum computer design . These will be crucial for the realization of more advanced future incarna
NASA Astrophysics Data System (ADS)
Carlton, David Bryan
The exponential improvements in speed, energy efficiency, and cost that the computer industry has relied on for growth during the last 50 years are in danger of ending within the decade. These improvements all have relied on scaling the size of the silicon-based transistor that is at the heart of every modern CPU down to smaller and smaller length scales. However, as the size of the transistor reaches scales that are measured in the number of atoms that make it up, it is clear that this scaling cannot continue forever. As a result of this, there has been a great deal of research effort directed at the search for the next device that will continue to power the growth of the computer industry. However, due to the billions of dollars of investment that conventional silicon transistors have received over the years, it is unlikely that a technology will emerge that will be able to beat it outright in every performance category. More likely, different devices will possess advantages over conventional transistors for certain applications and uses. One of these emerging computing platforms is nanomagnetic logic (NML). NML-based circuits process information by manipulating the magnetization states of single-domain nanomagnets coupled to their nearest neighbors through magnetic dipole interactions. The state variable is magnetization direction and computations can take place without passing an electric current. This makes them extremely attractive as a replacement for conventional transistor-based computing architectures for certain ultra-low power applications. In most work to date, nanomagnetic logic circuits have used an external magnetic clocking field to reset the system between computations. The clocking field is then subsequently removed very slowly relative to the magnetization dynamics, guiding the nanomagnetic logic circuit adiabatically into its magnetic ground state. In this dissertation, I will discuss the dynamics behind this process and show that it is greatly influenced by thermal fluctuations. The magnetic ground state containing the answer to the computation is reached by a stochastic process very similar to the thermal annealing of crystalline materials. We will discuss how these dynamics affect the expected reliability, speed, and energy dissipation of NML systems operating under these conditions. Next I will show how a slight change in the properties of the nanomagnets that make up a NML circuit can completely alter the dynamics by which computations take place. The addition of biaxial anisotropy to the magnetic energy landscape creates a metastable state along the hard axis of the nanomagnet. This metastability can be used to remove the stochastic nature of the computation and has large implications for reliability, speed, and energy dissipation which will all be discussed. The changes to NML operation by the addition of biaxial anisotropy introduce new challenges to realizing a commercially viable logic architecture. In the final chapter, I will discuss these challenges and talk about the architectural changes that are necessary to make a working NML circuit based on nanomagnets with biaxial anisotropy.
A Programmable Five Qubit Quantum Computer Using Trapped Atomic Ions
NASA Astrophysics Data System (ADS)
Debnath, Shantanu
Quantum computers can solve certain problems more efficiently compared to conventional classical methods. In the endeavor to build a quantum computer, several competing platforms have emerged that can implement certain quantum algorithms using a few qubits. However, the demonstrations so far have been done usually by tailoring the hardware to meet the requirements of a particular algorithm implemented for a limited number of instances. Although such proof of principal implementations are important to verify the working of algorithms on a physical system, they further need to have the potential to serve as a general purpose quantum computer allowing the flexibility required for running multiple algorithms and be scaled up to host more qubits. Here we demonstrate a small programmable quantum computer based on five trapped atomic ions each of which serves as a qubit. By optically resolving each ion we can individually address them in order to perform a complete set of single-qubit and fully connected two-qubit quantum gates and alsoperform efficient individual qubit measurements. We implement a computation architecture that accepts an algorithm from a user interface in the form of a standard logic gate sequence and decomposes it into fundamental quantum operations that are native to the hardware using a set of compilation instructions that are defined within the software. These operations are then effected through a pattern of laser pulses that perform coherent rotations on targeted qubits in the chain. The architecture implemented in the experiment therefore gives us unprecedented flexibility in the programming of any quantum algorithm while staying blind to the underlying hardware. As a demonstration we implement the Deutsch-Jozsa and Bernstein-Vazirani algorithms on the five-qubit processor and achieve average success rates of 95 and 90 percent, respectively. We also implement a five-qubit coherent quantum Fourier transform and examine its performance in the period finding and phase estimation protocol. We find fidelities of 84 and 62 percent, respectively. While maintaining the same computation architecture the system can be scaled to more ions using resources that scale favorably (O(N. 2)) with the numberof qubits N.
A learnable parallel processing architecture towards unity of memory and computing
NASA Astrophysics Data System (ADS)
Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.
2015-08-01
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.
Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J
2015-08-14
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Tier-2 Optimisation for Computational Density/Diversity and Big Data
NASA Astrophysics Data System (ADS)
Fay, R. B.; Bland, J.
2014-06-01
As the number of cores on chip continues to trend upwards and new CPU architectures emerge, increasing CPU density and diversity presents multiple challenges to site administrators. These include scheduling for massively multi-core systems (potentially including Graphical Processing Units (GPU), integrated and dedicated) and Many Integrated Core (MIC)) to ensure a balanced throughput of jobs while preserving overall cluster throughput, as well as the increasing complexity of developing for these heterogeneous platforms, and the challenge in managing this more complex mix of resources. In addition, meeting data demands as both dataset sizes increase and as the rate of demand scales with increased computational power requires additional performance from the associated storage elements. In this report, we evaluate one emerging technology, Solid State Drive (SSD) caching for RAID controllers, with consideration to its potential to assist in meeting evolving demand. We also briefly consider the broader developing trends outlined above in order to identify issues that may develop and assess what actions should be taken in the immediate term to address those.
A Secure Cloud-Assisted Wireless Body Area Network in Mobile Emergency Medical Care System.
Li, Chun-Ta; Lee, Cheng-Chi; Weng, Chi-Yao
2016-05-01
Recent advances in medical treatment and emergency applications, the need of integrating wireless body area network (WBAN) with cloud computing can be motivated by providing useful and real time information about patients' health state to the doctors and emergency staffs. WBAN is a set of body sensors carried by the patient to collect and transmit numerous health items to medical clouds via wireless and public communication channels. Therefore, a cloud-assisted WBAN facilitates response in case of emergency which can save patients' lives. Since the patient's data is sensitive and private, it is important to provide strong security and protection on the patient's medical data over public and insecure communication channels. In this paper, we address the challenge of participant authentication in mobile emergency medical care systems for patients supervision and propose a secure cloud-assisted architecture for accessing and monitoring health items collected by WBAN. For ensuring a high level of security and providing a mutual authentication property, chaotic maps based authentication and key agreement mechanisms are designed according to the concept of Diffie-Hellman key exchange, which depends on the CMBDLP and CMBDHP problems. Security and performance analyses show how the proposed system guaranteed the patient privacy and the system confidentiality of sensitive medical data while preserving the low computation property in medical treatment and remote medical monitoring.
Utilizing data grid architecture for the backup and recovery of clinical image data.
Liu, Brent J; Zhou, M Z; Documet, J
2005-01-01
Grid Computing represents the latest and most exciting technology to evolve from the familiar realm of parallel, peer-to-peer and client-server models. However, there has been limited investigation into the impact of this emerging technology in medical imaging and informatics. In particular, PACS technology, an established clinical image repository system, while having matured significantly during the past ten years, still remains weak in the area of clinical image data backup. Current solutions are expensive or time consuming and the technology is far from foolproof. Many large-scale PACS archive systems still encounter downtime for hours or days, which has the critical effect of crippling daily clinical operations. In this paper, a review of current backup solutions will be presented along with a brief introduction to grid technology. Finally, research and development utilizing the grid architecture for the recovery of clinical image data, in particular, PACS image data, will be presented. The focus of this paper is centered on applying a grid computing architecture to a DICOM environment since DICOM has become the standard for clinical image data and PACS utilizes this standard. A federation of PACS can be created allowing a failed PACS archive to recover its image data from others in the federation in a seamless fashion. The design reflects the five-layer architecture of grid computing: Fabric, Resource, Connectivity, Collective, and Application Layers. The testbed Data Grid is composed of one research laboratory and two clinical sites. The Globus 3.0 Toolkit (Co-developed by the Argonne National Laboratory and Information Sciences Institute, USC) for developing the core and user level middleware is utilized to achieve grid connectivity. The successful implementation and evaluation of utilizing data grid architecture for clinical PACS data backup and recovery will provide an understanding of the methodology for using Data Grid in clinical image data backup for PACS, as well as establishment of benchmarks for performance from future grid technology improvements. In addition, the testbed can serve as a road map for expanded research into large enterprise and federation level data grids to guarantee CA (Continuous Availability, 99.999% up time) in a variety of medical data archiving, retrieval, and distribution scenarios.
Digital optical computers at the optoelectronic computing systems center
NASA Technical Reports Server (NTRS)
Jordan, Harry F.
1991-01-01
The Digital Optical Computing Program within the National Science Foundation Engineering Research Center for Opto-electronic Computing Systems has as its specific goal research on optical computing architectures suitable for use at the highest possible speeds. The program can be targeted toward exploiting the time domain because other programs in the Center are pursuing research on parallel optical systems, exploiting optical interconnection and optical devices and materials. Using a general purpose computing architecture as the focus, we are developing design techniques, tools and architecture for operation at the speed of light limit. Experimental work is being done with the somewhat low speed components currently available but with architectures which will scale up in speed as faster devices are developed. The design algorithms and tools developed for a general purpose, stored program computer are being applied to other systems such as optimally controlled optical communication networks.
RNA nanotechnology for computer design and in vivo computation
Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B.; Guo, Peixuan
2013-01-01
Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658–667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 490 nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology. PMID:24000362
RNA nanotechnology for computer design and in vivo computation.
Qiu, Meikang; Khisamutdinov, Emil; Zhao, Zhengyi; Pan, Cheryl; Choi, Jeong-Woo; Leontis, Neocles B; Guo, Peixuan
2013-10-13
Molecular-scale computing has been explored since 1989 owing to the foreseeable limitation of Moore's law for silicon-based computation devices. With the potential of massive parallelism, low energy consumption and capability of working in vivo, molecular-scale computing promises a new computational paradigm. Inspired by the concepts from the electronic computer, DNA computing has realized basic Boolean functions and has progressed into multi-layered circuits. Recently, RNA nanotechnology has emerged as an alternative approach. Owing to the newly discovered thermodynamic stability of a special RNA motif (Shu et al. 2011 Nat. Nanotechnol. 6, 658-667 (doi:10.1038/nnano.2011.105)), RNA nanoparticles are emerging as another promising medium for nanodevice and nanomedicine as well as molecular-scale computing. Like DNA, RNA sequences can be designed to form desired secondary structures in a straightforward manner, but RNA is structurally more versatile and more thermodynamically stable owing to its non-canonical base-pairing, tertiary interactions and base-stacking property. A 90-nucleotide RNA can exhibit 4⁹⁰ nanostructures, and its loops and tertiary architecture can serve as a mounting dovetail that eliminates the need for external linking dowels. Its enzymatic and fluorogenic activity creates diversity in computational design. Varieties of small RNA can work cooperatively, synergistically or antagonistically to carry out computational logic circuits. The riboswitch and enzymatic ribozyme activities and its special in vivo attributes offer a great potential for in vivo computation. Unique features in transcription, termination, self-assembly, self-processing and acid resistance enable in vivo production of RNA nanoparticles that harbour various regulators for intracellular manipulation. With all these advantages, RNA computation is promising, but it is still in its infancy. Many challenges still exist. Collaborations between RNA nanotechnologists and computer scientists are necessary to advance this nascent technology.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Lin, Shih-Sung; Hung, Min-Hsiung; Tsai, Chang-Lung; Chou, Li-Ping
2012-12-01
The study aims to provide an ease-of-use approach for senior patients to utilize remote healthcare systems. An ease-of-use remote healthcare system (RHS) architecture using RFID (Radio Frequency Identification) and networking technologies is developed. Specifically, the codes in RFID tags are used for authenticating the patients' ID to secure and ease the login process. The patient needs only to take one action, i.e. placing a RFID tag onto the reader, to automatically login and start the RHS and then acquire automatic medical services. An ease-of-use emergency monitoring and reporting mechanism is developed as well to monitor and protect the safety of the senior patients who have to be left alone at home. By just pressing a single button, the RHS can automatically report the patient's emergency information to the clinic side so that the responsible medical personnel can take proper urgent actions for the patient. Besides, Web services technology is used to build the Internet communication scheme of the RHS so that the interoperability and data transmission security between the home server and the clinical server can be enhanced. A prototype RHS is constructed to validate the effectiveness of our designs. Testing results show that the proposed RHS architecture possesses the characteristics of ease to use, simplicity to operate, promptness in login, and no need to preserve identity information. The proposed RHS architecture can effectively increase the willingness of senior patients who act slowly or are unfamiliar with computer operations to use the RHS. The research results can be used as an add-on for developing future remote healthcare systems.
Testolin, Alberto; De Filippo De Grazia, Michele; Zorzi, Marco
2017-01-01
The recent "deep learning revolution" in artificial neural networks had strong impact and widespread deployment for engineering applications, but the use of deep learning for neurocomputational modeling has been so far limited. In this article we argue that unsupervised deep learning represents an important step forward for improving neurocomputational models of perception and cognition, because it emphasizes the role of generative learning as opposed to discriminative (supervised) learning. As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations. We compare different network architectures commonly used as building blocks for unsupervised deep learning by systematically testing the type of receptive fields and gain modulation developed by the hidden neurons. In particular, we compare Restricted Boltzmann Machines (RBMs), which are stochastic, generative networks with bidirectional connections trained using contrastive divergence, with autoencoders, which are deterministic networks trained using error backpropagation. For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation. The unsupervised models are then compared with supervised, feed-forward networks that learn an explicit mapping between different spatial reference frames. Our simulations show that both architectural and learning constraints strongly influenced the emergent coding of visual space in terms of distribution of tuning functions at the level of single neurons. Unsupervised models, and particularly RBMs, were found to more closely adhere to neurophysiological data from single-cell recordings in the primate parietal cortex. These results provide new insights into how basic properties of artificial neural networks might be relevant for modeling neural information processing in biological systems.
Testolin, Alberto; De Filippo De Grazia, Michele; Zorzi, Marco
2017-01-01
The recent “deep learning revolution” in artificial neural networks had strong impact and widespread deployment for engineering applications, but the use of deep learning for neurocomputational modeling has been so far limited. In this article we argue that unsupervised deep learning represents an important step forward for improving neurocomputational models of perception and cognition, because it emphasizes the role of generative learning as opposed to discriminative (supervised) learning. As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations. We compare different network architectures commonly used as building blocks for unsupervised deep learning by systematically testing the type of receptive fields and gain modulation developed by the hidden neurons. In particular, we compare Restricted Boltzmann Machines (RBMs), which are stochastic, generative networks with bidirectional connections trained using contrastive divergence, with autoencoders, which are deterministic networks trained using error backpropagation. For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation. The unsupervised models are then compared with supervised, feed-forward networks that learn an explicit mapping between different spatial reference frames. Our simulations show that both architectural and learning constraints strongly influenced the emergent coding of visual space in terms of distribution of tuning functions at the level of single neurons. Unsupervised models, and particularly RBMs, were found to more closely adhere to neurophysiological data from single-cell recordings in the primate parietal cortex. These results provide new insights into how basic properties of artificial neural networks might be relevant for modeling neural information processing in biological systems. PMID:28377709
NASA Astrophysics Data System (ADS)
Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng
2018-04-01
In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.
NASA Technical Reports Server (NTRS)
Hsia, T. C.; Lu, G. Z.; Han, W. H.
1987-01-01
In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
Computational sciences in the upstream oil and gas industry
Halsey, Thomas C.
2016-01-01
The predominant technical challenge of the upstream oil and gas industry has always been the fundamental uncertainty of the subsurface from which it produces hydrocarbon fluids. The subsurface can be detected remotely by, for example, seismic waves, or it can be penetrated and studied in the extremely limited vicinity of wells. Inevitably, a great deal of uncertainty remains. Computational sciences have been a key avenue to reduce and manage this uncertainty. In this review, we discuss at a relatively non-technical level the current state of three applications of computational sciences in the industry. The first of these is seismic imaging, which is currently being revolutionized by the emergence of full wavefield inversion, enabled by algorithmic advances and petascale computing. The second is reservoir simulation, also being advanced through the use of modern highly parallel computing architectures. Finally, we comment on the role of data analytics in the upstream industry. This article is part of the themed issue ‘Energy and the subsurface’. PMID:27597785
On the Impact of Execution Models: A Case Study in Computational Chemistry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chavarría-Miranda, Daniel; Halappanavar, Mahantesh; Krishnamoorthy, Sriram
2015-05-25
Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancingmore » that has comparable performance to a traditional hypergraph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.« less
Coordinated traffic incident management using the I-Net embedded sensor architecture
NASA Astrophysics Data System (ADS)
Dudziak, Martin J.
1999-01-01
The I-Net intelligent embedded sensor architecture enables the reconfigurable construction of wide-area remote sensing and data collection networks employing diverse processing and data acquisition modules communicating over thin- server/thin-client protocols. Adaptive initially for operation using mobile remotely-piloted vehicle platforms such as small helicopter robots such as the Hornet and Ascend-I, the I-Net architecture lends itself to a critical problem in the management of both spontaneous and planned traffic congestion and rerouting over major interstate thoroughfares such as the I-95 Corridor. Pre-programmed flight plans and ad hoc operator-assisted navigation of the lightweight helicopter, using an auto-pilot and gyroscopic stabilization augmentation units, allows daytime or nighttime over-the-horizon flights of the unit to collect and transmit real-time video imagery that may be stored or transmitted to other locations. With on-board GPS and ground-based pattern recognition capabilities to augment the standard video collection process, this approach enables traffic management and emergency response teams to plan and assist real-time in the adjustment of traffic flows in high- density or congested areas or during dangerous road conditions such as during ice, snow, and hurricane storms. The I-Net architecture allows for integration of land-based and roadside sensors within a comprehensive automated traffic management system with communications to and form an airborne or other platform to devices in the network other than human-operated desktop computers, thereby allowing more rapid assimilation and response for critical data. Experiments have been conducted using several modified platforms and standard video and still photographic equipment. Current research and development is focused upon modification of the modular instrumentation units in order to accommodate faster loading and reloading of equipment onto the RPV, extension of the I-Net architecture to enable RPV-to-RPV signaling and control, and refinement of safety and emergency mechanisms to handle RPV mechanical failure during flight.
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Computer-aided acquisition and logistics support (CALS): Concept of Operations for Depot Maintenance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bourgeois, N.C.; Greer, D.K.
1993-04-01
This CALS Concept of Operations for Depot Maintenance provides the foundation strategy and the near term tactical plan for CALS implementation in the depot maintenance environment. The user requirements enumerated and the overarching architecture outlined serve as the primary framework for implementation planning. The seamless integration of depot maintenance business processes and supporting information systems with the emerging global CALS environment will be critical to the efficient realization of depot user's information requirements, and as, such will be a fundamental theme in depot implementations.
Murakoshi, Kazushi; Mizuno, Junya
2004-11-01
In order to rapidly follow unexpected environmental changes, we propose a parameter control method in reinforcement learning that changes each of learning parameters in appropriate directions. We determine each appropriate direction on the basis of relationships between behaviors and neuromodulators by considering an emergency as a key word. Computer experiments show that the agents using our proposed method could rapidly respond to unexpected environmental changes, not depending on either two reinforcement learning algorithms (Q-learning and actor-critic (AC) architecture) or two learning problems (discontinuous and continuous state-action problems).
Adaptation of XMM-Newton SAS to GRID and VO architectures via web
NASA Astrophysics Data System (ADS)
Ibarra, A.; de La Calle, I.; Gabriel, C.; Salgado, J.; Osuna, P.
2008-10-01
The XMM-Newton Scientific Analysis Software (SAS) is a robust software that has allowed users to produce good scientific results since the beginning of the mission. This has been possible given the SAS capability to evolve with the advent of new technologies and adapt to the needs of the scientific community. The prototype of the Remote Interface for Science Analysis (RISA) presented here, is one such example, which provides remote analysis of XMM-Newton data with access to all the existing SAS functionality, while making use of GRID computing technology. This new technology has recently emerged within the astrophysical community to tackle the ever lasting problem of computer power for the reduction of large amounts of data.
Collaborative Working Architecture for IoT-Based Applications.
Mora, Higinio; Signes-Pont, María Teresa; Gil, David; Johnsson, Magnus
2018-05-23
The new sensing applications need enhanced computing capabilities to handle the requirements of complex and huge data processing. The Internet of Things (IoT) concept brings processing and communication features to devices. In addition, the Cloud Computing paradigm provides resources and infrastructures for performing the computations and outsourcing the work from the IoT devices. This scenario opens new opportunities for designing advanced IoT-based applications, however, there is still much research to be done to properly gear all the systems for working together. This work proposes a collaborative model and an architecture to take advantage of the available computing resources. The resulting architecture involves a novel network design with different levels which combines sensing and processing capabilities based on the Mobile Cloud Computing (MCC) paradigm. An experiment is included to demonstrate that this approach can be used in diverse real applications. The results show the flexibility of the architecture to perform complex computational tasks of advanced applications.
Optimizing Engineering Tools Using Modern Ground Architectures
2017-12-01
Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Heterogeneous compute in computer vision: OpenCL in OpenCV
NASA Astrophysics Data System (ADS)
Gasparakis, Harris
2014-02-01
We explore the relevance of Heterogeneous System Architecture (HSA) in Computer Vision, both as a long term vision, and as a near term emerging reality via the recently ratified OpenCL 2.0 Khronos standard. After a brief review of OpenCL 1.2 and 2.0, including HSA features such as Shared Virtual Memory (SVM) and platform atomics, we identify what genres of Computer Vision workloads stand to benefit by leveraging those features, and we suggest a new mental framework that replaces GPU compute with hybrid HSA APU compute. As a case in point, we discuss, in some detail, popular object recognition algorithms (part-based models), emphasizing the interplay and concurrent collaboration between the GPU and CPU. We conclude by describing how OpenCL has been incorporated in OpenCV, a popular open source computer vision library, emphasizing recent work on the Transparent API, to appear in OpenCV 3.0, which unifies the native CPU and OpenCL execution paths under a single API, allowing the same code to execute either on CPU or on a OpenCL enabled device, without even recompiling.
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
Soft Multifunctional Composites and Emulsions with Liquid Metals.
Kazem, Navid; Hellebrekers, Tess; Majidi, Carmel
2017-07-01
Binary mixtures of liquid metal (LM) or low-melting-point alloy (LMPA) in an elastomeric or fluidic carrier medium can exhibit unique combinations of electrical, thermal, and mechanical properties. This emerging class of soft multifunctional composites have potential applications in wearable computing, bio-inspired robotics, and shape-programmable architectures. The dispersion phase can range from dilute droplets to connected networks that support electrical conductivity. In contrast to deterministically patterned LM microfluidics, LMPA- and LM-embedded elastomer (LMEE) composites are statistically homogenous and exhibit effective bulk properties. Eutectic Ga-In (EGaIn) and Ga-In-Sn (Galinstan) alloys are typically used due to their high conductivity, low viscosity, negligible nontoxicity, and ability to wet to nonmetallic materials. Because they are liquid-phase, these alloys can alter the electrical and thermal properties of the composite while preserving the mechanics of the surrounding medium. For composites with LMPA inclusions (e.g., Field's metal, Pb-based solder), mechanical rigidity can be actively tuned with external heating or electrical activation. This progress report, reviews recent experimental and theoretical studies of this emerging class of soft material architectures and identifies current technical challenges and opportunities for further advancement. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
State-of-the-art in Heterogeneous Computing
Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...
2010-01-01
Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1991-01-01
The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Advanced computer architecture for large-scale real-time applications.
DOT National Transportation Integrated Search
1973-04-01
Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.
ERIC Educational Resources Information Center
Beltrametti, Monica; English, Will
1994-01-01
Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…
ERIC Educational Resources Information Center
Farid, Ayman A.; Zaghloul, Weaam M.; Dewidar, Khaled M.
2014-01-01
The great shift in sustainability and computer aided design in the field of architecture caused a remarkable change in the architecture philosophy, new aspects of beauty and aesthetic values are being introduced, and traditional definitions for beauty cannot fully cover this aspects, which causes a gap between; new architecture works criticism and…
Execution environment for intelligent real-time control systems
NASA Technical Reports Server (NTRS)
Sztipanovits, Janos
1987-01-01
Modern telerobot control technology requires the integration of symbolic and non-symbolic programming techniques, different models of parallel computations, and various programming paradigms. The Multigraph Architecture, which has been developed for the implementation of intelligent real-time control systems is described. The layered architecture includes specific computational models, integrated execution environment and various high-level tools. A special feature of the architecture is the tight coupling between the symbolic and non-symbolic computations. It supports not only a data interface, but also the integration of the control structures in a parallel computing environment.
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy
Hwang, Wen-Jyi; Cheng, Shih-Chang; Cheng, Chau-Jern
2011-01-01
This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system. PMID:22163688
Feasibility of Using Distributed Wireless Mesh Networks for Medical Emergency Response
Braunstein, Brian; Trimble, Troy; Mishra, Rajesh; Manoj, B. S.; Rao, Ramesh; Lenert, Leslie
2006-01-01
Achieving reliable, efficient data communications networks at a disaster site is a difficult task. Network paradigms, such as Wireless Mesh Network (WMN) architectures, form one exemplar for providing high-bandwidth, scalable data communication for medical emergency response activity. WMNs are created by self-organized wireless nodes that use multi-hop wireless relaying for data transfer. In this paper, we describe our experience using a mesh network architecture we developed for homeland security and medical emergency applications. We briefly discuss the architecture and present the traffic behavioral observations made by a client-server medical emergency application tested during a large-scale homeland security drill. We present our traffic measurements, describe lessons learned, and offer functional requirements (based on field testing) for practical 802.11 mesh medical emergency response networks. With certain caveats, the results suggest that 802.11 mesh networks are feasible and scalable systems for field communications in disaster settings. PMID:17238308
Performance/price estimates for cortex-scale hardware: a design space exploration.
Zaveri, Mazad S; Hammerstrom, Dan
2011-04-01
In this paper, we revisit the concept of virtualization. Virtualization is useful for understanding and investigating the performance/price and other trade-offs related to the hardware design space. Moreover, it is perhaps the most important aspect of a hardware design space exploration. Such a design space exploration is a necessary part of the study of hardware architectures for large-scale computational models for intelligent computing, including AI, Bayesian, bio-inspired and neural models. A methodical exploration is needed to identify potentially interesting regions in the design space, and to assess the relative performance/price points of these implementations. As an example, in this paper we investigate the performance/price of (digital and mixed-signal) CMOS and hypothetical CMOL (nanogrid) technology based hardware implementations of human cortex-scale spiking neural systems. Through this analysis, and the resulting performance/price points, we demonstrate, in general, the importance of virtualization, and of doing these kinds of design space explorations. The specific results suggest that hybrid nanotechnology such as CMOL is a promising candidate to implement very large-scale spiking neural systems, providing a more efficient utilization of the density and storage benefits of emerging nano-scale technologies. In general, we believe that the study of such hypothetical designs/architectures will guide the neuromorphic hardware community towards building large-scale systems, and help guide research trends in intelligent computing, and computer engineering. Copyright © 2010 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Betts, Janelle Lyon
2001-01-01
Describes a high school art assignment in which students utilize Appleworks or Claris Works to design their own house, after learning about architectural styles and how to use the computer program. States that the project develops student computer skills and increases student knowledge about architecture. (CMK)
A compact physical model for the simulation of pNML-based architectures
NASA Astrophysics Data System (ADS)
Turvani, G.; Riente, F.; Plozner, E.; Schmitt-Landsiedel, D.; Breitkreutz-v. Gamm, S.
2017-05-01
Among emerging technologies, perpendicular Nanomagnetic Logic (pNML) seems to be very promising because of its capability of combining logic and memory onto the same device, scalability, 3D-integration and low power consumption. Recently, Full Adder (FA) structures clocked by a global magnetic field have been experimentally demonstrated and detailed characterizations of the switching process governing the domain wall (DW) nucleation probability Pnuc and time tnuc have been performed. However, the design of pNML architectures represent a crucial point in the study of this technology; this can have a remarkable impact on the reliability of pNML structures. Here, we present a compact model developed in VHDL which enables to simulate complex pNML architectures while keeping into account critical physical parameters. Therefore, such parameters have been extracted from the experiments, fitted by the corresponding physical equations and encapsulated into the proposed model. Within this, magnetic structures are decomposed into a few basic elements (nucleation centers, nanowires, inverters etc.) represented by the according physical description. To validate the model, we redesigned a FA and compared our simulation results to the experiment. With this compact model of pNML devices we have envisioned a new methodology which makes it possible to simulate and test the physical behavior of complex architectures with very low computational costs.
Fully parallel write/read in resistive synaptic array for accelerating on-chip learning
NASA Astrophysics Data System (ADS)
Gao, Ligang; Wang, I.-Ting; Chen, Pai-Yu; Vrudhula, Sarma; Seo, Jae-sun; Cao, Yu; Hou, Tuo-Hung; Yu, Shimeng
2015-11-01
A neuro-inspired computing paradigm beyond the von Neumann architecture is emerging and it generally takes advantage of massive parallelism and is aimed at complex tasks that involve intelligence and learning. The cross-point array architecture with synaptic devices has been proposed for on-chip implementation of the weighted sum and weight update in the learning algorithms. In this work, forming-free, silicon-process-compatible Ta/TaO x /TiO2/Ti synaptic devices are fabricated, in which >200 levels of conductance states could be continuously tuned by identical programming pulses. In order to demonstrate the advantages of parallelism of the cross-point array architecture, a novel fully parallel write scheme is designed and experimentally demonstrated in a small-scale crossbar array to accelerate the weight update in the training process, at a speed that is independent of the array size. Compared to the conventional row-by-row write scheme, it achieves >30× speed-up and >30× improvement in energy efficiency as projected in a large-scale array. If realistic synaptic device characteristics such as device variations are taken into an array-level simulation, the proposed array architecture is able to achieve ∼95% recognition accuracy of MNIST handwritten digits, which is close to the accuracy achieved by software using the ideal sparse coding algorithm.
Software-defined optical network for metro-scale geographically distributed data centers.
Samadi, Payman; Wen, Ke; Xu, Junjie; Bergman, Keren
2016-05-30
The emergence of cloud computing and big data has rapidly increased the deployment of small and mid-sized data centers. Enterprises and cloud providers require an agile network among these data centers to empower application reliability and flexible scalability. We present a software-defined inter data center network to enable on-demand scale out of data centers on a metro-scale optical network. The architecture consists of a combined space/wavelength switching platform and a Software-Defined Networking (SDN) control plane equipped with a wavelength and routing assignment module. It enables establishing transparent and bandwidth-selective connections from L2/L3 switches, on-demand. The architecture is evaluated in a testbed consisting of 3 data centers, 5-25 km apart. We successfully demonstrated end-to-end bulk data transfer and Virtual Machine (VM) migrations across data centers with less than 100 ms connection setup time and close to full link capacity utilization.
Synthesis of InSb Nanowire Architectures - Building Blocks for Majorana Devices
NASA Astrophysics Data System (ADS)
Car, Diana
Breakthroughs in material development are playing a major role in the emerging field of topological quantum computation with Majorana Zero Modes (MZMs). Due to the strong spin-orbit interaction and large Landé g-factor InSb nanowires are one of the most promising one dimensional material systems in which to detect MZMs. The next generation of Majorana experiments should move beyond zero-mode detection and demonstrate the non-Abelian nature of MZMs by braiding. To achieve this goal advanced material platforms are needed: low-disorder, single-crystalline, planar networks of nanowires with high spin-orbit energy. In this talk I will discuss the formation and electronic properties of InSb nanowire networks. The bottom-up synthesis method we have developed is generic and can be employed to synthesize interconnected nanowire architectures of group III-V, II-VI and IV materials as long as they grow along a <111>direction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arumugam, Kamesh
Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore,more » these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address the parallel implementation challenges of such irregular applications on different HPC architectures. In particular, we use supervised learning to predict the computation structure and use it to address the control-ow and memory access irregularities in the parallel implementation of such applications on GPUs, Xeon Phis, and heterogeneous architectures composed of multi-core CPUs with GPUs or Xeon Phis. We use numerical simulation of charged particles beam dynamics simulation as a motivating example throughout the dissertation to present our new approach, though they should be equally applicable to a wide range of irregular applications. The machine learning approach presented here use predictive analytics and forecasting techniques to adaptively model and track the irregular memory access pattern at each time step of the simulation to anticipate the future memory access pattern. Access pattern forecasts can then be used to formulate optimization decisions during application execution which improves the performance of the application at a future time step based on the observations from earlier time steps. In heterogeneous architectures, forecasts can also be used to improve the memory performance and resource utilization of all the processing units to deliver a good aggregate performance. We used these optimization techniques and anticipation strategy to design a cache-aware, memory efficient parallel algorithm to address the irregularities in the parallel implementation of charged particles beam dynamics simulation on different HPC architectures. Experimental result using a diverse mix of HPC architectures shows that our approach in using anticipation strategy is effective in maximizing data reuse, ensuring workload balance, minimizing branch and memory divergence, and in improving resource utilization.« less
Hauenstein, Logan; Gao, Tia; Sze, Tsz Wo; Crawford, David; Alm, Alex; White, David
2006-01-01
Real-time information communication presents a persistent challenge to the emergency response community. During a medical emergency, various first response disciplines including Emergency Medical Service (EMS), Fire, and Police, and multiple health service facilities including hospitals, auxiliary care centers and public health departments using disparate information technology systems must coordinate their efforts by sharing real-time information. This paper describes a service-oriented architecture (SOA) that uses shared data models of emergency incidents to support the exchange of data between heterogeneous systems. This architecture is employed in the Advanced Health and Disaster Aid Network (AID-N) system, a testbed investigating information technologies to improve interoperation among multiple emergency response organizations in the Washington DC Metropolitan region. This architecture allows us to enable real-time data communication between three deployed systems: 1) a pre-hospital patient care reporting software system used on all ambulances in Arlington County, Virginia (MICHAELS), 2) a syndromic surveillance system used by public health departments in the Washington area (ESSENCE), and 3) a hazardous material reference software system (WISER) developed by the National Library Medicine. Additionally, we have extended our system to communicate with three new data sources: 1) wireless automated vital sign sensors worn by patients, 2) web portals for admitting hospitals, and 3) PDAs used by first responders at emergency scenes to input data (SIRP).
The Emerging Chinese Institutional Architecture in Higher Education
ERIC Educational Resources Information Center
Lo, William Yat Wai
2014-01-01
The article reviews the global landscape of higher education with the anticipation of an emerging Chinese institutional architecture in Asia-Pacific higher education. It starts with a theoretical framework for analyzing the functionalities of values and institutions in international higher education by adopting Joseph Nye's concept of soft power.…
Evaluation of Visual Computer Simulator for Computer Architecture Education
ERIC Educational Resources Information Center
Imai, Yoshiro; Imai, Masatoshi; Moritoh, Yoshio
2013-01-01
This paper presents trial evaluation of a visual computer simulator in 2009-2011, which has been developed to play some roles of both instruction facility and learning tool simultaneously. And it illustrates an example of Computer Architecture education for University students and usage of e-Learning tool for Assembly Programming in order to…
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
A heterogeneous hierarchical architecture for real-time computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skroch, D.A.; Fornaro, R.J.
The need for high-speed data acquisition and control algorithms has prompted continued research in the area of multiprocessor systems and related programming techniques. The result presented here is a unique hardware and software architecture for high-speed real-time computer systems. The implementation of a prototype of this architecture has required the integration of architecture, operating systems and programming languages into a cohesive unit. This report describes a Heterogeneous Hierarchial Architecture for Real-Time (H{sup 2} ART) and system software for program loading and interprocessor communication.
The graphical brain: Belief propagation and active inference
Friston, Karl J.; Parr, Thomas; de Vries, Bert
2018-01-01
This paper considers functional integration in the brain from a computational perspective. We ask what sort of neuronal message passing is mandated by active inference—and what implications this has for context-sensitive connectivity at microscopic and macroscopic levels. In particular, we formulate neuronal processing as belief propagation under deep generative models. Crucially, these models can entertain both discrete and continuous states, leading to distinct schemes for belief updating that play out on the same (neuronal) architecture. Technically, we use Forney (normal) factor graphs to elucidate the requisite message passing in terms of its form and scheduling. To accommodate mixed generative models (of discrete and continuous states), one also has to consider link nodes or factors that enable discrete and continuous representations to talk to each other. When mapping the implicit computational architecture onto neuronal connectivity, several interesting features emerge. For example, Bayesian model averaging and comparison, which link discrete and continuous states, may be implemented in thalamocortical loops. These and other considerations speak to a computational connectome that is inherently state dependent and self-organizing in ways that yield to a principled (variational) account. We conclude with simulations of reading that illustrate the implicit neuronal message passing, with a special focus on how discrete (semantic) representations inform, and are informed by, continuous (visual) sampling of the sensorium. Author Summary This paper considers functional integration in the brain from a computational perspective. We ask what sort of neuronal message passing is mandated by active inference—and what implications this has for context-sensitive connectivity at microscopic and macroscopic levels. In particular, we formulate neuronal processing as belief propagation under deep generative models that can entertain both discrete and continuous states. This leads to distinct schemes for belief updating that play out on the same (neuronal) architecture. Technically, we use Forney (normal) factor graphs to characterize the requisite message passing, and link this formal characterization to canonical microcircuits and extrinsic connectivity in the brain. PMID:29417960
Optimization of a Lattice Boltzmann Computation on State-of-the-Art Multicore Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Carter, Jonathan; Oliker, Leonid
2009-04-10
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of search-based performance optimizations, popular in linear algebra and FFT libraries, to application-specific computational kernels. Our work applies this strategy to a lattice Boltzmann application (LBMHD) that historically has made poor use of scalar microprocessors due to its complex data structures and memory access patterns. We explore one of the broadest sets of multicore architectures in the HPC literature, including the Intel Xeon E5345 (Clovertown), AMD Opteron 2214 (Santa Rosa), AMD Opteron 2356 (Barcelona), Sun T5140 T2+ (Victoria Falls), as well asmore » a QS20 IBM Cell Blade. Rather than hand-tuning LBMHD for each system, we develop a code generator that allows us to identify a highly optimized version for each platform, while amortizing the human programming effort. Results show that our auto-tuned LBMHD application achieves up to a 15x improvement compared with the original code at a given concurrency. Additionally, we present detailed analysis of each optimization, which reveal surprising hardware bottlenecks and software challenges for future multicore systems and applications.« less
Dynamic Distribution and Layouting of Model-Based User Interfaces in Smart Environments
NASA Astrophysics Data System (ADS)
Roscher, Dirk; Lehmann, Grzegorz; Schwartze, Veit; Blumendorf, Marco; Albayrak, Sahin
The developments in computer technology in the last decade change the ways of computer utilization. The emerging smart environments make it possible to build ubiquitous applications that assist users during their everyday life, at any time, in any context. But the variety of contexts-of-use (user, platform and environment) makes the development of such ubiquitous applications for smart environments and especially its user interfaces a challenging and time-consuming task. We propose a model-based approach, which allows adapting the user interface at runtime to numerous (also unknown) contexts-of-use. Based on a user interface modelling language, defining the fundamentals and constraints of the user interface, a runtime architecture exploits the description to adapt the user interface to the current context-of-use. The architecture provides automatic distribution and layout algorithms for adapting the applications also to contexts unforeseen at design time. Designers do not specify predefined adaptations for each specific situation, but adaptation constraints and guidelines. Furthermore, users are provided with a meta user interface to influence the adaptations according to their needs. A smart home energy management system serves as running example to illustrate the approach.
ERIC Educational Resources Information Center
Arumi, Francisco N.
Computer programs capable of describing the thermal behavior of buildings are used to help architectural students understand environmental systems. The Numerical Simulation Laboratory at the Architectural School of the University of Texas at Austin was developed to provide the necessary software capable of simulating the energy transactions…
NASA Astrophysics Data System (ADS)
Armstrong, Michael James
Increases in power demands and changes in the design practices of overall equipment manufacturers has led to a new paradigm in vehicle systems definition. The development of unique power systems architectures is of increasing importance to overall platform feasibility and must be pursued early in the aircraft design process. Many vehicle systems architecture trades must be conducted concurrent to platform definition. With an increased complexity introduced during conceptual design, accurate predictions of unit level sizing requirements must be made. Architecture specific emergent requirements must be identified which arise due to the complex integrated effect of unit behaviors. Off-nominal operating scenarios present sizing critical requirements to the aircraft vehicle systems. These requirements are architecture specific and emergent. Standard heuristically defined failure mitigation is sufficient for sizing traditional and evolutionary architectures. However, architecture concepts which vary significantly in terms of structure and composition require that unique failure mitigation strategies be defined for accurate estimations of unit level requirements. Identifying of these off-nominal emergent operational requirements require extensions to traditional safety and reliability tools and the systematic identification of optimal performance degradation strategies. Discrete operational constraints posed by traditional Functional Hazard Assessment (FHA) are replaced by continuous relationships between function loss and operational hazard. These relationships pose the objective function for hazard minimization. Load shedding optimization is performed for all statistically significant failures by varying the allocation of functional capability throughout the vehicle systems architecture. Expressing hazards, and thereby, reliability requirements as continuous relationships with the magnitude and duration of functional failure requires augmentations to the traditional means for system safety assessment (SSA). The traditional two state and discrete system reliability assessment proves insufficient. Reliability is, therefore, handled in an analog fashion: as a function of magnitude of failure and failure duration. A series of metrics are introduced which characterize system performance in terms of analog hazard probabilities. These include analog and cumulative system and functional risk, hazard correlation, and extensions to the traditional component importance metrics. Continuous FHA, load shedding optimization, and analog SSA constitute the SONOMA process (Systematic Off-Nominal Requirements Analysis). Analog system safety metrics inform both architecture optimization (changes in unit level capability and reliability) and architecture augmentation (changes in architecture structure and composition). This process was applied for two vehicle systems concepts (conventional and 'more-electric') in terms of loss/hazard relationships with varying degrees of fidelity. Application of this process shows that the traditional assumptions regarding the structure of the function loss vs. hazard relationship apply undue design bias to functions and components during exploratory design. This bias is illustrated in terms of inaccurate estimations of the system and function level risk and unit level importance. It was also shown that off-nominal emergent requirements must be defined specific to each architecture concept. Quantitative comparisons of architecture specific off-nominal performance were obtained which provide evidence to the need for accurate definition of load shedding strategies during architecture exploratory design. Formally expressing performance degradation strategies in terms of the minimization of a continuous hazard space enhances the system architects ability to accurately predict sizing critical emergent requirements concurrent to architecture definition. Furthermore, the methods and frameworks generated here provide a structured and flexible means for eliciting these architecture specific requirements during the performance of architecture trades.
Toward an Integration of Deep Learning and Neuroscience
Marblestone, Adam H.; Wayne, Greg; Kording, Konrad P.
2016-01-01
Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain's specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses. PMID:27683554
NASA Technical Reports Server (NTRS)
Weeks, Cindy Lou
1986-01-01
Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barhen, Jacob; Imam, Neena
2007-01-01
Revolutionary computing technologies are defined in terms of technological breakthroughs, which leapfrog over near-term projected advances in conventional hardware and software to produce paradigm shifts in computational science. For underwater threat source localization using information provided by a dynamical sensor network, one of the most promising computational advances builds upon the emergence of digital optical-core devices. In this article, we present initial results of sensor network calculations that focus on the concept of signal wavefront time-difference-of-arrival (TDOA). The corresponding algorithms are implemented on the EnLight processing platform recently introduced by Lenslet Laboratories. This tera-scale digital optical core processor is optimizedmore » for array operations, which it performs in a fixed-point-arithmetic architecture. Our results (i) illustrate the ability to reach the required accuracy in the TDOA computation, and (ii) demonstrate that a considerable speed-up can be achieved when using the EnLight 64a prototype processor as compared to a dual Intel XeonTM processor.« less
Biomanufacturing: a US-China National Science Foundation-sponsored workshop.
Sun, Wei; Yan, Yongnian; Lin, Feng; Spector, Myron
2006-05-01
A recent US-China National Science Foundation-sponsored workshop on biomanufacturing reviewed the state-of-the-art of an array of new technologies for producing scaffolds for tissue engineering, providing precision multi-scale control of material, architecture, and cells. One broad category of such techniques has been termed solid freeform fabrication. The techniques in this category include: stereolithography, selected laser sintering, single- and multiple-nozzle deposition and fused deposition modeling, and three-dimensional printing. The precise and repetitive placement of material and cells in a three-dimensional construct at the micrometer length scale demands computer control. These novel computer-controlled scaffold production techniques, when coupled with computer-based imaging and structural modeling methods for the production of the templates for the scaffolds, define an emerging field of computer-aided tissue engineering. In formulating the questions that remain to be answered and discussing the knowledge required to further advance the field, the Workshop provided a basis for recommendations for future work.
Face Recognition in Humans and Machines
NASA Astrophysics Data System (ADS)
O'Toole, Alice; Tistarelli, Massimo
The study of human face recognition by psychologists and neuroscientists has run parallel to the development of automatic face recognition technologies by computer scientists and engineers. In both cases, there are analogous steps of data acquisition, image processing, and the formation of representations that can support the complex and diverse tasks we accomplish with faces. These processes can be understood and compared in the context of their neural and computational implementations. In this chapter, we present the essential elements of face recognition by humans and machines, taking a perspective that spans psychological, neural, and computational approaches. From the human side, we overview the methods and techniques used in the neurobiology of face recognition, the underlying neural architecture of the system, the role of visual attention, and the nature of the representations that emerges. From the computational side, we discuss face recognition technologies and the strategies they use to overcome challenges to robust operation over viewing parameters. Finally, we conclude the chapter with a look at some recent studies that compare human and machine performances at face recognition.
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas E; Schuman, Catherine D; Young, Steven R
Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
ERIC Educational Resources Information Center
Amenyo, John-Thones
2012-01-01
Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
Switching from computer to microcomputer architecture education
NASA Astrophysics Data System (ADS)
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-03-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to microcomputer architecture. The authors present their strategies towards a successful crossing of boundaries between engineering disciplines. This communication aims at providing a different aspect on professional courses that are, nowadays, addressed at the expense of traditional courses.
Three-Dimensional Nanobiocomputing Architectures With Neuronal Hypercells
2007-06-01
Neumann architectures, and CMOS fabrication. Novel solutions of massive parallel distributed computing and processing (pipelined due to systolic... and processing platforms utilizing molecular hardware within an enabling organization and architecture. The design technology is based on utilizing a...Microsystems and Nanotechnologies investigated a novel 3D3 (Hardware Software Nanotechnology) technology to design super-high performance computing
Spin-neurons: A possible path to energy-efficient neuromorphic computers
NASA Astrophysics Data System (ADS)
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
2013-12-01
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and "thresholding" operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that "spin-neurons" (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.
Spin-neurons: A possible path to energy-efficient neuromorphic computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices.more » Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and “thresholding” operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that “spin-neurons” (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.« less
Switching from Computer to Microcomputer Architecture Education
ERIC Educational Resources Information Center
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-01-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to…
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.
Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang
2016-12-07
The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
Epilepsy analytic system with cloud computing.
Shen, Chia-Ping; Zhou, Weizhi; Lin, Feng-Seng; Sung, Hsiao-Ya; Lam, Yan-Yu; Chen, Wei; Lin, Jeng-Wei; Pan, Ming-Kai; Chiu, Ming-Jang; Lai, Feipei
2013-01-01
Biomedical data analytic system has played an important role in doing the clinical diagnosis for several decades. Today, it is an emerging research area of analyzing these big data to make decision support for physicians. This paper presents a parallelized web-based tool with cloud computing service architecture to analyze the epilepsy. There are many modern analytic functions which are wavelet transform, genetic algorithm (GA), and support vector machine (SVM) cascaded in the system. To demonstrate the effectiveness of the system, it has been verified by two kinds of electroencephalography (EEG) data, which are short term EEG and long term EEG. The results reveal that our approach achieves the total classification accuracy higher than 90%. In addition, the entire training time accelerate about 4.66 times and prediction time is also meet requirements in real time.
A Multi Agent Based Approach for Prehospital Emergency Management.
Safdari, Reza; Shoshtarian Malak, Jaleh; Mohammadzadeh, Niloofar; Danesh Shahraki, Azimeh
2017-07-01
To demonstrate an architecture to automate the prehospital emergency process to categorize the specialized care according to the situation at the right time for reducing the patient mortality and morbidity. Prehospital emergency process were analyzed using existing prehospital management systems, frameworks and the extracted process were modeled using sequence diagram in Rational Rose software. System main agents were identified and modeled via component diagram, considering the main system actors and by logically dividing business functionalities, finally the conceptual architecture for prehospital emergency management was proposed. The proposed architecture was simulated using Anylogic simulation software. Anylogic Agent Model, State Chart and Process Model were used to model the system. Multi agent systems (MAS) had a great success in distributed, complex and dynamic problem solving environments, and utilizing autonomous agents provides intelligent decision making capabilities. The proposed architecture presents prehospital management operations. The main identified agents are: EMS Center, Ambulance, Traffic Station, Healthcare Provider, Patient, Consultation Center, National Medical Record System and quality of service monitoring agent. In a critical condition like prehospital emergency we are coping with sophisticated processes like ambulance navigation health care provider and service assignment, consultation, recalling patients past medical history through a centralized EHR system and monitoring healthcare quality in a real-time manner. The main advantage of our work has been the multi agent system utilization. Our Future work will include proposed architecture implementation and evaluation of its impact on patient quality care improvement.
A Multi Agent Based Approach for Prehospital Emergency Management
Safdari, Reza; Shoshtarian Malak, Jaleh; Mohammadzadeh, Niloofar; Danesh Shahraki, Azimeh
2017-01-01
Objective: To demonstrate an architecture to automate the prehospital emergency process to categorize the specialized care according to the situation at the right time for reducing the patient mortality and morbidity. Methods: Prehospital emergency process were analyzed using existing prehospital management systems, frameworks and the extracted process were modeled using sequence diagram in Rational Rose software. System main agents were identified and modeled via component diagram, considering the main system actors and by logically dividing business functionalities, finally the conceptual architecture for prehospital emergency management was proposed. The proposed architecture was simulated using Anylogic simulation software. Anylogic Agent Model, State Chart and Process Model were used to model the system. Results: Multi agent systems (MAS) had a great success in distributed, complex and dynamic problem solving environments, and utilizing autonomous agents provides intelligent decision making capabilities. The proposed architecture presents prehospital management operations. The main identified agents are: EMS Center, Ambulance, Traffic Station, Healthcare Provider, Patient, Consultation Center, National Medical Record System and quality of service monitoring agent. Conclusion: In a critical condition like prehospital emergency we are coping with sophisticated processes like ambulance navigation health care provider and service assignment, consultation, recalling patients past medical history through a centralized EHR system and monitoring healthcare quality in a real-time manner. The main advantage of our work has been the multi agent system utilization. Our Future work will include proposed architecture implementation and evaluation of its impact on patient quality care improvement. PMID:28795061
Architecture for Cognitive Networking within NASA's Future Space Communications Infrastructure
NASA Technical Reports Server (NTRS)
Clark, Gilbert; Eddy, Wesley M.; Johnson, Sandra K.; Barnes, James; Brooks, David
2016-01-01
Future space mission concepts and designs pose many networking challenges for command, telemetry, and science data applications with diverse end-to-end data delivery needs. For future end-to-end architecture designs, a key challenge is meeting expected application quality of service requirements for multiple simultaneous mission data flows with options to use diverse onboard local data buses, commercial ground networks, and multiple satellite relay constellations in LEO, GEO, MEO, or even deep space relay links. Effectively utilizing a complex network topology requires orchestration and direction that spans the many discrete, individually addressable computer systems, which cause them to act in concert to achieve the overall network goals. The system must be intelligent enough to not only function under nominal conditions, but also adapt to unexpected situations, and reorganize or adapt to perform roles not originally intended for the system or explicitly programmed. This paper describes an architecture enabling the development and deployment of cognitive networking capabilities into the envisioned future NASA space communications infrastructure. We begin by discussing the need for increased automation, including inter-system discovery and collaboration. This discussion frames the requirements for an architecture supporting cognitive networking for future missions and relays, including both existing endpoint-based networking models and emerging information-centric models. From this basis, we discuss progress on a proof-of-concept implementation of this architecture, and results of implementation and initial testing of a cognitive networking on-orbit application on the SCaN Testbed attached to the International Space Station.
Architecture for Cognitive Networking within NASAs Future Space Communications Infrastructure
NASA Technical Reports Server (NTRS)
Clark, Gilbert J., III; Eddy, Wesley M.; Johnson, Sandra K.; Barnes, James; Brooks, David
2016-01-01
Future space mission concepts and designs pose many networking challenges for command, telemetry, and science data applications with diverse end-to-end data delivery needs. For future end-to-end architecture designs, a key challenge is meeting expected application quality of service requirements for multiple simultaneous mission data flows with options to use diverse onboard local data buses, commercial ground networks, and multiple satellite relay constellations in LEO, MEO, GEO, or even deep space relay links. Effectively utilizing a complex network topology requires orchestration and direction that spans the many discrete, individually addressable computer systems, which cause them to act in concert to achieve the overall network goals. The system must be intelligent enough to not only function under nominal conditions, but also adapt to unexpected situations, and reorganize or adapt to perform roles not originally intended for the system or explicitly programmed. This paper describes architecture features of cognitive networking within the future NASA space communications infrastructure, and interacting with the legacy systems and infrastructure in the meantime. The paper begins by discussing the need for increased automation, including inter-system collaboration. This discussion motivates the features of an architecture including cognitive networking for future missions and relays, interoperating with both existing endpoint-based networking models and emerging information-centric models. From this basis, we discuss progress on a proof-of-concept implementation of this architecture as a cognitive networking on-orbit application on the SCaN Testbed attached to the International Space Station.
Architecture of next-generation information management systems for digital radiology enterprises
NASA Astrophysics Data System (ADS)
Wong, Stephen T. C.; Wang, Huili; Shen, Weimin; Schmidt, Joachim; Chen, George; Dolan, Tom
2000-05-01
Few information systems today offer a clear and flexible means to define and manage the automated part of radiology processes. None of them provide a coherent and scalable architecture that can easily cope with heterogeneity and inevitable local adaptation of applications. Most importantly, they often lack a model that can integrate clinical and administrative information to aid better decisions in managing resources, optimizing operations, and improving productivity. Digital radiology enterprises require cost-effective solutions to deliver information to the right person in the right place and at the right time. We propose a new architecture of image information management systems for digital radiology enterprises. Such a system is based on the emerging technologies in workflow management, distributed object computing, and Java and Web techniques, as well as Philips' domain knowledge in radiology operations. Our design adapts the approach of '4+1' architectural view. In this new architecture, PACS and RIS will become one while the user interaction can be automated by customized workflow process. Clinical service applications are implemented as active components. They can be reasonably substituted by applications of local adaptations and can be multiplied for fault tolerance and load balancing. Furthermore, it will provide powerful query and statistical functions for managing resources and improving productivity in real time. This work will lead to a new direction of image information management in the next millennium. We will illustrate the innovative design with implemented examples of a working prototype.
Building Paradigms: Major Transformations in School Architecture (1798-2009)
ERIC Educational Resources Information Center
Gislason, Neil
2009-01-01
This article provides an historical overview of significant trends in school architecture from 1798 to the present. I divide the history of school architecture into two major phases. The first period falls between 1798 and 1921: the modern graded classroom emerged as a standard architectural feature during this period. The second period, which…
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Radenski, Atanas; Follen, Gregory J. (Technical Monitor)
2001-01-01
The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
THE COMPUTER AND THE ARCHITECTURAL PROFESSION.
ERIC Educational Resources Information Center
HAVILAND, DAVID S.
THE ROLE OF ADVANCING TECHNOLOGY IN THE FIELD OF ARCHITECTURE IS DISCUSSED IN THIS REPORT. PROBLEMS IN COMMUNICATION AND THE DESIGN PROCESS ARE IDENTIFIED. ADVANTAGES AND DISADVANTAGES OF COMPUTERS ARE MENTIONED IN RELATION TO MAN AND MACHINE INTERACTION. PRESENT AND FUTURE IMPLICATIONS OF COMPUTER USAGE ARE IDENTIFIED AND DISCUSSED WITH RESPECT…
The Contribution of Visualization to Learning Computer Architecture
ERIC Educational Resources Information Center
Yehezkel, Cecile; Ben-Ari, Mordechai; Dreyfus, Tommy
2007-01-01
This paper describes a visualization environment and associated learning activities designed to improve learning of computer architecture. The environment, EasyCPU, displays a model of the components of a computer and the dynamic processes involved in program execution. We present the results of a research program that analysed the contribution of…
Technology advances and market forces: Their impact on high performance architectures
NASA Technical Reports Server (NTRS)
Best, D. R.
1978-01-01
Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vineyard, Craig Michael; Verzi, Stephen Joseph
As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
Computing architecture for autonomous microgrids
Goldsmith, Steven Y.
2015-09-29
A computing architecture that facilitates autonomously controlling operations of a microgrid is described herein. A microgrid network includes numerous computing devices that execute intelligent agents, each of which is assigned to a particular entity (load, source, storage device, or switch) in the microgrid. The intelligent agents can execute in accordance with predefined protocols to collectively perform computations that facilitate uninterrupted control of the .
Blueprint for a microwave trapped ion quantum computer.
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K
2017-02-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
Awareware: Narrowcasting Attributes for Selective Attention, Privacy, and Multipresence
NASA Astrophysics Data System (ADS)
Cohen, Michael; Newton Fernando, Owen Noel
The domain of cscw, computer-supported collaborative work, and DSC, distributed synchronous collaboration, spans real-time interactive multiuser systems, shared information spaces, and applications for teleexistence and artificial reality, including collaborative virtual environments ( cves) (Benford et al., 2001). As presence awareness systems emerge, it is important to develop appropriate interfaces and architectures for managing multimodal multiuser systems. Especially in consideration of the persistent connectivity enabled by affordable networked communication, shared distributed environments require generalized control of media streams, techniques to control source → sink transmissions in synchronous groupware, including teleconferences and chatspaces, online role-playing games, and virtual concerts.
A UIMA wrapper for the NCBO annotator.
Roeder, Christophe; Jonquet, Clement; Shah, Nigam H; Baumgartner, William A; Verspoor, Karin; Hunter, Lawrence
2010-07-15
The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator-an ontology-based annotation service-to make it available as a component in UIMA workflows. This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows.
PACS for Bhutan: a cost effective open source architecture for emerging countries.
Ratib, Osman; Roduit, Nicolas; Nidup, Dechen; De Geer, Gerard; Rosset, Antoine; Geissbuhler, Antoine
2016-10-01
This paper reports the design and implementation of an innovative and cost-effective imaging management infrastructure suitable for radiology centres in emerging countries. It was implemented in the main referring hospital of Bhutan equipped with a CT, an MRI, digital radiology, and a suite of several ultrasound units. They lacked the necessary informatics infrastructure for image archiving and interpretation and needed a system for distribution of images to clinical wards. The solution developed for this project combines several open source software platforms in a robust and versatile archiving and communication system connected to analysis workstations equipped with a FDA-certified version of the highly popular Open-Source software. The whole system was implemented on standard off-the-shelf hardware. The system was installed in three days, and training of the radiologists as well as the technical and IT staff was provided onsite to ensure full ownership of the system by the local team. Radiologists were rapidly capable of reading and interpreting studies on the diagnostic workstations, which had a significant benefit on their workflow and ability to perform diagnostic tasks more efficiently. Furthermore, images were also made available to several clinical units on standard desktop computers through a web-based viewer. • Open source imaging informatics platforms can provide cost-effective alternatives for PACS • Robust and cost-effective open architecture can provide adequate solutions for emerging countries • Imaging informatics is often lacking in hospitals equipped with digital modalities.
An emergentist perspective on the origin of number sense
2018-01-01
The finding that human infants and many other animal species are sensitive to numerical quantity has been widely interpreted as evidence for evolved, biologically determined numerical capacities across unrelated species, thereby supporting a ‘nativist’ stance on the origin of number sense. Here, we tackle this issue within the ‘emergentist’ perspective provided by artificial neural network models, and we build on computer simulations to discuss two different approaches to think about the innateness of number sense. The first, illustrated by artificial life simulations, shows that numerical abilities can be supported by domain-specific representations emerging from evolutionary pressure. The second assumes that numerical representations need not be genetically pre-determined but can emerge from the interplay between innate architectural constraints and domain-general learning mechanisms, instantiated in deep learning simulations. We show that deep neural networks endowed with basic visuospatial processing exhibit a remarkable performance in numerosity discrimination before any experience-dependent learning, whereas unsupervised sensory experience with visual sets leads to subsequent improvement of number acuity and reduces the influence of continuous visual cues. The emergent neuronal code for numbers in the model includes both numerosity-sensitive (summation coding) and numerosity-selective response profiles, closely mirroring those found in monkey intraparietal neurons. We conclude that a form of innatism based on architectural and learning biases is a fruitful approach to understanding the origin and development of number sense. This article is part of a discussion meeting issue ‘The origins of numerical abilities'. PMID:29292348
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Youngblood, John N.; Saha, Aindam
1987-01-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, C.C.; Youngblood, J.N.; Saha, A.
1987-12-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
SU (2) lattice gauge theory simulations on Fermi GPUs
NASA Astrophysics Data System (ADS)
Cardoso, Nuno; Bicudo, Pedro
2011-05-01
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
NASA Astrophysics Data System (ADS)
Mahdavinejad, M.; Bitaab, N.
2017-08-01
Search for high-performance architecture and dreams of future architecture resulted in attempts towards meeting energy efficient architecture and planning in different aspects. Recent trends as a mean to meet future legacy in architecture are based on the idea of innovative technologies for resource efficient buildings, performative design, bio-inspired technologies etc. while there are meaningful differences between architecture of developed and developing countries. Significance of issue might be understood when the emerging cities are found interested in Dubaization and other related booming development doctrines. This paper is to analyze the level of developing countries’ success to achieve smart-eco buildings’ goals and objectives. Emerging cities of West of Asia are selected as case studies of the paper. The results of the paper show that the concept of high-performance architecture and smart-eco buildings are different in developing countries in comparison with developed countries. The paper is to mention five essential issues in order to improve future architecture of developing countries: 1- Integrated Strategies for Energy Efficiency, 2- Contextual Solutions, 3- Embedded and Initial Energy Assessment, 4- Staff and Occupancy Wellbeing, 5- Life-Cycle Monitoring.
Energy efficient hybrid computing systems using spin devices
NASA Astrophysics Data System (ADS)
Sharad, Mrigank
Emerging spin-devices like magnetic tunnel junctions (MTJ's), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ˜20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode' processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ˜100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters.
Richmond, Paul; Buesing, Lars; Giugliano, Michele; Vasilaki, Eleni
2011-05-04
High performance computing on the Graphics Processing Unit (GPU) is an emerging field driven by the promise of high computational power at a low cost. However, GPU programming is a non-trivial task and moreover architectural limitations raise the question of whether investing effort in this direction may be worthwhile. In this work, we use GPU programming to simulate a two-layer network of Integrate-and-Fire neurons with varying degrees of recurrent connectivity and investigate its ability to learn a simplified navigation task using a policy-gradient learning rule stemming from Reinforcement Learning. The purpose of this paper is twofold. First, we want to support the use of GPUs in the field of Computational Neuroscience. Second, using GPU computing power, we investigate the conditions under which the said architecture and learning rule demonstrate best performance. Our work indicates that networks featuring strong Mexican-Hat-shaped recurrent connections in the top layer, where decision making is governed by the formation of a stable activity bump in the neural population (a "non-democratic" mechanism), achieve mediocre learning results at best. In absence of recurrent connections, where all neurons "vote" independently ("democratic") for a decision via population vector readout, the task is generally learned better and more robustly. Our study would have been extremely difficult on a desktop computer without the use of GPU programming. We present the routines developed for this purpose and show that a speed improvement of 5x up to 42x is provided versus optimised Python code. The higher speed is achieved when we exploit the parallelism of the GPU in the search of learning parameters. This suggests that efficient GPU programming can significantly reduce the time needed for simulating networks of spiking neurons, particularly when multiple parameter configurations are investigated.
Multiprocessor architecture: Synthesis and evaluation
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1990-01-01
Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
Emergent Aerospace Designs Using Negotiating Autonomous Agents
NASA Technical Reports Server (NTRS)
Deshmukh, Abhijit; Middelkoop, Timothy; Krothapalli, Anjaneyulu; Smith, Charles
2000-01-01
This paper presents a distributed design methodology where designs emerge as a result of the negotiations between different stake holders in the process, such as cost, performance, reliability, etc. The proposed methodology uses autonomous agents to represent design decision makers. Each agent influences specific design parameters in order to maximize their utility. Since the design parameters depend on the aggregate demand of all the agents in the system, design agents need to negotiate with others in the market economy in order to reach an acceptable utility value. This paper addresses several interesting research issues related to distributed design architectures. First, we present a flexible framework which facilitates decomposition of the design problem. Second, we present overview of a market mechanism for generating acceptable design configurations. Finally, we integrate learning mechanisms in the design process to reduce the computational overhead.
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
An S N Algorithm for Modern Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
2016-08-29
LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
A MoTe2-based light-emitting diode and photodetector for silicon photonic integrated circuits.
Bie, Ya-Qing; Grosso, Gabriele; Heuck, Mikkel; Furchi, Marco M; Cao, Yuan; Zheng, Jiabao; Bunandar, Darius; Navarro-Moratalla, Efren; Zhou, Lin; Efetov, Dmitri K; Taniguchi, Takashi; Watanabe, Kenji; Kong, Jing; Englund, Dirk; Jarillo-Herrero, Pablo
2017-12-01
One of the current challenges in photonics is developing high-speed, power-efficient, chip-integrated optical communications devices to address the interconnects bottleneck in high-speed computing systems. Silicon photonics has emerged as a leading architecture, in part because of the promise that many components, such as waveguides, couplers, interferometers and modulators, could be directly integrated on silicon-based processors. However, light sources and photodetectors present ongoing challenges. Common approaches for light sources include one or few off-chip or wafer-bonded lasers based on III-V materials, but recent system architecture studies show advantages for the use of many directly modulated light sources positioned at the transmitter location. The most advanced photodetectors in the silicon photonic process are based on germanium, but this requires additional germanium growth, which increases the system cost. The emerging two-dimensional transition-metal dichalcogenides (TMDs) offer a path for optical interconnect components that can be integrated with silicon photonics and complementary metal-oxide-semiconductors (CMOS) processing by back-end-of-the-line steps. Here, we demonstrate a silicon waveguide-integrated light source and photodetector based on a p-n junction of bilayer MoTe 2 , a TMD semiconductor with an infrared bandgap. This state-of-the-art fabrication technology provides new opportunities for integrated optoelectronic systems.
A MoTe2-based light-emitting diode and photodetector for silicon photonic integrated circuits
NASA Astrophysics Data System (ADS)
Bie, Ya-Qing; Grosso, Gabriele; Heuck, Mikkel; Furchi, Marco M.; Cao, Yuan; Zheng, Jiabao; Bunandar, Darius; Navarro-Moratalla, Efren; Zhou, Lin; Efetov, Dmitri K.; Taniguchi, Takashi; Watanabe, Kenji; Kong, Jing; Englund, Dirk; Jarillo-Herrero, Pablo
2017-12-01
One of the current challenges in photonics is developing high-speed, power-efficient, chip-integrated optical communications devices to address the interconnects bottleneck in high-speed computing systems. Silicon photonics has emerged as a leading architecture, in part because of the promise that many components, such as waveguides, couplers, interferometers and modulators, could be directly integrated on silicon-based processors. However, light sources and photodetectors present ongoing challenges. Common approaches for light sources include one or few off-chip or wafer-bonded lasers based on III-V materials, but recent system architecture studies show advantages for the use of many directly modulated light sources positioned at the transmitter location. The most advanced photodetectors in the silicon photonic process are based on germanium, but this requires additional germanium growth, which increases the system cost. The emerging two-dimensional transition-metal dichalcogenides (TMDs) offer a path for optical interconnect components that can be integrated with silicon photonics and complementary metal-oxide-semiconductors (CMOS) processing by back-end-of-the-line steps. Here, we demonstrate a silicon waveguide-integrated light source and photodetector based on a p-n junction of bilayer MoTe2, a TMD semiconductor with an infrared bandgap. This state-of-the-art fabrication technology provides new opportunities for integrated optoelectronic systems.
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics
NASA Technical Reports Server (NTRS)
Padovan, Joe; Gute, Doug; Johnson, Keith
1989-01-01
The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uhr, L.
1987-01-01
This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Computer Architecture's Changing Role in Rebooting Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik P.
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Computer Architecture's Changing Role in Rebooting Computing
DeBenedictis, Erik P.
2017-04-26
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Design of a fault tolerant airborne digital computer. Volume 1: Architecture
NASA Technical Reports Server (NTRS)
Wensley, J. H.; Levitt, K. N.; Green, M. W.; Goldberg, J.; Neumann, P. G.
1973-01-01
This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive.
Using a software-defined computer in teaching the basics of computer architecture and operation
NASA Astrophysics Data System (ADS)
Kosowska, Julia; Mazur, Grzegorz
2017-08-01
The paper describes the concept and implementation of SDC_One software-defined computer designed for experimental and didactic purposes. Equipped with extensive hardware monitoring mechanisms, the device enables the students to monitor the computer's operation on bus transfer cycle or instruction cycle basis, providing the practical illustration of basic aspects of computer's operation. In the paper, we describe the hardware monitoring capabilities of SDC_One and some scenarios of using it in teaching the basics of computer architecture and microprocessor operation.
A Serial Bus Architecture for Parallel Processing Systems
1986-09-01
pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more communication capacity is needed, pushing...chip. The wider the communication path the more pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more...13 2. A Suitable Architecture Sought 14 II. OPTIMUM ARCHITECTURE OF LARGE INTEGRATED A. PARTIONING SILICON FOR MAXIMUM 1? 1. Transistor
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance
NASA Technical Reports Server (NTRS)
Rushby, John
1999-01-01
Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
Advanced cloud fault tolerance system
NASA Astrophysics Data System (ADS)
Sumangali, K.; Benny, Niketa
2017-11-01
Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
Heavy Lift Vehicle (HLV) Avionics Flight Computing Architecture Study
NASA Technical Reports Server (NTRS)
Hodson, Robert F.; Chen, Yuan; Morgan, Dwayne R.; Butler, A. Marc; Sdhuh, Joseph M.; Petelle, Jennifer K.; Gwaltney, David A.; Coe, Lisa D.; Koelbl, Terry G.; Nguyen, Hai D.
2011-01-01
A NASA multi-Center study team was assembled from LaRC, MSFC, KSC, JSC and WFF to examine potential flight computing architectures for a Heavy Lift Vehicle (HLV) to better understand avionics drivers. The study examined Design Reference Missions (DRMs) and vehicle requirements that could impact the vehicles avionics. The study considered multiple self-checking and voting architectural variants and examined reliability, fault-tolerance, mass, power, and redundancy management impacts. Furthermore, a goal of the study was to develop the skills and tools needed to rapidly assess additional architectures should requirements or assumptions change.
The new landscape of parallel computer architecture
NASA Astrophysics Data System (ADS)
Shalf, John
2007-07-01
The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Single-trial EEG RSVP classification using convolutional neural networks
NASA Astrophysics Data System (ADS)
Shamwell, Jared; Lee, Hyungtae; Kwon, Heesung; Marathe, Amar R.; Lawhern, Vernon; Nothwang, William
2016-05-01
Traditionally, Brain-Computer Interfaces (BCI) have been explored as a means to return function to paralyzed or otherwise debilitated individuals. An emerging use for BCIs is in human-autonomy sensor fusion where physiological data from healthy subjects is combined with machine-generated information to enhance the capabilities of artificial systems. While human-autonomy fusion of physiological data and computer vision have been shown to improve classification during visual search tasks, to date these approaches have relied on separately trained classification models for each modality. We aim to improve human-autonomy classification performance by developing a single framework that builds codependent models of human electroencephalograph (EEG) and image data to generate fused target estimates. As a first step, we developed a novel convolutional neural network (CNN) architecture and applied it to EEG recordings of subjects classifying target and non-target image presentations during a rapid serial visual presentation (RSVP) image triage task. The low signal-to-noise ratio (SNR) of EEG inherently limits the accuracy of single-trial classification and when combined with the high dimensionality of EEG recordings, extremely large training sets are needed to prevent overfitting and achieve accurate classification from raw EEG data. This paper explores a new deep CNN architecture for generalized multi-class, single-trial EEG classification across subjects. We compare classification performance from the generalized CNN architecture trained across all subjects to the individualized XDAWN, HDCA, and CSP neural classifiers which are trained and tested on single subjects. Preliminary results show that our CNN meets and slightly exceeds the performance of the other classifiers despite being trained across subjects.
Non-Linear Pattern Formation in Bone Growth and Architecture
Salmon, Phil
2014-01-01
The three-dimensional morphology of bone arises through adaptation to its required engineering performance. Genetically and adaptively bone travels along a complex spatiotemporal trajectory to acquire optimal architecture. On a cellular, micro-anatomical scale, what mechanisms coordinate the activity of osteoblasts and osteoclasts to produce complex and efficient bone architectures? One mechanism is examined here – chaotic non-linear pattern formation (NPF) – which underlies in a unifying way natural structures as disparate as trabecular bone, swarms of birds flying, island formation, fluid turbulence, and others. At the heart of NPF is the fact that simple rules operating between interacting elements, and Turing-like interaction between global and local signals, lead to complex and structured patterns. The study of “group intelligence” exhibited by swarming birds or shoaling fish has led to an embodiment of NPF called “particle swarm optimization” (PSO). This theoretical model could be applicable to the behavior of osteoblasts, osteoclasts, and osteocytes, seeing them operating “socially” in response simultaneously to both global and local signals (endocrine, cytokine, mechanical), resulting in their clustered activity at formation and resorption sites. This represents problem-solving by social intelligence, and could potentially add further realism to in silico computer simulation of bone modeling. What insights has NPF provided to bone biology? One example concerns the genetic disorder juvenile Pagets disease or idiopathic hyperphosphatasia, where the anomalous parallel trabecular architecture characteristic of this pathology is consistent with an NPF paradigm by analogy with known experimental NPF systems. Here, coupling or “feedback” between osteoblasts and osteoclasts is the critical element. This NPF paradigm implies a profound link between bone regulation and its architecture: in bone the architecture is the regulation. The former is the emergent consequence of the latter. PMID:25653638
Non-linear pattern formation in bone growth and architecture.
Salmon, Phil
2014-01-01
The three-dimensional morphology of bone arises through adaptation to its required engineering performance. Genetically and adaptively bone travels along a complex spatiotemporal trajectory to acquire optimal architecture. On a cellular, micro-anatomical scale, what mechanisms coordinate the activity of osteoblasts and osteoclasts to produce complex and efficient bone architectures? One mechanism is examined here - chaotic non-linear pattern formation (NPF) - which underlies in a unifying way natural structures as disparate as trabecular bone, swarms of birds flying, island formation, fluid turbulence, and others. At the heart of NPF is the fact that simple rules operating between interacting elements, and Turing-like interaction between global and local signals, lead to complex and structured patterns. The study of "group intelligence" exhibited by swarming birds or shoaling fish has led to an embodiment of NPF called "particle swarm optimization" (PSO). This theoretical model could be applicable to the behavior of osteoblasts, osteoclasts, and osteocytes, seeing them operating "socially" in response simultaneously to both global and local signals (endocrine, cytokine, mechanical), resulting in their clustered activity at formation and resorption sites. This represents problem-solving by social intelligence, and could potentially add further realism to in silico computer simulation of bone modeling. What insights has NPF provided to bone biology? One example concerns the genetic disorder juvenile Pagets disease or idiopathic hyperphosphatasia, where the anomalous parallel trabecular architecture characteristic of this pathology is consistent with an NPF paradigm by analogy with known experimental NPF systems. Here, coupling or "feedback" between osteoblasts and osteoclasts is the critical element. This NPF paradigm implies a profound link between bone regulation and its architecture: in bone the architecture is the regulation. The former is the emergent consequence of the latter.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing
2000-03-06
and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
Blueprint for a microwave trapped ion quantum computer
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.
2017-01-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course
ERIC Educational Resources Information Center
Arbelaitz, Olatz; José I. Martín; Muguerza, Javier
2015-01-01
This paper presents an analysis of introducing active methodologies in the Computer Architecture course taught in the second year of the Computer Engineering Bachelor's degree program at the University of the Basque Country (UPV/EHU), Spain. The paper reports the experience from three academic years, 2011-2012, 2012-2013, and 2013-2014, in which…
ERIC Educational Resources Information Center
Nikolic, B.; Radivojevic, Z.; Djordjevic, J.; Milutinovic, V.
2009-01-01
Courses in Computer Architecture and Organization are regularly included in Computer Engineering curricula. These courses are usually organized in such a way that students obtain not only a purely theoretical experience, but also a practical understanding of the topics lectured. This practical work is usually done in a laboratory using simulators…
A Project-Based Learning Approach to Programmable Logic Design and Computer Architecture
ERIC Educational Resources Information Center
Kellett, C. M.
2012-01-01
This paper describes a course in programmable logic design and computer architecture as it is taught at the University of Newcastle, Australia. The course is designed around a major design project and has two supplemental assessment tasks that are also described. The context of the Computer Engineering degree program within which the course is…
ERIC Educational Resources Information Center
Stanley, Timothy D.; Wong, Lap Kei; Prigmore, Daniel; Benson, Justin; Fishler, Nathan; Fife, Leslie; Colton, Don
2007-01-01
Students learn better when they both hear and do. In computer architecture courses "doing" can be difficult in small schools without hardware laboratories hosted by computer engineering, electrical engineering, or similar departments. Software solutions exist. Our success with George Mills' Multimedia Logic (MML) is the focus of this paper. MML…
Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A
2012-01-01
Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
Ayres, Daniel L.; Darling, Aaron; Zwickl, Derrick J.; Beerli, Peter; Holder, Mark T.; Lewis, Paul O.; Huelsenbeck, John P.; Ronquist, Fredrik; Swofford, David L.; Cummings, Michael P.; Rambaut, Andrew; Suchard, Marc A.
2012-01-01
Abstract Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software. PMID:21963610
The Role of Sketch in Architecture Design
NASA Astrophysics Data System (ADS)
Li, Yanjin; Ning, Wen
2017-06-01
With the continuous development of computer technology, we rely more and more on the computer and pay more and more attention to the final design results, so that we ignore the importance of the sketch. However, the sketch is the most basic and effective way of architecture design. Based on the study of the sketch of Tjibao Cultural Center of sketch, the paper explores the role of sketch in architecture design .
SU (2) lattice gauge theory simulations on Fermi GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p
2011-05-10
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
NASA Technical Reports Server (NTRS)
Azarbar, Bahman
1990-01-01
Existing and actively planned mobile satellite systems are competing for a viable share of the spectrum allocated by the International Telecommunications Union (ITU) to the satellite based mobile services in the 1.5/1.6 GHz range. The limited amount of spectrum available worldwide and the sheer number of existing and planned mobile satellite systems dictate the adoption of an architecture which will maximize sharing possibilities. A viable sharing architecture must recognize the operational needs and limitations of the existing systems. Furthermore, recognizing the right of access of the future systems as they will emerge in time, the adopted architecture must allow for additional growth and be amenable to orderly introduction of future systems. An attempt to devise such a sharing architecture is described. A specific example of the application of the basic concept to the existing and planned mobile satellite systems is also discussed.
Exploration of operator method digital optical computers for application to NASA
NASA Technical Reports Server (NTRS)
1990-01-01
Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
A Lightweight Protocol for Secure Video Streaming
Morkevicius, Nerijus; Bagdonas, Kazimieras
2018-01-01
The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing “Fog Node-End Device” layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard. PMID:29757988
A Lightweight Protocol for Secure Video Streaming.
Venčkauskas, Algimantas; Morkevicius, Nerijus; Bagdonas, Kazimieras; Damaševičius, Robertas; Maskeliūnas, Rytis
2018-05-14
The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing "Fog Node-End Device" layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard.
Layered Architectures for Quantum Computers and Quantum Repeaters
NASA Astrophysics Data System (ADS)
Jones, Nathan C.
This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
Advanced flight computer. Special study
NASA Technical Reports Server (NTRS)
Coo, Dennis
1995-01-01
This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996.
Advanced information processing system for advanced launch system: Avionics architecture synthesis
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.
1991-01-01
The Advanced Information Processing System (AIPS) is a fault-tolerant distributed computer system architecture that was developed to meet the real time computational needs of advanced aerospace vehicles. One such vehicle is the Advanced Launch System (ALS) being developed jointly by NASA and the Department of Defense to launch heavy payloads into low earth orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An avionics architecture that utilizes the AIPS hardware and software building blocks was synthesized for ALS. The AIPS for ALS architecture synthesis process starting with the ALS mission requirements and ending with an analysis of the candidate ALS avionics architecture is described.
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis
NASA Technical Reports Server (NTRS)
Brent, G. A.
1978-01-01
A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Duong, Vu A.
2012-01-01
A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant.
Computational structures for robotic computations
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chang, P. R.
1987-01-01
The computational problem of inverse kinematics and inverse dynamics of robot manipulators by taking advantage of parallelism and pipelining architectures is discussed. For the computation of inverse kinematic position solution, a maximum pipelined CORDIC architecture has been designed based on a functional decomposition of the closed-form joint equations. For the inverse dynamics computation, an efficient p-fold parallel algorithm to overcome the recurrence problem of the Newton-Euler equations of motion to achieve the time lower bound of O(log sub 2 n) has also been developed.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Mind and Language Architecture
Logan, Robert K
2010-01-01
A distinction is made between the brain and the mind. The architecture of the mind and language is then described within a neo-dualistic framework. A model for the origin of language based on emergence theory is presented. The complexity of hominid existence due to tool making, the control of fire and the social cooperation that fire required gave rise to a new level of order in mental activity and triggered the simultaneous emergence of language and conceptual thought. The mind is shown to have emerged as a bifurcation of the brain with the emergence of language. The role of language in the evolution of human culture is also described. PMID:20922045
A Risk Management Architecture for Emergency Integrated Aircraft Control
NASA Technical Reports Server (NTRS)
McGlynn, Gregory E.; Litt, Jonathan S.; Lemon, Kimberly A.; Csank, Jeffrey T.
2011-01-01
Enhanced engine operation--operation that is beyond normal limits--has the potential to improve the adaptability and safety of aircraft in emergency situations. Intelligent use of enhanced engine operation to improve the handling qualities of the aircraft requires sophisticated risk estimation techniques and a risk management system that spans the flight and propulsion controllers. In this paper, an architecture that weighs the risks of the emergency and of possible engine performance enhancements to reduce overall risk to the aircraft is described. Two examples of emergency situations are presented to demonstrate the interaction between the flight and propulsion controllers to facilitate the enhanced operation.
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.
2003-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT).
Solving the Cauchy-Riemann equations on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
Application of Tessellation in Architectural Geometry Design
NASA Astrophysics Data System (ADS)
Chang, Wei
2018-06-01
Tessellation plays a significant role in architectural geometry design, which is widely used both through history of architecture and in modern architectural design with the help of computer technology. Tessellation has been found since the birth of civilization. In terms of dimensions, there are two- dimensional tessellations and three-dimensional tessellations; in terms of symmetry, there are periodic tessellations and aperiodic tessellations. Besides, some special types of tessellations such as Voronoi Tessellation and Delaunay Triangles are also included. Both Geometry and Crystallography, the latter of which is the basic theory of three-dimensional tessellations, need to be studied. In history, tessellation was applied into skins or decorations in architecture. The development of Computer technology enables tessellation to be more powerful, as seen in surface control, surface display and structure design, etc. Therefore, research on the application of tessellation in architectural geometry design is of great necessity in architecture studies.
Modular open RF architecture: extending VICTORY to RF systems
NASA Astrophysics Data System (ADS)
Melber, Adam; Dirner, Jason; Johnson, Michael
2015-05-01
Radio frequency products spanning multiple functions have become increasingly critical to the warfighter. Military use of the electromagnetic spectrum now includes communications, electronic warfare (EW), intelligence, and mission command systems. Due to the urgent needs of counterinsurgency operations, various quick reaction capabilities (QRCs) have been fielded to enhance warfighter capability. Although these QRCs were highly successfully in their respective missions, they were designed independently resulting in significant challenges when integrated on a common platform. This paper discusses how the Modular Open RF Architecture (MORA) addresses these challenges by defining an open architecture for multifunction missions that decomposes monolithic radio systems into high-level components with welldefined functions and interfaces. The functional decomposition maximizes hardware sharing while minimizing added complexity and cost due to modularization. MORA achieves significant size, weight and power (SWaP) savings by allowing hardware such as power amplifiers and antennas to be shared across systems. By separating signal conditioning from the processing that implements the actual radio application, MORA exposes previously inaccessible architecture points, providing system integrators with the flexibility to insert third-party capabilities to address technical challenges and emerging requirements. MORA leverages the Vehicular Integration for Command, Control, Communication, Computers, Intelligence, Surveillance, and Reconnaissance (C4ISR)/EW Interoperability (VICTORY) framework. This paper concludes by discussing how MORA, VICTORY and other standards such as OpenVPX are being leveraged by the U.S. Army Research, Development, and Engineering Command (RDECOM) Communications Electronics Research, Development, and Engineering Center (CERDEC) to define a converged architecture enabling rapid technology insertion, interoperability and reduced SWaP.
Wearable Internet of Things - from human activity tracking to clinical integration.
Kumari, Poonam; Lopez-Benitez, Miguel; Gyu Myoung Lee; Tae-Seong Kim; Minhas, Atul S
2017-07-01
Wearable devices for human activity tracking have been emerging rapidly. Most of them are capable of sending health statistics to smartphones, smartwatches or smart bands. However, they only provide the data for individual analysis and their data is not integrated into clinical practice. Leveraging on the Internet of Things (IoT), edge and cloud computing technologies, we propose an architecture which is capable of providing cloud based clinical services using human activity data. Such services could supplement the shortage of staff in primary healthcare centers thereby reducing the burden on healthcare service providers. The enormous amount of data created from such services could also be utilized for planning future therapies by studying recovery cycles of existing patients. We provide a prototype based on our architecture and discuss its salient features. We also provide use cases of our system in personalized and home based healthcare services. We propose an International Telecommunication Union based standardization (ITU-T) for our design and discuss future directions in wearable IoT.
Component-based integration of chemistry and optimization software.
Kenny, Joseph P; Benson, Steven J; Alexeev, Yuri; Sarich, Jason; Janssen, Curtis L; McInnes, Lois Curfman; Krishnan, Manojkumar; Nieplocha, Jarek; Jurrus, Elizabeth; Fahlstrom, Carl; Windus, Theresa L
2004-11-15
Typical scientific software designs make rigid assumptions regarding programming language and data structures, frustrating software interoperability and scientific collaboration. Component-based software engineering is an emerging approach to managing the increasing complexity of scientific software. Component technology facilitates code interoperability and reuse. Through the adoption of methodology and tools developed by the Common Component Architecture Forum, we have developed a component architecture for molecular structure optimization. Using the NWChem and Massively Parallel Quantum Chemistry packages, we have produced chemistry components that provide capacity for energy and energy derivative evaluation. We have constructed geometry optimization applications by integrating the Toolkit for Advanced Optimization, Portable Extensible Toolkit for Scientific Computation, and Global Arrays packages, which provide optimization and linear algebra capabilities. We present a brief overview of the component development process and a description of abstract interfaces for chemical optimizations. The components conforming to these abstract interfaces allow the construction of applications using different chemistry and mathematics packages interchangeably. Initial numerical results for the component software demonstrate good performance, and highlight potential research enabled by this platform.
A Conceptual Framework for Pharmacodynamic Genome-wide Association Studies in Pharmacogenomics
Wu, Rongling; Tong, Chunfa; Wang, Zhong; Mauger, David; Tantisira, Kelan; Szefler, Stanley J.; Chinchilli, Vernon M.; Israel, Elliot
2013-01-01
Summary Genome-wide association studies (GWAS) have emerged as a powerful tool to identify loci that affect drug response or susceptibility to adverse drug reactions. However, current GWAS based on a simple analysis of associations between genotype and phenotype ignores the biochemical reactions of drug response, thus limiting the scope of inference about its genetic architecture. To facilitate the inference of GWAS in pharmacogenomics, we sought to undertake the mathematical integration of the pharmacodynamic process of drug reactions through computational models. By estimating and testing the genetic control of pharmacodynamic and pharmacokinetic parameters, this mechanistic approach does not only enhance the biological and clinical relevance of significant genetic associations, but also improve the statistical power and robustness of gene detection. This report discusses the general principle and development of pharmacodynamics-based GWAS, highlights the practical use of this approach in addressing various pharmacogenomic problems, and suggests that this approach will be an important method to study the genetic architecture of drug responses or reactions. PMID:21920452
Image-based tracking: a new emerging standard
NASA Astrophysics Data System (ADS)
Antonisse, Jim; Randall, Scott
2012-06-01
Automated moving object detection and tracking are increasingly viewed as solutions to the enormous data volumes resulting from emerging wide-area persistent surveillance systems. In a previous paper we described a Motion Imagery Standards Board (MISB) initiative to help address this problem: the specification of a micro-architecture for the automatic extraction of motion indicators and tracks. This paper reports on the development of an extended specification of the plug-and-play tracking micro-architecture, on its status as an emerging standard across DoD, the Intelligence Community, and NATO.
A single VLSI chip for computing syndromes in the (225, 223) Reed-Solomon decoder
NASA Technical Reports Server (NTRS)
Hsu, I. S.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.
1986-01-01
A description of a single VLSI chip for computing syndromes in the (255, 223) Reed-Solomon decoder is presented. The architecture that leads to this single VLSI chip design makes use of the dual basis multiplication algorithm. The same architecture can be applied to design VLSI chips to compute various kinds of number theoretic transforms.
Parameterized Micro-benchmarking: An Auto-tuning Approach for Complex Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ma, Wenjing; Krishnamoorthy, Sriram; Agrawal, Gagan
2012-05-15
Auto-tuning has emerged as an important practical method for creating highly optimized implementations of key computational kernels and applications. However, the growing complexity of architectures and applications is creating new challenges for auto-tuning. Complex applications can involve a prohibitively large search space that precludes empirical auto-tuning. Similarly, architectures are becoming increasingly complicated, making it hard to model performance. In this paper, we focus on the challenge to auto-tuning presented by applications with a large number of kernels and kernel instantiations. While these kernels may share a somewhat similar pattern, they differ considerably in problem sizes and the exact computation performed.more » We propose and evaluate a new approach to auto-tuning which we refer to as parameterized micro-benchmarking. It is an alternative to the two existing classes of approaches to auto-tuning: analytical model-based and empirical search-based. Particularly, we argue that the former may not be able to capture all the architectural features that impact performance, whereas the latter might be too expensive for an application that has several different kernels. In our approach, different expressions in the application, different possible implementations of each expression, and the key architectural features, are used to derive a simple micro-benchmark and a small parameter space. This allows us to learn the most significant features of the architecture that can impact the choice of implementation for each kernel. We have evaluated our approach in the context of GPU implementations of tensor contraction expressions encountered in excited state calculations in quantum chemistry. We have focused on two aspects of GPUs that affect tensor contraction execution: memory access patterns and kernel consolidation. Using our parameterized micro-benchmarking approach, we obtain a speedup of up to 2 over the version that used default optimizations, but no auto-tuning. We demonstrate that observations made from microbenchmarks match the behavior seen from real expressions. In the process, we make important observations about the memory hierarchy of two of the most recent NVIDIA GPUs, which can be used in other optimization frameworks as well.« less
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
Performance Evaluation of Communication Software Systems for Distributed Computing
NASA Technical Reports Server (NTRS)
Fatoohi, Rod
1996-01-01
In recent years there has been an increasing interest in object-oriented distributed computing since it is better quipped to deal with complex systems while providing extensibility, maintainability, and reusability. At the same time, several new high-speed network technologies have emerged for local and wide area networks. However, the performance of networking software is not improving as fast as the networking hardware and the workstation microprocessors. This paper gives an overview and evaluates the performance of the Common Object Request Broker Architecture (CORBA) standard in a distributed computing environment at NASA Ames Research Center. The environment consists of two testbeds of SGI workstations connected by four networks: Ethernet, FDDI, HiPPI, and ATM. The performance results for three communication software systems are presented, analyzed and compared. These systems are: BSD socket programming interface, IONA's Orbix, an implementation of the CORBA specification, and the PVM message passing library. The results show that high-level communication interfaces, such as CORBA and PVM, can achieve reasonable performance under certain conditions.
Evidence of common and separate eye and hand accumulators underlying flexible eye-hand coordination
Jana, Sumitash; Gopal, Atul
2016-01-01
Eye and hand movements are initiated by anatomically separate regions in the brain, and yet these movements can be flexibly coupled and decoupled, depending on the need. The computational architecture that enables this flexible coupling of independent effectors is not understood. Here, we studied the computational architecture that enables flexible eye-hand coordination using a drift diffusion framework, which predicts that the variability of the reaction time (RT) distribution scales with its mean. We show that a common stochastic accumulator to threshold, followed by a noisy effector-dependent delay, explains eye-hand RT distributions and their correlation in a visual search task that required decision-making, while an interactive eye and hand accumulator model did not. In contrast, in an eye-hand dual task, an interactive model better predicted the observed correlations and RT distributions than a common accumulator model. Notably, these two models could only be distinguished on the basis of the variability and not the means of the predicted RT distributions. Additionally, signatures of separate initiation signals were also observed in a small fraction of trials in the visual search task, implying that these distinct computational architectures were not a manifestation of the task design per se. Taken together, our results suggest two unique computational architectures for eye-hand coordination, with task context biasing the brain toward instantiating one of the two architectures. NEW & NOTEWORTHY Previous studies on eye-hand coordination have considered mainly the means of eye and hand reaction time (RT) distributions. Here, we leverage the approximately linear relationship between the mean and standard deviation of RT distributions, as predicted by the drift-diffusion model, to propose the existence of two distinct computational architectures underlying coordinated eye-hand movements. These architectures, for the first time, provide a computational basis for the flexible coupling between eye and hand movements. PMID:27784809
Architectural Analysis of Dynamically Reconfigurable Systems
NASA Technical Reports Server (NTRS)
Lindvall, Mikael; Godfrey, Sally; Ackermann, Chris; Ray, Arnab; Yonkwa, Lyly
2010-01-01
oTpics include: the problem (increased flexibility of architectural styles decrease analyzability, behavior emerges and varies depending on the configuration, does the resulting system run according to the intended design, and architectural decisions can impede or facilitate testing); top down approach to architecture analysis, detection of defects and deviations, and architecture and its testability; currently targeted projects GMSEC and CFS; analyzing software architectures; analyzing runtime events; actual architecture recognition; GMPUB in Dynamic SAVE; sample output from new approach; taking message timing delays into account; CFS examples of architecture and testability; some recommendations for improved testablity; and CFS examples of abstract interfaces and testability; CFS example of opening some internal details.
Alor-Hernández, Giner; Sánchez-Cervantes, José Luis; Juárez-Martínez, Ulises; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Aguilar-Laserre, Alberto
2012-03-01
Emergency healthcare is one of the emerging application domains for information services, which requires highly multimodal information services. The time of consuming pre-hospital emergency process is critical. Therefore, the minimization of required time for providing primary care and consultation to patients is one of the crucial factors when trying to improve the healthcare delivery in emergency situations. In this sense, dynamic location of medical entities is a complex process that needs time and it can be critical when a person requires medical attention. This work presents a multimodal location-based system for locating and assigning medical entities called ITOHealth. ITOHealth provides a multimodal middleware-oriented integrated architecture using a service-oriented architecture in order to provide information of medical entities in mobile devices and web browsers with enriched interfaces providing multimodality support. ITOHealth's multimodality is based on the use of Microsoft Agent Characters, the integration of natural language voice to the characters, and multi-language and multi-characters support providing an advantage for users with visual impairments.
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.
Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan
2017-11-01
The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
Jupiter Europa Orbiter Architecture Definition Process
NASA Technical Reports Server (NTRS)
Rasmussen, Robert; Shishko, Robert
2011-01-01
The proposed Jupiter Europa Orbiter mission, planned for launch in 2020, is using a new architectural process and framework tool to drive its model-based systems engineering effort. The process focuses on getting the architecture right before writing requirements and developing a point design. A new architecture framework tool provides for the structured entry and retrieval of architecture artifacts based on an emerging architecture meta-model. This paper describes the relationships among these artifacts and how they are used in the systems engineering effort. Some early lessons learned are discussed.
FRR: fair remote retrieval of outsourced private medical records in electronic health networks.
Wang, Huaqun; Wu, Qianhong; Qin, Bo; Domingo-Ferrer, Josep
2014-08-01
Cloud computing is emerging as the next-generation IT architecture. However, cloud computing also raises security and privacy concerns since the users have no physical control over the outsourced data. This paper focuses on fairly retrieving encrypted private medical records outsourced to remote untrusted cloud servers in the case of medical accidents and disputes. Our goal is to enable an independent committee to fairly recover the original private medical records so that medical investigation can be carried out in a convincing way. We achieve this goal with a fair remote retrieval (FRR) model in which either t investigation committee members cooperatively retrieve the original medical data or none of them can get any information on the medical records. We realize the first FRR scheme by exploiting fair multi-member key exchange and homomorphic privately verifiable tags. Based on the standard computational Diffie-Hellman (CDH) assumption, our scheme is provably secure in the random oracle model (ROM). A detailed performance analysis and experimental results show that our scheme is efficient in terms of communication and computation. Copyright © 2014 Elsevier Inc. All rights reserved.
Computer graphics in architecture and engineering
NASA Technical Reports Server (NTRS)
Greenberg, D. P.
1975-01-01
The present status of the application of computer graphics to the building profession or architecture and its relationship to other scientific and technical areas were discussed. It was explained that, due to the fragmented nature of architecture and building activities (in contrast to the aerospace industry), a comprehensive, economic utilization of computer graphics in this area is not practical and its true potential cannot now be realized due to the present inability of architects and structural, mechanical, and site engineers to rely on a common data base. Future emphasis will therefore have to be placed on a vertical integration of the construction process and effective use of a three-dimensional data base, rather than on waiting for any technological breakthrough in interactive computing.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Larson, Robert E.
1989-01-01
The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
Fault tolerant architectures for integrated aircraft electronics systems
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1983-01-01
Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered.
A biopolymer transistor: electrical amplification by microtubules.
Priel, Avner; Ramos, Arnolt J; Tuszynski, Jack A; Cantiello, Horacio F
2006-06-15
Microtubules (MTs) are important cytoskeletal structures engaged in a number of specific cellular activities, including vesicular traffic, cell cyto-architecture and motility, cell division, and information processing within neuronal processes. MTs have also been implicated in higher neuronal functions, including memory and the emergence of "consciousness". How MTs handle and process electrical information, however, is heretofore unknown. Here we show new electrodynamic properties of MTs. Isolated, taxol-stabilized MTs behave as biomolecular transistors capable of amplifying electrical information. Electrical amplification by MTs can lead to the enhancement of dynamic information, and processivity in neurons can be conceptualized as an "ionic-based" transistor, which may affect, among other known functions, neuronal computational capabilities.
A UIMA wrapper for the NCBO annotator
Roeder, Christophe; Jonquet, Clement; Shah, Nigam H.; Baumgartner, William A.; Verspoor, Karin; Hunter, Lawrence
2010-01-01
Summary: The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator—an ontology-based annotation service—to make it available as a component in UIMA workflows. Availability: This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows. Contact: chris.roeder@ucdenver.edu PMID:20505005
Federal Emergency Management Information System (FEMIS) system administration guide, version 1.4.5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arp, J.A.; Burnett, R.A.; Carter, R.J.
The Federal Emergency Management Information Systems (FEMIS) is an emergency management planning and response tool that was developed by the Pacific Northwest National Laboratory (PNNL) under the direction of the US Army Chemical Biological Defense Command. The FEMIS System Administration Guide provides information necessary for the system administrator to maintain the FEMIS system. The FEMIS system is designed for a single Chemical Stockpile Emergency Preparedness Program (CSEPP) site that has multiple Emergency Operations Centers (EOCs). Each EOC has personal computers (PCs) that emergency planners and operations personnel use to do their jobs. These PCs are connected via a local areamore » network (LAN) to servers that provide EOC-wide services. Each EOC is interconnected to other EOCs via a Wide Area Network (WAN). Thus, FEMIS is an integrated software product that resides on client/server computer architecture. The main body of FEMIS software, referred to as the FEMIS Application Software, resides on the PC client(s) and is directly accessible to emergency management personnel. The remainder of the FEMIS software, referred to as the FEMIS Support Software, resides on the UNIX server. The Support Software provides the communication, data distribution, and notification functionality necessary to operate FEMIS in a networked, client/server environment. The UNIX server provides an Oracle relational database management system (RDBMS) services, ARC/INFO GIS (optional) capabilities, and basic file management services. PNNL developed utilities that reside on the server include the Notification Service, the Command Service that executes the evacuation model, and AutoRecovery. To operate FEMIS, the Application Software must have access to a site specific FEMIS emergency management database. Data that pertains to an individual EOC`s jurisdiction is stored on the EOC`s local server. Information that needs to be accessible to all EOCs is automatically distributed by the FEMIS database to the other EOCs at the site.« less
HyperForest: A high performance multi-processor architecture for real-time intelligent systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.
1997-04-01
Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doorsmore » for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.« less
Early experiences in developing and managing the neuroscience gateway.
Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas T
2015-02-01
The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway.
Early experiences in developing and managing the neuroscience gateway
Sivagnanam, Subhashini; Majumdar, Amit; Yoshimoto, Kenneth; Astakhov, Vadim; Bandrowski, Anita; Martone, MaryAnn; Carnevale, Nicholas. T.
2015-01-01
SUMMARY The last few decades have seen the emergence of computational neuroscience as a mature field where researchers are interested in modeling complex and large neuronal systems and require access to high performance computing machines and associated cyber infrastructure to manage computational workflow and data. The neuronal simulation tools, used in this research field, are also implemented for parallel computers and suitable for high performance computing machines. But using these tools on complex high performance computing machines remains a challenge because of issues with acquiring computer time on these machines located at national supercomputer centers, dealing with complex user interface of these machines, dealing with data management and retrieval. The Neuroscience Gateway is being developed to alleviate and/or hide these barriers to entry for computational neuroscientists. It hides or eliminates, from the point of view of the users, all the administrative and technical barriers and makes parallel neuronal simulation tools easily available and accessible on complex high performance computing machines. It handles the running of jobs and data management and retrieval. This paper shares the early experiences in bringing up this gateway and describes the software architecture it is based on, how it is implemented, and how users can use this for computational neuroscience research using high performance computing at the back end. We also look at parallel scaling of some publicly available neuronal models and analyze the recent usage data of the neuroscience gateway. PMID:26523124
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Topical perspective on massive threading and parallelism.
Farber, Robert M
2011-09-01
Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.
Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Modelling parallel programs and multiprocessor architectures with AXE
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Fineman, Charles E.
1991-01-01
AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
A high performance parallel computing architecture for robust image features
NASA Astrophysics Data System (ADS)
Zhou, Renyan; Liu, Leibo; Wei, Shaojun
2014-03-01
A design of parallel architecture for image feature detection and description is proposed in this article. The major component of this architecture is a 2D cellular network composed of simple reprogrammable processors, enabling the Hessian Blob Detector and Haar Response Calculation, which are the most computing-intensive stage of the Speeded Up Robust Features (SURF) algorithm. Combining this 2D cellular network and dedicated hardware for SURF descriptors, this architecture achieves real-time image feature detection with minimal software in the host processor. A prototype FPGA implementation of the proposed architecture achieves 1318.9 GOPS general pixel processing @ 100 MHz clock and achieves up to 118 fps in VGA (640 × 480) image feature detection. The proposed architecture is stand-alone and scalable so it is easy to be migrated into VLSI implementation.
The RISC (Reduced Instruction Set Computer) Architecture and Computer Performance Evaluation.
1986-03-01
time where the main emphasis of the evaluation process is put on the software . The model is intended to provide a tool for computer architects to use...program, or 3) Was to be implemented in random logic more effec- tively than the equivalent sequence of software instructions. Both data and address...definition is the IEEE standard 729-1983 stating Computer Architecture as: " The process of defining a collection of hardware and software components and
First 3 years of operation of RIACS (Research Institute for Advanced Computer Science) (1983-1985)
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
The focus of the Research Institute for Advanced Computer Science (RIACS) is to explore matches between advanced computing architectures and the processes of scientific research. An architecture evaluation of the MIT static dataflow machine, specification of a graphical language for expressing distributed computations, and specification of an expert system for aiding in grid generation for two-dimensional flow problems was initiated. Research projects for 1984 and 1985 are summarized.
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2018-01-19
A main goal in DNA computing is to build DNA circuits to compute designated functions using a minimal number of DNA strands. Here, we propose a novel architecture to build compact DNA strand displacement circuits to compute a broad scope of functions in an analog fashion. A circuit by this architecture is composed of three autocatalytic amplifiers, and the amplifiers interact to perform computation. We show DNA circuits to compute functions sqrt(x), ln(x) and exp(x) for x in tunable ranges with simulation results. A key innovation in our architecture, inspired by Napier's use of logarithm transforms to compute square roots on a slide rule, is to make use of autocatalytic amplifiers to do logarithmic and exponential transforms in concentration and time. In particular, we convert from the input that is encoded by the initial concentration of the input DNA strand, to time, and then back again to the output encoded by the concentration of the output DNA strand at equilibrium. This combined use of strand-concentration and time encoding of computational values may have impact on other forms of molecular computation.
Performances of multiprocessor multidisk architectures for continuous media storage
NASA Astrophysics Data System (ADS)
Gennart, Benoit A.; Messerli, Vincent; Hersch, Roger D.
1996-03-01
Multimedia interfaces increase the need for large image databases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Architectural Implications of Cloud Computing
2011-10-24
Public Cloud Infrastructure-as-a- Service (IaaS) Software -as-a- Service ( SaaS ) Cloud Computing Types Platform-as-a- Service (PaaS) Based on Type of...Twitter #SEIVirtualForum © 2011 Carnegie Mellon University Software -as-a- Service ( SaaS ) Model of software deployment in which a third-party...and System Solutions (RTSS) Program. Her current interests and projects are in service -oriented architecture (SOA), cloud computing, and context
Generic Software for Emulating Multiprocessor Architectures.
1985-05-01
RD-A157 662 GENERIC SOFTWARE FOR EMULATING MULTIPROCESSOR 1/2 AlRCHITECTURES(J) MASSACHUSETTS INST OF TECH CAMBRIDGE U LRS LAB FOR COMPUTER SCIENCE R...AREA & WORK UNIT NUMBERS MIT Laboratory for Computer Science 545 Technology Square Cambridge, MA 02139 ____________ I I. CONTROLLING OFFICE NAME AND...aide If neceeasy end Identify by block number) Computer architecture, emulation, simulation, dataf low 20. ABSTRACT (Continue an reverse slde It
Sigint Application for Polymorphous Computing Architecture (PCA): Wideband DF
2006-08-01
Polymorphous Computing Architecture (PCA) program as stated by Robert Graybill is to Develop the computing foundation for agile systems by establishing...ubiquitous MUSIC algorithm rely upon an underlying narrowband signal model [8]. In this case, narrowband means that the signal bandwidth is less than...a wideband DF algorithm is needed to compensate for this model inadequacy. Among the various wideband DF techniques available, the coherent signal
Exploring Gigabyte Datasets in Real Time: Architectures, Interfaces and Time-Critical Design
NASA Technical Reports Server (NTRS)
Bryson, Steve; Gerald-Yamasaki, Michael (Technical Monitor)
1998-01-01
Architectures and Interfaces: The implications of real-time interaction on software architecture design: decoupling of interaction/graphics and computation into asynchronous processes. The performance requirements of graphics and computation for interaction. Time management in such an architecture. Examples of how visualization algorithms must be modified for high performance. Brief survey of interaction techniques and design, including direct manipulation and manipulation via widgets. talk discusses how human factors considerations drove the design and implementation of the virtual wind tunnel. Time-Critical Design: A survey of time-critical techniques for both computation and rendering. Emphasis on the assignment of a time budget to both the overall visualization environment and to each individual visualization technique in the environment. The estimation of the benefit and cost of an individual technique. Examples of the modification of visualization algorithms to allow time-critical control.
Hardware architecture design of image restoration based on time-frequency domain computation
NASA Astrophysics Data System (ADS)
Wen, Bo; Zhang, Jing; Jiao, Zipeng
2013-10-01
The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.
Takeda, Shuntaro; Furusawa, Akira
2017-09-22
We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
NASA Astrophysics Data System (ADS)
Takeda, Shuntaro; Furusawa, Akira
2017-09-01
We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.
a Real-Time GIS Platform for High Sour Gas Leakage Simulation, Evaluation and Visualization
NASA Astrophysics Data System (ADS)
Li, M.; Liu, H.; Yang, C.
2015-07-01
The development of high-sulfur gas fields, also known as sour gas field, is faced with a series of safety control and emergency management problems. The GIS-based emergency response system is placed high expectations under the consideration of high pressure, high content, complex terrain and highly density population in Sichuan Basin, southwest China. The most researches on high hydrogen sulphide gas dispersion simulation and evaluation are used for environmental impact assessment (EIA) or emergency preparedness planning. This paper introduces a real-time GIS platform for high-sulfur gas emergency response. Combining with real-time data from the leak detection systems and the meteorological monitoring stations, GIS platform provides the functions of simulating, evaluating and displaying of the different spatial-temporal toxic gas distribution patterns and evaluation results. This paper firstly proposes the architecture of Emergency Response/Management System, secondly explains EPA's Gaussian dispersion model CALPUFF simulation workflow under high complex terrain and real-time data, thirdly explains the emergency workflow and spatial analysis functions of computing the accident influencing areas, population and the optimal evacuation routes. Finally, a well blow scenarios is used for verify the system. The study shows that GIS platform which integrates the real-time data and CALPUFF models will be one of the essential operational platforms for high-sulfur gas fields emergency management.
Learning in the "Real" World: Encounters with Radical Architectures (1960s-1970s)
ERIC Educational Resources Information Center
Doucet, Isabelle
2017-01-01
Throughout the 1960s and 1970s architectural education saw to the emergence of radical attempts to reconnect pedagogy with "the real world" and to forge greater social responsibility in architecture. From this epoch of important political, social, and environmental action, this article discusses three "encounters" between…
Design and construction principles in nature and architecture.
Knippers, Jan; Speck, Thomas
2012-03-01
This paper will focus on how the emerging scientific discipline of biomimetics can bring new insights into the field of architecture. An analysis of both architectural and biological methodologies will show important aspects connecting these two. The foundation of this paper is a case study of convertible structures based on elastic plant movements.
A Comprehensive Approach to Emergency Planning
ERIC Educational Resources Information Center
Worsley, Tracy L.; Beckering, Don
2007-01-01
It is essential that the traditional emergency management structure be used as a framework for higher education emergency planning. The four phases of emergency management should be reflected in the architecture of all planning efforts. These include "preparedness," "response," "mitigation," and "recovery."…
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.
Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros
2014-06-25
The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions
2014-01-01
Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
Kadkhodapour, J; Montazerian, H; Darabi, A Ch; Anaraki, A P; Ahmadi, S M; Zadpoor, A A; Schmauder, S
2015-10-01
Since the advent of additive manufacturing techniques, regular porous biomaterials have emerged as promising candidates for tissue engineering scaffolds owing to their controllable pore architecture and feasibility in producing scaffolds from a variety of biomaterials. The architecture of scaffolds could be designed to achieve similar mechanical properties as in the host bone tissue, thereby avoiding issues such as stress shielding in bone replacement procedure. In this paper, the deformation and failure mechanisms of porous titanium (Ti6Al4V) biomaterials manufactured by selective laser melting from two different types of repeating unit cells, namely cubic and diamond lattice structures, with four different porosities are studied. The mechanical behavior of the above-mentioned porous biomaterials was studied using finite element models. The computational results were compared with the experimental findings from a previous study of ours. The Johnson-Cook plasticity and damage model was implemented in the finite element models to simulate the failure of the additively manufactured scaffolds under compression. The computationally predicted stress-strain curves were compared with the experimental ones. The computational models incorporating the Johnson-Cook damage model could predict the plateau stress and maximum stress at the first peak with less than 18% error. Moreover, the computationally predicted deformation modes were in good agreement with the results of scaling law analysis. A layer-by-layer failure mechanism was found for the stretch-dominated structures, i.e. structures made from the cubic unit cell, while the failure of the bending-dominated structures, i.e. structures made from the diamond unit cells, was accompanied by the shearing bands of 45°. Copyright © 2015 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCaskey, Alexander J.
Hybrid programming models for beyond-CMOS technologies will prove critical for integrating new computing technologies alongside our existing infrastructure. Unfortunately the software infrastructure required to enable this is lacking or not available. XACC is a programming framework for extreme-scale, post-exascale accelerator architectures that integrates alongside existing conventional applications. It is a pluggable framework for programming languages developed for next-gen computing hardware architectures like quantum and neuromorphic computing. It lets computational scientists efficiently off-load classically intractable work to attached accelerators through user-friendly Kernel definitions. XACC makes post-exascale hybrid programming approachable for domain computational scientists.
Application of computational physics within Northrop
NASA Technical Reports Server (NTRS)
George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.
1987-01-01
An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU
NASA Astrophysics Data System (ADS)
Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha
2018-03-01
Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.
All-memristive neuromorphic computing with level-tuned neurons
NASA Astrophysics Data System (ADS)
Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos
2016-09-01
In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
All-memristive neuromorphic computing with level-tuned neurons.
Pantazi, Angeliki; Woźniak, Stanisław; Tuma, Tomas; Eleftheriou, Evangelos
2016-09-02
In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
DOT National Transportation Integrated Search
2007-11-01
The purpose of this document is to provide an Architecture Analysis : for the Next Generation 911 (NG911) System (or system : of systems). The U.S. Department of Transportation (USDOT) : understands that access to emergency services...
Specialized computer architectures for computational aerodynamics
NASA Technical Reports Server (NTRS)
Stevenson, D. K.
1978-01-01
In recent years, computational fluid dynamics has made significant progress in modelling aerodynamic phenomena. Currently, one of the major barriers to future development lies in the compute-intensive nature of the numerical formulations and the relative high cost of performing these computations on commercially available general purpose computers, a cost high with respect to dollar expenditure and/or elapsed time. Today's computing technology will support a program designed to create specialized computing facilities to be dedicated to the important problems of computational aerodynamics. One of the still unresolved questions is the organization of the computing components in such a facility. The characteristics of fluid dynamic problems which will have significant impact on the choice of computer architecture for a specialized facility are reviewed.
Peer-to-peer Monte Carlo simulation of photon migration in topical applications of biomedical optics
NASA Astrophysics Data System (ADS)
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Efficient architecture for spike sorting in reconfigurable hardware.
Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying
2013-11-01
This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
Hybrid parallel computing architecture for multiview phase shifting
NASA Astrophysics Data System (ADS)
Zhong, Kai; Li, Zhongwei; Zhou, Xiaohui; Shi, Yusheng; Wang, Congjun
2014-11-01
The multiview phase-shifting method shows its powerful capability in achieving high resolution three-dimensional (3-D) shape measurement. Unfortunately, this ability results in very high computation costs and 3-D computations have to be processed offline. To realize real-time 3-D shape measurement, a hybrid parallel computing architecture is proposed for multiview phase shifting. In this architecture, the central processing unit can co-operate with the graphic processing unit (GPU) to achieve hybrid parallel computing. The high computation cost procedures, including lens distortion rectification, phase computation, correspondence, and 3-D reconstruction, are implemented in GPU, and a three-layer kernel function model is designed to simultaneously realize coarse-grained and fine-grained paralleling computing. Experimental results verify that the developed system can perform 50 fps (frame per second) real-time 3-D measurement with 260 K 3-D points per frame. A speedup of up to 180 times is obtained for the performance of the proposed technique using a NVIDIA GT560Ti graphics card rather than a sequential C in a 3.4 GHZ Inter Core i7 3770.
The science of computing - Parallel computation
NASA Technical Reports Server (NTRS)
Denning, P. J.
1985-01-01
Although parallel computation architectures have been known for computers since the 1920s, it was only in the 1970s that microelectronic components technologies advanced to the point where it became feasible to incorporate multiple processors in one machine. Concommitantly, the development of algorithms for parallel processing also lagged due to hardware limitations. The speed of computing with solid-state chips is limited by gate switching delays. The physical limit implies that a 1 Gflop operational speed is the maximum for sequential processors. A computer recently introduced features a 'hypercube' architecture with 128 processors connected in networks at 5, 6 or 7 points per grid, depending on the design choice. Its computing speed rivals that of supercomputers, but at a fraction of the cost. The added speed with less hardware is due to parallel processing, which utilizes algorithms representing different parts of an equation that can be broken into simpler statements and processed simultaneously. Present, highly developed computer languages like FORTRAN, PASCAL, COBOL, etc., rely on sequential instructions. Thus, increased emphasis will now be directed at parallel processing algorithms to exploit the new architectures.
National IVHS Architecture Development Strategy
DOT National Transportation Integrated Search
1994-01-27
NATIONAL INFORMATION AND CONTROL SYSTEMS ARE EMERGING THAT REQUIRE SYSTEM ARCHITECTURES FOR DEPLOYMENT ACROSS THE NATION, E.G., AIR TRAFFIC CONTROL SYSTEMS, MILITARY COMMAND AND CONTROL SYSTEMS, AND OTHER NATIONAL INFORMATION SYSTEMS. THE REQUIRED CH...
Analog Computation by DNA Strand Displacement Circuits.
Song, Tianqi; Garg, Sudhanshu; Mokhtar, Reem; Bui, Hieu; Reif, John
2016-08-19
DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands, respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. On the basis of these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.
Advanced Architectures for Astrophysical Supercomputing
NASA Astrophysics Data System (ADS)
Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.
2010-12-01
Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lala, J.H.; Nagle, G.A.; Harper, R.E.
1993-05-01
The Maglev control computer system should be designed to verifiably possess high reliability and safety as well as high availability to make Maglev a dependable and attractive transportation alternative to the public. A Maglev control computer system has been designed using a design-for-validation methodology developed earlier under NASA and SDIO sponsorship for real-time aerospace applications. The present study starts by defining the maglev mission scenario and ends with the definition of a maglev control computer architecture. Key intermediate steps included definitions of functional and dependability requirements, synthesis of two candidate architectures, development of qualitative and quantitative evaluation criteria, and analyticalmore » modeling of the dependability characteristics of the two architectures. Finally, the applicability of the design-for-validation methodology was also illustrated by applying it to the German Transrapid TR07 maglev control system.« less
NASA Technical Reports Server (NTRS)
Kayatin, Matthew J.; Perry, Jay L.
2017-01-01
Traditional gas-phase trace contaminant control adsorption process flow is constrained as required to maintain high contaminant single-pass adsorption efficiency. Specifically, the bed superficial velocity is controlled to limit the adsorption mass-transfer zone length relative to the physical adsorption bed; this is aided by traditional high-aspect ratio bed design. Through operation in this manner, most contaminants, including those with relatively high potential energy are readily adsorbed. A consequence of this operational approach, however, is a limited available operational flow margin. By considering a paradigm shift in adsorption architecture design and operations, in which flows of high superficial velocity are treated by low-aspect ratio sorbent beds, the range of well-adsorbed contaminants becomes limited, but the process flow is increased such that contaminant leaks or emerging contaminants of interest may be effectively controlled. To this end, the high velocity, low aspect ratio (HVLA) adsorption process architecture was demonstrated against a trace contaminant load representative of the International Space Station atmosphere. Two HVLA concept packaging designs (linear flow and radial flow) were tested. The performance of each design was evaluated and compared against computer simulation. Utilizing the HVLA process, long and sustained control of heavy organic contaminants was demonstrated.
Fast notification architecture for wireless sensor networks
NASA Astrophysics Data System (ADS)
Lee, Dong-Hahk
2013-03-01
In an emergency, since it is vital to transmit the message to the users immediately after analysing the data to prevent disaster, this article presents the deployment of a fast notification architecture for a wireless sensor network. The sensor nodes of the proposed architecture can monitor an emergency situation periodically and transmit the sensing data, immediately to the sink node. We decide on the grade of fire situation according to the decision rule using the sensing values of temperature, CO, smoke density and temperature increasing rate. On the other hand, to estimate the grade of air pollution, the sensing data, such as dust, formaldehyde, NO2, CO2, is applied to the given knowledge model. Since the sink node in the architecture has a ZigBee interface, it can transmit the alert messages in real time according to analysed results received from the host server to the terminals equipped with a SIM card-type ZigBee module. Also, the host server notifies the situation to the registered users who have cellular phone through short message service server of the cellular network. Thus, the proposed architecture can adapt an emergency situation dynamically compared to the conventional architecture using video processing. In the testbed, after generating air pollution and fire data, the terminal receives the message in less than 3 s. In the test results, this system can also be applied to buildings and public areas where many people gather together, to prevent unexpected disasters in urban settings.
Computer Security Primer: Systems Architecture, Special Ontology and Cloud Virtual Machines
ERIC Educational Resources Information Center
Waguespack, Leslie J.
2014-01-01
With the increasing proliferation of multitasking and Internet-connected devices, security has reemerged as a fundamental design concern in information systems. The shift of IS curricula toward a largely organizational perspective of security leaves little room for focus on its foundation in systems architecture, the computational underpinnings of…
Emergence of a Common Modeling Architecture for Earth System Science (Invited)
NASA Astrophysics Data System (ADS)
Deluca, C.
2010-12-01
Common modeling architecture can be viewed as a natural outcome of common modeling infrastructure. The development of model utility and coupling packages (ESMF, MCT, OpenMI, etc.) over the last decade represents the realization of a community vision for common model infrastructure. The adoption of these packages has led to increased technical communication among modeling centers and newly coupled modeling systems. However, adoption has also exposed aspects of interoperability that must be addressed before easy exchange of model components among different groups can be achieved. These aspects include common physical architecture (how a model is divided into components) and model metadata and usage conventions. The National Unified Operational Prediction Capability (NUOPC), an operational weather prediction consortium, is collaborating with weather and climate researchers to define a common model architecture that encompasses these advanced aspects of interoperability and looks to future needs. The nature and structure of the emergent common modeling architecture will be discussed along with its implications for future model development.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Choudhary, Alok Nidhi
1989-01-01
Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
ATCA for Machines-- Advanced Telecommunications Computing Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, R.S.; /SLAC
2008-04-22
The Advanced Telecommunications Computing Architecture is a new industry open standard for electronics instrument modules and shelves being evaluated for the International Linear Collider (ILC). It is the first industrial standard designed for High Availability (HA). ILC availability simulations have shown clearly that the capabilities of ATCA are needed in order to achieve acceptable integrated luminosity. The ATCA architecture looks attractive for beam instruments and detector applications as well. This paper provides an overview of ongoing R&D including application of HA principles to power electronics systems.
Design and synthesis of the superionic conductor Na10SnP2S12
NASA Astrophysics Data System (ADS)
Richards, William D.; Tsujimura, Tomoyuki; Miara, Lincoln J.; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand
2016-03-01
Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm-1 rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity.
Integration of hybrid wireless networks in cloud services oriented enterprise information systems
NASA Astrophysics Data System (ADS)
Li, Shancang; Xu, Lida; Wang, Xinheng; Wang, Jue
2012-05-01
This article presents a hybrid wireless network integration scheme in cloud services-based enterprise information systems (EISs). With the emerging hybrid wireless networks and cloud computing technologies, it is necessary to develop a scheme that can seamlessly integrate these new technologies into existing EISs. By combining the hybrid wireless networks and computing in EIS, a new framework is proposed, which includes frontend layer, middle layer and backend layers connected to IP EISs. Based on a collaborative architecture, cloud services management framework and process diagram are presented. As a key feature, the proposed approach integrates access control functionalities within the hybrid framework that provide users with filtered views on available cloud services based on cloud service access requirements and user security credentials. In future work, we will implement the proposed framework over SwanMesh platform by integrating the UPnP standard into an enterprise information system.
Basic Requirements for Systems Software Research and Development
NASA Technical Reports Server (NTRS)
Kuszmaul, Chris; Nitzberg, Bill
1996-01-01
Our success over the past ten years evaluating and developing advanced computing technologies has been due to a simple research and development (R/D) model. Our model has three phases: (a) evaluating the state-of-the-art, (b) identifying problems and creating innovations, and (c) developing solutions, improving the state- of-the-art. This cycle has four basic requirements: a large production testbed with real users, a diverse collection of state-of-the-art hardware, facilities for evalua- tion of emerging technologies and development of innovations, and control over system management on these testbeds. Future research will be irrelevant and future products will not work if any of these requirements is eliminated. In order to retain our effectiveness, the numerical aerospace simulator (NAS) must replace out-of-date production testbeds in as timely a fashion as possible, and cannot afford to ignore innovative designs such as new distributed shared memory machines, clustered commodity-based computers, and multi-threaded architectures.
Design and synthesis of the superionic conductor Na10SnP2S12.
Richards, William D; Tsujimura, Tomoyuki; Miara, Lincoln J; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand
2016-03-17
Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm(-1) rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity.
NASA Astrophysics Data System (ADS)
Kelley, Troy D.; McGhee, S.
2013-05-01
This paper describes the ongoing development of a robotic control architecture that inspired by computational cognitive architectures from the discipline of cognitive psychology. The Symbolic and Sub-Symbolic Robotics Intelligence Control System (SS-RICS) combines symbolic and sub-symbolic representations of knowledge into a unified control architecture. The new architecture leverages previous work in cognitive architectures, specifically the development of the Adaptive Character of Thought-Rational (ACT-R) and Soar. This paper details current work on learning from episodes or events. The use of episodic memory as a learning mechanism has, until recently, been largely ignored by computational cognitive architectures. This paper details work on metric level episodic memory streams and methods for translating episodes into abstract schemas. The presentation will include research on learning through novelty and self generated feedback mechanisms for autonomous systems.
Sensing and perception: Connectionist approaches to subcognitive computing
NASA Technical Reports Server (NTRS)
Skrrypek, J.
1987-01-01
New approaches to machine sensing and perception are presented. The motivation for crossdisciplinary studies of perception in terms of AI and neurosciences is suggested. The question of computing architecture granularity as related to global/local computation underlying perceptual function is considered and examples of two environments are given. Finally, the examples of using one of the environments, UCLA PUNNS, to study neural architectures for visual function are presented.
Computer architecture evaluation for structural dynamics computations: Project summary
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1989-01-01
The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.
2009-03-01
SENSOR NETWORKS THESIS Presented to the Faculty Department of Electrical and Computer Engineering Graduate School of Engineering and...hierarchical, and Secure Lock within a wireless sensor network (WSN) under the Hubenko architecture. Using a Matlab computer simulation, the impact of the...rekeying protocol should be applied given particular network parameters, such as WSN size. 10 1.3 Experimental Approach A computer simulation in
Kazakis, Georgios; Kanellopoulos, Ioannis; Sotiropoulos, Stefanos; Lagaros, Nikos D
2017-10-01
Construction industry has a major impact on the environment that we spend most of our life. Therefore, it is important that the outcome of architectural intuition performs well and complies with the design requirements. Architects usually describe as "optimal design" their choice among a rather limited set of design alternatives, dictated by their experience and intuition. However, modern design of structures requires accounting for a great number of criteria derived from multiple disciplines, often of conflicting nature. Such criteria derived from structural engineering, eco-design, bioclimatic and acoustic performance. The resulting vast number of alternatives enhances the need for computer-aided architecture in order to increase the possibility of arriving at a more preferable solution. Therefore, the incorporation of smart, automatic tools in the design process, able to further guide designer's intuition becomes even more indispensable. The principal aim of this study is to present possibilities to integrate automatic computational techniques related to topology optimization in the phase of intuition of civil structures as part of computer aided architectural design. In this direction, different aspects of a new computer aided architectural era related to the interpretation of the optimized designs, difficulties resulted from the increased computational effort and 3D printing capabilities are covered here in.
WebGIS based community services architecture by griddization managements and crowdsourcing services
NASA Astrophysics Data System (ADS)
Wang, Haiyin; Wan, Jianhua; Zeng, Zhe; Zhou, Shengchuan
2016-11-01
Along with the fast economic development of cities, rapid urbanization, population surge, in China, the social community service mechanisms need to be rationalized and the policy standards need to be unified, which results in various types of conflicts and challenges for community services of government. Based on the WebGIS technology, the article provides a community service architecture by gridding management and crowdsourcing service. The WEBGIS service architecture includes two parts: the cloud part and the mobile part. The cloud part refers to community service centres, which can instantaneously response the emergency, visualize the scene of the emergency, and analyse the data from the emergency. The mobile part refers to the mobile terminal, which can call the centre, report the event, collect data and verify the feedback. This WebGIS based community service systems for Huangdao District of Qingdao, were awarded the “2015’ national innovation of social governance case of typical cases”.
McLaughlin, David; Shapley, Robert; Shelley, Michael
2003-01-01
A large-scale computational model of a local patch of input layer 4 [Formula: see text] of the primary visual cortex (V1) of the macaque monkey, together with a coarse-grained reduction of the model, are used to understand potential effects of cortical architecture upon neuronal performance. Both the large-scale point neuron model and its asymptotic reduction are described. The work focuses upon orientation preference and selectivity, and upon the spatial distribution of neuronal responses across the cortical layer. Emphasis is given to the role of cortical architecture (the geometry of synaptic connectivity, of the ordered and disordered structure of input feature maps, and of their interplay) as mechanisms underlying cortical responses within the model. Specifically: (i) Distinct characteristics of model neuronal responses (firing rates and orientation selectivity) as they depend upon the neuron's location within the cortical layer relative to the pinwheel centers of the map of orientation preference; (ii) A time independent (DC) elevation in cortico-cortical conductances within the model, in contrast to a "push-pull" antagonism between excitation and inhibition; (iii) The use of asymptotic analysis to unveil mechanisms which underly these performances of the model; (iv) A discussion of emerging experimental data. The work illustrates that large-scale scientific computation--coupled together with analytical reduction, mathematical analysis, and experimental data, can provide significant understanding and intuition about the possible mechanisms of cortical response. It also illustrates that the idealization which is a necessary part of theoretical modeling can outline in sharp relief the consequences of differing alternative interpretations and mechanisms--with final arbiter being a body of experimental evidence whose measurements address the consequences of these analyses.
NASA Astrophysics Data System (ADS)
Tyson, Eric J.; Buckley, James; Franklin, Mark A.; Chamberlain, Roger D.
2008-10-01
The imaging atmospheric Cherenkov technique for high-energy gamma-ray astronomy is emerging as an important new technique for studying the high energy universe. Current experiments have data rates of ≈20TB/year and duty cycles of about 10%. In the future, more sensitive experiments may produce up to 1000 TB/year. The data analysis task for these experiments requires keeping up with this data rate in close to real-time. Such data analysis is a classic example of a streaming application with very high performance requirements. This class of application often benefits greatly from the use of non-traditional approaches for computation including using special purpose hardware (FPGAs and ASICs), or sophisticated parallel processing techniques. However, designing, debugging, and deploying to these architectures is difficult and thus they are not widely used by the astrophysics community. This paper presents the Auto-Pipe design toolset that has been developed to address many of the difficulties in taking advantage of complex streaming computer architectures for such applications. Auto-Pipe incorporates a high-level coordination language, functional and performance simulation tools, and the ability to deploy applications to sophisticated architectures. Using the Auto-Pipe toolset, we have implemented the front-end portion of an imaging Cherenkov data analysis application, suitable for real-time or offline analysis. The application operates on data from the VERITAS experiment, and shows how Auto-Pipe can greatly ease performance optimization and application deployment of a wide variety of platforms. We demonstrate a performance improvement over a traditional software approach of 32x using an FPGA solution and 3.6x using a multiprocessor based solution.
NASA Technical Reports Server (NTRS)
Smith, Paul H.
1988-01-01
The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
Architecture for hospital information integration
NASA Astrophysics Data System (ADS)
Chimiak, William J.; Janariz, Daniel L.; Martinez, Ralph
1999-07-01
The ongoing integration of hospital information systems (HIS) continues. Data storage systems, data networks and computers improve, data bases grow and health-care applications increase. Some computer operating systems continue to evolve and some fade. Health care delivery now depends on this computer-assisted environment. The result is the critical harmonization of the various hospital information systems becomes increasingly difficult. The purpose of this paper is to present an architecture for HIS integration that is computer-language-neutral and computer- hardware-neutral for the informatics applications. The proposed architecture builds upon the work done at the University of Arizona on middleware, the work of the National Electrical Manufacturers Association, and the American College of Radiology. It is a fresh approach to allowing applications engineers to access medical data easily and thus concentrates on the application techniques in which they are expert without struggling with medical information syntaxes. The HIS can be modeled using a hierarchy of information sub-systems thus facilitating its understanding. The architecture includes the resulting information model along with a strict but intuitive application programming interface, managed by CORBA. The CORBA requirement facilitates interoperability. It should also reduce software and hardware development times.
FPGA Implementation of Generalized Hebbian Algorithm for Texture Classification
Lin, Shiow-Jyu; Hwang, Wen-Jyi; Lee, Wei-Hao
2012-01-01
This paper presents a novel hardware architecture for principal component analysis. The architecture is based on the Generalized Hebbian Algorithm (GHA) because of its simplicity and effectiveness. The architecture is separated into three portions: the weight vector updating unit, the principal computation unit and the memory unit. In the weight vector updating unit, the computation of different synaptic weight vectors shares the same circuit for reducing the area costs. To show the effectiveness of the circuit, a texture classification system based on the proposed architecture is physically implemented by Field Programmable Gate Array (FPGA). It is embedded in a System-On-Programmable-Chip (SOPC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient design for attaining both high speed performance and low area costs. PMID:22778640
NASA Technical Reports Server (NTRS)
Hale, Mark A.; Craig, James I.; Mistree, Farrokh; Schrage, Daniel P.
1995-01-01
Computing architectures are being assembled that extend concurrent engineering practices by providing more efficient execution and collaboration on distributed, heterogeneous computing networks. Built on the successes of initial architectures, requirements for a next-generation design computing infrastructure can be developed. These requirements concentrate on those needed by a designer in decision-making processes from product conception to recycling and can be categorized in two areas: design process and design information management. A designer both designs and executes design processes throughout design time to achieve better product and process capabilities while expanding fewer resources. In order to accomplish this, information, or more appropriately design knowledge, needs to be adequately managed during product and process decomposition as well as recomposition. A foundation has been laid that captures these requirements in a design architecture called DREAMS (Developing Robust Engineering Analysis Models and Specifications). In addition, a computing infrastructure, called IMAGE (Intelligent Multidisciplinary Aircraft Generation Environment), is being developed that satisfies design requirements defined in DREAMS and incorporates enabling computational technologies.
Gigaflop architecture, a hardware perspective
NASA Technical Reports Server (NTRS)
Feierbach, G. F.
1978-01-01
Any super computer built in the early 1980s will use components that are available by fall 1978. The architecture of such a system cannot depart radically from current super computers if the software experience painfully acquired from these computers in the 70's is to apply. Given the above constraints, 10 billion floating point operations per second (BFLOPS) are attainable and a problem memory of 512 million (64 bit) words could be supported by the technology of the time. In contrast to this, industry is likely to respond with commercially available machines with a performance of less than 150 MFLOPS. This is due to self-imposed constraints on the manufacturers to provide upward compatible architectures (same instruction set) and systems which can be sold in significant volumes. Since this computing speed is inadequate to meet the demands of computational fluid dynamics, a special processor is required. Issues which are felt to be significant in the pursuit of maximum compute capability in this special processor are discussed.
Using Multimedia for Teaching Analysis in History of Modern Architecture.
ERIC Educational Resources Information Center
Perryman, Garry
This paper presents a case for the development and support of a computer-based interactive multimedia program for teaching analysis in community college architecture design programs. Analysis in architecture design is an extremely important strategy for the teaching of higher-order thinking skills, which senior schools of architecture look for in…
Progress in a novel architecture for high performance processing
NASA Astrophysics Data System (ADS)
Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin
2018-04-01
The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).
Multiple Embedded Processors for Fault-Tolerant Computing
NASA Technical Reports Server (NTRS)
Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy
2005-01-01
A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
A static data flow simulation study at Ames Research Center
NASA Technical Reports Server (NTRS)
Barszcz, Eric; Howard, Lauri S.
1987-01-01
Demands in computational power, particularly in the area of computational fluid dynamics (CFD), led NASA Ames Research Center to study advanced computer architectures. One architecture being studied is the static data flow architecture based on research done by Jack B. Dennis at MIT. To improve understanding of this architecture, a static data flow simulator, written in Pascal, has been implemented for use on a Cray X-MP/48. A matrix multiply and a two-dimensional fast Fourier transform (FFT), two algorithms used in CFD work at Ames, have been run on the simulator. Execution times can vary by a factor of more than 2 depending on the partitioning method used to assign instructions to processing elements. Service time for matching tokens has proved to be a major bottleneck. Loop control and array address calculation overhead can double the execution time. The best sustained MFLOPS rates were less than 50% of the maximum capability of the machine.
Strategies for concurrent processing of complex algorithms in data driven architectures
NASA Technical Reports Server (NTRS)
Stoughton, John W.; Mielke, Roland R.
1988-01-01
The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.
Muller, George; Perkins, Casey J.; Lancaster, Mary J.; MacDonald, Douglas G.; Clements, Samuel L.; Hutton, William J.; Patrick, Scott W.; Key, Bradley Robert
2015-07-28
Computer-implemented security evaluation methods, security evaluation systems, and articles of manufacture are described. According to one aspect, a computer-implemented security evaluation method includes accessing information regarding a physical architecture and a cyber architecture of a facility, building a model of the facility comprising a plurality of physical areas of the physical architecture, a plurality of cyber areas of the cyber architecture, and a plurality of pathways between the physical areas and the cyber areas, identifying a target within the facility, executing the model a plurality of times to simulate a plurality of attacks against the target by an adversary traversing at least one of the areas in the physical domain and at least one of the areas in the cyber domain, and using results of the executing, providing information regarding a security risk of the facility with respect to the target.
Enterprise application architecture development based on DoDAF and TOGAF
NASA Astrophysics Data System (ADS)
Tao, Zhi-Gang; Luo, Yun-Feng; Chen, Chang-Xin; Wang, Ming-Zhe; Ni, Feng
2017-05-01
For the purpose of supporting the design and analysis of enterprise application architecture, here, we report a tailored enterprise application architecture description framework and its corresponding design method. The presented framework can effectively support service-oriented architecting and cloud computing by creating the metadata model based on architecture content framework (ACF), DoDAF metamodel (DM2) and Cloud Computing Modelling Notation (CCMN). The framework also makes an effort to extend and improve the mapping between The Open Group Architecture Framework (TOGAF) application architectural inputs/outputs, deliverables and Department of Defence Architecture Framework (DoDAF)-described models. The roadmap of 52 DoDAF-described models is constructed by creating the metamodels of these described models and analysing the constraint relationship among metamodels. By combining the tailored framework and the roadmap, this article proposes a service-oriented enterprise application architecture development process. Finally, a case study is presented to illustrate the results of implementing the tailored framework in the Southern Base Management Support and Information Platform construction project using the development process proposed by the paper.
Selecting a Benchmark Suite to Profile High-Performance Computing (HPC) Machines
2014-11-01
architectures. Machines now contain central processing units (CPUs), graphics processing units (GPUs), and many integrated core ( MIC ) architecture all...evaluate the feasibility and applicability of a new architecture just released to the market . Researchers are often unsure how available resources will...architectures. Having a suite of programs running on different architectures, such as GPUs, MICs , and CPUs, adds complexity and technical challenges
NASA Technical Reports Server (NTRS)
Fijany, Amir (Inventor); Bejczy, Antal K. (Inventor)
1993-01-01
This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call.
NASA Astrophysics Data System (ADS)
Mehta, Neville; Kompalli, Suryaprakash; Chaudhary, Vipin
Teleradiology is the electronic transmission of radiological patient images, such as x-rays, CT, or MR across multiple locations. The goal could be interpretation, consultation, or medical records keeping. Information technology solutions have enabled electronic records and their associated benefits are evident in health care today. However, salient aspects of collaborative interfaces, and computer assisted diagnostic (CAD) tools are yet to be integrated into workflow designs. The Computer Assisted Diagnostics and Interventions (CADI) group at the University at Buffalo has developed an architecture that facilitates web-enabled use of CAD tools, along with the novel concept of synchronized collaboration. The architecture can support multiple teleradiology applications and case studies are presented here.
Design of a modular digital computer system, CDRL no. D001, final design plan
NASA Technical Reports Server (NTRS)
Easton, R. A.
1975-01-01
The engineering breadboard implementation for the CDRL no. D001 modular digital computer system developed during design of the logic system was documented. This effort followed the architecture study completed and documented previously, and was intended to verify the concepts of a fault tolerant, automatically reconfigurable, modular version of the computer system conceived during the architecture study. The system has a microprogrammed 32 bit word length, general register architecture and an instruction set consisting of a subset of the IBM System 360 instruction set plus additional fault tolerance firmware. The following areas were covered: breadboard packaging, central control element, central processing element, memory, input/output processor, and maintenance/status panel and electronics.
NASA Astrophysics Data System (ADS)
Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.
2017-11-01
Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Real-time multimedia communications in medical emergency - the CONCERTO project solution.
Martini, Maria G; Iacobelli, Lorenzo; Bergeron, Cyril; Hewage, Chaminda T; Panza, Gianmarco; Piri, Esa; Vehkapera, Janne; Amon, Peter; Mazzotti, Matteo; Savino, Ketty; Bokor, Laszlo
2015-01-01
The management of medical emergency, in particular cardiac emergency, requests prompt intervention and the possibility to communicate in real time from the emergency area / ambulance to the hospital as much diagnostic information as possible about the patient. This would enable a prompt emergency diagnosis and operation and the possibility to prepare the appropriate actions in the suitable hospital department. To address this scenario, the CONCERTO European project proposed a wireless communication system based on a novel cross-layer architecture, including the integration of building blocks for medical media content fusion, delivery and access. This paper describes the proposed system architecture, outlining the developed components and mechanisms, and the evaluation of the proposed system, carried out in a hospital with the support of medical staff. The technical results and the feedback received highlight the impact of the CONCERTO approach in the healthcare domain, in particular in enabling a prompt and reliable diagnosis in challenging medical emergency scenarios.
Examining the architecture of cellular computing through a comparative study with a computer
Wang, Degeng; Gribskov, Michael
2005-01-01
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software–hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's ‘hardware’ equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the ‘bandwidth’ of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed. PMID:16849179
Examining the architecture of cellular computing through a comparative study with a computer.
Wang, Degeng; Gribskov, Michael
2005-06-22
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.
Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi
2018-01-01
Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Design and Analysis of a Neuromemristive Reservoir Computing Architecture for Biosignal Processing
Kudithipudi, Dhireesha; Saleh, Qutaiba; Merkel, Cory; Thesing, James; Wysocki, Bryant
2016-01-01
Reservoir computing (RC) is gaining traction in several signal processing domains, owing to its non-linear stateful computation, spatiotemporal encoding, and reduced training complexity over recurrent neural networks (RNNs). Previous studies have shown the effectiveness of software-based RCs for a wide spectrum of applications. A parallel body of work indicates that realizing RNN architectures using custom integrated circuits and reconfigurable hardware platforms yields significant improvements in power and latency. In this research, we propose a neuromemristive RC architecture, with doubly twisted toroidal structure, that is validated for biosignal processing applications. We exploit the device mismatch to implement the random weight distributions within the reservoir and propose mixed-signal subthreshold circuits for energy efficiency. A comprehensive analysis is performed to compare the efficiency of the neuromemristive RC architecture in both digital(reconfigurable) and subthreshold mixed-signal realizations. Both Electroencephalogram (EEG) and Electromyogram (EMG) biosignal benchmarks are used for validating the RC designs. The proposed RC architecture demonstrated an accuracy of 90 and 84% for epileptic seizure detection and EMG prosthetic finger control, respectively. PMID:26869876
Yoo, Dongjin
2012-07-01
Advanced additive manufacture (AM) techniques are now being developed to fabricate scaffolds with controlled internal pore architectures in the field of tissue engineering. In general, these techniques use a hybrid method which combines computer-aided design (CAD) with computer-aided manufacturing (CAM) tools to design and fabricate complicated three-dimensional (3D) scaffold models. The mathematical descriptions of micro-architectures along with the macro-structures of the 3D scaffold models are limited by current CAD technologies as well as by the difficulty of transferring the designed digital models to standard formats for fabrication. To overcome these difficulties, we have developed an efficient internal pore architecture design system based on triply periodic minimal surface (TPMS) unit cell libraries and associated computational methods to assemble TPMS unit cells into an entire scaffold model. In addition, we have developed a process planning technique based on TPMS internal architecture pattern of unit cells to generate tool paths for freeform fabrication of tissue engineering porous scaffolds. Copyright © 2012 IPEM. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Vilotte, J.-P.; Atkinson, M.; Michelini, A.; Igel, H.; van Eck, T.
2012-04-01
Increasingly dense seismic and geodetic networks are continuously transmitting a growing wealth of data from around the world. The multi-use of these data leaded the seismological community to pioneer globally distributed open-access data infrastructures, standard services and formats, e.g., the Federation of Digital Seismic Networks (FDSN) and the European Integrated Data Archives (EIDA). Our ability to acquire observational data outpaces our ability to manage, analyze and model them. Research in seismology is today facing a fundamental paradigm shift. Enabling advanced data-intensive analysis and modeling applications challenges conventional storage, computation and communication models and requires a new holistic approach. It is instrumental to exploit the cornucopia of data, and to guarantee optimal operation and design of the high-cost monitoring facilities. The strategy of VERCE is driven by the needs of the seismological data-intensive applications in data analysis and modeling. It aims to provide a comprehensive architecture and framework adapted to the scale and the diversity of those applications, and integrating the data infrastructures with Grid, Cloud and HPC infrastructures. It will allow prototyping solutions for new use cases as they emerge within the European Plate Observatory Systems (EPOS), the ESFRI initiative of the solid Earth community. Computational seismology, and information management, is increasingly revolving around massive amounts of data that stem from: (1) the flood of data from the observational systems; (2) the flood of data from large-scale simulations and inversions; (3) the ability to economically store petabytes of data online; (4) the evolving Internet and Data-aware computing capabilities. As data-intensive applications are rapidly increasing in scale and complexity, they require additional services-oriented architectures offering a virtualization-based flexibility for complex and re-usable workflows. Scientific information management poses computer science challenges: acquisition, organization, query and visualization tasks scale almost linearly with the data volumes. Commonly used FTP-GREP metaphor allows today to scan gigabyte-sized datasets but will not work for scanning terabyte-sized continuous waveform datasets. New data analysis and modeling methods, exploiting the signal coherence within dense network arrays, are nonlinear. Pair-algorithms on N points scale as N2. Wave form inversion and stochastic simulations raise computing and data handling challenges These applications are unfeasible for tera-scale datasets without new parallel algorithms that use near-linear processing, storage and bandwidth, and that can exploit new computing paradigms enabled by the intersection of several technologies (HPC, parallel scalable database crawler, data-aware HPC). This issues will be discussed based on a number of core pilot data-intensive applications and use cases retained in VERCE. This core applications are related to: (1) data processing and data analysis methods based on correlation techniques; (2) cpu-intensive applications such as large-scale simulation of synthetic waveforms in complex earth systems, and full waveform inversion and tomography. We shall analyze their workflow and data flow, and their requirements for a new service-oriented architecture and a data-aware platform with services and tools. Finally, we will outline the importance of a new collaborative environment between seismology and computer science, together with the need for the emergence and the recognition of 'research technologists' mastering the evolving data-aware technologies and the data-intensive research goals in seismology.
Embedded Data Processor and Portable Computer Technology testbeds
NASA Technical Reports Server (NTRS)
Alena, Richard; Liu, Yuan-Kwei; Goforth, Andre; Fernquist, Alan R.
1993-01-01
Attention is given to current activities in the Embedded Data Processor and Portable Computer Technology testbed configurations that are part of the Advanced Data Systems Architectures Testbed at the Information Sciences Division at NASA Ames Research Center. The Embedded Data Processor Testbed evaluates advanced microprocessors for potential use in mission and payload applications within the Space Station Freedom Program. The Portable Computer Technology (PCT) Testbed integrates and demonstrates advanced portable computing devices and data system architectures. The PCT Testbed uses both commercial and custom-developed devices to demonstrate the feasibility of functional expansion and networking for portable computers in flight missions.
A supportive architecture for CFD-based design optimisation
NASA Astrophysics Data System (ADS)
Li, Ni; Su, Zeya; Bi, Zhuming; Tian, Chao; Ren, Zhiming; Gong, Guanghong
2014-03-01
Multi-disciplinary design optimisation (MDO) is one of critical methodologies to the implementation of enterprise systems (ES). MDO requiring the analysis of fluid dynamics raises a special challenge due to its extremely intensive computation. The rapid development of computational fluid dynamic (CFD) technique has caused a rise of its applications in various fields. Especially for the exterior designs of vehicles, CFD has become one of the three main design tools comparable to analytical approaches and wind tunnel experiments. CFD-based design optimisation is an effective way to achieve the desired performance under the given constraints. However, due to the complexity of CFD, integrating with CFD analysis in an intelligent optimisation algorithm is not straightforward. It is a challenge to solve a CFD-based design problem, which is usually with high dimensions, and multiple objectives and constraints. It is desirable to have an integrated architecture for CFD-based design optimisation. However, our review on existing works has found that very few researchers have studied on the assistive tools to facilitate CFD-based design optimisation. In the paper, a multi-layer architecture and a general procedure are proposed to integrate different CFD toolsets with intelligent optimisation algorithms, parallel computing technique and other techniques for efficient computation. In the proposed architecture, the integration is performed either at the code level or data level to fully utilise the capabilities of different assistive tools. Two intelligent algorithms are developed and embedded with parallel computing. These algorithms, together with the supportive architecture, lay a solid foundation for various applications of CFD-based design optimisation. To illustrate the effectiveness of the proposed architecture and algorithms, the case studies on aerodynamic shape design of a hypersonic cruising vehicle are provided, and the result has shown that the proposed architecture and developed algorithms have performed successfully and efficiently in dealing with the design optimisation with over 200 design variables.
Prospective Architectures for Onboard vs Cloud-Based Decision Making for Unmanned Aerial Systems
NASA Technical Reports Server (NTRS)
Sankararaman, Shankar; Teubert, Christopher
2017-01-01
This paper investigates propsective architectures for decision-making in unmanned aerial systems. When these unmanned vehicles operate in urban environments, there are several sources of uncertainty that affect their behavior, and decision-making algorithms need to be robust to account for these different sources of uncertainty. It is important to account for several risk-factors that affect the flight of these unmanned systems, and facilitate decision-making by taking into consideration these various risk-factors. In addition, there are several technical challenges related to autonomous flight of unmanned aerial systems; these challenges include sensing, obstacle detection, path planning and navigation, trajectory generation and selection, etc. Many of these activities require significant computational power and in many situations, all of these activities need to be performed in real-time. In order to efficiently integrate these activities, it is important to develop a systematic architecture that can facilitate real-time decision-making. Four prospective architectures are discussed in this paper; on one end of the spectrum, the first architecture considers all activities/computations being performed onboard the vehicle whereas on the other end of the spectrum, the fourth and final architecture considers all activities/computations being performed in the cloud, using a new service known as Prognostics as a Service that is being developed at NASA Ames Research Center. The four different architectures are compared, their advantages and disadvantages are explained and conclusions are presented.
NASA Astrophysics Data System (ADS)
Zaveri, Mazad Shaheriar
The semiconductor/computer industry has been following Moore's law for several decades and has reaped the benefits in speed and density of the resultant scaling. Transistor density has reached almost one billion per chip, and transistor delays are in picoseconds. However, scaling has slowed down, and the semiconductor industry is now facing several challenges. Hybrid CMOS/nano technologies, such as CMOL, are considered as an interim solution to some of the challenges. Another potential architectural solution includes specialized architectures for applications/models in the intelligent computing domain, one aspect of which includes abstract computational models inspired from the neuro/cognitive sciences. Consequently in this dissertation, we focus on the hardware implementations of Bayesian Memory (BM), which is a (Bayesian) Biologically Inspired Computational Model (BICM). This model is a simplified version of George and Hawkins' model of the visual cortex, which includes an inference framework based on Judea Pearl's belief propagation. We then present a "hardware design space exploration" methodology for implementing and analyzing the (digital and mixed-signal) hardware for the BM. This particular methodology involves: analyzing the computational/operational cost and the related micro-architecture, exploring candidate hardware components, proposing various custom hardware architectures using both traditional CMOS and hybrid nanotechnology - CMOL, and investigating the baseline performance/price of these architectures. The results suggest that CMOL is a promising candidate for implementing a BM. Such implementations can utilize the very high density storage/computation benefits of these new nano-scale technologies much more efficiently; for example, the throughput per 858 mm2 (TPM) obtained for CMOL based architectures is 32 to 40 times better than the TPM for a CMOS based multiprocessor/multi-FPGA system, and almost 2000 times better than the TPM for a PC implementation. We later use this methodology to investigate the hardware implementations of cortex-scale spiking neural system, which is an approximate neural equivalent of BICM based cortex-scale system. The results of this investigation also suggest that CMOL is a promising candidate to implement such large-scale neuromorphic systems. In general, the assessment of such hypothetical baseline hardware architectures provides the prospects for building large-scale (mammalian cortex-scale) implementations of neuromorphic/Bayesian/intelligent systems using state-of-the-art and beyond state-of-the-art silicon structures.
NASA Astrophysics Data System (ADS)
Lhamon, Michael Earl
A pattern recognition system which uses complex correlation filter banks requires proportionally more computational effort than single-real valued filters. This introduces increased computation burden but also introduces a higher level of parallelism, that common computing platforms fail to identify. As a result, we consider algorithm mapping to both optical and digital processors. For digital implementation, we develop computationally efficient pattern recognition algorithms, referred to as, vector inner product operators that require less computational effort than traditional fast Fourier methods. These algorithms do not need correlation and they map readily onto parallel digital architectures, which imply new architectures for optical processors. These filters exploit circulant-symmetric matrix structures of the training set data representing a variety of distortions. By using the same mathematical basis as with the vector inner product operations, we are able to extend the capabilities of more traditional correlation filtering to what we refer to as "Super Images". These "Super Images" are used to morphologically transform a complicated input scene into a predetermined dot pattern. The orientation of the dot pattern is related to the rotational distortion of the object of interest. The optical implementation of "Super Images" yields feature reduction necessary for using other techniques, such as artificial neural networks. We propose a parallel digital signal processor architecture based on specific pattern recognition algorithms but general enough to be applicable to other similar problems. Such an architecture is classified as a data flow architecture. Instead of mapping an algorithm to an architecture, we propose mapping the DSP architecture to a class of pattern recognition algorithms. Today's optical processing systems have difficulties implementing full complex filter structures. Typically, optical systems (like the 4f correlators) are limited to phase-only implementation with lower detection performance than full complex electronic systems. Our study includes pseudo-random pixel encoding techniques for approximating full complex filtering. Optical filter bank implementation is possible and they have the advantage of time averaging the entire filter bank at real time rates. Time-averaged optical filtering is computational comparable to billions of digital operations-per-second. For this reason, we believe future trends in high speed pattern recognition will involve hybrid architectures of both optical and DSP elements.
Super and parallel computers and their impact on civil engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamat, M.P.
1986-01-01
This book presents the papers given at a conference on the use of supercomputers in civil engineering. Topics considered at the conference included solving nonlinear equations on a hypercube, a custom architectured parallel processing system, distributed data processing, algorithms, computer architecture, parallel processing, vector processing, computerized simulation, and cost benefit analysis.
p88110: A Graphical Simulator for Computer Architecture and Organization Courses
ERIC Educational Resources Information Center
Garcia, M. I.; Rodriguez, S.; Perez, A.; Garcia, A.
2009-01-01
Studying fundamental Computer Architecture and Organization topics requires a significant amount of practical work if students are to acquire a good grasp of the theoretical concepts presented in classroom lectures or textbooks. The use of simulators is commonly adopted in order to reach this objective. However, as most of the available…
Component architecture in drug discovery informatics.
Smith, Peter M
2002-05-01
This paper reviews the characteristics of a new model of computing that has been spurred on by the Internet, known as Netcentric computing. Developments in this model led to distributed component architectures, which, although not new ideas, are now realizable with modern tools such as Enterprise Java. The application of this approach to scientific computing, particularly in pharmaceutical discovery research, is discussed and highlighted by a particular case involving the management of biological assay data.
FPGA-based architecture for motion recovering in real-time
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel; Maya-Rueda, Selene E.; Torres-Huitzil, Cesar
2002-03-01
A key problem in the computer vision field is the measurement of object motion in a scene. The main goal is to compute an approximation of the 3D motion from the analysis of an image sequence. Once computed, this information can be used as a basis to reach higher level goals in different applications. Motion estimation algorithms pose a significant computational load for the sequential processors limiting its use in practical applications. In this work we propose a hardware architecture for motion estimation in real time based on FPGA technology. The technique used for motion estimation is Optical Flow due to its accuracy, and the density of velocity estimation, however other techniques are being explored. The architecture is composed of parallel modules working in a pipeline scheme to reach high throughput rates near gigaflops. The modules are organized in a regular structure to provide a high degree of flexibility to cover different applications. Some results will be presented and the real-time performance will be discussed and analyzed. The architecture is prototyped in an FPGA board with a Virtex device interfaced to a digital imager.
2015-05-01
Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and Mobile Devices Walt Scacchi and Thomas...2015 to 00-00-2015 4. TITLE AND SUBTITLE Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and...architecture (OA) software systems Emerging challenges in achieving Better Buying Power (BBP) via OA software systems for Web- based and Mobile devices
NASA Technical Reports Server (NTRS)
Torres-Pomales, Wilfredo
2014-01-01
This report presents an example of the application of multi-criteria decision analysis to the selection of an architecture for a safety-critical distributed computer system. The design problem includes constraints on minimum system availability and integrity, and the decision is based on the optimal balance of power, weight and cost. The analysis process includes the generation of alternative architectures, evaluation of individual decision criteria, and the selection of an alternative based on overall value. In this example presented here, iterative application of the quantitative evaluation process made it possible to deliberately generate an alternative architecture that is superior to all others regardless of the relative importance of cost.
Fault tolerant architectures for integrated aircraft electronics systems, task 2
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1984-01-01
The architectural basis for an advanced fault tolerant on-board computer to succeed the current generation of fault tolerant computers is examined. The network error tolerant system architecture is studied with particular attention to intercluster configurations and communication protocols, and to refined reliability estimates. The diagnosis of faults, so that appropriate choices for reconfiguration can be made is discussed. The analysis relates particularly to the recognition of transient faults in a system with tasks at many levels of priority. The demand driven data-flow architecture, which appears to have possible application in fault tolerant systems is described and work investigating the feasibility of automatic generation of aircraft flight control programs from abstract specifications is reported.
A cognitive computational model inspired by the immune system response.
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective.
A Cognitive Computational Model Inspired by the Immune System Response
Abdo Abd Al-Hady, Mohamed; Badr, Amr Ahmed; Mostafa, Mostafa Abd Al-Azim
2014-01-01
The immune system has a cognitive ability to differentiate between healthy and unhealthy cells. The immune system response (ISR) is stimulated by a disorder in the temporary fuzzy state that is oscillating between the healthy and unhealthy states. However, modeling the immune system is an enormous challenge; the paper introduces an extensive summary of how the immune system response functions, as an overview of a complex topic, to present the immune system as a cognitive intelligent agent. The homogeneity and perfection of the natural immune system have been always standing out as the sought-after model we attempted to imitate while building our proposed model of cognitive architecture. The paper divides the ISR into four logical phases: setting a computational architectural diagram for each phase, proceeding from functional perspectives (input, process, and output), and their consequences. The proposed architecture components are defined by matching biological operations with computational functions and hence with the framework of the paper. On the other hand, the architecture focuses on the interoperability of main theoretical immunological perspectives (classic, cognitive, and danger theory), as related to computer science terminologies. The paper presents a descriptive model of immune system, to figure out the nature of response, deemed to be intrinsic for building a hybrid computational model based on a cognitive intelligent agent perspective and inspired by the natural biology. To that end, this paper highlights the ISR phases as applied to a case study on hepatitis C virus, meanwhile illustrating our proposed architecture perspective. PMID:25003131
Howes, Andrew; Lewis, Richard L; Vera, Alonso
2009-10-01
The authors assume that individuals adapt rationally to a utility function given constraints imposed by their cognitive architecture and the local task environment. This assumption underlies a new approach to modeling and understanding cognition-cognitively bounded rational analysis-that sharpens the predictive acuity of general, integrated theories of cognition and action. Such theories provide the necessary computational means to explain the flexible nature of human behavior but in doing so introduce extreme degrees of freedom in accounting for data. The new approach narrows the space of predicted behaviors through analysis of the payoff achieved by alternative strategies, rather than through fitting strategies and theoretical parameters to data. It extends and complements established approaches, including computational cognitive architectures, rational analysis, optimal motor control, bounded rationality, and signal detection theory. The authors illustrate the approach with a reanalysis of an existing account of psychological refractory period (PRP) dual-task performance and the development and analysis of a new theory of ordered dual-task responses. These analyses yield several novel results, including a new understanding of the role of strategic variation in existing accounts of PRP and the first predictive, quantitative account showing how the details of ordered dual-task phenomena emerge from the rational control of a cognitive system subject to the combined constraints of internal variance, motor interference, and a response selection bottleneck.
The Spin Torque Lego - from spin torque nano-devices to advanced computing architectures
NASA Astrophysics Data System (ADS)
Grollier, Julie
2013-03-01
Spin transfer torque (STT), predicted in 1996, and first observed around 2000, brought spintronic devices to the realm of active elements. A whole class of new devices, based on the combined effects of STT for writing and Giant Magneto-Resistance or Tunnel Magneto-Resistance for reading has emerged. The second generation of MRAMs, based on spin torque writing : the STT-RAM, is under industrial development and should be out on the market in three years. But spin torque devices are not limited to binary memories. We will rapidly present how the spin torque effect also allows to implement non-linear nano-oscillators, spin-wave emitters, controlled stochastic devices and microwave nano-detectors. What is extremely interesting is that all these functionalities can be obtained using the same materials, the exact same stack, simply by changing the device geometry and its bias conditions. So these different devices can be seen as Lego bricks, each brick with its own functionality. During this talk, I will show how spin torque can be engineered to build new bricks, such as the Spintronic Memristor, an artificial magnetic nano-synapse. I will then give hints on how to assemble these bricks in order to build novel types of computing architectures, with a special focus on neuromorphic circuits. Financial support by the European Research Council Starting Grant NanoBrain (ERC 2010 Stg 259068) is acknowledged.
PCM-Based Durable Write Cache for Fast Disk I/O
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhuo; Wang, Bin; Carpenter, Patrick
2012-01-01
Flash based solid-state devices (FSSDs) have been adopted within the memory hierarchy to improve the performance of hard disk drive (HDD) based storage system. However, with the fast development of storage-class memories, new storage technologies with better performance and higher write endurance than FSSDs are emerging, e.g., phase-change memory (PCM). Understanding how to leverage these state-of-the-art storage technologies for modern computing systems is important to solve challenging data intensive computing problems. In this paper, we propose to leverage PCM for a hybrid PCM-HDD storage architecture. We identify the limitations of traditional LRU caching algorithms for PCM-based caches, and develop amore » novel hash-based write caching scheme called HALO to improve random write performance of hard disks. To address the limited durability of PCM devices and solve the degraded spatial locality in traditional wear-leveling techniques, we further propose novel PCM management algorithms that provide effective wear-leveling while maximizing access parallelism. We have evaluated this PCM-based hybrid storage architecture using applications with a diverse set of I/O access patterns. Our experimental results demonstrate that the HALO caching scheme leads to an average reduction of 36.8% in execution time compared to the LRU caching scheme, and that the SFC wear leveling extends the lifetime of PCM by a factor of 21.6.« less
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Follen, Gregory J. (Technical Monitor); Radenski, Atanas
2003-01-01
The growth of Internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of Peer to Peer (P2P) software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high-performance computing applications community. The general goal of this project is to achieve better understanding of the transition to Internet-based high-performance computing and to develop solutions for some of the technical challenges of this transition. In particular, we are interested in creating long-term motivation for end users to provide their idle processor time to support computationally intensive tasks. We believe that a practical P2P architecture should provide useful service to both clients with high-performance computing needs and contributors of lower-end computing resources. To achieve this, we are designing dual -service architecture for P2P high-performance divide-and conquer computing; we are also experimenting with a prototype implementation. Our proposed architecture incorporates a master server, utilizes dual satellite servers, and operates on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. A dual satellite server comprises a high-performance computing engine and a lower-end contributor service engine. The computing engine provides generic support for divide and conquer computations. The service engine is intended to provide free useful HTTP-based services to contributors of lower-end computing resources. Our proposed architecture is complementary to and accessible from computational grids, such as Globus, Legion, and Condor. Grids provide remote access to existing higher-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end Internet nodes. Our project is focused on a generic divide and conquer paradigm and on mobile applications of this paradigm that can operate on a loose and ever changing pool of lower-end Internet nodes.
Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation
2008-05-19
Data Compression for Maskless Lithography Systems: Architecture, Algorithms and Implementation Vito Dai Electrical Engineering and Computer Sciences...servers or to redistribute to lists, requires prior specific permission. Data Compression for Maskless Lithography Systems: Architecture, Algorithms and...for Maskless Lithography Systems: Architecture, Algorithms and Implementation Copyright 2008 by Vito Dai 1 Abstract Data Compression for Maskless
Bill, Johannes; Buesing, Lars; Habenschuss, Stefan; Nessler, Bernhard; Maass, Wolfgang; Legenstein, Robert
2015-01-01
During the last decade, Bayesian probability theory has emerged as a framework in cognitive science and neuroscience for describing perception, reasoning and learning of mammals. However, our understanding of how probabilistic computations could be organized in the brain, and how the observed connectivity structure of cortical microcircuits supports these calculations, is rudimentary at best. In this study, we investigate statistical inference and self-organized learning in a spatially extended spiking network model, that accommodates both local competitive and large-scale associative aspects of neural information processing, under a unified Bayesian account. Specifically, we show how the spiking dynamics of a recurrent network with lateral excitation and local inhibition in response to distributed spiking input, can be understood as sampling from a variational posterior distribution of a well-defined implicit probabilistic model. This interpretation further permits a rigorous analytical treatment of experience-dependent plasticity on the network level. Using machine learning theory, we derive update rules for neuron and synapse parameters which equate with Hebbian synaptic and homeostatic intrinsic plasticity rules in a neural implementation. In computer simulations, we demonstrate that the interplay of these plasticity rules leads to the emergence of probabilistic local experts that form distributed assemblies of similarly tuned cells communicating through lateral excitatory connections. The resulting sparse distributed spike code of a well-adapted network carries compressed information on salient input features combined with prior experience on correlations among them. Our theory predicts that the emergence of such efficient representations benefits from network architectures in which the range of local inhibition matches the spatial extent of pyramidal cells that share common afferent input. PMID:26284370
Autonomic Computing for Spacecraft Ground Systems
NASA Technical Reports Server (NTRS)
Li, Zhenping; Savkli, Cetin; Jones, Lori
2007-01-01
Autonomic computing for spacecraft ground systems increases the system reliability and reduces the cost of spacecraft operations and software maintenance. In this paper, we present an autonomic computing solution for spacecraft ground systems at NASA Goddard Space Flight Center (GSFC), which consists of an open standard for a message oriented architecture referred to as the GMSEC architecture (Goddard Mission Services Evolution Center), and an autonomic computing tool, the Criteria Action Table (CAT). This solution has been used in many upgraded ground systems for NASA 's missions, and provides a framework for developing solutions with higher autonomic maturity.
Traffic information computing platform for big data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn
Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.
NASA Astrophysics Data System (ADS)
Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.
2016-05-01
In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Developing the Next Generation of Science Data System Engineers
NASA Technical Reports Server (NTRS)
Moses, John F.; Behnke, Jeanne; Durachka, Christopher D.
2016-01-01
At Goddard, engineers and scientists with a range of experience in science data systems are needed to employ new technologies and develop advances in capabilities for supporting new Earth and Space science research. Engineers with extensive experience in science data, software engineering and computer-information architectures are needed to lead and perform these activities. The increasing types and complexity of instrument data and emerging computer technologies coupled with the current shortage of computer engineers with backgrounds in science has led the need to develop a career path for science data systems engineers and architects.The current career path, in which undergraduate students studying various disciplines such as Computer Engineering or Physical Scientist, generally begins with serving on a development team in any of the disciplines where they can work in depth on existing Goddard data systems or serve with a specific NASA science team. There they begin to understand the data, infuse technologies, and begin to know the architectures of science data systems. From here the typical career involves peermentoring, on-the-job training or graduate level studies in analytics, computational science and applied science and mathematics. At the most senior level, engineers become subject matter experts and system architect experts, leading discipline-specific data centers and large software development projects. They are recognized as a subject matter expert in a science domain, they have project management expertise, lead standards efforts and lead international projects. A long career development remains necessary not only because of the breadth of knowledge required across physical sciences and engineering disciplines, but also because of the diversity of instrument data being developed today both by NASA and international partner agencies and because multidiscipline science and practitioner communities expect to have access to all types of observational data.This paper describes an approach to defining career-path guidance for college-bound high school and undergraduate engineering students, junior and senior engineers from various disciplines.
Developing the Next Generation of Science Data System Engineers
NASA Astrophysics Data System (ADS)
Moses, J. F.; Durachka, C. D.; Behnke, J.
2015-12-01
At Goddard, engineers and scientists with a range of experience in science data systems are needed to employ new technologies and develop advances in capabilities for supporting new Earth and Space science research. Engineers with extensive experience in science data, software engineering and computer-information architectures are needed to lead and perform these activities. The increasing types and complexity of instrument data and emerging computer technologies coupled with the current shortage of computer engineers with backgrounds in science has led the need to develop a career path for science data systems engineers and architects. The current career path, in which undergraduate students studying various disciplines such as Computer Engineering or Physical Scientist, generally begins with serving on a development team in any of the disciplines where they can work in depth on existing Goddard data systems or serve with a specific NASA science team. There they begin to understand the data, infuse technologies, and begin to know the architectures of science data systems. From here the typical career involves peer mentoring, on-the-job training or graduate level studies in analytics, computational science and applied science and mathematics. At the most senior level, engineers become subject matter experts and system architect experts, leading discipline-specific data centers and large software development projects. They are recognized as a subject matter expert in a science domain, they have project management expertise, lead standards efforts and lead international projects. A long career development remains necessary not only because of the breath of knowledge required across physical sciences and engineering disciplines, but also because of the diversity of instrument data being developed today both by NASA and international partner agencies and because multi-discipline science and practitioner communities expect to have access to all types of observational data. This paper describes an approach to defining career-path guidance for college-bound high school and undergraduate engineering students, junior and senior engineers from various disciplines.
Richmond, Paul; Buesing, Lars; Giugliano, Michele; Vasilaki, Eleni
2011-01-01
High performance computing on the Graphics Processing Unit (GPU) is an emerging field driven by the promise of high computational power at a low cost. However, GPU programming is a non-trivial task and moreover architectural limitations raise the question of whether investing effort in this direction may be worthwhile. In this work, we use GPU programming to simulate a two-layer network of Integrate-and-Fire neurons with varying degrees of recurrent connectivity and investigate its ability to learn a simplified navigation task using a policy-gradient learning rule stemming from Reinforcement Learning. The purpose of this paper is twofold. First, we want to support the use of GPUs in the field of Computational Neuroscience. Second, using GPU computing power, we investigate the conditions under which the said architecture and learning rule demonstrate best performance. Our work indicates that networks featuring strong Mexican-Hat-shaped recurrent connections in the top layer, where decision making is governed by the formation of a stable activity bump in the neural population (a “non-democratic” mechanism), achieve mediocre learning results at best. In absence of recurrent connections, where all neurons “vote” independently (“democratic”) for a decision via population vector readout, the task is generally learned better and more robustly. Our study would have been extremely difficult on a desktop computer without the use of GPU programming. We present the routines developed for this purpose and show that a speed improvement of 5x up to 42x is provided versus optimised Python code. The higher speed is achieved when we exploit the parallelism of the GPU in the search of learning parameters. This suggests that efficient GPU programming can significantly reduce the time needed for simulating networks of spiking neurons, particularly when multiple parameter configurations are investigated. PMID:21572529
Guest editorial. Integrated healthcare information systems.
Li, Ling; Ge, Ri-Li; Zhou, Shang-Ming; Valerdi, Ricardo
2012-07-01
The use of integrated information systems for healthcare has been started more than a decade ago. In recent years, rapid advances in information integration methods have spurred tremendous growth in the use of integrated information systems in healthcare delivery. Various techniques have been used for probing such integrated systems. These techniques include service-oriented architecture (SOA), EAI, workflow management, grid computing, and others. Many applications require a combination of these techniques, which gives rise to the emergence of enterprise systems in healthcare. Development of the techniques originated from different disciplines has the potential to significantly improve the performance of enterprise systems in healthcare. This editorial paper briefly introduces the enterprise systems in the perspective of healthcare informatics.
Recent advances in approximation concepts for optimum structural design
NASA Technical Reports Server (NTRS)
Barthelemy, Jean-Francois M.; Haftka, Raphael T.
1991-01-01
The basic approximation concepts used in structural optimization are reviewed. Some of the most recent developments in that area since the introduction of the concept in the mid-seventies are discussed. The paper distinguishes between local, medium-range, and global approximations; it covers functions approximations and problem approximations. It shows that, although the lack of comparative data established on reference test cases prevents an accurate assessment, there have been significant improvements. The largest number of developments have been in the areas of local function approximations and use of intermediate variable and response quantities. It also appears that some new methodologies are emerging which could greatly benefit from the introduction of new computer architecture.
Kinetic signature of fractal-like filament networks formed by orientational linear epitaxy.
Hwang, Wonmuk; Eryilmaz, Esma
2014-07-11
We study a broad class of epitaxial assembly of filament networks on lattice surfaces. Over time, a scale-free behavior emerges with a 2.5-3 power-law exponent in filament length distribution. Partitioning between the power-law and exponential behaviors in a network can be used to find the stage and kinetic parameters of the assembly process. To analyze real-world networks, we develop a computer program that measures the network architecture in experimental images. Application to triaxial networks of collagen fibrils shows quantitative agreement with our model. Our unifying approach can be used for characterizing and controlling the network formation that is observed across biological and nonbiological systems.
NASA Astrophysics Data System (ADS)
Panov, Yu. D.; Moskvin, A. S.; Rybakov, F. N.; Borisov, A. B.
2016-12-01
We made use of a special algorithm for compute unified device architecture for NVIDIA graphics cards, a nonlinear conjugate-gradient method to minimize energy functional, and Monte-Carlo technique to directly observe the forming of the ground state configuration for the 2D hard-core bosons by lowering the temperature and its evolution with deviation away from half-filling. The novel technique allowed us to examine earlier implications and uncover novel features of the phase transitions, in particular, look upon the nucleation of the odd domain structure, emergence of filamentary superfluidity nucleated at the antiphase domain walls of the charge-ordered phase, and nucleation and evolution of different topological structures.
Dynamic array processing for computationally intensive expert systems in CLIPS
NASA Technical Reports Server (NTRS)
Athavale, N. N.; Ragade, R. K.; Fenske, T. E.; Cassaro, M. A.
1990-01-01
This paper puts forth an architecture for implementing a loop for advanced data structure of arrays in CLIPS. An attempt is made to use multi-field variables in such an architecture to process a set of data during the decision making cycle. Also, current limitations on the expert system shells are discussed in brief in this paper. The resulting architecture is designed to circumvent the current limitations set by the expert system shell and also by the operating environment. Such advanced data structures are needed for tightly coupling symbolic and numeric computation modules.
A model for architectural comparison
NASA Astrophysics Data System (ADS)
Ho, Sam; Snyder, Larry
1988-04-01
Recently, architectures for sequential computers became a topic of much discussion and controversy. At the center of this storm is the Reduced Instruction Set Computer, or RISC, first described at Berkeley in 1980. While the merits of the RISC architecture cannot be ignored, its opponents have tried to do just that, while its proponents have expanded and frequently exaggerated them. This state of affairs has persisted to this day. No attempt is made to settle this controversy, since indeed there is likely no one answer. A qualitative framework is provided for a rational discussion of the issues.
Dual-scale topology optoelectronic processor.
Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H
1991-12-15
The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
An Object-Oriented Network-Centric Software Architecture for Physical Computing
NASA Astrophysics Data System (ADS)
Palmer, Richard
1997-08-01
Recent developments in object-oriented computer languages and infrastructure such as the Internet, Web browsers, and the like provide an opportunity to define a more productive computational environment for scientific programming that is based more closely on the underlying mathematics describing physics than traditional programming languages such as FORTRAN or C++. In this talk I describe an object-oriented software architecture for representing physical problems that includes classes for such common mathematical objects as geometry, boundary conditions, partial differential and integral equations, discretization and numerical solution methods, etc. In practice, a scientific program written using this architecture looks remarkably like the mathematics used to understand the problem, is typically an order of magnitude smaller than traditional FORTRAN or C++ codes, and hence easier to understand, debug, describe, etc. All objects in this architecture are ``network-enabled,'' which means that components of a software solution to a physical problem can be transparently loaded from anywhere on the Internet or other global network. The architecture is expressed as an ``API,'' or application programmers interface specification, with reference embeddings in Java, Python, and C++. A C++ class library for an early version of this API has been implemented for machines ranging from PC's to the IBM SP2, meaning that phidentical codes run on all architectures.
SecureCore Security Architecture: Authority Mode and Emergency Management
2007-10-16
can shield first responders from social vultures (e.g., “ambulance chasers”) or malicious parties who could intentionally interfere with emergency...hierarchical design Communications Management: network communication Process Management...and Emergency Management 1 I. Introduction During many crises, first- responder access to sensitive, restricted emergency information is
Computational Models and Emergent Properties of Respiratory Neural Networks
Lindsey, Bruce G.; Rybak, Ilya A.; Smith, Jeffrey C.
2012-01-01
Computational models of the neural control system for breathing in mammals provide a theoretical and computational framework bringing together experimental data obtained from different animal preparations under various experimental conditions. Many of these models were developed in parallel and iteratively with experimental studies and provided predictions guiding new experiments. This data-driven modeling approach has advanced our understanding of respiratory network architecture and neural mechanisms underlying generation of the respiratory rhythm and pattern, including their functional reorganization under different physiological conditions. Models reviewed here vary in neurobiological details and computational complexity and span multiple spatiotemporal scales of respiratory control mechanisms. Recent models describe interacting populations of respiratory neurons spatially distributed within the Bötzinger and pre-Bötzinger complexes and rostral ventrolateral medulla that contain core circuits of the respiratory central pattern generator (CPG). Network interactions within these circuits along with intrinsic rhythmogenic properties of neurons form a hierarchy of multiple rhythm generation mechanisms. The functional expression of these mechanisms is controlled by input drives from other brainstem components, including the retrotrapezoid nucleus and pons, which regulate the dynamic behavior of the core circuitry. The emerging view is that the brainstem respiratory network has rhythmogenic capabilities at multiple levels of circuit organization. This allows flexible, state-dependent expression of different neural pattern-generation mechanisms under various physiological conditions, enabling a wide repertoire of respiratory behaviors. Some models consider control of the respiratory CPG by pulmonary feedback and network reconfiguration during defensive behaviors such as cough. Future directions in modeling of the respiratory CPG are considered. PMID:23687564
ERIC Educational Resources Information Center
Hung, Y.-C.
2012-01-01
This paper investigates the impact of combining self explaining (SE) with computer architecture diagrams to help novice students learn assembly language programming. Pre- and post-test scores for the experimental and control groups were compared and subjected to covariance (ANCOVA) statistical analysis. Results indicate that the SE-plus-diagram…
MIT CSAIL and Lincoln Laboratory Task Force Report
2016-08-01
projects have been very diverse, spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications...spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications, computing architectures and...to machine learning systems and algorithms, such as recommender systems, and “Big Data ” analytics . Advanced computing architectures broadly refer to
A Model for Minimizing Numeric Function Generator Complexity and Delay
2007-12-01
allow computation of difficult mathematical functions in less time and with less hardware than commonly employed methods. They compute piecewise...Programmable Gate Arrays (FPGAs). The algorithms and estimation techniques apply to various NFG architectures and mathematical functions. This...thesis compares hardware utilization and propagation delay for various NFG architectures, mathematical functions, word widths, and segmentation methods
Usage of Thin-Client/Server Architecture in Computer Aided Education
ERIC Educational Resources Information Center
Cimen, Caghan; Kavurucu, Yusuf; Aydin, Halit
2014-01-01
With the advances of technology, thin-client/server architecture has become popular in multi-user/single network environments. Thin-client is a user terminal in which the user can login to a domain and run programs by connecting to a remote server. Recent developments in network and hardware technologies (cloud computing, virtualization, etc.)…
Optical Computing Based on Neuronal Models
1988-05-01
walking, and cognition are far too complex for existing sequential digital computers. Therefore new architectures, hardware, and algorithms modeled...collective behavior, and iterative processing into optical processing and artificial neurodynamical systems. Another intriguing promise of neural nets is...with architectures, implementations, and programming; and material research s -7- called for. Our future research in neurodynamics will continue to
Using SPEEDES to simulate the blue gene interconnect network
NASA Technical Reports Server (NTRS)
Springer, P.; Upchurch, E.
2003-01-01
JPL and the Center for Advanced Computer Architecture (CACR) is conducting application and simulation analyses of BG/L in order to establish a range of effectiveness for the Blue Gene/L MPP architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution.
The Use of Metaphors as a Parametric Design Teaching Model: A Case Study
ERIC Educational Resources Information Center
Agirbas, Asli
2018-01-01
Teaching methodologies for parametric design are being researched all over the world, since there is a growing demand for computer programming logic and its fabrication process in architectural education. The computer programming courses in architectural education are usually done in a very short period of time, and so students have no chance to…
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.
1992-01-01
An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
Service-Oriented Architecture for NVO and TeraGrid Computing
NASA Technical Reports Server (NTRS)
Jacob, Joseph; Miller, Craig; Williams, Roy; Steenberg, Conrad; Graham, Matthew
2008-01-01
The National Virtual Observatory (NVO) Extensible Secure Scalable Service Infrastructure (NESSSI) is a Web service architecture and software framework that enables Web-based astronomical data publishing and processing on grid computers such as the National Science Foundation's TeraGrid. Characteristics of this architecture include the following: (1) Services are created, managed, and upgraded by their developers, who are trusted users of computing platforms on which the services are deployed. (2) Service jobs can be initiated by means of Java or Python client programs run on a command line or with Web portals. (3) Access is granted within a graduated security scheme in which the size of a job that can be initiated depends on the level of authentication of the user.
A multitasking finite state architecture for computer control of an electric powertrain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burba, J.C.
1984-01-01
Finite state techniques provide a common design language between the control engineer and the computer engineer for event driven computer control systems. They simplify communication and provide a highly maintainable control system understandable by both. This paper describes the development of a control system for an electric vehicle powertrain utilizing finite state concepts. The basics of finite state automata are provided as a framework to discuss a unique multitasking software architecture developed for this application. The architecture employs conventional time-sliced techniques with task scheduling controlled by a finite state machine representation of the control strategy of the powertrain. The complexitiesmore » of excitation variable sampling in this environment are also considered.« less
NASA Technical Reports Server (NTRS)
Rickard, D. A.; Bodenheimer, R. E.
1976-01-01
Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
Network architecture test-beds as platforms for ubiquitous computing.
Roscoe, Timothy
2008-10-28
Distributed systems research, and in particular ubiquitous computing, has traditionally assumed the Internet as a basic underlying communications substrate. Recently, however, the networking research community has come to question the fundamental design or 'architecture' of the Internet. This has been led by two observations: first, that the Internet as it stands is now almost impossible to evolve to support new functionality; and second, that modern applications of all kinds now use the Internet rather differently, and frequently implement their own 'overlay' networks above it to work around its perceived deficiencies. In this paper, I discuss recent academic projects to allow disruptive change to the Internet architecture, and also outline a radically different view of networking for ubiquitous computing that such proposals might facilitate.
FPGA-based real-time phase measuring profilometry algorithm design and implementation
NASA Astrophysics Data System (ADS)
Zhan, Guomin; Tang, Hongwei; Zhong, Kai; Li, Zhongwei; Shi, Yusheng
2016-11-01
Phase measuring profilometry (PMP) has been widely used in many fields, like Computer Aided Verification (CAV), Flexible Manufacturing System (FMS) et al. High frame-rate (HFR) real-time vision-based feedback control will be a common demands in near future. However, the instruction time delay in the computer caused by numerous repetitive operations greatly limit the efficiency of data processing. FPGA has the advantages of pipeline architecture and parallel execution, and it fit for handling PMP algorithm. In this paper, we design a fully pipelined hardware architecture for PMP. The functions of hardware architecture includes rectification, phase calculation, phase shifting, and stereo matching. The experiment verified the performance of this method, and the factors that may influence the computation accuracy was analyzed.
NASA Astrophysics Data System (ADS)
Ragan-Kelley, M.; Perez, F.; Granger, B.; Kluyver, T.; Ivanov, P.; Frederic, J.; Bussonnier, M.
2014-12-01
IPython has provided terminal-based tools for interactive computing in Python since 2001. The notebook document format and multi-process architecture introduced in 2011 have expanded the applicable scope of IPython into teaching, presenting, and sharing computational work, in addition to interactive exploration. The new architecture also allows users to work in any language, with implementations in Python, R, Julia, Haskell, and several other languages. The language agnostic parts of IPython have been renamed to Jupyter, to better capture the notion that a cross-language design can encapsulate commonalities present in computational research regardless of the programming language being used. This architecture offers components like the web-based Notebook interface, that supports rich documents that combine code and computational results with text narratives, mathematics, images, video and any media that a modern browser can display. This interface can be used not only in research, but also for publication and education, as notebooks can be converted to a variety of output formats, including HTML and PDF. Recent developments in the Jupyter project include a multi-user environment for hosting notebooks for a class or research group, a live collaboration notebook via Google Docs, and better support for languages other than Python.
OS friendly microprocessor architecture: Hardware level computer security
NASA Astrophysics Data System (ADS)
Jungwirth, Patrick; La Fratta, Patrick
2016-05-01
We present an introduction to the patented OS Friendly Microprocessor Architecture (OSFA) and hardware level computer security. Conventional microprocessors have not tried to balance hardware performance and OS performance at the same time. Conventional microprocessors have depended on the Operating System for computer security and information assurance. The goal of the OS Friendly Architecture is to provide a high performance and secure microprocessor and OS system. We are interested in cyber security, information technology (IT), and SCADA control professionals reviewing the hardware level security features. The OS Friendly Architecture is a switched set of cache memory banks in a pipeline configuration. For light-weight threads, the memory pipeline configuration provides near instantaneous context switching times. The pipelining and parallelism provided by the cache memory pipeline provides for background cache read and write operations while the microprocessor's execution pipeline is running instructions. The cache bank selection controllers provide arbitration to prevent the memory pipeline and microprocessor's execution pipeline from accessing the same cache bank at the same time. This separation allows the cache memory pages to transfer to and from level 1 (L1) caching while the microprocessor pipeline is executing instructions. Computer security operations are implemented in hardware. By extending Unix file permissions bits to each cache memory bank and memory address, the OSFA provides hardware level computer security.
Internet Architecture: Lessons Learned and Looking Forward
2006-12-01
Internet Architecture: Lessons Learned and Looking Forward Geoffrey G. Xie Department of Computer Science Naval Postgraduate School April 2006... Internet architecture. Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is...readers are referred there for more information about a specific protocol or concept. 2. Origin of Internet Architecture The Internet is easily
ERIC Educational Resources Information Center
Uwakonye, Obioha; Alagbe, Oluwole; Oluwatayo, Adedapo; Alagbe, Taiye; Alalade, Gbenga
2015-01-01
As a result of globalization of digital technology, intellectual discourse on what constitutes the basic body of architectural knowledge to be imparted to future professionals has been on the increase. This digital revolution has brought to the fore the need to review the already overloaded architectural education curriculum of Nigerian schools of…
Sabne, Amit J.; Sakdhnagool, Putt; Lee, Seyong; ...
2015-07-13
Accelerator-based heterogeneous computing is gaining momentum in the high-performance computing arena. However, the increased complexity of heterogeneous architectures demands more generic, high-level programming models. OpenACC is one such attempt to tackle this problem. Although the abstraction provided by OpenACC offers productivity, it raises questions concerning both functional and performance portability. In this article, the authors propose HeteroIR, a high-level, architecture-independent intermediate representation, to map high-level programming models, such as OpenACC, to heterogeneous architectures. They present a compiler approach that translates OpenACC programs into HeteroIR and accelerator kernels to obtain OpenACC functional portability. They then evaluate the performance portability obtained bymore » OpenACC with their approach on 12 OpenACC programs on Nvidia CUDA, AMD GCN, and Intel Xeon Phi architectures. They study the effects of various compiler optimizations and OpenACC program settings on these architectures to provide insights into the achieved performance portability.« less
The future of computing--new architectures and new technologies.
Warren, P
2004-02-01
All modern computers are designed using the 'von Neumann' architecture and built using silicon transistor technology. Both architecture and technology have been remarkably successful. Yet there are a range of problems for which this conventional architecture is not particularly well adapted, and new architectures are being proposed to solve these problems, in particular based on insight from nature. Transistor technology has enjoyed 50 years of continuing progress. However, the laws of physics dictate that within a relatively short time period this progress will come to an end. New technologies, based on molecular and biological sciences as well as quantum physics, are vying to replace silicon, or at least coexist with it and extend its capability. The paper describes these novel architectures and technologies, places them in the context of the kinds of problems they might help to solve, and predicts their possible manner and time of adoption. Finally it describes some key questions and research problems associated with their use.
Scalable service architecture for providing strong service guarantees
NASA Astrophysics Data System (ADS)
Christin, Nicolas; Liebeherr, Joerg
2002-07-01
For the past decade, a lot of Internet research has been devoted to providing different levels of service to applications. Initial proposals for service differentiation provided strong service guarantees, with strict bounds on delays, loss rates, and throughput, but required high overhead in terms of computational complexity and memory, both of which raise scalability concerns. Recently, the interest has shifted to service architectures with low overhead. However, these newer service architectures only provide weak service guarantees, which do not always address the needs of applications. In this paper, we describe a service architecture that supports strong service guarantees, can be implemented with low computational complexity, and only requires to maintain little state information. A key mechanism of the proposed service architecture is that it addresses scheduling and buffer management in a single algorithm. The presented architecture offers no solution for controlling the amount of traffic that enters the network. Instead, we plan on exploiting feedback mechanisms of TCP congestion control algorithms for the purpose of regulating the traffic entering the network.
GPU-computing in econophysics and statistical physics
NASA Astrophysics Data System (ADS)
Preis, T.
2011-03-01
A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.
NASA Astrophysics Data System (ADS)
Ford, Eric B.; Dindar, Saleh; Peters, Jorg
2015-08-01
The realism of astrophysical simulations and statistical analyses of astronomical data are set by the available computational resources. Thus, astronomers and astrophysicists are constantly pushing the limits of computational capabilities. For decades, astronomers benefited from massive improvements in computational power that were driven primarily by increasing clock speeds and required relatively little attention to details of the computational hardware. For nearly a decade, increases in computational capabilities have come primarily from increasing the degree of parallelism, rather than increasing clock speeds. Further increases in computational capabilities will likely be led by many-core architectures such as Graphical Processing Units (GPUs) and Intel Xeon Phi. Successfully harnessing these new architectures, requires significantly more understanding of the hardware architecture, cache hierarchy, compiler capabilities and network network characteristics.I will provide an astronomer's overview of the opportunities and challenges provided by modern many-core architectures and elastic cloud computing. The primary goal is to help an astronomical audience understand what types of problems are likely to yield more than order of magnitude speed-ups and which problems are unlikely to parallelize sufficiently efficiently to be worth the development time and/or costs.I will draw on my experience leading a team in developing the Swarm-NG library for parallel integration of large ensembles of small n-body systems on GPUs, as well as several smaller software projects. I will share lessons learned from collaborating with computer scientists, including both technical and soft skills. Finally, I will discuss the challenges of training the next generation of astronomers to be proficient in this new era of high-performance computing, drawing on experience teaching a graduate class on High-Performance Scientific Computing for Astrophysics and organizing a 2014 advanced summer school on Bayesian Computing for Astronomical Data Analysis with support of the Penn State Center for Astrostatistics and Institute for CyberScience.
Power Efficient Hardware Architecture of SHA-1 Algorithm for Trusted Mobile Computing
NASA Astrophysics Data System (ADS)
Kim, Mooseop; Ryou, Jaecheol
The Trusted Mobile Platform (TMP) is developed and promoted by the Trusted Computing Group (TCG), which is an industry standard body to enhance the security of the mobile computing environment. The built-in SHA-1 engine in TMP is one of the most important circuit blocks and contributes the performance of the whole platform because it is used as key primitives supporting platform integrity and command authentication. Mobile platforms have very stringent limitations with respect to available power, physical circuit area, and cost. Therefore special architecture and design methods for low power SHA-1 circuit are required. In this paper, we present a novel and efficient hardware architecture of low power SHA-1 design for TMP. Our low power SHA-1 hardware can compute 512-bit data block using less than 7,000 gates and has a power consumption about 1.1 mA on a 0.25μm CMOS process.
An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations
NASA Technical Reports Server (NTRS)
Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.
1996-01-01
We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.