Optimizing Engineering Tools Using Modern Ground Architectures
2017-12-01
Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Transitioning ISR architecture into the cloud
NASA Astrophysics Data System (ADS)
Lash, Thomas D.
2012-06-01
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR) intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Analysis of Defenses Against Code Reuse Attacks on Modern and New Architectures
2015-09-01
soundness or completeness. An incomplete analysis will produce extra edges in the CFG that might allow an attacker to slip through. An unsound analysis...Analysis of Defenses Against Code Reuse Attacks on Modern and New Architectures by Isaac Noah Evans Submitted to the Department of Electrical...Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer
NASA Astrophysics Data System (ADS)
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; Kalinkin, Alexander A.
2017-02-01
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, which is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,'bottom-up' and 'top-down', are illustrated. Preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.
Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; ...
2017-03-20
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
Computer Technology: State of the Art.
ERIC Educational Resources Information Center
Withington, Frederic G.
1981-01-01
Describes the nature of modern general-purpose computer systems, including hardware, semiconductor electronics, microprocessors, computer architecture, input output technology, and system control programs. Seven suggested readings are cited. (FM)
Execution environment for intelligent real-time control systems
NASA Technical Reports Server (NTRS)
Sztipanovits, Janos
1987-01-01
Modern telerobot control technology requires the integration of symbolic and non-symbolic programming techniques, different models of parallel computations, and various programming paradigms. The Multigraph Architecture, which has been developed for the implementation of intelligent real-time control systems is described. The layered architecture includes specific computational models, integrated execution environment and various high-level tools. A special feature of the architecture is the tight coupling between the symbolic and non-symbolic computations. It supports not only a data interface, but also the integration of the control structures in a parallel computing environment.
Using Multimedia for Teaching Analysis in History of Modern Architecture.
ERIC Educational Resources Information Center
Perryman, Garry
This paper presents a case for the development and support of a computer-based interactive multimedia program for teaching analysis in community college architecture design programs. Analysis in architecture design is an extremely important strategy for the teaching of higher-order thinking skills, which senior schools of architecture look for in…
Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuller, Ivan K.; Stevens, Rick; Pino, Robinson
2015-10-29
Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
The Architecture of Information at Plateau Beaubourg
ERIC Educational Resources Information Center
Branda, Ewan Edward
2012-01-01
During the course of the 1960s, computers and information networks made their appearance in the public imagination. To architects on the cusp of architecture's postmodern turn, information technology offered new forms, metaphors, and techniques by which modern architecture's technological and utopian basis could be reasserted. Yet by the end of…
An S N Algorithm for Modern Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Randal Scott
2016-08-29
LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
Component architecture in drug discovery informatics.
Smith, Peter M
2002-05-01
This paper reviews the characteristics of a new model of computing that has been spurred on by the Internet, known as Netcentric computing. Developments in this model led to distributed component architectures, which, although not new ideas, are now realizable with modern tools such as Enterprise Java. The application of this approach to scientific computing, particularly in pharmaceutical discovery research, is discussed and highlighted by a particular case involving the management of biological assay data.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.
Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros
2014-06-25
The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions
2014-01-01
Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
Modern Computational Techniques for the HMMER Sequence Analysis
2013-01-01
This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
A learnable parallel processing architecture towards unity of memory and computing
NASA Astrophysics Data System (ADS)
Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.
2015-08-01
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.
Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J
2015-08-14
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Partitioning in Avionics Architectures: Requirements, Mechanisms, and Assurance
NASA Technical Reports Server (NTRS)
Rushby, John
1999-01-01
Automated aircraft control has traditionally been divided into distinct "functions" that are implemented separately (e.g., autopilot, autothrottle, flight management); each function has its own fault-tolerant computer system, and dependencies among different functions are generally limited to the exchange of sensor and control data. A by-product of this "federated" architecture is that faults are strongly contained within the computer system of the function where they occur and cannot readily propagate to affect the operation of other functions. More modern avionics architectures contemplate supporting multiple functions on a single, shared, fault-tolerant computer system where natural fault containment boundaries are less sharply defined. Partitioning uses appropriate hardware and software mechanisms to restore strong fault containment to such integrated architectures. This report examines the requirements for partitioning, mechanisms for their realization, and issues in providing assurance for partitioning. Because partitioning shares some concerns with computer security, security models are reviewed and compared with the concerns of partitioning.
Design and Decorative Art in Shaping of Architectural Environment Image
NASA Astrophysics Data System (ADS)
Shabalina, N. M.
2017-11-01
The relevance of the topic is determined by the dynamic development of the promising branch, i.e. the architectural environment design, which requires, on the one hand, consideration of the morphology and typology of this art form, on the other hand, the specificity of the architectural environment artistic image. The intensive development of innovative computer technologies and materials in modern engineering, improvement of the information communications forms in their totality has led to the application of new methods in design and construction which, in their turn, have required the development of additional methods for content and context analysis in the integrated assessment of socially significant architectural environments. In the modern culture, correlative processes are steadily developing leading us to a new understanding of the interaction of architecture, decorative art and design. Their rapprochement at the morphological level has been noted which makes it possible to reveal a specific method of synthesis and similarity. The architecture of postmodern styles differs in its bionic form becoming an interactive part of the society and approaching its structural qualities with painting, sculpture, and design. In the modern world, these processes acquire multi-valued semantic nuances, expand the importance of associativity and dynamic processuality in the perception of environmental objects and demand the development of new approaches to the assessment of the architectural design environment. Within the framework of the universal paradigm of modern times the concept of the world develops as a set of systems that live according to the self-organization laws.
Applications of an architecture design and assessment system (ADAS)
NASA Technical Reports Server (NTRS)
Gray, F. Gail; Debrunner, Linda S.; White, Tennis S.
1988-01-01
A new Architecture Design and Assessment System (ADAS) tool package is introduced, and a range of possible applications is illustrated. ADAS was used to evaluate the performance of an advanced fault-tolerant computer architecture in a modern flight control application. Bottlenecks were identified and possible solutions suggested. The tool was also used to inject faults into the architecture and evaluate the synchronization algorithm, and improvements are suggested. Finally, ADAS was used as a front end research tool to aid in the design of reconfiguration algorithms in a distributed array architecture.
Application of Tessellation in Architectural Geometry Design
NASA Astrophysics Data System (ADS)
Chang, Wei
2018-06-01
Tessellation plays a significant role in architectural geometry design, which is widely used both through history of architecture and in modern architectural design with the help of computer technology. Tessellation has been found since the birth of civilization. In terms of dimensions, there are two- dimensional tessellations and three-dimensional tessellations; in terms of symmetry, there are periodic tessellations and aperiodic tessellations. Besides, some special types of tessellations such as Voronoi Tessellation and Delaunay Triangles are also included. Both Geometry and Crystallography, the latter of which is the basic theory of three-dimensional tessellations, need to be studied. In history, tessellation was applied into skins or decorations in architecture. The development of Computer technology enables tessellation to be more powerful, as seen in surface control, surface display and structure design, etc. Therefore, research on the application of tessellation in architectural geometry design is of great necessity in architecture studies.
NASA Astrophysics Data System (ADS)
Bird, Robert; Nystrom, David; Albright, Brian
2017-10-01
The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
Network architecture test-beds as platforms for ubiquitous computing.
Roscoe, Timothy
2008-10-28
Distributed systems research, and in particular ubiquitous computing, has traditionally assumed the Internet as a basic underlying communications substrate. Recently, however, the networking research community has come to question the fundamental design or 'architecture' of the Internet. This has been led by two observations: first, that the Internet as it stands is now almost impossible to evolve to support new functionality; and second, that modern applications of all kinds now use the Internet rather differently, and frequently implement their own 'overlay' networks above it to work around its perceived deficiencies. In this paper, I discuss recent academic projects to allow disruptive change to the Internet architecture, and also outline a radically different view of networking for ubiquitous computing that such proposals might facilitate.
NASA Astrophysics Data System (ADS)
Schulthess, Thomas C.
2013-03-01
The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.
NASA Astrophysics Data System (ADS)
Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.
2016-05-01
In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Petascale Many Body Methods for Complex Correlated Systems
NASA Astrophysics Data System (ADS)
Pruschke, Thomas
2012-02-01
Correlated systems constitute an important class of materials in modern condensed matter physics. Correlation among electrons are at the heart of all ordering phenomena and many intriguing novel aspects, such as quantum phase transitions or topological insulators, observed in a variety of compounds. Yet, theoretically describing these phenomena is still a formidable task, even if one restricts the models used to the smallest possible set of degrees of freedom. Here, modern computer architectures play an essential role, and the joint effort to devise efficient algorithms and implement them on state-of-the art hardware has become an extremely active field in condensed-matter research. To tackle this task single-handed is quite obviously not possible. The NSF-OISE funded PIRE collaboration ``Graduate Education and Research in Petascale Many Body Methods for Complex Correlated Systems'' is a successful initiative to bring together leading experts around the world to form a virtual international organization for addressing these emerging challenges and educate the next generation of computational condensed matter physicists. The collaboration includes research groups developing novel theoretical tools to reliably and systematically study correlated solids, experts in efficient computational algorithms needed to solve the emerging equations, and those able to use modern heterogeneous computer architectures to make then working tools for the growing community.
Borresen, Jon; Lynch, Stephen
2012-01-01
In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory.
NASA Astrophysics Data System (ADS)
Hadade, Ioan; di Mare, Luca
2016-08-01
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture
NASA Astrophysics Data System (ADS)
Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.
2017-12-01
A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.
YASS: A System Simulator for Operating System and Computer Architecture Teaching and Learning
ERIC Educational Resources Information Center
Mustafa, Besim
2013-01-01
A highly interactive, integrated and multi-level simulator has been developed specifically to support both the teachers and the learners of modern computer technologies at undergraduate level. The simulator provides a highly visual and user configurable environment with many pedagogical features aimed at facilitating deep understanding of concepts…
Geospace simulations using modern accelerator processor technology
NASA Astrophysics Data System (ADS)
Germaschewski, K.; Raeder, J.; Larson, D. J.
2009-12-01
OpenGGCM (Open Geospace General Circulation Model) is a well-established numerical code simulating the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is currently limited by computational constraints on grid resolution. OpenGGCM has been ported to make use of the added computational powerof modern accelerator based processor architectures, in particular the Cell processor. The Cell architecture is a novel inhomogeneous multicore architecture capable of achieving up to 230 GFLops on a single chip. The University of New Hampshire recently acquired a PowerXCell 8i based computing cluster, and here we will report initial performance results of OpenGGCM. Realizing the high theoretical performance of the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallelization approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We use a modern technique, automatic code generation, which shields the application programmer from having to deal with all of the implementation details just described, keeping the code much more easily maintainable. Our preliminary results indicate excellent performance, a speed-up of a factor of 30 compared to the unoptimized version.
Mark 4A antenna control system data handling architecture study
NASA Technical Reports Server (NTRS)
Briggs, H. C.; Eldred, D. B.
1991-01-01
A high-level review was conducted to provide an analysis of the existing architecture used to handle data and implement control algorithms for NASA's Deep Space Network (DSN) antennas and to make system-level recommendations for improving this architecture so that the DSN antennas can support the ever-tightening requirements of the next decade and beyond. It was found that the existing system is seriously overloaded, with processor utilization approaching 100 percent. A number of factors contribute to this overloading, including dated hardware, inefficient software, and a message-passing strategy that depends on serial connections between machines. At the same time, the system has shortcomings and idiosyncrasies that require extensive human intervention. A custom operating system kernel and an obscure programming language exacerbate the problems and should be modernized. A new architecture is presented that addresses these and other issues. Key features of the new architecture include a simplified message passing hierarchy that utilizes a high-speed local area network, redesign of particular processing function algorithms, consolidation of functions, and implementation of the architecture in modern hardware and software using mainstream computer languages and operating systems. The system would also allow incremental hardware improvements as better and faster hardware for such systems becomes available, and costs could potentially be low enough that redundancy would be provided economically. Such a system could support DSN requirements for the foreseeable future, though thorough consideration must be given to hard computational requirements, porting existing software functionality to the new system, and issues of fault tolerance and recovery.
Borresen, Jon; Lynch, Stephen
2012-01-01
In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory. PMID:23173034
The future of computing--new architectures and new technologies.
Warren, P
2004-02-01
All modern computers are designed using the 'von Neumann' architecture and built using silicon transistor technology. Both architecture and technology have been remarkably successful. Yet there are a range of problems for which this conventional architecture is not particularly well adapted, and new architectures are being proposed to solve these problems, in particular based on insight from nature. Transistor technology has enjoyed 50 years of continuing progress. However, the laws of physics dictate that within a relatively short time period this progress will come to an end. New technologies, based on molecular and biological sciences as well as quantum physics, are vying to replace silicon, or at least coexist with it and extend its capability. The paper describes these novel architectures and technologies, places them in the context of the kinds of problems they might help to solve, and predicts their possible manner and time of adoption. Finally it describes some key questions and research problems associated with their use.
Integrating Computer Architectures into the Design of High-Performance Controllers
NASA Technical Reports Server (NTRS)
Jacklin, Stephen A.; Leyland, Jane A.; Warmbrodt, William
1986-01-01
Modern control systems must typically perform real-time identification and control, as well as coordinate a host of other activities related to user interaction, on-line graphics, and file management. This paper discusses five global design considerations that are useful to integrate array processor, multimicroprocessor, and host computer system architecture into versatile, high-speed controllers. Such controllers are capable of very high control throughput, and can maintain constant interaction with the non-real-time or user environment. As an application example, the architecture of a high-speed, closed-loop controller used to actively control helicopter vibration will be briefly discussed. Although this system has been designed for use as the controller for real-time rotorcraft dynamics and control studies in a wind-tunnel environment, the control architecture can generally be applied to a wide range of automatic control applications.
A modern approach to storing of 3D geometry of objects in machine engineering industry
NASA Astrophysics Data System (ADS)
Sokolova, E. A.; Aslanov, G. A.; Sokolov, A. A.
2017-02-01
3D graphics is a kind of computer graphics which has absorbed a lot from the vector and raster computer graphics. It is used in interior design projects, architectural projects, advertising, while creating educational computer programs, movies, visual images of parts and products in engineering, etc. 3D computer graphics allows one to create 3D scenes along with simulation of light conditions and setting up standpoints.
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.
Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi
2018-01-01
Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Irregular Applications: Architectures & Algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feo, John T.; Villa, Oreste; Tumeo, Antonino
Irregular applications are characterized by irregular data structures, control and communication patterns. Novel irregular high performance applications which deal with large data sets and require have recently appeared. Unfortunately, current high performance systems and software infrastructures executes irregular algorithms poorly. Only coordinated efforts by end user, area specialists and computer scientists that consider both the architecture and the software stack may be able to provide solutions to the challenges of modern irregular applications.
Chessa, Manuela; Bianchi, Valentina; Zampetti, Massimo; Sabatini, Silvio P; Solari, Fabio
2012-01-01
The intrinsic parallelism of visual neural architectures based on distributed hierarchical layers is well suited to be implemented on the multi-core architectures of modern graphics cards. The design strategies that allow us to optimally take advantage of such parallelism, in order to efficiently map on GPU the hierarchy of layers and the canonical neural computations, are proposed. Specifically, the advantages of a cortical map-like representation of the data are exploited. Moreover, a GPU implementation of a novel neural architecture for the computation of binocular disparity from stereo image pairs, based on populations of binocular energy neurons, is presented. The implemented neural model achieves good performances in terms of reliability of the disparity estimates and a near real-time execution speed, thus demonstrating the effectiveness of the devised design strategies. The proposed approach is valid in general, since the neural building blocks we implemented are a common basis for the modeling of visual neural functionalities.
Manyscale Computing for Sensor Processing in Support of Space Situational Awareness
NASA Astrophysics Data System (ADS)
Schmalz, M.; Chapman, W.; Hayden, E.; Sahni, S.; Ranka, S.
2014-09-01
Increasing image and signal data burden associated with sensor data processing in support of space situational awareness implies continuing computational throughput growth beyond the petascale regime. In addition to growing applications data burden and diversity, the breadth, diversity and scalability of high performance computing architectures and their various organizations challenge the development of a single, unifying, practicable model of parallel computation. Therefore, models for scalable parallel processing have exploited architectural and structural idiosyncrasies, yielding potential misapplications when legacy programs are ported among such architectures. In response to this challenge, we have developed a concise, efficient computational paradigm and software called Manyscale Computing to facilitate efficient mapping of annotated application codes to heterogeneous parallel architectures. Our theory, algorithms, software, and experimental results support partitioning and scheduling of application codes for envisioned parallel architectures, in terms of work atoms that are mapped (for example) to threads or thread blocks on computational hardware. Because of the rigor, completeness, conciseness, and layered design of our manyscale approach, application-to-architecture mapping is feasible and scalable for architectures at petascales, exascales, and above. Further, our methodology is simple, relying primarily on a small set of primitive mapping operations and support routines that are readily implemented on modern parallel processors such as graphics processing units (GPUs) and hybrid multi-processors (HMPs). In this paper, we overview the opportunities and challenges of manyscale computing for image and signal processing in support of space situational awareness applications. We discuss applications in terms of a layered hardware architecture (laboratory > supercomputer > rack > processor > component hierarchy). Demonstration applications include performance analysis and results in terms of execution time as well as storage, power, and energy consumption for bus-connected and/or networked architectures. The feasibility of the manyscale paradigm is demonstrated by addressing four principal challenges: (1) architectural/structural diversity, parallelism, and locality, (2) masking of I/O and memory latencies, (3) scalability of design as well as implementation, and (4) efficient representation/expression of parallel applications. Examples will demonstrate how manyscale computing helps solve these challenges efficiently on real-world computing systems.
Gauss Elimination: Workhorse of Linear Algebra.
1995-08-05
linear algebra computation for solving systems, computing determinants and determining the rank of matrix. All of these are discussed in varying contexts. These include different arithmetic or algebraic setting such as integer arithmetic or polynomial rings as well as conventional real (floating-point) arithmetic. These have effects on both accuracy and complexity analyses of the algorithm. These, too, are covered here. The impact of modern parallel computer architecture on GE is also
AHaH computing-from metastable switches to attractors to machine learning.
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures-all key capabilities of biological nervous systems and modern machine learning algorithms with real world application.
NASA Technical Reports Server (NTRS)
Jacklin, S. A.; Leyland, J. A.; Warmbrodt, W.
1985-01-01
Modern control systems must typically perform real-time identification and control, as well as coordinate a host of other activities related to user interaction, online graphics, and file management. This paper discusses five global design considerations which are useful to integrate array processor, multimicroprocessor, and host computer system architectures into versatile, high-speed controllers. Such controllers are capable of very high control throughput, and can maintain constant interaction with the nonreal-time or user environment. As an application example, the architecture of a high-speed, closed-loop controller used to actively control helicopter vibration is briefly discussed. Although this system has been designed for use as the controller for real-time rotorcraft dynamics and control studies in a wind tunnel environment, the controller architecture can generally be applied to a wide range of automatic control applications.
Creating executable architectures using Visual Simulation Objects (VSO)
NASA Astrophysics Data System (ADS)
Woodring, John W.; Comiskey, John B.; Petrov, Orlin M.; Woodring, Brian L.
2005-05-01
Investigations have been performed to identify a methodology for creating executable models of architectures and simulations of architecture that lead to an understanding of their dynamic properties. Colored Petri Nets (CPNs) are used to describe architecture because of their strong mathematical foundations, the existence of techniques for their verification and graph theory"s well-established history of success in modern science. CPNs have been extended to interoperate with legacy simulations via a High Level Architecture (HLA) compliant interface. It has also been demonstrated that an architecture created as a CPN can be integrated with Department of Defense Architecture Framework products to ensure consistency between static and dynamic descriptions. A computer-aided tool, Visual Simulation Objects (VSO), which aids analysts in specifying, composing and executing architectures, has been developed to verify the methodology and as a prototype commercial product.
A Survey of Architectural Techniques For Improving Cache Power Efficiency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers have proposed several techniques for improving energy efficiency of cache architectures. This paper surveys recent architectural techniques for improving cache power efficiency and also presents a classification of these techniques based on their characteristics. For providing an application perspective, this paper also reviews several real-worldmore » processor chips that employ cache energy saving techniques. The aim of this survey is to enable engineers and researchers to get insights into the techniques for improving cache power efficiency and motivate them to invent novel solutions for enabling low-power operation of caches.« less
Kazakis, Georgios; Kanellopoulos, Ioannis; Sotiropoulos, Stefanos; Lagaros, Nikos D
2017-10-01
Construction industry has a major impact on the environment that we spend most of our life. Therefore, it is important that the outcome of architectural intuition performs well and complies with the design requirements. Architects usually describe as "optimal design" their choice among a rather limited set of design alternatives, dictated by their experience and intuition. However, modern design of structures requires accounting for a great number of criteria derived from multiple disciplines, often of conflicting nature. Such criteria derived from structural engineering, eco-design, bioclimatic and acoustic performance. The resulting vast number of alternatives enhances the need for computer-aided architecture in order to increase the possibility of arriving at a more preferable solution. Therefore, the incorporation of smart, automatic tools in the design process, able to further guide designer's intuition becomes even more indispensable. The principal aim of this study is to present possibilities to integrate automatic computational techniques related to topology optimization in the phase of intuition of civil structures as part of computer aided architectural design. In this direction, different aspects of a new computer aided architectural era related to the interpretation of the optimized designs, difficulties resulted from the increased computational effort and 3D printing capabilities are covered here in.
AHaH Computing–From Metastable Switches to Attractors to Machine Learning
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures–all key capabilities of biological nervous systems and modern machine learning algorithms with real world application. PMID:24520315
Peculiarities of Natural Technology Application in Architecture
NASA Astrophysics Data System (ADS)
Umorina, Z.
2017-11-01
Technical advancement of the modern world has made it possible to create unique artificial objects based on the natural technology principle. New engineering and design types, such as computational design, additive manufacturing, materials engineering, synthetic biology, etc. allow us to enter a new level of interaction between a human being and nature. This influences the formation of a new world view in the sphere of architecture and leads to the development of new methods and styles [1,2].
Enabling GEODSS for Space Situational Awareness (SSA)
NASA Astrophysics Data System (ADS)
Wootton, S.
2016-09-01
The Ground-Based Electro-Optical Deep Space Surveillance (GEODSS) System has been in operation since the mid-1980's. While GEODSS has been the Space Surveillance Network's (SSN's) workhorse in terms of deep space surveillance, it has not undergone a significant modernization since the 1990's. This means GEODSS continues to operate under a mostly obsolete, legacy data processing baseline. The System Program Office (SPO) responsible for GEODSS, SMC/SYGO, has a number of advanced Space Situational Awareness (SSA)-related efforts in progress, in the form of innovative optical capabilities, data processing algorithms, and hardware upgrades. Each of these efforts is in various stages of evaluation and acquisition. These advanced capabilities rely upon a modern computing environment in which to integrate, but GEODSS does not have one—yet. The SPO is also executing a Service Life Extension Program (SLEP) to modernize the various subsystems within GEODSS, along with a parallel effort to implement a complete, modern software re-architecture. The goal is to use a modern, service-based architecture to provide expedient integration as well as easier and more sustainable expansion. This presentation will describe these modernization efforts in more detail and discuss how adopting such modern paradigms and practices will help ensure the GEODSS system remains relevant and sustainable far beyond 2027.
NASA Technical Reports Server (NTRS)
Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.
1992-01-01
Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.
Integrating the Apache Big Data Stack with HPC for Big Data
NASA Astrophysics Data System (ADS)
Fox, G. C.; Qiu, J.; Jha, S.
2014-12-01
There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development. However, the same is not so true for data intensive computing, even though commercially clouds devote much more resources to data analytics than supercomputers devote to simulations. We look at a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures. We suggest a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks and use these to identify a few key classes of hardware/software architectures. Our analysis builds on combining HPC and ABDS the Apache big data software stack that is well used in modern cloud computing. Initial results on clouds and HPC systems are encouraging. We propose the development of SPIDAL - Scalable Parallel Interoperable Data Analytics Library -- built on system aand data abstractions suggested by the HPC-ABDS architecture. We discuss how it can be used in several application areas including Polar Science.
Evaluation of the Conservation of Modern Architectural Heritage through Ankara’s Public Buildings
NASA Astrophysics Data System (ADS)
Turgut Gültekin, Nevin
2017-10-01
This paper evaluates the approach to the field of modern architecture in Turkey through the public buildings of Ankara. Although the conservation of modern architecture as cultural heritage has been accepted, to a limited degree, within related frameworks and disciplines, and within theory, the inconsistency in preservation legislations have been evaluated critically. The scope of conservation is limited to the state of being old and historical, thereby rendering modern architecture not worth conserving. This is valid for many countries, just like it is for Turkey. Despite various local interpretations of the mode of modern architecture that foresees mono-typing, the connotations of “culture” and the state of being a “product of the past,” of the 20th century, are denied. The expanding and transforming characteristic of immovable cultural heritage is disregarded. As such, modern architecture in Turkey remains inadequately analyzed and documented within the framework of cultural heritage. The conservation of buildings dating back to the 20th century remains within the preference of the related Ministry. As the criteria for this preference is not determined, some public buildings that exemplify modern architecture are rapidly lost despite their being of the same style and period with other buildings designated for conservation. The threat of being torn down or destroyed due to aging functionally and physically renders the preservation of modern architecture products within the framework of cultural heritage, as well as the updating of the legal context according to new parameters, urgent and necessary. The sustenance of public buildings, which are not only products of modern architecture but also sources of the history of the city and architecture, and therefore the history of the Republic in Turkey and the modernization process, gains even more significance through its impact on the urban identity of the capital, Ankara. To this end, this paper focuses on the city of Ankara for its case study on the present status of sustaining modern architectural heritage.
NASA Astrophysics Data System (ADS)
Ragan-Kelley, M.; Perez, F.; Granger, B.; Kluyver, T.; Ivanov, P.; Frederic, J.; Bussonnier, M.
2014-12-01
IPython has provided terminal-based tools for interactive computing in Python since 2001. The notebook document format and multi-process architecture introduced in 2011 have expanded the applicable scope of IPython into teaching, presenting, and sharing computational work, in addition to interactive exploration. The new architecture also allows users to work in any language, with implementations in Python, R, Julia, Haskell, and several other languages. The language agnostic parts of IPython have been renamed to Jupyter, to better capture the notion that a cross-language design can encapsulate commonalities present in computational research regardless of the programming language being used. This architecture offers components like the web-based Notebook interface, that supports rich documents that combine code and computational results with text narratives, mathematics, images, video and any media that a modern browser can display. This interface can be used not only in research, but also for publication and education, as notebooks can be converted to a variety of output formats, including HTML and PDF. Recent developments in the Jupyter project include a multi-user environment for hosting notebooks for a class or research group, a live collaboration notebook via Google Docs, and better support for languages other than Python.
A performance model for GPUs with caches
Dao, Thanh Tuan; Kim, Jungwon; Seo, Sangmin; ...
2014-06-24
To exploit the abundant computational power of the world's fastest supercomputers, an even workload distribution to the typically heterogeneous compute devices is necessary. While relatively accurate performance models exist for conventional CPUs, accurate performance estimation models for modern GPUs do not exist. This paper presents two accurate models for modern GPUs: a sampling-based linear model, and a model based on machine-learning (ML) techniques which improves the accuracy of the linear model and is applicable to modern GPUs with and without caches. We first construct the sampling-based linear model to predict the runtime of an arbitrary OpenCL kernel. Based on anmore » analysis of NVIDIA GPUs' scheduling policies we determine the earliest sampling points that allow an accurate estimation. The linear model cannot capture well the significant effects that memory coalescing or caching as implemented in modern GPUs have on performance. We therefore propose a model based on ML techniques that takes several compiler-generated statistics about the kernel as well as the GPU's hardware performance counters as additional inputs to obtain a more accurate runtime performance estimation for modern GPUs. We demonstrate the effectiveness and broad applicability of the model by applying it to three different NVIDIA GPU architectures and one AMD GPU architecture. On an extensive set of OpenCL benchmarks, on average, the proposed model estimates the runtime performance with less than 7 percent error for a second-generation GTX 280 with no on-chip caches and less than 5 percent for the Fermi-based GTX 580 with hardware caches. On the Kepler-based GTX 680, the linear model has an error of less than 10 percent. On an AMD GPU architecture, Radeon HD 6970, the model estimates with 8 percent of error rates. As a result, the proposed technique outperforms existing models by a factor of 5 to 6 in terms of accuracy.« less
High performance flight computer developed for deep space applications
NASA Technical Reports Server (NTRS)
Bunker, Robert L.
1993-01-01
The development of an advanced space flight computer for real time embedded deep space applications which embodies the lessons learned on Galileo and modern computer technology is described. The requirements are listed and the design implementation that meets those requirements is described. The development of SPACE-16 (Spaceborne Advanced Computing Engine) (where 16 designates the databus width) was initiated to support the MM2 (Marine Mark 2) project. The computer is based on a radiation hardened emulation of a modern 32 bit microprocessor and its family of support devices including a high performance floating point accelerator. Additional custom devices which include a coprocessor to improve input/output capabilities, a memory interface chip, and an additional support chip that provide management of all fault tolerant features, are described. Detailed supporting analyses and rationale which justifies specific design and architectural decisions are provided. The six chip types were designed and fabricated. Testing and evaluation of a brass/board was initiated.
WIS Implementation Study Report. Volume 2. Resumes.
1983-10-01
WIS modernization that major attention be paid to interface definition and design, system integra- tion and test , and configuration management of the...Estimates -- Computer Corporation of America -- 155 Test Processing Systems -- Newburyport Computer Associates, Inc. -- 183 Cluster II Papers-- Standards...enhancements of the SPL/I compiler system, development of test systems for the verification of SDEX/M and the timing and architecture of the AN/U YK-20 and
NASA STI Program Coordinating Council Eleventh Meeting: NASA STI Modernization Plan
NASA Technical Reports Server (NTRS)
1993-01-01
The theme of this NASA Scientific and Technical Information Program Coordinating Council Meeting was the modernization of the STI Program. Topics covered included the activities of the Engineering Review Board in the creation of the Infrastructure Upgrade Plan, the progress of the RECON Replacement Project, the use and status of Electronic SCAN (Selected Current Aerospace Notices), the Machine Translation Project, multimedia, electronic document interchange, the NASA Access Mechanism, computer network upgrades, and standards in the architectural effort.
Performance of GeantV EM Physics Models
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2017-10-01
The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Song, Shuaiwen; Fu, Haohuan
2014-08-16
Support Vector Machine (SVM) has been widely used in data-mining and Big Data applications as modern commercial databases start to attach an increasing importance to the analytic capabilities. In recent years, SVM was adapted to the field of High Performance Computing for power/performance prediction, auto-tuning, and runtime scheduling. However, even at the risk of losing prediction accuracy due to insufficient runtime information, researchers can only afford to apply offline model training to avoid significant runtime training overhead. To address the challenges above, we designed and implemented MICSVM, a highly efficient parallel SVM for x86 based multi-core and many core architectures,more » such as the Intel Ivy Bridge CPUs and Intel Xeon Phi coprocessor (MIC).« less
NASA Astrophysics Data System (ADS)
Titov, A. G.; Okladnikov, I. G.; Gordov, E. P.
2017-11-01
The use of large geospatial datasets in climate change studies requires the development of a set of Spatial Data Infrastructure (SDI) elements, including geoprocessing and cartographical visualization web services. This paper presents the architecture of a geospatial OGC web service system as an integral part of a virtual research environment (VRE) general architecture for statistical processing and visualization of meteorological and climatic data. The architecture is a set of interconnected standalone SDI nodes with corresponding data storage systems. Each node runs a specialized software, such as a geoportal, cartographical web services (WMS/WFS), a metadata catalog, and a MySQL database of technical metadata describing geospatial datasets available for the node. It also contains geospatial data processing services (WPS) based on a modular computing backend realizing statistical processing functionality and, thus, providing analysis of large datasets with the results of visualization and export into files of standard formats (XML, binary, etc.). Some cartographical web services have been developed in a system’s prototype to provide capabilities to work with raster and vector geospatial data based on OGC web services. The distributed architecture presented allows easy addition of new nodes, computing and data storage systems, and provides a solid computational infrastructure for regional climate change studies based on modern Web and GIS technologies.
The path toward HEP High Performance Computing
NASA Astrophysics Data System (ADS)
Apostolakis, John; Brun, René; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro
2014-06-01
High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from the recent technology evolution in computing.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
Heterogeneous real-time computing in radio astronomy
NASA Astrophysics Data System (ADS)
Ford, John M.; Demorest, Paul; Ransom, Scott
2010-07-01
Modern computer architectures suited for general purpose computing are often not the best choice for either I/O-bound or compute-bound problems. Sometimes the best choice is not to choose a single architecture, but to take advantage of the best characteristics of different computer architectures to solve your problems. This paper examines the tradeoffs between using computer systems based on the ubiquitous X86 Central Processing Units (CPU's), Field Programmable Gate Array (FPGA) based signal processors, and Graphical Processing Units (GPU's). We will show how a heterogeneous system can be produced that blends the best of each of these technologies into a real-time signal processing system. FPGA's tightly coupled to analog-to-digital converters connect the instrument to the telescope and supply the first level of computing to the system. These FPGA's are coupled to other FPGA's to continue to provide highly efficient processing power. Data is then packaged up and shipped over fast networks to a cluster of general purpose computers equipped with GPU's, which are used for floating-point intensive computation. Finally, the data is handled by the CPU and written to disk, or further processed. Each of the elements in the system has been chosen for its specific characteristics and the role it can play in creating a system that does the most for the least, in terms of power, space, and money.
Dependency graph for code analysis on emerging architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shashkov, Mikhail Jurievich; Lipnikov, Konstantin
Direct acyclic dependency (DAG) graph is becoming the standard for modern multi-physics codes.The ideal DAG is the true block-scheme of a multi-physics code. Therefore, it is the convenient object for insitu analysis of the cost of computations and algorithmic bottlenecks related to statistical frequent data motion and dymanical machine state.
Early Period of Modern Architecture in Turkey - A Case Study of Eskisehir
NASA Astrophysics Data System (ADS)
Karasozen, Rana
2017-10-01
Modern architecture in the Western World bore fruit at the beginning of the 20th Century in consequence of the process of modernity and seeking of the proper architecture for it. It was formed firstly towards the end of the 1920s. The main reason of this nonsynchronous development was the inadequacy of enlightenment and industrial revolution during the Ottoman Empire and the lack of formation of an intellectual infrastructure which provides the basis of modernity. However, the Ottoman Westernization occurring in the 19th century constituted the foundations of the Republic modernity founded in 1923. The earliest modern architectural designs in Turkey were first practised by European architects after the foundation of the Republic and internalised and practised extensively by the native architects afterwards. The early modern architecture of Turkey, named as “1930s Modernism”, continued until the beginning of the World War II. This period was formed in between the periods of first and second nationalist architecture movements. The early modern architecture period of Turkey was a period which high-quality designs were made. It was practised and internalised not only in big cities such as Ankara and in Istanbul, but also in the medium and small cities of the country. This situation was not just about a formal exception but about the internalisation of modernity by the society. Eskisehir is one of the most important pioneering cities of the Republic period in terms of industrial and educational developments. The earliest modern buildings were built as the public buildings by the state and non-citizen architects in the inadequate conditions of the country in terms of economy and professional people. The earliest modern houses of the city designed by these architects were the prototypes for the later practices which offered the citizens a new lifestyle. The modern houses were the symbols of prestige and status for the owners and the dwellers. The features of early modern buildings of Eskisehir as a medium-size city of Turkey will be examined in this study within the scope of the early modern architectural period of Turkey.
CUDA Optimization Strategies for Compute- and Memory-Bound Neuroimaging Algorithms
Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W.
2011-01-01
As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. PMID:21159404
CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.
Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W
2012-06-01
As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saad, Tony; Sutherland, James C.
To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Saad, Tony; Sutherland, James C.
2016-05-04
To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
NASA Astrophysics Data System (ADS)
Yarovyi, Andrii A.; Timchenko, Leonid I.; Kozhemiako, Volodymyr P.; Kokriatskaia, Nataliya I.; Hamdi, Rami R.; Savchuk, Tamara O.; Kulyk, Oleksandr O.; Surtel, Wojciech; Amirgaliyev, Yedilkhan; Kashaganova, Gulzhan
2017-08-01
The paper deals with a problem of insufficient productivity of existing computer means for large image processing, which do not meet modern requirements posed by resource-intensive computing tasks of laser beam profiling. The research concentrated on one of the profiling problems, namely, real-time processing of spot images of the laser beam profile. Development of a theory of parallel-hierarchic transformation allowed to produce models for high-performance parallel-hierarchical processes, as well as algorithms and software for their implementation based on the GPU-oriented architecture using GPGPU technologies. The analyzed performance of suggested computerized tools for processing and classification of laser beam profile images allows to perform real-time processing of dynamic images of various sizes.
A simple modern correctness condition for a space-based high-performance multiprocessor
NASA Technical Reports Server (NTRS)
Probst, David K.; Li, Hon F.
1992-01-01
A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.
NASA Astrophysics Data System (ADS)
Kashansky, Vladislav V.; Kaftannikov, Igor L.
2018-02-01
Modern numerical modeling experiments and data analytics problems in various fields of science and technology reveal a wide variety of serious requirements for distributed computing systems. Many scientific computing projects sometimes exceed the available resource pool limits, requiring extra scalability and sustainability. In this paper we share the experience and findings of our own on combining the power of SLURM, BOINC and GlusterFS as software system for scientific computing. Especially, we suggest a complete architecture and highlight important aspects of systems integration.
Digital Waveguide Architectures for Virtual Musical Instruments
NASA Astrophysics Data System (ADS)
Smith, Julius O.
Digital sound synthesis has become a standard staple of modern music studios, videogames, personal computers, and hand-held devices. As processing power has increased over the years, sound synthesis implementations have evolved from dedicated chip sets, to single-chip solutions, and ultimately to software implementations within processors used primarily for other tasks (such as for graphics or general purpose computing). With the cost of implementation dropping closer and closer to zero, there is increasing room for higher quality algorithms.
Elements of a modern turbomachinery design system
NASA Astrophysics Data System (ADS)
Jennions, Ian K.
1994-05-01
The aerodynamic design system at GE Aircraft Engines (GEAE) consists of many parts: throughflow, secondary flow, geometry generators, blade-to-blade and fully three-dimensional (3D) analysis. This paper describes each of these elements and discusses optimization and computer architecture issues. Emphasis is placed on those areas in which the company is thought to have special capability.
Architectural Techniques For Managing Non-volatile Caches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
As chip power dissipation becomes a critical challenge in scaling processor performance, computer architects are forced to fundamentally rethink the design of modern processors and hence, the chip-design industry is now at a major inflection point in its hardware roadmap. The high leakage power and low density of SRAM poses serious obstacles in its use for designing large on-chip caches and for this reason, researchers are exploring non-volatile memory (NVM) devices, such as spin torque transfer RAM, phase change RAM and resistive RAM. However, since NVMs are not strictly superior to SRAM, effective architectural techniques are required for making themmore » a universal memory solution. This book discusses techniques for designing processor caches using NVM devices. It presents algorithms and architectures for improving their energy efficiency, performance and lifetime. It also provides both qualitative and quantitative evaluation to help the reader gain insights and motivate them to explore further. This book will be highly useful for beginners as well as veterans in computer architecture, chip designers, product managers and technical marketing professionals.« less
Image and Morphology in Modern Theory of Architecture
NASA Astrophysics Data System (ADS)
Yankovskaya, Y. S.; Merenkov, A. V.
2017-11-01
This paper is devoted to some important and fundamental problems of the modern Russian architectural theory. These problems are: methodological and technological retardation; substitution of the modern professional architectural theoretical knowledge by the humanitarian concepts; preference of the traditional historical or historical-theoretical research. One of the most probable ways is the formation of useful modern subject (and multi-subject)-oriented concepts in architecture. To get over the criticism and distrust of the architectural theory is possible through the recognition of an important role of the subject (architect, consumer, contractor, ruler, etc.) and direction of the practical tasks of the forming human environment in the today’s rapidly changing world and post-industrial society. In this article we consider the evolution of two basic concepts for the theory of architecture such as the image and morphology.
Parallel algorithm for computation of second-order sequential best rotations
NASA Astrophysics Data System (ADS)
Redif, Soydan; Kasap, Server
2013-12-01
Algorithms for computing an approximate polynomial matrix eigenvalue decomposition of para-Hermitian systems have emerged as a powerful, generic signal processing tool. A technique that has shown much success in this regard is the sequential best rotation (SBR2) algorithm. Proposed is a scheme for parallelising SBR2 with a view to exploiting the modern architectural features and inherent parallelism of field-programmable gate array (FPGA) technology. Experiments show that the proposed scheme can achieve low execution times while requiring minimal FPGA resources.
High-performance computing with quantum processing units
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...
2017-03-01
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
High-performance computing with quantum processing units
DOE Office of Scientific and Technical Information (OSTI.GOV)
Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.
The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Parallelization of the preconditioned IDR solver for modern multicore computer systems
NASA Astrophysics Data System (ADS)
Bessonov, O. A.; Fedoseyev, A. I.
2012-10-01
This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Fiacco, P. A.; Rice, W. H.
1991-01-01
Computerized medical record systems require structured database architectures for information processing. However, the data must be able to be transferred across heterogeneous platform and software systems. Client-Server architecture allows for distributive processing of information among networked computers and provides the flexibility needed to link diverse systems together effectively. We have incorporated this client-server model with a graphical user interface into an outpatient medical record system, known as SuperChart, for the Department of Family Medicine at SUNY Health Science Center at Syracuse. SuperChart was developed using SuperCard and Oracle SuperCard uses modern object-oriented programming to support a hypermedia environment. Oracle is a powerful relational database management system that incorporates a client-server architecture. This provides both a distributed database and distributed processing which improves performance. PMID:1807732
NASA Astrophysics Data System (ADS)
Ford, Eric B.; Dindar, Saleh; Peters, Jorg
2015-08-01
The realism of astrophysical simulations and statistical analyses of astronomical data are set by the available computational resources. Thus, astronomers and astrophysicists are constantly pushing the limits of computational capabilities. For decades, astronomers benefited from massive improvements in computational power that were driven primarily by increasing clock speeds and required relatively little attention to details of the computational hardware. For nearly a decade, increases in computational capabilities have come primarily from increasing the degree of parallelism, rather than increasing clock speeds. Further increases in computational capabilities will likely be led by many-core architectures such as Graphical Processing Units (GPUs) and Intel Xeon Phi. Successfully harnessing these new architectures, requires significantly more understanding of the hardware architecture, cache hierarchy, compiler capabilities and network network characteristics.I will provide an astronomer's overview of the opportunities and challenges provided by modern many-core architectures and elastic cloud computing. The primary goal is to help an astronomical audience understand what types of problems are likely to yield more than order of magnitude speed-ups and which problems are unlikely to parallelize sufficiently efficiently to be worth the development time and/or costs.I will draw on my experience leading a team in developing the Swarm-NG library for parallel integration of large ensembles of small n-body systems on GPUs, as well as several smaller software projects. I will share lessons learned from collaborating with computer scientists, including both technical and soft skills. Finally, I will discuss the challenges of training the next generation of astronomers to be proficient in this new era of high-performance computing, drawing on experience teaching a graduate class on High-Performance Scientific Computing for Astrophysics and organizing a 2014 advanced summer school on Bayesian Computing for Astronomical Data Analysis with support of the Penn State Center for Astrostatistics and Institute for CyberScience.
Islamic Modernism and Architectural Modernism of Muhammadiyah’s Lio Mosque
NASA Astrophysics Data System (ADS)
Prajawisastra, A. F.; Aryanti, T.
2017-03-01
The Muhammadiyah’s Lio Mosque is one of the masterpieces of Achmad Noe’man, the great Indonesian mosque architect. The mosque was built as a community mosque at the center of Muhammadiyah’s quarter in Garut, West Java, in conjuction with the construction of the district’s Muhammadiyah branch. Having a shape out of the existing grip, the mosque has neither a dome nor a tajug tumpang tiga (three-tiered pyramidal roof) like other mosques nearby, but instead uses a gable roof and minarets towering. This article aims to analyze the architecture of the Lio Mosque and to learn Achmad Noe’man’s interpretation of modernism, both Islamic modernism and architectural modernism, reflected in the mosque design. Employing a qualitative approach, this study used observation and interviews with the mosque’s stakeholders. This article argues that the ideology of modernism, believed by Achmad Noe’man and the Muhammadiyah organization, was embodied in the Lio Mosque architecture.
Contemporary cybernetics and its facets of cognitive informatics and computational intelligence.
Wang, Yingxu; Kinsner, Witold; Zhang, Du
2009-08-01
This paper explores the architecture, theoretical foundations, and paradigms of contemporary cybernetics from perspectives of cognitive informatics (CI) and computational intelligence. The modern domain and the hierarchical behavioral model of cybernetics are elaborated at the imperative, autonomic, and cognitive layers. The CI facet of cybernetics is presented, which explains how the brain may be mimicked in cybernetics via CI and neural informatics. The computational intelligence facet is described with a generic intelligence model of cybernetics. The compatibility between natural and cybernetic intelligence is analyzed. A coherent framework of contemporary cybernetics is presented toward the development of transdisciplinary theories and applications in cybernetics, CI, and computational intelligence.
Evolution of the SOFIA tracking control system
NASA Astrophysics Data System (ADS)
Fiebig, Norbert; Jakob, Holger; Pfüller, Enrico; Röser, Hans-Peter; Wiedemann, Manuel; Wolf, Jürgen
2014-07-01
The airborne observatory SOFIA (Stratospheric Observatory for Infrared Astronomy) is undergoing a modernization of its tracking system. This included new, highly sensitive tracking cameras, control computers, filter wheels and other equipment, as well as a major redesign of the control software. The experiences along the migration path from an aged 19" VMbus based control system to the application of modern industrial PCs, from VxWorks real-time operating system to embedded Linux and a state of the art software architecture are presented. Further, the concept is presented to operate the new camera also as a scientific instrument, in parallel to tracking.
Air Traffic Control: Complete and Enforced Architecture Needed for FAA Systems Modernization
DOT National Transportation Integrated Search
1997-02-01
Because of the size, complexity, and importance of FAA's air traffic control : (ATC) modernization, the General Accounting Office (GAO) reviewed it to : determine (1) whether FAA has a target architecture(s), and associated : subarchitectures, to gui...
The Salman Mosque: Achmad Noe’man’s Critique of Indonesian Conventional Mosque Architecture
NASA Astrophysics Data System (ADS)
Holik, A. A. R.; Aryanti, T.
2017-03-01
The Salman Mosque, designed by Achmad Noe’man, was a striking Islamic architectural design in the 1960s when it was built. Unlike the conventional mosques, particularly in Indonesia, it has no dome. Instead, the roof was made of prestressed concrete and resembles a canoe. Using data drawn from field observations, this paper explores the architectural characteristics of the Salman Mosque as a product of Modern architecture. It argues that the domeless mosque, the simple minaret, the wooden wall panels and floor, the women’s balcony, and the roof demonstrate architectural modernism, as opposed to the conventional mosque typology that flourished in Indonesia at the time. This paper further argues that the Salman Mosque is Noe’man’s critique of the Indonesian conventional mosque architecture. It concludes that the architectural features of the Salman Mosque reflects Noe’man’s modern vision of Islam and Islamic architecture.
Feeding People's Curiosity: Leveraging the Cloud for Automatic Dissemination of Mars Images
NASA Technical Reports Server (NTRS)
Knight, David; Powell, Mark
2013-01-01
Smartphones and tablets have made wireless computing ubiquitous, and users expect instant, on-demand access to information. The Mars Science Laboratory (MSL) operations software suite, MSL InterfaCE (MSLICE), employs a different back-end image processing architecture compared to that of the Mars Exploration Rovers (MER) in order to better satisfy modern consumer-driven usage patterns and to offer greater server-side flexibility. Cloud services are a centerpiece of the server-side architecture that allows new image data to be delivered automatically to both scientists using MSLICE and the general public through the MSL website (http://mars.jpl.nasa.gov/msl/).
Dynamic Architecture. New Style Forming Aspects
NASA Astrophysics Data System (ADS)
Belyaeva, T. V.
2017-11-01
The article deals with the methods of buildings and structures transformation in the light of modern solutions in dynamic architecture. The mechanism for the formation of a modern object is proposed. Such design methods are becoming rather relevant in view of today’s trends while the priority of dynamic architecture directions keeps increasing.
A GPU-Based Architecture for Real-Time Data Assessment at Synchrotron Experiments
NASA Astrophysics Data System (ADS)
Chilingaryan, Suren; Mirone, Alessandro; Hammersley, Andrew; Ferrero, Claudio; Helfen, Lukas; Kopmann, Andreas; Rolo, Tomy dos Santos; Vagovic, Patrik
2011-08-01
Advances in digital detector technology leads presently to rapidly increasing data rates in imaging experiments. Using fast two-dimensional detectors in computed tomography, the data acquisition can be much faster than the reconstruction if no adequate measures are taken, especially when a high photon flux at synchrotron sources is used. We have optimized the reconstruction software employed at the micro-tomography beamlines of our synchrotron facilities to use the computational power of modern graphic cards. The main paradigm of our approach is the full utilization of all system resources. We use a pipelined architecture, where the GPUs are used as compute coprocessors to reconstruct slices, while the CPUs are preparing the next ones. Special attention is devoted to minimize data transfers between the host and GPU memory and to execute memory transfers in parallel with the computations. We were able to reduce the reconstruction time by a factor 30 and process a typical data set of 20 GB in 40 seconds. The time needed for the first evaluation of the reconstructed sample is reduced significantly and quasi real-time visualization is now possible.
High-performance 3D compressive sensing MRI reconstruction.
Kim, Daehyun; Trzasko, Joshua D; Smelyanskiy, Mikhail; Haider, Clifton R; Manduca, Armando; Dubey, Pradeep
2010-01-01
Compressive Sensing (CS) is a nascent sampling and reconstruction paradigm that describes how sparse or compressible signals can be accurately approximated using many fewer samples than traditionally believed. In magnetic resonance imaging (MRI), where scan duration is directly proportional to the number of acquired samples, CS has the potential to dramatically decrease scan time. However, the computationally expensive nature of CS reconstructions has so far precluded their use in routine clinical practice - instead, more-easily generated but lower-quality images continue to be used. We investigate the development and optimization of a proven inexact quasi-Newton CS reconstruction algorithm on several modern parallel architectures, including CPUs, GPUs, and Intel's Many Integrated Core (MIC) architecture. Our (optimized) baseline implementation on a quad-core Core i7 is able to reconstruct a 256 × 160×80 volume of the neurovasculature from an 8-channel, 10 × undersampled data set within 56 seconds, which is already a significant improvement over existing implementations. The latest six-core Core i7 reduces the reconstruction time further to 32 seconds. Moreover, we show that the CS algorithm benefits from modern throughput-oriented architectures. Specifically, our CUDA-base implementation on NVIDIA GTX480 reconstructs the same dataset in 16 seconds, while Intel's Knights Ferry (KNF) of the MIC architecture even reduces the time to 12 seconds. Such level of performance allows the neurovascular dataset to be reconstructed within a clinically viable time.
Micro-Biomechanics of the Kebara 2 Hyoid and Its Implications for Speech in Neanderthals
D’Anastasio, Ruggero; Wroe, Stephen; Tuniz, Claudio; Mancini, Lucia; Cesana, Deneb T.; Dreossi, Diego; Ravichandiran, Mayoorendra; Attard, Marie; Parr, William C. H.; Agur, Anne; Capasso, Luigi
2013-01-01
The description of a Neanderthal hyoid from Kebara Cave (Israel) in 1989 fuelled scientific debate on the evolution of speech and complex language. Gross anatomy of the Kebara 2 hyoid differs little from that of modern humans. However, whether Homo neanderthalensis could use speech or complex language remains controversial. Similarity in overall shape does not necessarily demonstrate that the Kebara 2 hyoid was used in the same way as that of Homo sapiens. The mechanical performance of whole bones is partly controlled by internal trabecular geometries, regulated by bone-remodelling in response to the forces applied. Here we show that the Neanderthal and modern human hyoids also present very similar internal architectures and micro-biomechanical behaviours. Our study incorporates detailed analysis of histology, meticulous reconstruction of musculature, and computational biomechanical analysis with models incorporating internal micro-geometry. Because internal architecture reflects the loadings to which a bone is routinely subjected, our findings are consistent with a capacity for speech in the Neanderthals. PMID:24367509
Architecture as a Diplomatic Tool: A Proposal for the New American Embassy in Baghdad, Iraq
2006-04-01
133 William J. R. Curtis, Modern Architecture Since 1900, 3 ed. (New York: Phaidon , 1996) 295-96. 134 Curtis 296. 135 Curtis...Dover, 1986. Curtis, William J. R. Modern Architecture Since 1900. 3d ed. New York: Phaidon , 1996. De Quincy, Quatremère. The True, the Fictive
SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications
Kalinin, Alexandr A.; Palanimalai, Selvam; Dinov, Ivo D.
2018-01-01
The modern web is a successful platform for large scale interactive web applications, including visualizations. However, there are no established design principles for building complex visual analytics (VA) web applications that could efficiently integrate visualizations with data management, computational transformation, hypothesis testing, and knowledge discovery. This imposes a time-consuming design and development process on many researchers and developers. To address these challenges, we consider the design requirements for the development of a module-based VA system architecture, adopting existing practices of large scale web application development. We present the preliminary design and implementation of an open-source platform for Statistics Online Computational Resource Analytical Toolbox (SOCRAT). This platform defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module interaction, re-usage, and extension. To demonstrate how this platform can be used to integrate a number of data management, interactive visualization, and analysis tools, we implement an example application for simple VA tasks including raw data input and representation, interactive visualization and analysis. PMID:29630069
SOCRAT Platform Design: A Web Architecture for Interactive Visual Analytics Applications.
Kalinin, Alexandr A; Palanimalai, Selvam; Dinov, Ivo D
2017-04-01
The modern web is a successful platform for large scale interactive web applications, including visualizations. However, there are no established design principles for building complex visual analytics (VA) web applications that could efficiently integrate visualizations with data management, computational transformation, hypothesis testing, and knowledge discovery. This imposes a time-consuming design and development process on many researchers and developers. To address these challenges, we consider the design requirements for the development of a module-based VA system architecture, adopting existing practices of large scale web application development. We present the preliminary design and implementation of an open-source platform for Statistics Online Computational Resource Analytical Toolbox (SOCRAT). This platform defines: (1) a specification for an architecture for building VA applications with multi-level modularity, and (2) methods for optimizing module interaction, re-usage, and extension. To demonstrate how this platform can be used to integrate a number of data management, interactive visualization, and analysis tools, we implement an example application for simple VA tasks including raw data input and representation, interactive visualization and analysis.
SISYPHUS: A high performance seismic inversion factory
NASA Astrophysics Data System (ADS)
Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas
2016-04-01
In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Y.; Cameron, K.W.
1998-11-24
Workload characterization has been proven an essential tool to architecture design and performance evaluation in both scientific and commercial computing areas. Traditional workload characterization techniques include FLOPS rate, cache miss ratios, CPI (cycles per instruction or IPC, instructions per cycle) etc. With the complexity of sophisticated modern superscalar microprocessors, these traditional characterization techniques are not powerful enough to pinpoint the performance bottleneck of an application on a specific microprocessor. They are also incapable of immediately demonstrating the potential performance benefit of any architectural or functional improvement in a new processor design. To solve these problems, many people rely on simulators,more » which have substantial constraints especially on large-scale scientific computing applications. This paper presents a new technique of characterizing applications at the instruction level using hardware performance counters. It has the advantage of collecting instruction-level characteristics in a few runs virtually without overhead or slowdown. A variety of instruction counts can be utilized to calculate some average abstract workload parameters corresponding to microprocessor pipelines or functional units. Based on the microprocessor architectural constraints and these calculated abstract parameters, the architectural performance bottleneck for a specific application can be estimated. In particular, the analysis results can provide some insight to the problem that only a small percentage of processor peak performance can be achieved even for many very cache-friendly codes. Meanwhile, the bottleneck estimation can provide suggestions about viable architectural/functional improvement for certain workloads. Eventually, these abstract parameters can lead to the creation of an analytical microprocessor pipeline model and memory hierarchy model.« less
Dense and Sparse Matrix Operations on the Cell Processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel W.; Shalf, John; Oliker, Leonid
2005-05-01
The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
Security Framework for Pervasive Healthcare Architectures Utilizing MPEG-21 IPMP Components.
Fragopoulos, Anastasios; Gialelis, John; Serpanos, Dimitrios
2009-01-01
Nowadays in modern and ubiquitous computing environments, it is imperative more than ever the necessity for deployment of pervasive healthcare architectures into which the patient is the central point surrounded by different types of embedded and small computing devices, which measure sensitive physical indications, interacting with hospitals databases, allowing thus urgent medical response in occurrences of critical situations. Such environments must be developed satisfying the basic security requirements for real-time secure data communication, and protection of sensitive medical data and measurements, data integrity and confidentiality, and protection of the monitored patient's privacy. In this work, we argue that the MPEG-21 Intellectual Property Management and Protection (IPMP) components can be used in order to achieve protection of transmitted medical information and enhance patient's privacy, since there is selective and controlled access to medical data that sent toward the hospital's servers.
A Survey Of Techniques for Managing and Leveraging Caches in GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2014-09-01
Initially introduced as special-purpose accelerators for graphics applications, graphics processing units (GPUs) have now emerged as general purpose computing platforms for a wide range of applications. To address the requirements of these applications, modern GPUs include sizable hardware-managed caches. However, several factors, such as unique architecture of GPU, rise of CPU–GPU heterogeneous computing, etc., demand effective management of caches to achieve high performance and energy efficiency. Recently, several techniques have been proposed for this purpose. In this paper, we survey several architectural and system-level techniques proposed for managing and leveraging GPU caches. We also discuss the importance and challenges ofmore » cache management in GPUs. The aim of this paper is to provide the readers insights into cache management techniques for GPUs and motivate them to propose even better techniques for leveraging the full potential of caches in the GPUs of tomorrow.« less
Natural energy and vernacular architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fathy, H.
1986-01-01
This volume presents insights into the indigenous architectural forms in hot arid climates. The author presents his extensive research on climate control, particularly in the Middle East, to demonstrate the advantages of many locally available building materials and traditional building methods. He suggests improved uses of natural energy that can bridge the gap between traditional achievements and modern needs. He argues that various architectural forms in these climates have evolved intuitively from scientifically valid concepts. Such forms combine comfort and beauty, social and physical functionality. He discusses that in substituting modern materials, architects sometimes have ignored the environmental context ofmore » traditional architecture. As a result, individuals may find themselves physically and psychologically uncomfortable in modern structures. His approach, informed by a sensitive humanism, demonstrates the ways in which traditional architectural forms can be of use in solving problems facing contemporary architecture, in particular the critical housing situation in the Third World.« less
[Building Process and Architectural Planning Characteristics of Daehan Hospital Main Building].
Lee, Geauchul
2016-04-01
This paper explores the introduction process of Daehan Hospital from Japan as the modern medical facility in Korea, and the architectural planning characteristics as a medical facility through the detailed building process of Daehan Hospital main building. The most noticeable characteristic of Daehan Hospital is that it was designed and constructed not by Korean engineers but by Japanese engineers. Therefore, Daehan Hospital was influenced by Japanese early modern medical facility, and Japanese engineers modeled Daehan Hospital main building on Tokyo Medical School main building which was constructed in 1876 as the first national medical school and hospital. The architectural type of Tokyo Medical School main building was a typical school architecture in early Japanese modern period which had a middle corridor and a pseudo Western-style tower, but Tokyo Medical School main building became the model of a medical facility as the symbol of the medical department in Tokyo Imperial University. This was the introduction and transplantation process of Japanese modern 'model' like as other modern systems and technologies during the Korean modern transition period. However, unlike Tokyo Medical School main building, Daehan Hospital main building was constructed not as a wooden building but as a masonry building. Comparing with the function of Daehan Hospital main building, its architectural form and construction costs was excessive scale, which was because Japanese Resident-General of Korea had the intention of ostentation that Japanese modernity was superior to Korean Empire.
Porting plasma physics simulation codes to modern computing architectures using the
NASA Astrophysics Data System (ADS)
Germaschewski, Kai; Abbott, Stephen
2015-11-01
Available computing power has continued to grow exponentially even after single-core performance satured in the last decade. The increase has since been driven by more parallelism, both using more cores and having more parallelism in each core, e.g. in GPUs and Intel Xeon Phi. Adapting existing plasma physics codes is challenging, in particular as there is no single programming model that covers current and future architectures. We will introduce the open-source
Baryonic and mesonic 3-point functions with open spin indices
NASA Astrophysics Data System (ADS)
Bali, Gunnar S.; Collins, Sara; Gläßle, Benjamin; Heybrock, Simon; Korcyl, Piotr; Löffler, Marius; Rödl, Rudolf; Schäfer, Andreas
2018-03-01
We have implemented a new way of computing three-point correlation functions. It is based on a factorization of the entire correlation function into two parts which are evaluated with open spin-(and to some extent flavor-) indices. This allows us to estimate the two contributions simultaneously for many different initial and final states and momenta, with little computational overhead. We explain this factorization as well as its efficient implementation in a new library which has been written to provide the necessary functionality on modern parallel architectures and on CPUs, including Intel's Xeon Phi series.
Development of seismic tomography software for hybrid supercomputers
NASA Astrophysics Data System (ADS)
Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton
2015-04-01
Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.
A FAST ITERATIVE METHOD FOR SOLVING THE EIKONAL EQUATION ON TETRAHEDRAL DOMAINS
Fu, Zhisong; Kirby, Robert M.; Whitaker, Ross T.
2014-01-01
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically optimal, but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains computational density. Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers. PMID:25221418
NASA Astrophysics Data System (ADS)
Nurliani Lukito, Yulia; Previta Handoko, Bella
2018-03-01
During the 1950s, the idea of Minimalism presents itself as one of the response of the search of universal language in art and architecture. This particular style, which was started as an art movement, has received many critics in the relation to the loss of art but nevertheless Minimalism has spread all over the world and influenced many disciplines, including architecture. In minimalist architecture, elements of design convey simplicity, basic geometrical forms, with no decoration, and the use of white color, modern materials and clean spaces. The “less is more” movement in architecture, which can be seen in the works of Mies van der Rohe and also in the International Style that celebrates materiality and rationality, is also understood as Minimalism. Moreover, an important historical connection to minimalist architecture is the relationship to popular representations of how the upscale modern family lived. Recently, the idea of minimalist architecture appears in Indonesia as a preferable housing style. Adapting minimalist architecture to be suitable for a tropical climate can be done partly by modifying the forms and the microclimate such as using passive system approach or additional equipment that creates comfort in the building. This paper investigates the idea of minimalist architecture in Jakarta, Indonesia, and how the idea is widely used for housing. Some questions related to this study are investigating whether minimalist architecture in Jakarta shares the same principles with minimalist architecture in its earlier time or it is only a trend in housing design. Not only this study analyzes the moment when the idea of Minimalism develops in the history of modern architecture but also some important characteristics of minimalist architecture in different era and space. In addition, this study also discusses how minimalist architecture that happens in Jakarta becomes a way of dealing with both modern and local conditions, including a break free from traditions.
Fast CPU-based Monte Carlo simulation for radiotherapy dose calculation.
Ziegenhein, Peter; Pirner, Sven; Ph Kamerling, Cornelis; Oelfke, Uwe
2015-08-07
Monte-Carlo (MC) simulations are considered to be the most accurate method for calculating dose distributions in radiotherapy. Its clinical application, however, still is limited by the long runtimes conventional implementations of MC algorithms require to deliver sufficiently accurate results on high resolution imaging data. In order to overcome this obstacle we developed the software-package PhiMC, which is capable of computing precise dose distributions in a sub-minute time-frame by leveraging the potential of modern many- and multi-core CPU-based computers. PhiMC is based on the well verified dose planning method (DPM). We could demonstrate that PhiMC delivers dose distributions which are in excellent agreement to DPM. The multi-core implementation of PhiMC scales well between different computer architectures and achieves a speed-up of up to 37[Formula: see text] compared to the original DPM code executed on a modern system. Furthermore, we could show that our CPU-based implementation on a modern workstation is between 1.25[Formula: see text] and 1.95[Formula: see text] faster than a well-known GPU implementation of the same simulation method on a NVIDIA Tesla C2050. Since CPUs work on several hundreds of GB RAM the typical GPU memory limitation does not apply for our implementation and high resolution clinical plans can be calculated.
Multicore Architecture-aware Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srinivasa, Avinash
Modern high performance systems are becoming increasingly complex and powerful due to advancements in processor and memory architecture. In order to keep up with this increasing complexity, applications have to be augmented with certain capabilities to fully exploit such systems. These may be at the application level, such as static or dynamic adaptations or at the system level, like having strategies in place to override some of the default operating system polices, the main objective being to improve computational performance of the application. The current work proposes two such capabilites with respect to multi-threaded scientific applications, in particular a largemore » scale physics application computing ab-initio nuclear structure. The first involves using a middleware tool to invoke dynamic adaptations in the application, so as to be able to adjust to the changing computational resource availability at run-time. The second involves a strategy for effective placement of data in main memory, to optimize memory access latencies and bandwidth. These capabilties when included were found to have a significant impact on the application performance, resulting in average speedups of as much as two to four times.« less
A High Performance VLSI Computer Architecture For Computer Graphics
NASA Astrophysics Data System (ADS)
Chin, Chi-Yuan; Lin, Wen-Tai
1988-10-01
A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
Medical Signal-Conditioning and Data-Interface System
NASA Technical Reports Server (NTRS)
Braun, Jeffrey; Jacobus, charles; Booth, Scott; Suarez, Michael; Smith, Derek; Hartnagle, Jeffrey; LePrell, Glenn
2006-01-01
A general-purpose portable, wearable electronic signal-conditioning and data-interface system is being developed for medical applications. The system can acquire multiple physiological signals (e.g., electrocardiographic, electroencephalographic, and electromyographic signals) from sensors on the wearer s body, digitize those signals that are received in analog form, preprocess the resulting data, and transmit the data to one or more remote location(s) via a radiocommunication link and/or the Internet. The system includes a computer running data-object-oriented software that can be programmed to configure the system to accept almost any analog or digital input signals from medical devices. The computing hardware and software implement a general-purpose data-routing-and-encapsulation architecture that supports tagging of input data and routing the data in a standardized way through the Internet and other modern packet-switching networks to one or more computer(s) for review by physicians. The architecture supports multiple-site buffering of data for redundancy and reliability, and supports both real-time and slower-than-real-time collection, routing, and viewing of signal data. Routing and viewing stations support insertion of automated analysis routines to aid in encoding, analysis, viewing, and diagnosis.
NASA Technical Reports Server (NTRS)
Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.
SNAPSHOT: A MODERN, SUSTAINABLE HOLDUP MEASUREMENT SYSTEM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rowe, Nathan C; Younkin, James R; Smith, Steven E
2016-01-01
SNAPSHOT is a software platform designed to eventually replace Holdup Measurement System 4 (HMS 4), which is the current state-of-the-art for acquisition and analysis of nondestructive assay measurement data for in situ nuclear materials, holdup, in support of criticality safety and material control and accounting. HMS 4 is over 10 years old and is currently unsustainable due to hardware and software incompatibilities that have arisen from advances in detector electronics, primarily updates to multi-channel analyzers (MCAs), and both computer and handheld operating systems. SNAPSHOT is a complete redesign of HMS 4 that addresses the issue of compatibility with modern MCAsmore » and operating systems and that is designed with a flexible architecture to support long-term sustainability. It also provides an updated and more user friendly interface and is being developed under an NQA 1 software quality assurance (SQA) program to facilitate site acceptance for safety-related applications. This paper provides an overview of the SNAPSHOT project including details of the software development process, the SQA program, and the architecture designed to support sustainability.« less
Optimizing high performance computing workflow for protein functional annotation.
Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene
2014-09-10
Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.
Optimizing high performance computing workflow for protein functional annotation
Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene
2014-01-01
Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296
Capital Architecture: Situating symbolism parallel to architectural methods and technology
NASA Astrophysics Data System (ADS)
Daoud, Bassam
Capital Architecture is a symbol of a nation's global presence and the cultural and social focal point of its inhabitants. Since the advent of High-Modernism in Western cities, and subsequently decolonised capitals, civic architecture no longer seems to be strictly grounded in the philosophy that national buildings shape the legacy of government and the way a nation is regarded through its built environment. Amidst an exceedingly globalized architectural practice and with the growing concern of key heritage foundations over the shortcomings of international modernism in representing its immediate socio-cultural context, the contextualization of public architecture within its sociological, cultural and economic framework in capital cities became the key denominator of this thesis. Civic architecture in capital cities is essential to confront the challenges of symbolizing a nation and demonstrating the legitimacy of the government'. In today's dominantly secular Western societies, governmental architecture, especially where the seat of political power lies, is the ultimate form of architectural expression in conveying a sense of identity and underlining a nation's status. Departing with these convictions, this thesis investigates the embodied symbolic power, the representative capacity, and the inherent permanence in contemporary architecture, and in its modes of production. Through a vast study on Modern architectural ideals and heritage -- in parallel to methodologies -- the thesis stimulates the future of large scale governmental building practices and aims to identify and index the key constituents that may respond to the lack representation in civic architecture in capital cities.
NASA Astrophysics Data System (ADS)
Burnett, W.
2016-12-01
The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Phipps, Eric T.; D'Elia, Marta; Edwards, Harold C.; ...
2017-04-18
In this study, quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in anmore » embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).« less
High Performance Radiation Transport Simulations on TITAN
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M
2012-01-01
In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less
Modern Conditions and the Impacts of the Creation of Architectural Environment
NASA Astrophysics Data System (ADS)
Abyzov, Vadym
2017-10-01
The purpose of this research is an attempt to identify and analyse the modern conditions and impacts of the creation of architectural environment and on this basis to determine the main directions and tasks of the development of architecture at the appropriate hierarchical levels. A comprehensive review and structural analysis of all impact factors and different current conditions that lead to the sustainable architecture design are conducted in the proposal. The main groups of factors and conditions such as social-economical, natural-geographic, urban, ergonomics, ecological, typological, technical, cultural, and aesthetics are determined in accordance with their contemporary specifics. This analysis provides an opportunity to define the appropriative hierarchical levels of the modern trends and prospects of creation an effective, attractive and friendly architectural environment. Some examples of author’s projects and implementations is presented in the article. Such methodological approach will help to create a holistic view of the creation architectural environment, will allow to systematize existing knowledges and concepts, practices and prospects of the means and methods of its formation and development.
Advances in Modern Botnet Understanding and the Accurate Enumeration of Infected Hosts
ERIC Educational Resources Information Center
Nunnery, Christopher Edward
2011-01-01
Botnets remain a potent threat due to evolving modern architectures, inadequate remediation methods, and inaccurate measurement techniques. In response, this research exposes the architectures and operations of two advanced botnets, techniques to enumerate infected hosts, and pursues the scientific refinement of infected-host enumeration data by…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dennig, Yasmin
Sandia National Laboratories has a long history of significant contributions to the high performance community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraFLOP barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing (HPC) capabilities to gain a tremendous competitive edge in the marketplace. As part of our continuing commitment to providing modern computing infrastructuremore » and systems in support of Sandia missions, we made a major investment in expanding Building 725 to serve as the new home of HPC systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) Prototype platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment and application code performance.« less
Architecture for an integrated real-time air combat and sensor network simulation
NASA Astrophysics Data System (ADS)
Criswell, Evans A.; Rushing, John; Lin, Hong; Graves, Sara
2007-04-01
An architecture for an integrated air combat and sensor network simulation is presented. The architecture integrates two components: a parallel real-time sensor fusion and target tracking simulation, and an air combat simulation. By integrating these two simulations, it becomes possible to experiment with scenarios in which one or both sides in a battle have very large numbers of primitive passive sensors, and to assess the likely effects of those sensors on the outcome of the battle. Modern Air Power is a real-time theater-level air combat simulation that is currently being used as a part of the USAF Air and Space Basic Course (ASBC). The simulation includes a variety of scenarios from the Vietnam war to the present day, and also includes several hypothetical future scenarios. Modern Air Power includes a scenario editor, an order of battle editor, and full AI customization features that make it possible to quickly construct scenarios for any conflict of interest. The scenario editor makes it possible to place a wide variety of sensors including both high fidelity sensors such as radars, and primitive passive sensors that provide only very limited information. The parallel real-time sensor network simulation is capable of handling very large numbers of sensors on a computing cluster of modest size. It can fuse information provided by disparate sensors to detect and track targets, and produce target tracks.
NASA Astrophysics Data System (ADS)
Yager, Kevin; Albert, Thomas; Brower, Bernard V.; Pellechia, Matthew F.
2015-06-01
The domain of Geospatial Intelligence Analysis is rapidly shifting toward a new paradigm of Activity Based Intelligence (ABI) and information-based Tipping and Cueing. General requirements for an advanced ABIAA system present significant challenges in architectural design, computing resources, data volumes, workflow efficiency, data mining and analysis algorithms, and database structures. These sophisticated ABI software systems must include advanced algorithms that automatically flag activities of interest in less time and within larger data volumes than can be processed by human analysts. In doing this, they must also maintain the geospatial accuracy necessary for cross-correlation of multi-intelligence data sources. Historically, serial architectural workflows have been employed in ABIAA system design for tasking, collection, processing, exploitation, and dissemination. These simpler architectures may produce implementations that solve short term requirements; however, they have serious limitations that preclude them from being used effectively in an automated ABIAA system with multiple data sources. This paper discusses modern ABIAA architectural considerations providing an overview of an advanced ABIAA system and comparisons to legacy systems. It concludes with a recommended strategy and incremental approach to the research, development, and construction of a fully automated ABIAA system.
NASA Astrophysics Data System (ADS)
Lebedev, A. A.; Ivanova, E. G.; Komleva, V. A.; Klokov, N. M.; Komlev, A. A.
2017-01-01
The considered method of learning the basics of microelectronic circuits and systems amplifier enables one to understand electrical processes deeper, to understand the relationship between static and dynamic characteristics and, finally, bring the learning process to the cognitive process. The scheme of problem-based learning can be represented by the following sequence of procedures: the contradiction is perceived and revealed; the cognitive motivation is provided by creating a problematic situation (the mental state of the student), moving the desire to solve the problem, to raise the question "why?", the hypothesis is made; searches for solutions are implemented; the answer is looked for. Due to the complexity of architectural schemes in the work the modern methods of computer analysis and synthesis are considered in the work. Examples of engineering by students in the framework of students' scientific and research work of analog circuits with improved performance based on standard software and software developed at the Department of Microelectronics MEPhI.
Timeliness of Creative Subjects in Architecture Education
NASA Astrophysics Data System (ADS)
Vargot, T.
2017-11-01
The following article is about the problem of insufficient number of drawing and painting lessons delivered in the process of architectural education. There is a comparison between the education of successful architects of the past and modern times. The author stands for the importance of creative subjects being the essential part of development and education of future architects. Skills achieved during the study of creative subjects will be used not only as a mean of self-expression but as an instrument in the toolkit of a professional. Sergei Tchoban was taken as an example of a successful architect for whom the knowledge of a man-made drawing is very important. He arranges the contests of architectural drawings for students promoting creative development in this way. Nowadays, students tend to use computer programs to make architectural projects losing their individual approach. The creative process becomes a matter of scissors and paste being just a copy of something that already exists. The solution of the problem is the reconsideration of the department’s curriculum and adding extra hours for creative subjects.
HYDRA : High-speed simulation architecture for precision spacecraft formation simulation
NASA Technical Reports Server (NTRS)
Martin, Bryan J.; Sohl, Garett.
2003-01-01
e Hierarchical Distributed Reconfigurable Architecture- is a scalable simulation architecture that provides flexibility and ease-of-use which take advantage of modern computation and communication hardware. It also provides the ability to implement distributed - or workstation - based simulations and high-fidelity real-time simulation from a common core. Originally designed to serve as a research platform for examining fundamental challenges in formation flying simulation for future space missions, it is also finding use in other missions and applications, all of which can take advantage of the underlying Object-Oriented structure to easily produce distributed simulations. Hydra automates the process of connecting disparate simulation components (Hydra Clients) through a client server architecture that uses high-level descriptions of data associated with each client to find and forge desirable connections (Hydra Services) at run time. Services communicate through the use of Connectors, which abstract messaging to provide single-interface access to any desired communication protocol, such as from shared-memory message passing to TCP/IP to ACE and COBRA. Hydra shares many features with the HLA, although providing more flexibility in connectivity services and behavior overriding.
NASA Astrophysics Data System (ADS)
Guzman, J. C.; Bennett, T.
2008-08-01
The Convergent Radio Astronomy Demonstrator (CONRAD) is a collaboration between the computing teams of two SKA pathfinder instruments, MeerKAT (South Africa) and ASKAP (Australia). Our goal is to produce the required common software to operate, process and store the data from the two instruments. Both instruments are synthesis arrays composed of a large number of antennas (40 - 100) operating at centimeter wavelengths with wide-field capabilities. Key challenges are the processing of high volume of data in real-time as well as the remote mode of operations. Here we present the software architecture for CONRAD. Our design approach is to maximize the use of open solutions and third-party software widely deployed in commercial applications, such as SNMP and LDAP, and to utilize modern web-based technologies for the user interfaces, such as AJAX.
Accelerating next generation sequencing data analysis with system level optimizations.
Kathiresan, Nagarajan; Temanni, Ramzi; Almabrazi, Hakeem; Syed, Najeeb; Jithesh, Puthen V; Al-Ali, Rashid
2017-08-22
Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.
Fractional Steps methods for transient problems on commodity computer architectures
NASA Astrophysics Data System (ADS)
Krotkiewski, M.; Dabrowski, M.; Podladchikov, Y. Y.
2008-12-01
Fractional Steps methods are suitable for modeling transient processes that are central to many geological applications. Low memory requirements and modest computational complexity facilitates calculations on high-resolution three-dimensional models. An efficient implementation of Alternating Direction Implicit/Locally One-Dimensional schemes for an Opteron-based shared memory system is presented. The memory bandwidth usage, the main bottleneck on modern computer architectures, is specially addressed. High efficiency of above 2 GFlops per CPU is sustained for problems of 1 billion degrees of freedom. The optimized sequential implementation of all 1D sweeps is comparable in execution time to copying the used data in the memory. Scalability of the parallel implementation on up to 8 CPUs is close to perfect. Performing one timestep of the Locally One-Dimensional scheme on a system of 1000 3 unknowns on 8 CPUs takes only 11 s. We validate the LOD scheme using a computational model of an isolated inclusion subject to a constant far field flux. Next, we study numerically the evolution of a diffusion front and the effective thermal conductivity of composites consisting of multiple inclusions and compare the results with predictions based on the differential effective medium approach. Finally, application of the developed parabolic solver is suggested for a real-world problem of fluid transport and reactions inside a reservoir.
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators
Wang, Wei; Xu, Lifan; Cavazos, John; Huang, Howie H.; Kay, Matthew
2014-01-01
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media. PMID:24497950
The efficiency of geophysical adjoint codes generated by automatic differentiation tools
NASA Astrophysics Data System (ADS)
Vlasenko, A. V.; Köhl, A.; Stammer, D.
2016-02-01
The accuracy of numerical models that describe complex physical or chemical processes depends on the choice of model parameters. Estimating an optimal set of parameters by optimization algorithms requires knowledge of the sensitivity of the process of interest to model parameters. Typically the sensitivity computation involves differentiation of the model, which can be performed by applying algorithmic differentiation (AD) tools to the underlying numerical code. However, existing AD tools differ substantially in design, legibility and computational efficiency. In this study we show that, for geophysical data assimilation problems of varying complexity, the performance of adjoint codes generated by the existing AD tools (i) Open_AD, (ii) Tapenade, (iii) NAGWare and (iv) Transformation of Algorithms in Fortran (TAF) can be vastly different. Based on simple test problems, we evaluate the efficiency of each AD tool with respect to computational speed, accuracy of the adjoint, the efficiency of memory usage, and the capability of each AD tool to handle modern FORTRAN 90-95 elements such as structures and pointers, which are new elements that either combine groups of variables or provide aliases to memory addresses, respectively. We show that, while operator overloading tools are the only ones suitable for modern codes written in object-oriented programming languages, their computational efficiency lags behind source transformation by orders of magnitude, rendering the application of these modern tools to practical assimilation problems prohibitive. In contrast, the application of source transformation tools appears to be the most efficient choice, allowing handling even large geophysical data assimilation problems. However, they can only be applied to numerical models written in earlier generations of programming languages. Our study indicates that applying existing AD tools to realistic geophysical problems faces limitations that urgently need to be solved to allow the continuous use of AD tools for solving geophysical problems on modern computer architectures.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors
NASA Astrophysics Data System (ADS)
Yi, Hongsuk
2014-03-01
Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Advanced Architectures for Modern Weather/Multifunction Radars
2017-03-01
Advanced Architectures for Modern Weather /Multifunction Radars Caleb Fulton The University of Oklahoma Advanced Radar Research Center Norman...and all of them are addressing the need to lower cost while improving beamforming flexibility in future weather radar systems that will be tasked...with multiple non- weather functions. Keywords: Phased arrays, digital beamforming, multifunction radar. Introduction and Overview As the performance
NASA Astrophysics Data System (ADS)
Mezzino, D.; Pei, W.; Santana Quintero, M.; Reyes Rodriguez, R.
2015-08-01
This contribution describes the results of an International workshop on documentation of historic and cultural heritage developed jointly between Universidad de Guadalajara's Centro Universitario de Arte, Arquitectura y Diseño (CUAAD) and Carleton University's Architectural Conservation and Sustainability Program. The objective of the workshop was to create a learning environment for emerging heritage professionals through the use of advanced recording techniques for the documentation of modern architectural heritage in Guadalajara, Mexico. The selected site was Casa Cristo, one of the several architectural projects by Luis Barragán in Guadalajara. The house was built between 1927 and 1929 for Gustavo R. Cristo, mayor of the city. The style of the building reflects the European influences derived from the architect's travel experience, as well as the close connection with local craftsmanship. All of these make the house an outstanding example of modern regional architecture. A systematic documentation strategy was developed for the site, using different survey equipment and techniques to capture the shape, colour, spatial configuration, and current conditions of Casa Cristo for its eventual rehabilitation and conservation.
NASA Astrophysics Data System (ADS)
Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A.; Oliveira, Micael J. T.; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G.; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A. L.
2012-06-01
Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Andrade, Xavier; Alberdi-Rodriguez, Joseba; Strubbe, David A; Oliveira, Micael J T; Nogueira, Fernando; Castro, Alberto; Muguerza, Javier; Arruabarrena, Agustin; Louie, Steven G; Aspuru-Guzik, Alán; Rubio, Angel; Marques, Miguel A L
2012-06-13
Octopus is a general-purpose density-functional theory (DFT) code, with a particular emphasis on the time-dependent version of DFT (TDDFT). In this paper we present the ongoing efforts to achieve the parallelization of octopus. We focus on the real-time variant of TDDFT, where the time-dependent Kohn-Sham equations are directly propagated in time. This approach has great potential for execution in massively parallel systems such as modern supercomputers with thousands of processors and graphics processing units (GPUs). For harvesting the potential of conventional supercomputers, the main strategy is a multi-level parallelization scheme that combines the inherent scalability of real-time TDDFT with a real-space grid domain-partitioning approach. A scalable Poisson solver is critical for the efficiency of this scheme. For GPUs, we show how using blocks of Kohn-Sham states provides the required level of data parallelism and that this strategy is also applicable for code optimization on standard processors. Our results show that real-time TDDFT, as implemented in octopus, can be the method of choice for studying the excited states of large molecular systems in modern parallel architectures.
Accelerating scientific computations with mixed precision algorithms
NASA Astrophysics Data System (ADS)
Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire
2009-12-01
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
Sensing and Measurement Architecture for Grid Modernization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taft, Jeffrey D.; De Martini, Paul
2016-02-01
This paper addresses architecture for grid sensor networks, with primary emphasis on distribution grids. It describes a forward-looking view of sensor network architecture for advanced distribution grids, and discusses key regulatory, financial, and planning issues.
Dreams and creative problem-solving.
Barrett, Deirdre
2017-10-01
Dreams have produced art, music, novels, films, mathematical proofs, designs for architecture, telescopes, and computers. Dreaming is essentially our brain thinking in another neurophysiologic state-and therefore it is likely to solve some problems on which our waking minds have become stuck. This neurophysiologic state is characterized by high activity in brain areas associated with imagery, so problems requiring vivid visualization are also more likely to get help from dreaming. This article reviews great historical dreams and modern laboratory research to suggest how dreams can aid creativity and problem-solving. © 2017 New York Academy of Sciences.
Global magnetohydrodynamic simulations on multiple GPUs
NASA Astrophysics Data System (ADS)
Wong, Un-Hong; Wong, Hon-Cheng; Ma, Yonghui
2014-01-01
Global magnetohydrodynamic (MHD) models play the major role in investigating the solar wind-magnetosphere interaction. However, the huge computation requirement in global MHD simulations is also the main problem that needs to be solved. With the recent development of modern graphics processing units (GPUs) and the Compute Unified Device Architecture (CUDA), it is possible to perform global MHD simulations in a more efficient manner. In this paper, we present a global magnetohydrodynamic (MHD) simulator on multiple GPUs using CUDA 4.0 with GPUDirect 2.0. Our implementation is based on the modified leapfrog scheme, which is a combination of the leapfrog scheme and the two-step Lax-Wendroff scheme. GPUDirect 2.0 is used in our implementation to drive multiple GPUs. All data transferring and kernel processing are managed with CUDA 4.0 API instead of using MPI or OpenMP. Performance measurements are made on a multi-GPU system with eight NVIDIA Tesla M2050 (Fermi architecture) graphics cards. These measurements show that our multi-GPU implementation achieves a peak performance of 97.36 GFLOPS in double precision.
Supercomputing resources empowering superstack with interactive and integrated systems
NASA Astrophysics Data System (ADS)
Rückemann, Claus-Peter
2012-09-01
This paper presents the results from the development and implementation of Superstack algorithms to be dynamically used with integrated systems and supercomputing resources. Processing of geophysical data, thus named geoprocessing, is an essential part of the analysis of geoscientific data. The theory of Superstack algorithms and the practical application on modern computing architectures was inspired by developments introduced with processing of seismic data on mainframes and within the last years leading to high end scientific computing applications. There are several stacking algorithms known but with low signal to noise ratio in seismic data the use of iterative algorithms like the Superstack can support analysis and interpretation. The new Superstack algorithms are in use with wave theory and optical phenomena on highly performant computing resources for huge data sets as well as for sophisticated application scenarios in geosciences and archaeology.
DNS of Flow in a Low-Pressure Turbine Cascade Using a Discontinuous-Galerkin Spectral-Element Method
NASA Technical Reports Server (NTRS)
Garai, Anirban; Diosady, Laslo Tibor; Murman, Scott; Madavan, Nateri
2015-01-01
A new computational capability under development for accurate and efficient high-fidelity direct numerical simulation (DNS) and large eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy-stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy and is implemented in a computationally efficient manner on a modern high performance computer architecture. A validation study using this method to perform DNS of flow in a low-pressure turbine airfoil cascade are presented. Preliminary results indicate that the method captures the main features of the flow. Discrepancies between the predicted results and the experiments are likely due to the effects of freestream turbulence not being included in the simulation and will be addressed in the final paper.
Power estimation on functional level for programmable processors
NASA Astrophysics Data System (ADS)
Schneider, M.; Blume, H.; Noll, T. G.
2004-05-01
In diesem Beitrag werden verschiedene Ansätze zur Verlustleistungsschätzung von programmierbaren Prozessoren vorgestellt und bezüglich ihrer Übertragbarkeit auf moderne Prozessor-Architekturen wie beispielsweise Very Long Instruction Word (VLIW)-Architekturen bewertet. Besonderes Augenmerk liegt hierbei auf dem Konzept der sogenannten Functional-Level Power Analysis (FLPA). Dieser Ansatz basiert auf der Einteilung der Prozessor-Architektur in funktionale Blöcke wie beispielsweise Processing-Unit, Clock-Netzwerk, interner Speicher und andere. Die Verlustleistungsaufnahme dieser Bl¨ocke wird parameterabhängig durch arithmetische Modellfunktionen beschrieben. Durch automatisierte Analyse von Assemblercodes des zu schätzenden Systems mittels eines Parsers können die Eingangsparameter wie beispielsweise der erzielte Parallelitätsgrad oder die Art des Speicherzugriffs gewonnen werden. Dieser Ansatz wird am Beispiel zweier moderner digitaler Signalprozessoren durch eine Vielzahl von Basis-Algorithmen der digitalen Signalverarbeitung evaluiert. Die ermittelten Schätzwerte für die einzelnen Algorithmen werden dabei mit physikalisch gemessenen Werten verglichen. Es ergibt sich ein sehr kleiner maximaler Schätzfehler von 3%. In this contribution different approaches for power estimation for programmable processors are presented and evaluated concerning their capability to be applied to modern digital signal processor architectures like e.g. Very Long InstructionWord (VLIW) -architectures. Special emphasis will be laid on the concept of so-called Functional-Level Power Analysis (FLPA). This approach is based on the separation of the processor architecture into functional blocks like e.g. processing unit, clock network, internal memory and others. The power consumption of these blocks is described by parameter dependent arithmetic model functions. By application of a parser based automized analysis of assembler codes of the systems to be estimated the input parameters of the Correspondence to: H. Blume (blume@eecs.rwth-aachen.de) arithmetic functions like e.g. the achieved degree of parallelism or the kind and number of memory accesses can be computed. This approach is exemplarily demonstrated and evaluated applying two modern digital signal processors and a variety of basic algorithms of digital signal processing. The resulting estimation values for the inspected algorithms are compared to physically measured values. A resulting maximum estimation error of 3% is achieved.
Highly parallel implementation of non-adiabatic Ehrenfest molecular dynamics
NASA Astrophysics Data System (ADS)
Kanai, Yosuke; Schleife, Andre; Draeger, Erik; Anisimov, Victor; Correa, Alfredo
2014-03-01
While the adiabatic Born-Oppenheimer approximation tremendously lowers computational effort, many questions in modern physics, chemistry, and materials science require an explicit description of coupled non-adiabatic electron-ion dynamics. Electronic stopping, i.e. the energy transfer of a fast projectile atom to the electronic system of the target material, is a notorious example. We recently implemented real-time time-dependent density functional theory based on the plane-wave pseudopotential formalism in the Qbox/qb@ll codes. We demonstrate that explicit integration using a fourth-order Runge-Kutta scheme is very suitable for modern highly parallelized supercomputers. Applying the new implementation to systems with hundreds of atoms and thousands of electrons, we achieved excellent performance and scalability on a large number of nodes both on the BlueGene based ``Sequoia'' system at LLNL as well as the Cray architecture of ``Blue Waters'' at NCSA. As an example, we discuss our work on computing the electronic stopping power of aluminum and gold for hydrogen projectiles, showing an excellent agreement with experiment. These first-principles calculations allow us to gain important insight into the the fundamental physics of electronic stopping.
A Modern Take on the RV Classics: N-body Analysis of GJ 876 and 55 Cnc
NASA Astrophysics Data System (ADS)
Nelson, Benjamin E.; Ford, E. B.; Wright, J.
2013-01-01
Over the past two decades, radial velocity (RV) observations have uncovered a diverse population of exoplanet systems, in particular a subset of multi-planet systems that exhibit strong dynamical interactions. To extract the model parameters (and uncertainties) accurately from these observations, one requires self-consistent n-body integrations and must explore a high-dimensional 7 x number of planets) parameter space, both of which are computationally challenging. Utilizing the power of modern computing resources, we apply our Radial velocity Using N-body Differential Evolution Markov Chain Monte Carlo code (RUN DEMCMC) to two landmark systems from early exoplanet surveys: GJ 876 and 55 Cnc. For GJ 876, we analyze the Keck HIRES (Rivera et al. 2010) and HARPS (Correia et al. 2010) data and constrain the distribution of the Laplace argument. For 55 Cnc, we investigate the orbital architecture based on a cumulative 1086 RV observations from various sources and transit constraints from Winn et al. 2011. In both cases, we also test for long-term orbital stability.
Building Paradigms: Major Transformations in School Architecture (1798-2009)
ERIC Educational Resources Information Center
Gislason, Neil
2009-01-01
This article provides an historical overview of significant trends in school architecture from 1798 to the present. I divide the history of school architecture into two major phases. The first period falls between 1798 and 1921: the modern graded classroom emerged as a standard architectural feature during this period. The second period, which…
NASA Astrophysics Data System (ADS)
Macikowski, Bartosz
2017-10-01
Modernism is usually recognized and associated with the aesthetics of the International Style, represented by white-plastered, horizontally articulated architecture with skimpy decoration, where function was the main imperative of the architects’ ambitions. In Northern Europe though, Modernism also revealed its brick face, representing different manners, styles, and appearances. The brick face of Modernism reflected, in fact, the complexity of the modern change, breaking ties with the historic styles of the 19th century and being still present in the beginning of the 20th century. Regardless of the cosmopolitan character of the International Style and its unified aesthetics, architects tried to find and keep shades of individuality. This was especially visible in the references to either regional or even local traditions. This diversity of modernistic architecture is intensified by its different functions. The language of industrial architecture derives its forms directly from its nature of pure functional idiom, devoted to economic and functional optimization. The industrial form usually seems subordinate to the technical nature of objects. But regardless of that, in the 19th century and the first half of the 20th century we can observe an interesting evolution of styles and tendencies in industrial architecture, even in such a narrow and specific field like the architecture of small hydropower plants. The purpose of the research was to recognize the evolution of the architectural form of hydropower plants as a developing branch of industry in the first half of the 20th century. In Pomerania, during this period, a dynamic growth of investments took place, which concerned the use of the Pomeranian rivers’ potential to produce electric energy. At the end of the 19th century, electricity had a strong meaning as a symbol of a radical civilizational change, which influenced also the aesthetic aspects of architecture. This could suggest that the architecture of hydropower plants should be one of the carriers of the new progressive architecture. In fact, in the case of the Pomeranian hydropower plants, their technical solutions were among the most advanced and progressive solutions of those times, sometimes even experimental, adjusted to the diversity of local geographical conditions. Regardless of that, the architecture of the Pomeranian power plants was rather reflecting the diversity and dynamism of the aesthetic discourse of the time (sometimes even representing and adopting traditional or historical forms). The cascade of the power plants Podgaje (1928), Jastrowie (1930), and Ptusza (1930), all part of the same investment on the river Gwda, can be the example of the absorption and development of new aesthetic trends within the same stream of clinker architecture. The paper describes selected examples of Pomeranian power plants as a comparative study which could illustrate the evolution of the brick architecture of the beginning of the 20th century.
Challenges of Big Data Analysis.
Fan, Jianqing; Han, Fang; Liu, Han
2014-06-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis
Fan, Jianqing; Han, Fang; Liu, Han
2014-01-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
Computational sciences in the upstream oil and gas industry
Halsey, Thomas C.
2016-01-01
The predominant technical challenge of the upstream oil and gas industry has always been the fundamental uncertainty of the subsurface from which it produces hydrocarbon fluids. The subsurface can be detected remotely by, for example, seismic waves, or it can be penetrated and studied in the extremely limited vicinity of wells. Inevitably, a great deal of uncertainty remains. Computational sciences have been a key avenue to reduce and manage this uncertainty. In this review, we discuss at a relatively non-technical level the current state of three applications of computational sciences in the industry. The first of these is seismic imaging, which is currently being revolutionized by the emergence of full wavefield inversion, enabled by algorithmic advances and petascale computing. The second is reservoir simulation, also being advanced through the use of modern highly parallel computing architectures. Finally, we comment on the role of data analytics in the upstream industry. This article is part of the themed issue ‘Energy and the subsurface’. PMID:27597785
US NDC Modernization: Service Oriented Architecture Study Status
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamlet, Benjamin R.; Encarnacao, Andre Villanova; Harris, James M.
2014-12-01
This report is a progress update on the USNDC Modernization Service Oriented Architecture (SOA) study describing results from Inception Iteration 1, which occurred between October 2012 and March 2013. The goals during this phase are 1) discovering components of the system that have potential service implementations, 2) identifying applicable SOA patterns for data access, service interfaces, and service orchestration/choreography, and 3) understanding performance tradeoffs for various SOA patterns
IAServ: an intelligent home care web services platform in a cloud for aging-in-place.
Su, Chuan-Jun; Chiang, Chang-Yu
2013-11-12
As the elderly population has been rapidly expanding and the core tax-paying population has been shrinking, the need for adequate elderly health and housing services continues to grow while the resources to provide such services are becoming increasingly scarce. Thus, increasing the efficiency of the delivery of healthcare services through the use of modern technology is a pressing issue. The seamless integration of such enabling technologies as ontology, intelligent agents, web services, and cloud computing is transforming healthcare from hospital-based treatments to home-based self-care and preventive care. A ubiquitous healthcare platform based on this technological integration, which synergizes service providers with patients' needs to be developed to provide personalized healthcare services at the right time, in the right place, and the right manner. This paper presents the development and overall architecture of IAServ (the Intelligent Aging-in-place Home care Web Services Platform) to provide personalized healthcare service ubiquitously in a cloud computing setting to support the most desirable and cost-efficient method of care for the aged-aging in place. The IAServ is expected to offer intelligent, pervasive, accurate and contextually-aware personal care services. Architecturally the implemented IAServ leverages web services and cloud computing to provide economic, scalable, and robust healthcare services over the Internet.
IAServ: An Intelligent Home Care Web Services Platform in a Cloud for Aging-in-Place
Su, Chuan-Jun; Chiang, Chang-Yu
2013-01-01
As the elderly population has been rapidly expanding and the core tax-paying population has been shrinking, the need for adequate elderly health and housing services continues to grow while the resources to provide such services are becoming increasingly scarce. Thus, increasing the efficiency of the delivery of healthcare services through the use of modern technology is a pressing issue. The seamless integration of such enabling technologies as ontology, intelligent agents, web services, and cloud computing is transforming healthcare from hospital-based treatments to home-based self-care and preventive care. A ubiquitous healthcare platform based on this technological integration, which synergizes service providers with patients’ needs to be developed to provide personalized healthcare services at the right time, in the right place, and the right manner. This paper presents the development and overall architecture of IAServ (the Intelligent Aging-in-place Home care Web Services Platform) to provide personalized healthcare service ubiquitously in a cloud computing setting to support the most desirable and cost-efficient method of care for the aged-aging in place. The IAServ is expected to offer intelligent, pervasive, accurate and contextually-aware personal care services. Architecturally the implemented IAServ leverages web services and cloud computing to provide economic, scalable, and robust healthcare services over the Internet. PMID:24225647
NASA Astrophysics Data System (ADS)
Pane, I. F.; Loebis, M. N.; Azhari, I.; Ginting, N.; Harisdani, D. D.
2018-02-01
As we know before that the development of Modern Architecture especially in Europe was not separated from the influence of Avant Garde, this influence brought a remarkable thing to the development of mindset at the time. It was not only in art and its beauty, but also in the development of architectural theories. The development of this new theory produced new forms in the world of architecture. The transition era from Renaissance to Modern was showed with Eclectic Architecture, at that time the shape of the building was still in classical form. The use of classical ornaments was still often used so that the outside appearance of architecture was seen like the previous one. Although skeleton concrete had been found as a new technology at that time, the shape of the building was still looked like in the classical period. This will be different at the time after the Eclectic because the views on the mindset of life have changed. The development of modern architecture in Europe is influenced by the movement of Functionalism and International Style; this movement characterizes a free architecture in the past. This movement also brought to Indonesia and some big cities like Jakarta, Surabaya and Medan. This study aims to see its influence in several colonial buildings built during the reign of the Dutch East Indies in Medan. With a qualitative approach looking at the existing theory and theories used to clarify the effect of the movement and see that the colonial buildings have a significant influence on the development of Functionalism and International Style in Europe.
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-01-01
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek
2014-02-12
This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1; 920 × 1; 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems.
Computers in Academic Architecture Libraries.
ERIC Educational Resources Information Center
Willis, Alfred; And Others
1992-01-01
Computers are widely used in architectural research and teaching in U.S. schools of architecture. A survey of libraries serving these schools sought information on the emphasis placed on computers by the architectural curriculum, accessibility of computers to library staff, and accessibility of computers to library patrons. Survey results and…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, Haley; BC Cancer Agency, Surrey, B.C.; BC Cancer Agency, Vancouver, B.C.
2014-08-15
Many have speculated about the future of computational technology in clinical radiation oncology. It has been advocated that the next generation of computational infrastructure will improve on the current generation by incorporating richer aspects of automation, more heavily and seamlessly featuring distributed and parallel computation, and providing more flexibility toward aggregate data analysis. In this report we describe how a recently created — but currently existing — analysis framework (DICOMautomaton) incorporates these aspects. DICOMautomaton supports a variety of use cases but is especially suited for dosimetric outcomes correlation analysis, investigation and comparison of radiotherapy treatment efficacy, and dose-volume computation. Wemore » describe: how it overcomes computational bottlenecks by distributing workload across a network of machines; how modern, asynchronous computational techniques are used to reduce blocking and avoid unnecessary computation; and how issues of out-of-date data are addressed using reactive programming techniques and data dependency chains. We describe internal architecture of the software and give a detailed demonstration of how DICOMautomaton could be used to search for correlations between dosimetric and outcomes data.« less
High-Performance 3D Compressive Sensing MRI Reconstruction Using Many-Core Architectures.
Kim, Daehyun; Trzasko, Joshua; Smelyanskiy, Mikhail; Haider, Clifton; Dubey, Pradeep; Manduca, Armando
2011-01-01
Compressive sensing (CS) describes how sparse signals can be accurately reconstructed from many fewer samples than required by the Nyquist criterion. Since MRI scan duration is proportional to the number of acquired samples, CS has been gaining significant attention in MRI. However, the computationally intensive nature of CS reconstructions has precluded their use in routine clinical practice. In this work, we investigate how different throughput-oriented architectures can benefit one CS algorithm and what levels of acceleration are feasible on different modern platforms. We demonstrate that a CUDA-based code running on an NVIDIA Tesla C2050 GPU can reconstruct a 256 × 160 × 80 volume from an 8-channel acquisition in 19 seconds, which is in itself a significant improvement over the state of the art. We then show that Intel's Knights Ferry can perform the same 3D MRI reconstruction in only 12 seconds, bringing CS methods even closer to clinical viability.
Cosmological neutrino simulations at extreme scale
Emberson, J. D.; Yu, Hao-Ran; Inman, Derek; ...
2017-08-01
Constraining neutrino mass remains an elusive challenge in modern physics. Precision measurements are expected from several upcoming cosmological probes of large-scale structure. Achieving this goal relies on an equal level of precision from theoretical predictions of neutrino clustering. Numerical simulations of the non-linear evolution of cold dark matter and neutrinos play a pivotal role in this process. We incorporate neutrinos into the cosmological N-body code CUBEP3M and discuss the challenges associated with pushing to the extreme scales demanded by the neutrino problem. We highlight code optimizations made to exploit modern high performance computing architectures and present a novel method ofmore » data compression that reduces the phase-space particle footprint from 24 bytes in single precision to roughly 9 bytes. We scale the neutrino problem to the Tianhe-2 supercomputer and provide details of our production run, named TianNu, which uses 86% of the machine (13,824 compute nodes). With a total of 2.97 trillion particles, TianNu is currently the world’s largest cosmological N-body simulation and improves upon previous neutrino simulations by two orders of magnitude in scale. We finish with a discussion of the unanticipated computational challenges that were encountered during the TianNu runtime.« less
The Osseus platform: a prototype for advanced web-based distributed simulation
NASA Astrophysics Data System (ADS)
Franceschini, Derrick; Riecken, Mark
2016-05-01
Recent technological advances in web-based distributed computing and database technology have made possible a deeper and more transparent integration of some modeling and simulation applications. Despite these advances towards true integration of capabilities, disparate systems, architectures, and protocols will remain in the inventory for some time to come. These disparities present interoperability challenges for distributed modeling and simulation whether the application is training, experimentation, or analysis. Traditional approaches call for building gateways to bridge between disparate protocols and retaining interoperability specialists. Challenges in reconciling data models also persist. These challenges and their traditional mitigation approaches directly contribute to higher costs, schedule delays, and frustration for the end users. Osseus is a prototype software platform originally funded as a research project by the Defense Modeling & Simulation Coordination Office (DMSCO) to examine interoperability alternatives using modern, web-based technology and taking inspiration from the commercial sector. Osseus provides tools and services for nonexpert users to connect simulations, targeting the time and skillset needed to successfully connect disparate systems. The Osseus platform presents a web services interface to allow simulation applications to exchange data using modern techniques efficiently over Local or Wide Area Networks. Further, it provides Service Oriented Architecture capabilities such that finer granularity components such as individual models can contribute to simulation with minimal effort.
An entire universe of the Roman world's architecture found in the human skull.
Turliuc, Dana; Turliuc, Șerban; Cucu, Andrei; Dumitrescu, Gabriela; Costea, Claudia
2017-01-01
Today's neuroanatomical terminology has its origins in the Romans' way of life, in their civil and military house architecture, as well as in the fields of engineering and technology. Despite the fact that they did not know how the nervous system worked and what the role of each neuroanatomic structure was, over time, especially in Renaissance and early modern times, the anatomists sought descriptive names for the nervous structures they have identified by way of similarity with some ancient items. This study aims to briefly review the influence of Roman architecture, engineering, and technology on neuroanatomic nomenclature, the precursor of modern neuroanatomical terminology.
GPU accelerated implementation of NCI calculations using promolecular density.
Rubez, Gaëtan; Etancelin, Jean-Matthieu; Vigouroux, Xavier; Krajecki, Michael; Boisson, Jean-Charles; Hénon, Eric
2017-05-30
The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand-protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Creating technical heritage object replicas in a virtual environment
NASA Astrophysics Data System (ADS)
Egorova, Olga; Shcherbinin, Dmitry
2016-03-01
The paper presents innovative informatics methods for creating virtual technical heritage replicas, which are of significant scientific and practical importance not only to researchers but to the public in general. By performing 3D modeling and animation of aircrafts, spaceships, architectural-engineering buildings, and other technical objects, the process of learning is achieved while promoting the preservation of the replicas for future generations. Modern approaches based on the wide usage of computer technologies attract a greater number of young people to explore the history of science and technology and renew their interest in the field of mechanical engineering.
NASA Astrophysics Data System (ADS)
Ghosh, Amal K.; Basuray, Amitabha
2008-11-01
The memory devices in multi-valued logic are of most significance in modern research. This paper deals with the implementation of basic memory devices in multi-valued logic using Savart plate and spatial light modulator (SLM) based optoelectronic circuits. Photons are used here as the carrier to speed up the operations. Optical tree architecture (OTA) has been also utilized in the optical interconnection network. We have exploited the advantages of Savart plates, SLMs and OTA and proposed the SLM based high speed JK, D-type and T-type flip-flops in a trinary system.
The Plasma Simulation Code: A modern particle-in-cell code with patch-based load-balancing
NASA Astrophysics Data System (ADS)
Germaschewski, Kai; Fox, William; Abbott, Stephen; Ahmadi, Narges; Maynard, Kristofor; Wang, Liang; Ruhl, Hartmut; Bhattacharjee, Amitava
2016-08-01
This work describes the Plasma Simulation Code (PSC), an explicit, electromagnetic particle-in-cell code with support for different order particle shape functions. We review the basic components of the particle-in-cell method as well as the computational architecture of the PSC code that allows support for modular algorithms and data structure in the code. We then describe and analyze in detail a distinguishing feature of PSC: patch-based load balancing using space-filling curves which is shown to lead to major efficiency gains over unbalanced methods and a previously used simpler balancing method.
Digital signal conditioning for flight test instrumentation
NASA Technical Reports Server (NTRS)
Bever, Glenn A.
1991-01-01
An introduction to digital measurement processes on aircraft is provided. Flight test instrumentation systems are rapidly evolving from analog-intensive to digital intensive systems, including the use of onboard digital computers. The topics include measurements that are digital in origin, as well as sampling, encoding, transmitting, and storing data. Particular emphasis is placed on modern avionic data bus architectures and what to be aware of when extracting data from them. Examples of data extraction techniques are given. Tradeoffs between digital logic families, trends in digital development, and design testing techniques are discussed. An introduction to digital filtering is also covered.
Study of heterogeneous and reconfigurable architectures in the communication domain
NASA Astrophysics Data System (ADS)
Feldkaemper, H. T.; Blume, H.; Noll, T. G.
2003-05-01
One of the most challenging design issues for next generations of (mobile) communication systems is fulfilling the computational demands while finding an appropriate trade-off between flexibility and implementation aspects, especially power consumption. Flexibility of modern architectures is desirable, e.g. concerning adaptation to new standards and reduction of time-to-market of a new product. Typical target architectures for future communication systems include embedded FPGAs, dedicated macros as well as programmable digital signal and control oriented processor cores as each of these has its specific advantages. These will be integrated as a System-on-Chip (SoC). For such a heterogeneous architecture a design space exploration and an appropriate partitioning plays a crucial role. On the exemplary vehicle of a Viterbi decoder as frequently used in communication systems we show which costs in terms of ATE complexity arise implementing typical components on different types of architecture blocks. A factor of about seven orders of magnitude spans between a physically optimised implementation and an implementation on a programmable DSP kernel. An implementation on an embedded FPGA kernel is in between these two representing an attractive compromise with high flexibility and low power consumption. Extending this comparison to further components, it is shown quantitatively that the cost ratio between different implementation alternatives is closely related to the operation to be performed. This information is essential for the appropriate partitioning of heterogeneous systems.
CSciBox: An Intelligent Assistant for Dating Ice and Sediment Cores
NASA Astrophysics Data System (ADS)
Finlinson, K.; Bradley, E.; White, J. W. C.; Anderson, K. A.; Marchitto, T. M., Jr.; de Vesine, L. R.; Jones, T. R.; Lindsay, C. M.; Israelsen, B.
2015-12-01
CSciBox is an integrated software system for the construction and evaluation of age models of paleo-environmental archives. It incorporates a number of data-processing and visualization facilities, ranging from simple interpolation to reservoir-age correction and 14C calibration via the Calib algorithm, as well as a number of firn and ice-flow models. It employs modern database technology to store paleoclimate proxy data and analysis results in an easily accessible and searchable form, and offers the user access to those data and computational elements via a modern graphical user interface (GUI). In the case of truly large data or computations, CSciBox is parallelizable across modern multi-core processors, or clusters, or even the cloud. The code is open source and freely available on github, as are one-click installers for various versions of Windows and Mac OSX. The system's architecture allows users to incorporate their own software in the form of computational components that can be built smoothly into CSciBox workflows, taking advantage of CSciBox's GUI, data importing facilities, and plotting capabilities. To date, BACON and StratiCounter have been integrated into CSciBox as embedded components. The user can manipulate and compose all of these tools and facilities as she sees fit. Alternatively, she can employ CSciBox's automated reasoning engine, which uses artificial intelligence techniques to explore the gamut of age models and cross-dating scenarios automatically. The automated reasoning engine captures the knowledge of expert geoscientists, and can output a description of its reasoning.
NASA Astrophysics Data System (ADS)
Rossi, D.
2011-09-01
The main focus of this article is to explain a teaching activity. This experience follows a research aimed to testing innovative systems for formal and digital analysis of architectural building. In particular, the field of investigation is the analytical drawing. An analytical draw allows to develope an interpretative and similar models of reality; these models are built using photomodeling techniques and are designed to re-write modern and contemporary architecture. The typology of the buildings surveyed belong to a cultural period, called Modern Movement, historically placed between the two world wars. The Modern Movement aimed to renew existing architectural principle and to a functional redefinition of the same one. In Italy these principles arrived during the Fascist period. Heritage made up of public social buildings (case del Balilla, G.I.L., recreation center...) built during the fascist period in middle Italy is remarkable for quantity and in many cases for architectural quality. This kind of buildings are composed using pure shapes: large cube (gyms) alternate with long rectangular block containing offices creates compositions made of big volumes and high towers. These features are perfectly suited to the needs of a surveying process by photomodeling where the role of photography is central and where there is the need to identify certain and easily distinguishable points on all picture, leaning on the edges of the volume or lininig on the texture discontinuity. The goal is the documentation to preserve and to develop buildings and urban complexes of modern architecture, directed to encourage an artistic preservation.
Epilepsy analytic system with cloud computing.
Shen, Chia-Ping; Zhou, Weizhi; Lin, Feng-Seng; Sung, Hsiao-Ya; Lam, Yan-Yu; Chen, Wei; Lin, Jeng-Wei; Pan, Ming-Kai; Chiu, Ming-Jang; Lai, Feipei
2013-01-01
Biomedical data analytic system has played an important role in doing the clinical diagnosis for several decades. Today, it is an emerging research area of analyzing these big data to make decision support for physicians. This paper presents a parallelized web-based tool with cloud computing service architecture to analyze the epilepsy. There are many modern analytic functions which are wavelet transform, genetic algorithm (GA), and support vector machine (SVM) cascaded in the system. To demonstrate the effectiveness of the system, it has been verified by two kinds of electroencephalography (EEG) data, which are short term EEG and long term EEG. The results reveal that our approach achieves the total classification accuracy higher than 90%. In addition, the entire training time accelerate about 4.66 times and prediction time is also meet requirements in real time.
MIA-Clustering: a novel method for segmentation of paleontological material.
Dunmore, Christopher J; Wollny, Gert; Skinner, Matthew M
2018-01-01
Paleontological research increasingly uses high-resolution micro-computed tomography (μCT) to study the inner architecture of modern and fossil bone material to answer important questions regarding vertebrate evolution. This non-destructive method allows for the measurement of otherwise inaccessible morphology. Digital measurement is predicated on the accurate segmentation of modern or fossilized bone from other structures imaged in μCT scans, as errors in segmentation can result in inaccurate calculations of structural parameters. Several approaches to image segmentation have been proposed with varying degrees of automation, ranging from completely manual segmentation, to the selection of input parameters required for computational algorithms. Many of these segmentation algorithms provide speed and reproducibility at the cost of flexibility that manual segmentation provides. In particular, the segmentation of modern and fossil bone in the presence of materials such as desiccated soft tissue, soil matrix or precipitated crystalline material can be difficult. Here we present a free open-source segmentation algorithm application capable of segmenting modern and fossil bone, which also reduces subjective user decisions to a minimum. We compare the effectiveness of this algorithm with another leading method by using both to measure the parameters of a known dimension reference object, as well as to segment an example problematic fossil scan. The results demonstrate that the medical image analysis-clustering method produces accurate segmentations and offers more flexibility than those of equivalent precision. Its free availability, flexibility to deal with non-bone inclusions and limited need for user input give it broad applicability in anthropological, anatomical, and paleontological contexts.
NASA Astrophysics Data System (ADS)
Balletti, C.; Costa, M.; Guerra, F.; Martinello, F.; Vernier, P.
2018-05-01
Conservation of modern and contemporary cultural heritage, which goes from design objects, to architecture, to cities and territories, is certainly a current topic and in the development phase as it is underway - in the same modernity - a process of systematic replacement of architectural elements, outcome of solutions then experimental, which today are reproduced with contemporary materials, analogous in the appearance, but intimately different especially in the technological content.The paper describes the particular case of La Tour de Meudon, better known as The Tower, (1966) by André Bloc, a contemporary architect of Le Corbusier, founder of L'Architecture d'aujourd'hui, who created his habitable sculptures. All his works mark the evolution of geometric abstraction to the free form, and they are still admirable testimonies of a journey that led him from architecture to architecture. His Architecture and his sculpture intertwine, opening the plastic unity of form in physical space-time. The survey is a fundamental moment for the knowledge of these hybrid architectures, where the structural component is hidden by its evident plasticity, as if it were a large sculpture with abstract and overlapping geometric shapes.Survey isn't only an analysis of geometries: it is instrumental to the other structural and material analyses since it provides a metric and topological basis on which to spatially locate the phenomena being studied. The integrated survey of the building (laser scanning, photogrammetry, topography) has allowed to document his project, contributing to the to definition of the actual construction characteristics and ascertain both the material consistency and the state of conservation.
Programmable computing with a single magnetoresistive element
NASA Astrophysics Data System (ADS)
Ney, A.; Pampuch, C.; Koch, R.; Ploog, K. H.
2003-10-01
The development of transistor-based integrated circuits for modern computing is a story of great success. However, the proved concept for enhancing computational power by continuous miniaturization is approaching its fundamental limits. Alternative approaches consider logic elements that are reconfigurable at run-time to overcome the rigid architecture of the present hardware systems. Implementation of parallel algorithms on such `chameleon' processors has the potential to yield a dramatic increase of computational speed, competitive with that of supercomputers. Owing to their functional flexibility, `chameleon' processors can be readily optimized with respect to any computer application. In conventional microprocessors, information must be transferred to a memory to prevent it from getting lost, because electrically processed information is volatile. Therefore the computational performance can be improved if the logic gate is additionally capable of storing the output. Here we describe a simple hardware concept for a programmable logic element that is based on a single magnetic random access memory (MRAM) cell. It combines the inherent advantage of a non-volatile output with flexible functionality which can be selected at run-time to operate as an AND, OR, NAND or NOR gate.
2010-06-01
DATES COVEREDAPR 2009 – JAN 2010 (From - To) APR 2009 – JAN 2010 4. TITLE AND SUBTITLE EMERGING NEUROMORPHIC COMPUTING ARCHITECTURES AND ENABLING...14. ABSTRACT The highly cross-disciplinary emerging field of neuromorphic computing architectures for cognitive information processing applications...belief systems, software, computer engineering, etc. In our effort to develop cognitive systems atop a neuromorphic computing architecture, we explored
NASA Astrophysics Data System (ADS)
Broten, Gregory S.; Monckton, Simon P.; Collier, Jack; Giesbrecht, Jared
2006-05-01
In 2002 Defence R&D Canada changed research direction from pure tele-operated land vehicles to general autonomy for land, air, and sea craft. The unique constraints of the military environment coupled with the complexity of autonomous systems drove DRDC to carefully plan a research and development infrastructure that would provide state of the art tools without restricting research scope. DRDC's long term objectives for its autonomy program address disparate unmanned ground vehicle (UGV), unattended ground sensor (UGS), air (UAV), and subsea and surface (UUV and USV) vehicles operating together with minimal human oversight. Individually, these systems will range in complexity from simple reconnaissance mini-UAVs streaming video to sophisticated autonomous combat UGVs exploiting embedded and remote sensing. Together, these systems can provide low risk, long endurance, battlefield services assuming they can communicate and cooperate with manned and unmanned systems. A key enabling technology for this new research is a software architecture capable of meeting both DRDC's current and future requirements. DRDC built upon recent advances in the computing science field while developing its software architecture know as the Architecture for Autonomy (AFA). Although a well established practice in computing science, frameworks have only recently entered common use by unmanned vehicles. For industry and government, the complexity, cost, and time to re-implement stable systems often exceeds the perceived benefits of adopting a modern software infrastructure. Thus, most persevere with legacy software, adapting and modifying software when and wherever possible or necessary -- adopting strategic software frameworks only when no justifiable legacy exists. Conversely, academic programs with short one or two year projects frequently exploit strategic software frameworks but with little enduring impact. The open-source movement radically changes this picture. Academic frameworks, open to public scrutiny and modification, now rival commercial frameworks in both quality and economic impact. Further, industry now realizes that open source frameworks can reduce cost and risk of systems engineering. This paper describes the Architecture for Autonomy implemented by DRDC and how this architecture meets DRDC's current needs. It also presents an argument for why this architecture should also satisfy DRDC's future requirements as well.
Memristor-Based Synapse Design and Training Scheme for Neuromorphic Computing Architecture
2012-06-01
system level built upon the conventional Von Neumann computer architecture [2][3]. Developing the neuromorphic architecture at chip level by...SCHEME FOR NEUROMORPHIC COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-11-2-0046 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...creation of memristor-based neuromorphic computing architecture. Rather than the existing crossbar-based neuron network designs, we focus on memristor
NASA Technical Reports Server (NTRS)
Estefan, Jeff A.; Giovannoni, Brian J.
2014-01-01
The Advanced Multi-Mission Operations Systems (AMMOS) is NASA's premier space mission operations product line offering for use in deep-space robotic and astrophysics missions. The general approach to AMMOS modernization over the course of its 29-year history exemplifies a continual, evolutionary approach with periods of sponsor investment peaks and valleys in between. Today, the Multimission Ground Systems and Services (MGSS) office-the program office that manages the AMMOS for NASA-actively pursues modernization initiatives and continues to evolve the AMMOS by incorporating enhanced capabilities and newer technologies into its end-user tool and service offerings. Despite the myriad of modernization investments that have been made over the evolutionary course of the AMMOS, pain points remain. These pain points, based on interviews with numerous flight project mission operations personnel, can be classified principally into two major categories: 1) information-related issues, and 2) process-related issues. By information-related issues, we mean pain points associated with the management and flow of MOS data across the various system interfaces. By process-related issues, we mean pain points associated with the MOS activities performed by mission operators (i.e., humans) and supporting software infrastructure used in support of those activities. In this paper, three foundational concepts-Timeline, Closed Loop Control, and Separation of Concerns-collectively form the basis for expressing a set of core architectural tenets that provides a multifaceted approach to AMMOS system architecture modernization intended to address the information- and process-related issues. Each of these architectural tenets will be further explored in this paper. Ultimately, we envision the application of these core tenets resulting in a unified vision of a future-state architecture for the AMMOS-one that is intended to result in a highly adaptable, highly efficient, and highly cost-effective set of multimission MOS products and services.
ERIC Educational Resources Information Center
Bennett, Michael A.; Benton, Stephen L.
2001-01-01
Examines the attributions college students (N=301) make toward pictures of college campus buildings. Results reveal that students attributed greater likelihood of individual success to pictures depicting modern architecture than they did to those depicting traditional architecture. (Contains 28 references and 3 tables.) (Author/GCP)
Modern Church Construction in Urals. Problems and Prospects
NASA Astrophysics Data System (ADS)
Surin, D. N.; Tereshina, O. B.
2017-11-01
The article analyzes the problems of the modern Orthodox church architecture in Russia, special attention is paid to the problems of the Ural region. It justifies the importance of addressing to this issue connected with the Orthodox traditions revival in Russia over the last decades and the need to compensate for tens of thousands of the churches destroyed in the Soviet period. The works on the theory and history of the Russian architecture and art, studies of the architectural heritage and the art of building of the Ural craftsmen are used as a scientific and methodological base for the church architecture development. The article discloses the historically formed architectural features of the Russian Orthodox churches the artistic image of which is designed to create a certain religious and aesthetic experience. It is stated that the restoration of the Russian church construction tradition is possible on the background of architectural heritage. It sets the tendencies and vital tasks in church construction and outlines a complex of measures to solve these tasks at the public and regional levels.
The Amazing Labyrinth: An Ancient-Modern Humanities Unit
ERIC Educational Resources Information Center
Ladensack, Carl
1973-01-01
The image of the labyrinth from mythology can find modern day parallelisms in architecture, art, music, and literature--all of which contributes to a humanities unit combining the old with the new. (MM)
Developing a Distributed Computing Architecture at Arizona State University.
ERIC Educational Resources Information Center
Armann, Neil; And Others
1994-01-01
Development of Arizona State University's computing architecture, designed to ensure that all new distributed computing pieces will work together, is described. Aspects discussed include the business rationale, the general architectural approach, characteristics and objectives of the architecture, specific services, and impact on the university…
Automatic Blocking Of QR and LU Factorizations for Locality
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yi, Q; Kennedy, K; You, H
2004-03-26
QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, moremore » benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.« less
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
The architecture of a modern military health information system.
Mukherji, Raj J; Egyhazy, Csaba J
2004-06-01
This article describes a melding of a government-sponsored architecture for complex systems with open systems engineering architecture developed by the Institute for Electrical and Electronics Engineers (IEEE). Our experience in using these two architectures in building a complex healthcare system is described in this paper. The work described shows that it is possible to combine these two architectural frameworks in describing the systems, operational, and technical views of a complex automation system. The advantage in combining the two architectural frameworks lies in the simplicity of implementation and ease of understanding of automation system architectural elements by medical professionals.
Frances: A Tool for Understanding Computer Architecture and Assembly Language
ERIC Educational Resources Information Center
Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh
2012-01-01
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
Tutorial: Computer architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gajski, D.D.; Milutinovic, V.M.; Siegel, H.J.
1986-01-01
This book presents the state-of-the-art in advanced computer architecture. It deals with the concepts underlying current architectures and covers approaches and techniques being used in the design of advanced computer systems.
Outline of a novel architecture for cortical computation.
Majumdar, Kaushik
2008-03-01
In this paper a novel architecture for cortical computation has been proposed. This architecture is composed of computing paths consisting of neurons and synapses. These paths have been decomposed into lateral, longitudinal and vertical components. Cortical computation has then been decomposed into lateral computation (LaC), longitudinal computation (LoC) and vertical computation (VeC). It has been shown that various loop structures in the cortical circuit play important roles in cortical computation as well as in memory storage and retrieval, keeping in conformity with the molecular basis of short and long term memory. A new learning scheme for the brain has also been proposed and how it is implemented within the proposed architecture has been explained. A few mathematical results about the architecture have been proposed, some of which are without proof.
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi
The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC
Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi
2018-03-19
The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
NASA Astrophysics Data System (ADS)
Strauss, R. Du Toit; Effenberger, Frederic
2017-10-01
In this review, an overview of the recent history of stochastic differential equations (SDEs) in application to particle transport problems in space physics and astrophysics is given. The aim is to present a helpful working guide to the literature and at the same time introduce key principles of the SDE approach via "toy models". Using these examples, we hope to provide an easy way for newcomers to the field to use such methods in their own research. Aspects covered are the solar modulation of cosmic rays, diffusive shock acceleration, galactic cosmic ray propagation and solar energetic particle transport. We believe that the SDE method, due to its simplicity and computational efficiency on modern computer architectures, will be of significant relevance in energetic particle studies in the years to come.
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures
Manolakos, Elias S.
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.
Sharma, Anuj; Manolakos, Elias S
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
Wang, Bei; Ethier, Stephane; Tang, William; ...
2017-06-29
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Bei; Ethier, Stephane; Tang, William
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Architecture Adaptive Computing Environment
NASA Technical Reports Server (NTRS)
Dorband, John E.
2006-01-01
Architecture Adaptive Computing Environment (aCe) is a software system that includes a language, compiler, and run-time library for parallel computing. aCe was developed to enable programmers to write programs, more easily than was previously possible, for a variety of parallel computing architectures. Heretofore, it has been perceived to be difficult to write parallel programs for parallel computers and more difficult to port the programs to different parallel computing architectures. In contrast, aCe is supportable on all high-performance computing architectures. Currently, it is supported on LINUX clusters. aCe uses parallel programming constructs that facilitate writing of parallel programs. Such constructs were used in single-instruction/multiple-data (SIMD) programming languages of the 1980s, including Parallel Pascal, Parallel Forth, C*, *LISP, and MasPar MPL. In aCe, these constructs are extended and implemented for both SIMD and multiple- instruction/multiple-data (MIMD) architectures. Two new constructs incorporated in aCe are those of (1) scalar and virtual variables and (2) pre-computed paths. The scalar-and-virtual-variables construct increases flexibility in optimizing memory utilization in various architectures. The pre-computed-paths construct enables the compiler to pre-compute part of a communication operation once, rather than computing it every time the communication operation is performed.
González-Plaza, Juan J; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R
2016-01-01
Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species.
The Visualization Toolkit (VTK): Rewriting the rendering code for modern graphics cards
NASA Astrophysics Data System (ADS)
Hanwell, Marcus D.; Martin, Kenneth M.; Chaudhary, Aashish; Avila, Lisa S.
2015-09-01
The Visualization Toolkit (VTK) is an open source, permissively licensed, cross-platform toolkit for scientific data processing, visualization, and data analysis. It is over two decades old, originally developed for a very different graphics card architecture. Modern graphics cards feature fully programmable, highly parallelized architectures with large core counts. VTK's rendering code was rewritten to take advantage of modern graphics cards, maintaining most of the toolkit's programming interfaces. This offers the opportunity to compare the performance of old and new rendering code on the same systems/cards. Significant improvements in rendering speeds and memory footprints mean that scientific data can be visualized in greater detail than ever before. The widespread use of VTK means that these improvements will reap significant benefits.
Towards a Unified Architecture for Data-Intensive Seismology in VERCE
NASA Astrophysics Data System (ADS)
Klampanos, I.; Spinuso, A.; Trani, L.; Krause, A.; Garcia, C. R.; Atkinson, M.
2013-12-01
Modern seismology involves managing, storing and processing large datasets, typically geographically distributed across organisations. Performing computational experiments using these data generates more data, which in turn have to be managed, further analysed and frequently be made available within or outside the scientific community. As part of the EU-funded project VERCE (http://verce.eu), we research and develop a number of use-cases, interfacing technologies to satisfy the data-intensive requirements of modern seismology. Our solution seeks to support: (1) familiar programming environments to develop and execute experiments, in particular via Python/ObsPy, (2) a unified view of heterogeneous computing resources, public or private, through the adoption of workflows, (3) monitoring the experiments and validating the data products at varying granularities, via a comprehensive provenance system, (4) reproducibility of experiments and consistency in collaboration, via a shared registry of processing units and contextual metadata (computing resources, data, etc.) Here, we provide a brief account of these components and their roles in the proposed architecture. Our design integrates heterogeneous distributed systems, while allowing researchers to retain current practices and control data handling and execution via higher-level abstractions. At the core of our solution lies the workflow language Dispel. While Dispel can be used to express workflows at fine detail, it may also be used as part of meta- or job-submission workflows. User interaction can be provided through a visual editor or through custom applications on top of parameterisable workflows, which is the approach VERCE follows. According to our design, the scientist may use versions of Dispel/workflow processing elements offered by the VERCE library or override them introducing custom scientific code, using ObsPy. This approach has the advantage that, while the scientist uses a familiar tool, the resulting workflow can be executed on a number of underlying stream-processing engines, such as STORM or OGSA-DAI, transparently. While making efficient use of arbitrarily distributed resources and large data-sets is of priority, such processing requires adequate provenance tracking and monitoring. Hiding computation and orchestration details via a workflow system, allows us to embed provenance harvesting where appropriate without impeding the user's regular working patterns. Our provenance model is based on the W3C PROV standard and can provide information of varying granularity regarding execution, systems and data consumption/production. A video demonstrating a prototype provenance exploration tool can be found at http://bit.ly/15t0Fz0. Keeping experimental methodology and results open and accessible, as well as encouraging reproducibility and collaboration, is of central importance to modern science. As our users are expected to be based at different geographical locations, to have access to different computing resources and to employ customised scientific codes, the use of a shared registry of workflow components, implementations, data and computing resources is critical.
Modern Gemini-Approach to Technology Development for Human Space Exploration
NASA Technical Reports Server (NTRS)
White, Harold
2010-01-01
In NASA's plan to put men on the moon, there were three sequential programs: Mercury, Gemini, and Apollo. The Gemini program was used to develop and integrate the technologies that would be necessary for the Apollo program to successfully put men on the moon. We would like to present an analogous modern approach that leverages legacy ISS hardware designs, and integrates developing new technologies into a flexible architecture This new architecture is scalable, sustainable, and can be used to establish human exploration infrastructure beyond low earth orbit and into deep space.
ERIC Educational Resources Information Center
Twenty-First Century School Fund, Washington, DC.
This report addresses the decision-making process for replacing or modernizing the District of Columbia Public Schools (DCPS) as proposed in the DCPS facility master plan. The three-section document discusses old and historic schools and their future; the schools' historical and architectural value; cost of replacement and modernization; design;…
US NDC Modernization Iteration E1 Prototyping Report: User Interface Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lober, Randall R.
2014-12-01
During the first iteration of the US NDC Modernization Elaboration phase (E1), the SNL US NDC modernization project team completed an initial survey of applicable COTS solutions, and established exploratory prototyping related to the User Interface Framework (UIF) in support of system architecture definition. This report summarizes these activities and discusses planned follow-on work.
US NDC Modernization Iteration E1 Prototyping Report: Common Object Interface
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lewis, Jennifer E.; Hess, Michael M.
2014-12-01
During the first iteration of the US NDC Modernization Elaboration phase (E1), the SNL US NDC modernization project team completed an initial survey of applicable COTS solutions, and established exploratory prototyping related to the Common Object Interface (COI) in support of system architecture definition. This report summarizes these activities and discusses planned follow-on work.
US NDC Modernization Iteration E1 Prototyping Report: Processing Control Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prescott, Ryan; Hamlet, Benjamin R.
2014-12-01
During the first iteration of the US NDC Modernization Elaboration phase (E1), the SNL US NDC modernization project team developed an initial survey of applicable COTS solutions, and established exploratory prototyping related to the processing control framework in support of system architecture definition. This report summarizes these activities and discusses planned follow-on work.
Class Act: In Alabama, Students Turn Tires and Bales of Hay into Striking Architecture for the Poor.
ERIC Educational Resources Information Center
Stewart, Doug
2001-01-01
At the Rural Studio--an off-campus program of Auburn University--architectural students use scavenged and donated materials to create innovative houses and other buildings for poor, rural, primarily African American communities. Materials such as hay bales and old tires are recycled to create full-blown modern architecture, which also fulfills…
ERIC Educational Resources Information Center
Baker, Kate
2014-01-01
The context-free "object building," the sculptural form, reigned in schools of architecture for decades. As we are finally moving on from 20th century modernism, there is an urgency to re-place buildings within their contexts. All too often, students with a background in the discipline of architecture, struggle to design buildings that…
IDC Reengineering Iteration I2 Architectural Prototype Reports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamlet, Benjamin R.
To fulfill the inception phase deliverable “Demonstration of architectural prototype“ the SNL IDC Reengineering project team is providing seven reports describing system prototyping work completed between October 2012 and October 2014as part of the SNL US NDC Modernization project.
Peng, Yun; Miller, Brandi D; Boone, Timothy B; Zhang, Yingchun
2018-02-12
Weakened pelvic floor support is believed to be the main cause of various pelvic floor disorders. Modern theories of pelvic floor support stress on the structural and functional integrity of multiple structures and their interplay to maintain normal pelvic floor functions. Connective tissues provide passive pelvic floor support while pelvic floor muscles provide active support through voluntary contraction. Advanced modern medical technologies allow us to comprehensively and thoroughly evaluate the interaction of supporting structures and assess both active and passive support functions. The pathophysiology of various pelvic floor disorders associated with pelvic floor weakness is now under scrutiny from the combination of (1) morphological, (2) dynamic (through computational modeling), and (3) neurophysiological perspectives. This topical review aims to update newly emerged studies assessing pelvic floor support function among these three categories. A literature search was performed with emphasis on (1) medical imaging studies that assess pelvic floor muscle architecture, (2) subject-specific computational modeling studies that address new topics such as modeling muscle contractions, and (3) pelvic floor neurophysiology studies that report novel devices or findings such as high-density surface electromyography techniques. We found that recent computational modeling studies are featured with more realistic soft tissue constitutive models (e.g., active muscle contraction) as well as an increasing interest in simulating surgical interventions (e.g., artificial sphincter). Diffusion tensor imaging provides a useful non-invasive tool to characterize pelvic floor muscles at the microstructural level, which can be potentially used to improve the accuracy of the simulation of muscle contraction. Studies using high-density surface electromyography anal and vaginal probes on large patient cohorts have been recently reported. Influences of vaginal delivery on the distribution of innervation zones of pelvic floor muscles are clarified, providing useful guidance for a better protection of women during delivery. We are now in a period of transition to advanced diagnostic and predictive pelvic floor medicine. Our findings highlight the application of diffusion tensor imaging, computational models with consideration of active pelvic floor muscle contraction, high-density surface electromyography, and their potential integration, as tools to push the boundary of our knowledge in pelvic floor support and better shape current clinical practice.
Multigrid methods with space–time concurrency
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.; ...
2017-10-06
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less
Multigrid methods with space–time concurrency
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falgout, R. D.; Friedhoff, S.; Kolev, Tz. V.
Here, we consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporalmore » dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures.« less
High-Performance 3D Compressive Sensing MRI Reconstruction Using Many-Core Architectures
Kim, Daehyun; Trzasko, Joshua; Smelyanskiy, Mikhail; Haider, Clifton; Dubey, Pradeep; Manduca, Armando
2011-01-01
Compressive sensing (CS) describes how sparse signals can be accurately reconstructed from many fewer samples than required by the Nyquist criterion. Since MRI scan duration is proportional to the number of acquired samples, CS has been gaining significant attention in MRI. However, the computationally intensive nature of CS reconstructions has precluded their use in routine clinical practice. In this work, we investigate how different throughput-oriented architectures can benefit one CS algorithm and what levels of acceleration are feasible on different modern platforms. We demonstrate that a CUDA-based code running on an NVIDIA Tesla C2050 GPU can reconstruct a 256 × 160 × 80 volume from an 8-channel acquisition in 19 seconds, which is in itself a significant improvement over the state of the art. We then show that Intel's Knights Ferry can perform the same 3D MRI reconstruction in only 12 seconds, bringing CS methods even closer to clinical viability. PMID:21922017
Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aktulga, Hasan Metin; Coffman, Paul; Shan, Tzu-Ray
2015-12-01
Hybrid parallelism allows high performance computing applications to better leverage the increasing on-node parallelism of modern supercomputers. In this paper, we present a hybrid parallel implementation of the widely used LAMMPS/ReaxC package, where the construction of bonded and nonbonded lists and evaluation of complex ReaxFF interactions are implemented efficiently using OpenMP parallelism. Additionally, the performance of the QEq charge equilibration scheme is examined and a dual-solver is implemented. We present the performance of the resulting ReaxC-OMP package on a state-of-the-art multi-core architecture Mira, an IBM BlueGene/Q supercomputer. For system sizes ranging from 32 thousand to 16.6 million particles, speedups inmore » the range of 1.5-4.5x are observed using the new ReaxC-OMP software. Sustained performance improvements have been observed for up to 262,144 cores (1,048,576 processes) of Mira with a weak scaling efficiency of 91.5% in larger simulations containing 16.6 million particles.« less
NASA Astrophysics Data System (ADS)
Gupta, V.; Gupta, N.; Gupta, S.; Field, E.; Maechling, P.
2003-12-01
Modern laptop computers, and personal computers, can provide capabilities that are, in many ways, comparable to workstations or departmental servers. However, this doesn't mean we should run all computations on our local computers. We have identified several situations in which it preferable to implement our seismological application programs in a distributed, server-based, computing model. In this model, application programs on the user's laptop, or local computer, invoke programs that run on an organizational server, and the results are returned to the invoking system. Situations in which a server-based architecture may be preferred include: (a) a program is written in a language, or written for an operating environment, that is unsupported on the local computer, (b) software libraries or utilities required to execute a program are not available on the users computer, (c) a computational program is physically too large, or computationally too expensive, to run on a users computer, (d) a user community wants to enforce a consistent method of performing a computation by standardizing on a single implementation of a program, and (e) the computational program may require current information, that is not available to all client computers. Until recently, distributed, server-based, computational capabilities were implemented using client/server architectures. In these architectures, client programs were often written in the same language, and they executed in the same computing environment, as the servers. Recently, a new distributed computational model, called Web Services, has been developed. Web Services are based on Internet standards such as XML, SOAP, WDSL, and UDDI. Web Services offer the promise of platform, and language, independent distributed computing. To investigate this new computational model, and to provide useful services to the SCEC Community, we have implemented several computational and utility programs using a Web Service architecture. We have hosted these Web Services as a part of the SCEC Community Modeling Environment (SCEC/CME) ITR Project (http://www.scec.org/cme). We have implemented Web Services for several of the reasons sited previously. For example, we implemented a FORTRAN-based Earthquake Rupture Forecast (ERF) as a Web Service for use by client computers that don't support a FORTRAN runtime environment. We implemented a Generic Mapping Tool (GMT) Web Service for use by systems that don't have local access to GMT. We implemented a Hazard Map Calculator Web Service to execute Hazard calculations that are too computationally intensive to run on a local system. We implemented a Coordinate Conversion Web Service to enforce a standard and consistent method for converting between UTM and Lat/Lon. Our experience developing these services indicates both strengths and weakness in current Web Service technology. Client programs that utilize Web Services typically need network access, a significant disadvantage at times. Programs with simple input and output parameters were the easiest to implement as Web Services, while programs with complex parameter-types required a significant amount of additional development. We also noted that Web services are very data-oriented, and adapting object-oriented software into the Web Service model proved problematic. Also, the Web Service approach of converting data types into XML format for network transmission has significant inefficiencies for some data sets.
Supporting Undergraduate Computer Architecture Students Using a Visual MIPS64 CPU Simulator
ERIC Educational Resources Information Center
Patti, D.; Spadaccini, A.; Palesi, M.; Fazzino, F.; Catania, V.
2012-01-01
The topics of computer architecture are always taught using an Assembly dialect as an example. The most commonly used textbooks in this field use the MIPS64 Instruction Set Architecture (ISA) to help students in learning the fundamentals of computer architecture because of its orthogonality and its suitability for real-world applications. This…
Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques
2013-03-01
MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES AND CIRCUIT TECHNIQUES POLYTECHNIC INSTITUTE OF NEW YORK UNIVERSITY...TECHNICAL REPORT 3. DATES COVERED (From - To) OCT 2010 – OCT 2012 4. TITLE AND SUBTITLE MEMRISTOR-BASED COMPUTING ARCHITECTURE : DESIGN METHODOLOGIES...schemes for a memristor-based reconfigurable architecture design have not been fully explored yet. Therefore, in this project, we investigated
Exploiting current-generation graphics hardware for synthetic-scene generation
NASA Astrophysics Data System (ADS)
Tanner, Michael A.; Keen, Wayne A.
2010-04-01
Increasing seeker frame rate and pixel count, as well as the demand for higher levels of scene fidelity, have driven scene generation software for hardware-in-the-loop (HWIL) and software-in-the-loop (SWIL) testing to higher levels of parallelization. Because modern PC graphics cards provide multiple computational cores (240 shader cores for a current NVIDIA Corporation GeForce and Quadro cards), implementation of phenomenology codes on graphics processing units (GPUs) offers significant potential for simultaneous enhancement of simulation frame rate and fidelity. To take advantage of this potential requires algorithm implementation that is structured to minimize data transfers between the central processing unit (CPU) and the GPU. In this paper, preliminary methodologies developed at the Kinetic Hardware In-The-Loop Simulator (KHILS) will be presented. Included in this paper will be various language tradeoffs between conventional shader programming, Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), including performance trades and possible pathways for future tool development.
High performance in silico virtual drug screening on many-core processors.
McIntosh-Smith, Simon; Price, James; Sessions, Richard B; Ibarra, Amaurys A
2015-05-01
Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel's Xeon Phi and multi-core CPUs with SIMD instruction sets.
High performance in silico virtual drug screening on many-core processors
Price, James; Sessions, Richard B; Ibarra, Amaurys A
2015-01-01
Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets. PMID:25972727
OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing
NASA Astrophysics Data System (ADS)
Wei, Shoulin; Wang, Feng; Deng, Hui; Liu, Cuiyin; Dai, Wei; Liang, Bo; Mei, Ying; Shi, Congming; Liu, Yingbo; Wu, Jingping
2017-02-01
The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
NASA Astrophysics Data System (ADS)
He, S.; Tang, Z.; Yang, S.
2015-09-01
Baoguo Temple is located half way up Lingshan Mountain in Northern Ningbo, Zhejiang Province, China. The main hall of Baoguo Temple is Song dynasty wooden structure. As the oldest wooden architecture in Jiangnan, China, it is a national major protective historical relic. In 2005, Baoguo Temple Ancient Architecture Museum was set up and opens to the outside world. From 2007, to be able to protect it more effectively and foreseeably, Baoguo Temple Ancient Architecture Museum began to build information collecting systems towards historical architectures using modern information technology. After comparing correlated studies both at home and abroad, we found that: heritage protection abroad started earlier than us, and it has already established thorough protection system, relevant protection mechanism, and also issued relevant protection laws and regulations. The technology which was utilized in protection abroad was not only limited in RS, GIS, GPS, VR, but also included many emerging technology such as using a computational fluid dynamics model to simulate the condition of temperature and humidity. The main body of this paper are going to talk about four parts: the first one is existing information system. In this part, we'll introduce the information collecting system, which was preliminarily built in 2007 in Baoguo Temple Ancient Architecture Museum. Using the modern digital computer information technology, researchers can gradually check and acquire the information of the material of relics, the condition of the structure stress and the natural environmental information, which may probably affect the cultural architecture. And this part may be divided into information collection, information management and exhibition. The second part is update scheme design of original information collecting equipment and technology. Original information collecting system of microenvironment is relatively independent and data haven't been included in the management of the system. The original sensors transmit signal by wire and it would be interfered each other when they work together, and then it may cause congestion sometimes. Otherwise, the original system has been working continuously for seven years and it can't adapt to the new computer hardware and operating system. Then, this part may be divided into data integration of information collection, equipment upgrading and adding of information collecting point, upgrade of information management and exhibition system. The third part is scheme design of newly added information collecting projects. After understanding the exposed disadvantages before, the added projects may include real-time information collection of groundwater level and quantity, surface water quantity and velocity, mountain landslide, vibration of the main hall, material of wood construction of the main hall, structure of the main hall, the condition of key components of the main hall, air pollution such as the concentration of SO2, PM2.5, O2, CO2 and information collection of insect pest such as termite. After collecting information by many ways, the fourth part is to talk about comprehensive application of collecting information. This part may include comprehensive analysis of collecting information, management application of collecting information, publishing of collecting information and exhibition of information collecting system. Therefore, through this research, we want to develop information collecting work more perfectly and entirely and protect historical heritages more scientifically and effectively.
[Tuberculosis and the modern ideal of living].
Medici, T C
2003-08-20
Sunlight and fresh air belong to the everyday life's myths. It has influenced our times and personal lives as much as industrialization. Today we are hardly aware of the multiple and omnipresent consequences of this myth. The modern movement with all its facets including modern architecture is barely conceivable without it. What is the link between this triad with all its effects and tuberculosis, the oldest and most important infectious disease which still claims more than 3 million deaths per year worldwide? Tuberculosis was treated by sunlight and fresh air at all times. This treatment was at its zenith during the second half of the 19th century after Hermann Brehmer had initiated this treatment within sanatoria in 1862. The sanatorium vogue lasted until the middle of the last century when streptomycin was isolated by Selman Waksman 1943. A new type of hospital was necessary for treating the patients with sunlight and fresh air: the sanatorium with its wide windows, sheltered open balconies, terraces and "Liegehallen". In return, this airy type of building was the forrunner of a new architectural style, called "Neues Bauen". The latter has profoundly influenced our modern ideal of living since Le Corbusiier built the Villa Savoye, one of the architectural highlights of the 20th century.
NASA Astrophysics Data System (ADS)
Arab, Yasser; Hassan, Ahmad Sanusi; Qanaa, Bushra
2017-10-01
This research analyzed the façade thermal performance of high-rise buildings with modern and neo-minimalist architectural style. Four high-rise apartment buildings in Penang Island are selected as case studies for this research. The modern architectural style, which was popular during the 1970s to 1990s, nearly disregarded the cultural identity of the country and used the basic geometric shapes in the design. Conversely, the neo-minimalist style is the popular style from the 2010s up to the present. This style is a result of the "less is more" concept, which means using minimal applications to obtain an efficient design. The four selected case studies are as follows: Halaman Kristal 2 and Mutiara Idaman 1 with modern architectural style and Light Linear and Baystar apartments with neo-minimalist style. The research uses Fluke Ti20 thermal imager to capture thermal images of the west façade of the selected case studies on an hourly basis from 12:00 to 6:00 P.M. on March 15, 2017. Results confirm that the neo-minimalist façade elements, such as balconies and recessed walls, as well as other shading elements, are effective in improving the performance of façade shading. Notably, façade shading causes low surface temperature and provides cool indoor atmosphere during the day when the temperature is extremely high outside. Accordingly, this distinct feature partly explains the current popularity of the neo-minimalist architectural style.
Brain architecture: a design for natural computation.
Kaiser, Marcus
2007-12-15
Fifty years ago, John von Neumann compared the architecture of the brain with that of the computers he invented and which are still in use today. In those days, the organization of computers was based on concepts of brain organization. Here, we give an update on current results on the global organization of neural systems. For neural systems, we outline how the spatial and topological architecture of neuronal and cortical networks facilitates robustness against failures, fast processing and balanced network activation. Finally, we discuss mechanisms of self-organization for such architectures. After all, the organization of the brain might again inspire computer architecture.
González-Plaza, Juan J.; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F.; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R.; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R.
2016-01-01
Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species. PMID:26973682
A new software-based architecture for quantum computer
NASA Astrophysics Data System (ADS)
Wu, Nan; Song, FangMin; Li, Xiangdong
2010-04-01
In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
Parallel Finite Element Domain Decomposition for Structural/Acoustic Analysis
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.; Tungkahotara, Siroj; Watson, Willie R.; Rajan, Subramaniam D.
2005-01-01
A domain decomposition (DD) formulation for solving sparse linear systems of equations resulting from finite element analysis is presented. The formulation incorporates mixed direct and iterative equation solving strategics and other novel algorithmic ideas that are optimized to take advantage of sparsity and exploit modern computer architecture, such as memory and parallel computing. The most time consuming part of the formulation is identified and the critical roles of direct sparse and iterative solvers within the framework of the formulation are discussed. Experiments on several computer platforms using several complex test matrices are conducted using software based on the formulation. Small-scale structural examples are used to validate thc steps in the formulation and large-scale (l,000,000+ unknowns) duct acoustic examples are used to evaluate the ORIGIN 2000 processors, and a duster of 6 PCs (running under the Windows environment). Statistics show that the formulation is efficient in both sequential and parallel computing environmental and that the formulation is significantly faster and consumes less memory than that based on one of the best available commercialized parallel sparse solvers.
The next generation of command post computing
NASA Astrophysics Data System (ADS)
Arnold, Ross D.; Lieb, Aaron J.; Samuel, Jason M.; Burger, Mitchell A.
2015-05-01
The future of command post computing demands an innovative new solution to address a variety of challenging operational needs. The Command Post of the Future is the Army's primary command and control decision support system, providing situational awareness and collaborative tools for tactical decision making, planning, and execution management from Corps to Company level. However, as the U.S. Army moves towards a lightweight, fully networked battalion, disconnected operations, thin client architecture and mobile computing become increasingly essential. The Command Post of the Future is not designed to support these challenges in the coming decade. Therefore, research into a hybrid blend of technologies is in progress to address these issues. This research focuses on a new command and control system utilizing the rich collaboration framework afforded by Command Post of the Future coupled with a new user interface consisting of a variety of innovative workspace designs. This new system is called Tactical Applications. This paper details a brief history of command post computing, presents the challenges facing the modern Army, and explores the concepts under consideration for Tactical Applications that meet these challenges in a variety of innovative ways.
New materials and structures for photovoltaics
NASA Astrophysics Data System (ADS)
Zunger, Alex; Wagner, S.; Petroff, P. M.
1993-01-01
Despite the fact that over the years crystal chemists have discovered numerous semiconducting substances, and that modern epitaxial growth techniques are able to produce many novel atomic-scale architectures, current electronic and opto-electronic technologies are based but on a handful of ˜10 traditional semiconductor core materials. This paper surveys a number of yet-unexploited classes of semiconductors, pointing to the much-needed research in screening, growing, and characterizing promising members of these classes. In light of the unmanageably large number of a-priori possibilities, we emphasize the role that structural chemistry and modern computer-aided design must play in screening potentially important candidates. The basic classes of materials discussed here include nontraditional alloys, such as non-isovalent and heterostructural semiconductors, materials at reduced dimensionality, including superlattices, zeolite-caged nanostructures and organic semiconductors, spontaneously ordered alloys, interstitial semiconductors, filled tetrahedral structures, ordered vacancy compounds, and compounds based on d and f electron elements. A collaborative effort among material predictor, material grower, and material characterizer holds the promise for a successful identification of new and exciting systems.
Platform-dependent optimization considerations for mHealth applications
NASA Astrophysics Data System (ADS)
Kaghyan, Sahak; Akopian, David; Sarukhanyan, Hakob
2015-03-01
Modern mobile devices contain integrated sensors that enable multitude of applications in such fields as mobile health (mHealth), entertainment, sports, etc. Human physical activity monitoring is one of such the emerging applications. There exists a range of challenges that relate to activity monitoring tasks, and, particularly, exploiting optimal solutions and architectures for respective mobile software application development. This work addresses mobile computations related to integrated inertial sensors for activity monitoring, such as accelerometers, gyroscopes, integrated global positioning system (GPS) and WLAN-based positioning, that can be used for activity monitoring. Some of the aspects will be discussed in this paper. Each of the sensing data sources has its own characteristics such as specific data formats, data rates, signal acquisition durations etc., and these specifications affect energy consumption. Energy consumption significantly varies as sensor data acquisition is followed by data analysis including various transformations and signal processing algorithms. This paper will address several aspects of more optimal activity monitoring implementations exploiting state-of-the-art capabilities of modern platforms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arumugam, Kamesh
Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore,more » these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address the parallel implementation challenges of such irregular applications on different HPC architectures. In particular, we use supervised learning to predict the computation structure and use it to address the control-ow and memory access irregularities in the parallel implementation of such applications on GPUs, Xeon Phis, and heterogeneous architectures composed of multi-core CPUs with GPUs or Xeon Phis. We use numerical simulation of charged particles beam dynamics simulation as a motivating example throughout the dissertation to present our new approach, though they should be equally applicable to a wide range of irregular applications. The machine learning approach presented here use predictive analytics and forecasting techniques to adaptively model and track the irregular memory access pattern at each time step of the simulation to anticipate the future memory access pattern. Access pattern forecasts can then be used to formulate optimization decisions during application execution which improves the performance of the application at a future time step based on the observations from earlier time steps. In heterogeneous architectures, forecasts can also be used to improve the memory performance and resource utilization of all the processing units to deliver a good aggregate performance. We used these optimization techniques and anticipation strategy to design a cache-aware, memory efficient parallel algorithm to address the irregularities in the parallel implementation of charged particles beam dynamics simulation on different HPC architectures. Experimental result using a diverse mix of HPC architectures shows that our approach in using anticipation strategy is effective in maximizing data reuse, ensuring workload balance, minimizing branch and memory divergence, and in improving resource utilization.« less
An integrated autonomous rendezvous and docking system architecture using Centaur modern avionics
NASA Technical Reports Server (NTRS)
Nelson, Kurt
1991-01-01
The avionics system for the Centaur upper stage is in the process of being modernized with the current state-of-the-art in strapdown inertial guidance equipment. This equipment includes an integrated flight control processor with a ring laser gyro based inertial guidance system. This inertial navigation unit (INU) uses two MIL-STD-1750A processors and communicates over the MIL-STD-1553B data bus. Commands are translated into load activation through a Remote Control Unit (RCU) which incorporates the use of solid state relays. Also, a programmable data acquisition system replaces separate multiplexer and signal conditioning units. This modern avionics suite is currently being enhanced through independent research and development programs to provide autonomous rendezvous and docking capability using advanced cruise missile image processing technology and integrated GPS navigational aids. A system concept was developed to combine these technologies in order to achieve a fully autonomous rendezvous, docking, and autoland capability. The current system architecture and the evolution of this architecture using advanced modular avionics concepts being pursued for the National Launch System are discussed.
Challenges for Deploying Man-Portable Robots into Hostile Environments
2000-11-01
video, JAUGS , MDARS 1. BACKGROUND In modern-day warfare the most likely battlefield is an urban environment, which poses many threats to today’s...teleoperation, reconnaissance, surveillance, digital video, JAUGS , MDARS 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT 18...Architecture (MRHA) and the Joint Architecture for Unmanned Ground Systems ( JAUGS ). The hybrid architecture is termed SMART for Small Robotic Technology. It
Astronomy, Community, and Modern Calendar Buildings
NASA Astrophysics Data System (ADS)
Campion, N.
2016-01-01
This paper will look at Avon Tyrrell House, a “calendar house” dating from 1891 and an example of nineteenth-century astronomical architecture in England. The paper will suggest that “calendar buildings” may represent a genre of modern astronomical architecture which has, so far, not been studied, were designed to create stronger communities precisely because of their astronomical connections, and indicates scope for further investigation. The paper will contextualize the modern “calendar building” within the tradition of constructing cities and sacred sites as reflections or embodiments of the sky. By creating spaces which were connected to the celestial bodies, it was possible to create human communities which were linked to celestial ones, encouraging social stability and harmony. Such ideas underpinned traditions of the foundation of cities from China, through India, the Middle East, and Mesoamerica.
Architectures for single-chip image computing
NASA Astrophysics Data System (ADS)
Gove, Robert J.
1992-04-01
This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
Blended Design Approach of Long Span Structure and Malay Traditional Architecture
NASA Astrophysics Data System (ADS)
Sundari, Titin
2017-12-01
The growing population in the world is so fast, which is followed by the increasing need of some new and large activities. Architects face the problem on how to facilitate buildings with various activities such as for large meeting, conference, indoors gymnasium and sports, and many others. The long span structure of building is one of the solutions to solve that problem. Generally, large buildings which implemented this structure will look as a technological, modern and futuristic ones or even neo futuristic performance. But on the other hand, many people still want to enjoy the specific and unique senses of local traditional architecture. So is the Malay people who want an easy pleasant large facilities which can be fulfilled by implementing modern long span building structure technology. In the same time, their unique sense of Malay traditional architecture can still be maintained. To overcome this double problems of design, it needs a blended design approach of long span structure and Malay Traditional Architecture.
Physically Based Virtual Surgery Planning and Simulation Tools for Personal Health Care Systems
NASA Astrophysics Data System (ADS)
Dogan, Firat; Atilgan, Yasemin
The virtual surgery planning and simulation tools have gained a great deal of importance in the last decade in a consequence of increasing capacities at the information technology level. The modern hardware architectures, large scale database systems, grid based computer networks, agile development processes, better 3D visualization and all the other strong aspects of the information technology brings necessary instruments into almost every desk. The last decade’s special software and sophisticated super computer environments are now serving to individual needs inside “tiny smart boxes” for reasonable prices. However, resistance to learning new computerized environments, insufficient training and all the other old habits prevents effective utilization of IT resources by the specialists of the health sector. In this paper, all the aspects of the former and current developments in surgery planning and simulation related tools are presented, future directions and expectations are investigated for better electronic health care systems.
NASA Astrophysics Data System (ADS)
Pei, Zongrui; Eisenbach, Markus
2017-06-01
Dislocations are among the most important defects in determining the mechanical properties of both conventional alloys and high-entropy alloys. The Peierls-Nabarro model supplies an efficient pathway to their geometries and mobility. The difficulty in solving the integro-differential Peierls-Nabarro equation is how to effectively avoid the local minima in the energy landscape of a dislocation core. Among the other methods to optimize the dislocation core structures, we choose the algorithm of Particle Swarm Optimization, an algorithm that simulates the social behaviors of organisms. By employing more particles (bigger swarm) and more iterative steps (allowing them to explore for longer time), the local minima can be effectively avoided. But this would require more computational cost. The advantage of this algorithm is that it is readily parallelized in modern high computing architecture. We demonstrate the performance of our parallelized algorithm scales linearly with the number of employed cores.
Solvable Family of Driven-Dissipative Many-Body Systems.
Foss-Feig, Michael; Young, Jeremy T; Albert, Victor V; Gorshkov, Alexey V; Maghrebi, Mohammad F
2017-11-10
Exactly solvable models have played an important role in establishing the sophisticated modern understanding of equilibrium many-body physics. Conversely, the relative scarcity of solutions for nonequilibrium models greatly limits our understanding of systems away from thermal equilibrium. We study a family of nonequilibrium models, some of which can be viewed as dissipative analogues of the transverse-field Ising model, in that an effectively classical Hamiltonian is frustrated by dissipative processes that drive the system toward states that do not commute with the Hamiltonian. Surprisingly, a broad and experimentally relevant subset of these models can be solved efficiently. We leverage these solutions to compute the effects of decoherence on a canonical trapped-ion-based quantum computation architecture, and to prove a no-go theorem on steady-state phase transitions in a many-body model that can be realized naturally with Rydberg atoms or trapped ions.
Solvable Family of Driven-Dissipative Many-Body Systems
NASA Astrophysics Data System (ADS)
Foss-Feig, Michael; Young, Jeremy T.; Albert, Victor V.; Gorshkov, Alexey V.; Maghrebi, Mohammad F.
2017-11-01
Exactly solvable models have played an important role in establishing the sophisticated modern understanding of equilibrium many-body physics. Conversely, the relative scarcity of solutions for nonequilibrium models greatly limits our understanding of systems away from thermal equilibrium. We study a family of nonequilibrium models, some of which can be viewed as dissipative analogues of the transverse-field Ising model, in that an effectively classical Hamiltonian is frustrated by dissipative processes that drive the system toward states that do not commute with the Hamiltonian. Surprisingly, a broad and experimentally relevant subset of these models can be solved efficiently. We leverage these solutions to compute the effects of decoherence on a canonical trapped-ion-based quantum computation architecture, and to prove a no-go theorem on steady-state phase transitions in a many-body model that can be realized naturally with Rydberg atoms or trapped ions.
NASA Technical Reports Server (NTRS)
Savely, Robert T.; Loftin, R. Bowen
1990-01-01
Training is a major endeavor in all modern societies. Common training methods include training manuals, formal classes, procedural computer programs, simulations, and on-the-job training. NASA's training approach has focussed primarily on on-the-job training in a simulation environment for both crew and ground based personnel. NASA must explore new approaches to training for the 1990's and beyond. Specific autonomous training systems are described which are based on artificial intelligence technology for use by NASA astronauts, flight controllers, and ground based support personnel that show an alternative to current training systems. In addition to these specific systems, the evolution of a general architecture for autonomous intelligent training systems that integrates many of the features of traditional training programs with artificial intelligence techniques is presented. These Intelligent Computer Aided Training (ICAT) systems would provide much of the same experience that could be gained from the best on-the-job training.
Architecture-Adaptive Computing Environment: A Tool for Teaching Parallel Programming
NASA Technical Reports Server (NTRS)
Dorband, John E.; Aburdene, Maurice F.
2002-01-01
Recently, networked and cluster computation have become very popular. This paper is an introduction to a new C based parallel language for architecture-adaptive programming, aCe C. The primary purpose of aCe (Architecture-adaptive Computing Environment) is to encourage programmers to implement applications on parallel architectures by providing them the assurance that future architectures will be able to run their applications with a minimum of modification. A secondary purpose is to encourage computer architects to develop new types of architectures by providing an easily implemented software development environment and a library of test applications. This new language should be an ideal tool to teach parallel programming. In this paper, we will focus on some fundamental features of aCe C.
Toward a Fault Tolerant Architecture for Vital Medical-Based Wearable Computing.
Abdali-Mohammadi, Fardin; Bajalan, Vahid; Fathi, Abdolhossein
2015-12-01
Advancements in computers and electronic technologies have led to the emergence of a new generation of efficient small intelligent systems. The products of such technologies might include Smartphones and wearable devices, which have attracted the attention of medical applications. These products are used less in critical medical applications because of their resource constraint and failure sensitivity. This is due to the fact that without safety considerations, small-integrated hardware will endanger patients' lives. Therefore, proposing some principals is required to construct wearable systems in healthcare so that the existing concerns are dealt with. Accordingly, this paper proposes an architecture for constructing wearable systems in critical medical applications. The proposed architecture is a three-tier one, supporting data flow from body sensors to cloud. The tiers of this architecture include wearable computers, mobile computing, and mobile cloud computing. One of the features of this architecture is its high possible fault tolerance due to the nature of its components. Moreover, the required protocols are presented to coordinate the components of this architecture. Finally, the reliability of this architecture is assessed by simulating the architecture and its components, and other aspects of the proposed architecture are discussed.
The Modern Solar House: Architecture, Energy, and the Emergence of Environmentalism, 1938--1959
NASA Astrophysics Data System (ADS)
Barber, Daniel A.
This dissertation describes the active discourse regarding solar house heating in American architectural, engineering, political, economic, and corporate contexts from the eve of World War II until the late 1950s. Interweaving these multiple narratives, the aim of the project is threefold: to document this vital discourse, to place it in the context of the history of architecture, and to trace through it the emergence of a techno-cultural environmentalism. Experimentation in the solar house relied on the principles of modern architecture for both energy efficiency and claims to cultural relevance. A passive "solar house principle" was developed in the late 30s in the suburban houses of George Fred Keck that involved open plans and flexible roof lines, and emphasized volumetric design. Spurred by wartime concern over energy resource depletion, architectural interest in solar heating also engaged an engineering discourse; in particular, an experimental program at the Massachusetts Institute of Technology led to four solar houses and a codification of its technological parameters. Attention to the MIT projects at the UN and in the Truman and Eisenhower administrations placed the solar house as a central node in an emergent network exploring the problems and possibilities of a renewable resource economy. Further experimentation elaborated on connections between this architecturalengineering discourse and the technical assistance regimes of development assistance; here by MIT researcher Maria Telkes, who also collaborated, at different junctures, with the architects Eleanor Raymond and Aladar Olgyay. The solar house discourse was further developed as a cultural project in the 1958 competition to design a solar heated residence, "Living With the Sun," which coalesced the diverse formal tendencies of midcentury modernism to promote the solar house as an innovation in both lifestyle and policy. Though the examples described are not successful as either technological objects and cultural projects, the story of the modern solar house excavates a history of the present anxiety concerning the relationship between environmental and social conditions. Perhaps most cogently, the narrative reconfigures the role of architecture within such discussions, as a site for both technological innovation and for experimentation in the formation of an environmentalist culture.
Advanced computer architecture specification for automated weld systems
NASA Technical Reports Server (NTRS)
Katsinis, Constantine
1994-01-01
This report describes the requirements for an advanced automated weld system and the associated computer architecture, and defines the overall system specification from a broad perspective. According to the requirements of welding procedures as they relate to an integrated multiaxis motion control and sensor architecture, the computer system requirements are developed based on a proven multiple-processor architecture with an expandable, distributed-memory, single global bus architecture, containing individual processors which are assigned to specific tasks that support sensor or control processes. The specified architecture is sufficiently flexible to integrate previously developed equipment, be upgradable and allow on-site modifications.
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
Recursive computer architecture for VLSI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treleaven, P.C.; Hopkins, R.P.
1982-01-01
A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
Hypercluster Parallel Processor
NASA Technical Reports Server (NTRS)
Blech, Richard A.; Cole, Gary L.; Milner, Edward J.; Quealy, Angela
1992-01-01
Hypercluster computer system includes multiple digital processors, operation of which coordinated through specialized software. Configurable according to various parallel-computing architectures of shared-memory or distributed-memory class, including scalar computer, vector computer, reduced-instruction-set computer, and complex-instruction-set computer. Designed as flexible, relatively inexpensive system that provides single programming and operating environment within which one can investigate effects of various parallel-computing architectures and combinations on performance in solution of complicated problems like those of three-dimensional flows in turbomachines. Hypercluster software and architectural concepts are in public domain.
The Weather Forecast Using Data Mining Research Based on Cloud Computing.
NASA Astrophysics Data System (ADS)
Wang, ZhanJie; Mazharul Mujib, A. B. M.
2017-10-01
Weather forecasting has been an important application in meteorology and one of the most scientifically and technologically challenging problem around the world. In my study, we have analyzed the use of data mining techniques in forecasting weather. This paper proposes a modern method to develop a service oriented architecture for the weather information systems which forecast weather using these data mining techniques. This can be carried out by using Artificial Neural Network and Decision tree Algorithms and meteorological data collected in Specific time. Algorithm has presented the best results to generate classification rules for the mean weather variables. The results showed that these data mining techniques can be enough for weather forecasting.
Distributed Computing Architecture for Image-Based Wavefront Sensing and 2 D FFTs
NASA Technical Reports Server (NTRS)
Smith, Jeffrey S.; Dean, Bruce H.; Haghani, Shadan
2006-01-01
Image-based wavefront sensing (WFS) provides significant advantages over interferometric-based wavefi-ont sensors such as optical design simplicity and stability. However, the image-based approach is computational intensive, and therefore, specialized high-performance computing architectures are required in applications utilizing the image-based approach. The development and testing of these high-performance computing architectures are essential to such missions as James Webb Space Telescope (JWST), Terrestial Planet Finder-Coronagraph (TPF-C and CorSpec), and Spherical Primary Optical Telescope (SPOT). The development of these specialized computing architectures require numerous two-dimensional Fourier Transforms, which necessitate an all-to-all communication when applied on a distributed computational architecture. Several solutions for distributed computing are presented with an emphasis on a 64 Node cluster of DSPs, multiple DSP FPGAs, and an application of low-diameter graph theory. Timing results and performance analysis will be presented. The solutions offered could be applied to other all-to-all communication and scientifically computationally complex problems.
Peer-to-peer Monte Carlo simulation of photon migration in topical applications of biomedical optics
NASA Astrophysics Data System (ADS)
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Analysis of impact of general-purpose graphics processor units in supersonic flow modeling
NASA Astrophysics Data System (ADS)
Emelyanov, V. N.; Karpenko, A. G.; Kozelkov, A. S.; Teterina, I. V.; Volkov, K. N.; Yalozo, A. V.
2017-06-01
Computational methods are widely used in prediction of complex flowfields associated with off-normal situations in aerospace engineering. Modern graphics processing units (GPU) provide architectures and new programming models that enable to harness their large processing power and to design computational fluid dynamics (CFD) simulations at both high performance and low cost. Possibilities of the use of GPUs for the simulation of external and internal flows on unstructured meshes are discussed. The finite volume method is applied to solve three-dimensional unsteady compressible Euler and Navier-Stokes equations on unstructured meshes with high resolution numerical schemes. CUDA technology is used for programming implementation of parallel computational algorithms. Solutions of some benchmark test cases on GPUs are reported, and the results computed are compared with experimental and computational data. Approaches to optimization of the CFD code related to the use of different types of memory are considered. Speedup of solution on GPUs with respect to the solution on central processor unit (CPU) is compared. Performance measurements show that numerical schemes developed achieve 20-50 speedup on GPU hardware compared to CPU reference implementation. The results obtained provide promising perspective for designing a GPU-based software framework for applications in CFD.
CSP: A Multifaceted Hybrid Architecture for Space Computing
NASA Technical Reports Server (NTRS)
Rudolph, Dylan; Wilson, Christopher; Stewart, Jacob; Gauvin, Patrick; George, Alan; Lam, Herman; Crum, Gary Alex; Wirthlin, Mike; Wilson, Alex; Stoddard, Aaron
2014-01-01
Research on the CHREC Space Processor (CSP) takes a multifaceted hybrid approach to embedded space computing. Working closely with the NASA Goddard SpaceCube team, researchers at the National Science Foundation (NSF) Center for High-Performance Reconfigurable Computing (CHREC) at the University of Florida and Brigham Young University are developing hybrid space computers that feature an innovative combination of three technologies: commercial-off-the-shelf (COTS) devices, radiation-hardened (RadHard) devices, and fault-tolerant computing. Modern COTS processors provide the utmost in performance and energy-efficiency but are susceptible to ionizing radiation in space, whereas RadHard processors are virtually immune to this radiation but are more expensive, larger, less energy-efficient, and generations behind in speed and functionality. By featuring COTS devices to perform the critical data processing, supported by simpler RadHard devices that monitor and manage the COTS devices, and augmented with novel uses of fault-tolerant hardware, software, information, and networking within and between COTS devices, the resulting system can maximize performance and reliability while minimizing energy consumption and cost. NASA Goddard has adopted the CSP concept and technology with plans underway to feature flight-ready CSP boards on two upcoming space missions.
The architectural form of Qikou Cave dwellings in Chinese "Earth" culture
NASA Astrophysics Data System (ADS)
Chen, Xuanchen; Feng, Xinqun
2018-03-01
Cave building is not only a kind of architecture with unique style, but also a manifestation of Chinese traditional culture. Cave culture is an important part of Chinese traditional culture. The main purpose of this thesis which studies the architectural form of Qikou Cave, is to analyze how the cave building plays a positive role in promoting the development and application of modern resources and in cultural transmission. Based on a large amount of literature material, and taking Qikou Cave as an example, by studying the morphological characteristics of cave building, the paper takes an optimistic outlook on its future development and the sustainable development of the resources. It is expected that the cave culture can be further explored to promote the traditional Chinese culture and to drive the development of modern construction industry and resource conservation.
Workflows for Full Waveform Inversions
NASA Astrophysics Data System (ADS)
Boehm, Christian; Krischer, Lion; Afanasiev, Michael; van Driel, Martin; May, Dave A.; Rietmann, Max; Fichtner, Andreas
2017-04-01
Despite many theoretical advances and the increasing availability of high-performance computing clusters, full seismic waveform inversions still face considerable challenges regarding data and workflow management. While the community has access to solvers which can harness modern heterogeneous computing architectures, the computational bottleneck has fallen to these often manpower-bounded issues that need to be overcome to facilitate further progress. Modern inversions involve huge amounts of data and require a tight integration between numerical PDE solvers, data acquisition and processing systems, nonlinear optimization libraries, and job orchestration frameworks. To this end we created a set of libraries and applications revolving around Salvus (http://salvus.io), a novel software package designed to solve large-scale full waveform inverse problems. This presentation focuses on solving passive source seismic full waveform inversions from local to global scales with Salvus. We discuss (i) design choices for the aforementioned components required for full waveform modeling and inversion, (ii) their implementation in the Salvus framework, and (iii) how it is all tied together by a usable workflow system. We combine state-of-the-art algorithms ranging from high-order finite-element solutions of the wave equation to quasi-Newton optimization algorithms using trust-region methods that can handle inexact derivatives. All is steered by an automated interactive graph-based workflow framework capable of orchestrating all necessary pieces. This naturally facilitates the creation of new Earth models and hopefully sparks new scientific insights. Additionally, and even more importantly, it enhances reproducibility and reliability of the final results.
Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS
NASA Astrophysics Data System (ADS)
Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.
2018-03-01
Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.
A synchronized computational architecture for generalized bilateral control of robot arms
NASA Technical Reports Server (NTRS)
Bejczy, Antal K.; Szakaly, Zoltan
1987-01-01
This paper describes a computational architecture for an interconnected high speed distributed computing system for generalized bilateral control of robot arms. The key method of the architecture is the use of fully synchronized, interrupt driven software. Since an objective of the development is to utilize the processing resources efficiently, the synchronization is done in the hardware level to reduce system software overhead. The architecture also achieves a balaced load on the communication channel. The paper also describes some architectural relations to trading or sharing manual and automatic control.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation
NASA Technical Reports Server (NTRS)
Stocker, John C.; Golomb, Andrew M.
2011-01-01
Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Primary School Architecture in Portugal: A Case Study
ERIC Educational Resources Information Center
Freire da Silva, Jose M. R.
2008-01-01
Educational facilities became important public and specialised buildings since governments began to face the right of populations to education. Policies to provide educational buildings that respect modern notions of comfort and hygiene led architects to develop architectural concepts that corresponded to new demands on education. The need to…
Fischbach, Martin; Wiebusch, Dennis; Latoschik, Marc Erich
2017-04-01
Modularity, modifiability, reusability, and API usability are important software qualities that determine the maintainability of software architectures. Virtual, Augmented, and Mixed Reality (VR, AR, MR) systems, modern computer games, as well as interactive human-robot systems often include various dedicated input-, output-, and processing subsystems. These subsystems collectively maintain a real-time simulation of a coherent application state. The resulting interdependencies between individual state representations, mutual state access, overall synchronization, and flow of control implies a conceptual close coupling whereas software quality asks for a decoupling to develop maintainable solutions. This article presents five semantics-based software techniques that address this contradiction: Semantic grounding, code from semantics, grounded actions, semantic queries, and decoupling by semantics. These techniques are applied to extend the well-established entity-component-system (ECS) pattern to overcome some of this pattern's deficits with respect to the implied state access. A walk-through of central implementation aspects of a multimodal (speech and gesture) VR-interface is used to highlight the techniques' benefits. This use-case is chosen as a prototypical example of complex architectures with multiple interacting subsystems found in many VR, AR and MR architectures. Finally, implementation hints are given, lessons learned regarding maintainability pointed-out, and performance implications discussed.
Architectural Specialization for Inter-Iteration Loop Dependence Patterns
2015-10-01
Architectural Specialization for Inter-Iteration Loop Dependence Patterns Christopher Batten Computer Systems Laboratory School of Electrical and...Trends in Computer Architecture Transistors (Thousands) Frequency (MHz) Typical Power (W) MIPS R2K Intel P4 DEC Alpha 21264 Data collected by M...T as ks p er Jo ule ) Simple Processor Design Power Constraint High-Performance Architectures Embedded Architectures Design Performance
NASA Astrophysics Data System (ADS)
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-03-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
Cognitive Architectures and Human-Computer Interaction. Introduction to Special Issue.
ERIC Educational Resources Information Center
Gray, Wayne D.; Young, Richard M.; Kirschenbaum, Susan S.
1997-01-01
In this introduction to a special issue on cognitive architectures and human-computer interaction (HCI), editors and contributors provide a brief overview of cognitive architectures. The following four architectures represented by articles in this issue are: Soar; LICAI (linked model of comprehension-based action planning and instruction taking);…
Biomimetic design processes in architecture: morphogenetic and evolutionary computational design.
Menges, Achim
2012-03-01
Design computation has profound impact on architectural design methods. This paper explains how computational design enables the development of biomimetic design processes specific to architecture, and how they need to be significantly different from established biomimetic processes in engineering disciplines. The paper first explains the fundamental difference between computer-aided and computational design in architecture, as the understanding of this distinction is of critical importance for the research presented. Thereafter, the conceptual relation and possible transfer of principles from natural morphogenesis to design computation are introduced and the related developments of generative, feature-based, constraint-based, process-based and feedback-based computational design methods are presented. This morphogenetic design research is then related to exploratory evolutionary computation, followed by the presentation of two case studies focusing on the exemplary development of spatial envelope morphologies and urban block morphologies.
High Temporal Resolution Mapping of Seismic Noise Sources Using Heterogeneous Supercomputers
NASA Astrophysics Data System (ADS)
Paitz, P.; Gokhberg, A.; Ermert, L. A.; Fichtner, A.
2017-12-01
The time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems like earthquake fault zones, volcanoes, geothermal and hydrocarbon reservoirs. We present results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service providing seismic noise source maps for Central Europe with high temporal resolution. We use source imaging methods based on the cross-correlation of seismic noise records from all seismic stations available in the region of interest. The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept to provide the interested researchers worldwide with regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for the generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise source mapping itself rests on the measurement of logarithmic amplitude ratios in suitably pre-processed noise correlations, and the use of simplified sensitivity kernels. During the implementation we addressed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service-oriented architecture for coordination of various sub-systems, and engineering an appropriate data storage solution. The present pilot version of the service implements noise source maps for Switzerland. Extension of the solution to Central Europe is planned for the next project phase.
The flight telerobotic servicer: From functional architecture to computer architecture
NASA Technical Reports Server (NTRS)
Lumia, Ronald; Fiala, John
1989-01-01
After a brief tutorial on the NASA/National Bureau of Standards Standard Reference Model for Telerobot Control System Architecture (NASREM) functional architecture, the approach to its implementation is shown. First, interfaces must be defined which are capable of supporting the known algorithms. This is illustrated by considering the interfaces required for the SERVO level of the NASREM functional architecture. After interface definition, the specific computer architecture for the implementation must be determined. This choice is obviously technology dependent. An example illustrating one possible mapping of the NASREM functional architecture to a particular set of computers which implements it is shown. The result of choosing the NASREM functional architecture is that it provides a technology independent paradigm which can be mapped into a technology dependent implementation capable of evolving with technology in the laboratory and in space.
Architectural Symbols of a City - Case Study
NASA Astrophysics Data System (ADS)
Poplatek, Jacek
2017-10-01
The identity of a city is understood as a collection of individual features, which give the city its individual character and distinguish it from other places; it undoubtedly constitutes a cultural value, which should be cherished. A city is made special thanks to its geographical location, landscape values, urban layout and - architecture. In the case of Sopot - a spa located on the Bay of Gdansk, the mosaic of the above-mentioned features has created a unique image of a seaside resort. Sopot architecture is distinguished by a complex of buildings dating back to the turn of the 20th century, which is the largest one in the country. The architecture of the city is dominated by eclectic influences, mainly Neo-gothic and Art-Nouveau, as well as early modernism; it is also possible to find examples of holiday architecture, with characteristic wooden verandas. The identity of a city and its image is not always permanent and unchanging in time. In the case of Sopot, only 5% of the existing buildings were damaged during the Second World War. However, the most important ones, characteristic for the city and located in its representative part, were destroyed. The war was followed by a period of economic stagnation and isolation from the free world, which lasted for almost 45 years. At that time there were no comprehensive revitalisation projects for this prestigious area of the city. The buildings constructed in the 1960s did not create an architecturally and spatially coherent urban tissue. The situation changed in 1989, when Poland regained its sovereignty. Since that time numerous investment projects have been carried out in Sopot, including the prestigious ones, located in the representative part of the city. This paper has been devoted to Sopot architecture - both historic and modern, the dominating architectural trends and the issues connected with the coexistence of “the old and the new”. The buildings characteristic for the city, historic and modern ones, which constituted (or constitute at present) important landmarks in the urban area, and which were (or still are) the city symbols, have been analysed. Unfortunately, some of the buildings constructed over the last 25 years in the representative part of the city are not consistent with its unique character. The decisions made by investors, architects, city authorities and the monument preservation office may have serious negative effects; they may cause degradation of urban space and, as a result, harm its image. In the summary of this paper possible dangers connected with realising investments in the most important city locations, the ones with historic context, have been indicated, and recommendations aimed at elimination of such dangers have been presented. The priority - particularly in cities with an established, unique image - should be to ensure that architectural and cultural heritage is preserved, while new architecture should speak with modern language and introduce new values to its historic surroundings.
Multi-GPU Jacobian accelerated computing for soft-field tomography.
Borsic, A; Attardo, E A; Halter, R J
2012-10-01
Image reconstruction in soft-field tomography is based on an inverse problem formulation, where a forward model is fitted to the data. In medical applications, where the anatomy presents complex shapes, it is common to use finite element models (FEMs) to represent the volume of interest and solve a partial differential equation that models the physics of the system. Over the last decade, there has been a shifting interest from 2D modeling to 3D modeling, as the underlying physics of most problems are 3D. Although the increased computational power of modern computers allows working with much larger FEM models, the computational time required to reconstruct 3D images on a fine 3D FEM model can be significant, on the order of hours. For example, in electrical impedance tomography (EIT) applications using a dense 3D FEM mesh with half a million elements, a single reconstruction iteration takes approximately 15-20 min with optimized routines running on a modern multi-core PC. It is desirable to accelerate image reconstruction to enable researchers to more easily and rapidly explore data and reconstruction parameters. Furthermore, providing high-speed reconstructions is essential for some promising clinical application of EIT. For 3D problems, 70% of the computing time is spent building the Jacobian matrix, and 25% of the time in forward solving. In this work, we focus on accelerating the Jacobian computation by using single and multiple GPUs. First, we discuss an optimized implementation on a modern multi-core PC architecture and show how computing time is bounded by the CPU-to-memory bandwidth; this factor limits the rate at which data can be fetched by the CPU. Gains associated with the use of multiple CPU cores are minimal, since data operands cannot be fetched fast enough to saturate the processing power of even a single CPU core. GPUs have much faster memory bandwidths compared to CPUs and better parallelism. We are able to obtain acceleration factors of 20 times on a single NVIDIA S1070 GPU, and of 50 times on four GPUs, bringing the Jacobian computing time for a fine 3D mesh from 12 min to 14 s. We regard this as an important step toward gaining interactive reconstruction times in 3D imaging, particularly when coupled in the future with acceleration of the forward problem. While we demonstrate results for EIT, these results apply to any soft-field imaging modality where the Jacobian matrix is computed with the adjoint method.
Multi-GPU Jacobian Accelerated Computing for Soft Field Tomography
Borsic, A.; Attardo, E. A.; Halter, R. J.
2012-01-01
Image reconstruction in soft-field tomography is based on an inverse problem formulation, where a forward model is fitted to the data. In medical applications, where the anatomy presents complex shapes, it is common to use Finite Element Models to represent the volume of interest and to solve a partial differential equation that models the physics of the system. Over the last decade, there has been a shifting interest from 2D modeling to 3D modeling, as the underlying physics of most problems are three-dimensional. Though the increased computational power of modern computers allows working with much larger FEM models, the computational time required to reconstruct 3D images on a fine 3D FEM model can be significant, on the order of hours. For example, in Electrical Impedance Tomography applications using a dense 3D FEM mesh with half a million elements, a single reconstruction iteration takes approximately 15 to 20 minutes with optimized routines running on a modern multi-core PC. It is desirable to accelerate image reconstruction to enable researchers to more easily and rapidly explore data and reconstruction parameters. Further, providing high-speed reconstructions are essential for some promising clinical application of EIT. For 3D problems 70% of the computing time is spent building the Jacobian matrix, and 25% of the time in forward solving. In the present work, we focus on accelerating the Jacobian computation by using single and multiple GPUs. First, we discuss an optimized implementation on a modern multi-core PC architecture and show how computing time is bounded by the CPU-to-memory bandwidth; this factor limits the rate at which data can be fetched by the CPU. Gains associated with use of multiple CPU cores are minimal, since data operands cannot be fetched fast enough to saturate the processing power of even a single CPU core. GPUs have a much faster memory bandwidths compared to CPUs and better parallelism. We are able to obtain acceleration factors of 20 times on a single NVIDIA S1070 GPU, and of 50 times on 4 GPUs, bringing the Jacobian computing time for a fine 3D mesh from 12 minutes to 14 seconds. We regard this as an important step towards gaining interactive reconstruction times in 3D imaging, particularly when coupled in the future with acceleration of the forward problem. While we demonstrate results for Electrical Impedance Tomography, these results apply to any soft-field imaging modality where the Jacobian matrix is computed with the Adjoint Method. PMID:23010857
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Donaldson, Thomas; Doty, Karl; Engle, Steven W.; Larson, Robert E.; O'Reilly, John G.
1988-01-01
The results of a Phase I Small Business Innovative Research (SBIR) project performed for the NASA Langley Computational Structural Mechanics Group are described. The project resulted in the identification of a family of chordal-ring interconnection architectures with excellent potential to serve as the basis for new multimicroprocessor (MMP) computers. The paper presents examples of how computational algorithms from structural mechanics can be efficiently implemented on the chordal-ring architecture.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1992-01-01
The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
"Not a shack in the woods": architecture for tuberculosis in Muskoka and Toronto.
Adams, Annmarie; Burke, Stacie
2006-01-01
This paper explores architecture as a primary source in the history of tuberculosis. In comparing five Ontario sanatoria built between 1897 and 1923, we identify a range of types and a growing resemblance of ex-urban TB sanatoria to urban hospitals. Existing literature on Canadian TB hospital architecture suggests the endurance of picturesque architecture, but the cottage plan was only one of the types deemed appropriate for consumptives in the early 20th century, even in Muskoka. Furthermore we argue that urban and ex-urban TB ideologies actually coalesce about 1923, best illustrated in the boldly modern architecture of Muskoka's new Gage pavilion.
Morphology of muscle attachment sites in the modern human hand does not reflect muscle architecture.
Williams-Hatala, E M; Hatala, K G; Hiles, S; Rabey, K N
2016-06-23
Muscle attachment sites (entheses) on dry bones are regularly used by paleontologists to infer soft tissue anatomy and to reconstruct behaviors of extinct organisms. This method is commonly applied to fossil hominin hand bones to assess their abilities to participate in Paleolithic stone tool behaviors. Little is known, however, about how or even whether muscle anatomy and activity regimes influence the morphologies of their entheses, especially in the hand. Using the opponens muscles from a sample of modern humans, we tested the hypothesis that aspects of hand muscle architecture that are known to be influenced by behavior correlate with the size and shape of their associated entheses. Results show no consistent relationships between these behaviorally-influenced aspects of muscle architecture and entheseal morphology. Consequently, it is likely premature to infer patterns of behavior, such as stone tool making in fossil hominins, from these same entheses.
Morphology of muscle attachment sites in the modern human hand does not reflect muscle architecture
Williams-Hatala, E. M.; Hatala, K. G.; Hiles, S.; Rabey, K. N.
2016-01-01
Muscle attachment sites (entheses) on dry bones are regularly used by paleontologists to infer soft tissue anatomy and to reconstruct behaviors of extinct organisms. This method is commonly applied to fossil hominin hand bones to assess their abilities to participate in Paleolithic stone tool behaviors. Little is known, however, about how or even whether muscle anatomy and activity regimes influence the morphologies of their entheses, especially in the hand. Using the opponens muscles from a sample of modern humans, we tested the hypothesis that aspects of hand muscle architecture that are known to be influenced by behavior correlate with the size and shape of their associated entheses. Results show no consistent relationships between these behaviorally-influenced aspects of muscle architecture and entheseal morphology. Consequently, it is likely premature to infer patterns of behavior, such as stone tool making in fossil hominins, from these same entheses. PMID:27334440
An end-to-end communications architecture for condition-based maintenance applications
NASA Astrophysics Data System (ADS)
Kroculick, Joseph
2014-06-01
This paper explores challenges in implementing an end-to-end communications architecture for Condition-Based Maintenance Plus (CBM+) data transmission which aligns with the Army's Network Modernization Strategy. The Army's Network Modernization strategy is based on rolling out network capabilities which connect the smallest unit and Soldier level to enterprise systems. CBM+ is a continuous improvement initiative over the life cycle of a weapon system or equipment to improve the reliability and maintenance effectiveness of Department of Defense (DoD) systems. CBM+ depends on the collection, processing and transport of large volumes of data. An important capability that enables CBM+ is an end-to-end network architecture that enables data to be uploaded from the platform at the tactical level to enterprise data analysis tools. To connect end-to-end maintenance processes in the Army's supply chain, a CBM+ network capability can be developed from available network capabilities.
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.; Lytle, John K. (Technical Monitor)
2002-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
Distributed computing environments for future space control systems
NASA Technical Reports Server (NTRS)
Viallefont, Pierre
1993-01-01
The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
Electro-Optic Computing Architectures. Volume I
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit (OW
A Platform Architecture for Sensor Data Processing and Verification in Buildings
ERIC Educational Resources Information Center
Ortiz, Jorge Jose
2013-01-01
This thesis examines the state of the art of building information systems and evaluates their architecture in the context of emerging technologies and applications for deep analysis of the built environment. We observe that modern building information systems are difficult to extend, do not provide general services for application development, do…
ERIC Educational Resources Information Center
Demski, Jennifer
2012-01-01
When one thinks of 21st century schools, one thinks of geometric modern architecture, sustainable building materials, and high-tech modular classrooms. It's rare, though, that a district has the space or the money to build that school from the ground up. Instead, the challenge for most is the transformation of the 20th century architecture to…
Midcentury Modern High Schools: Rebooting the Architecture
ERIC Educational Resources Information Center
Havens, Kevin
2010-01-01
A high school is more than a building; it's a repository of memories for many community members. High schools built at the turn of the century are not only cultural and civic landmarks, they are also often architectural treasures. When these facilities become outdated, a renovation that preserves the building's aesthetics and character is usually…
Modular multiple sensors information management for computer-integrated surgery.
Vaccarella, Alberto; Enquobahrie, Andinet; Ferrigno, Giancarlo; Momi, Elena De
2012-09-01
In the past 20 years, technological advancements have modified the concept of modern operating rooms (ORs) with the introduction of computer-integrated surgery (CIS) systems, which promise to enhance the outcomes, safety and standardization of surgical procedures. With CIS, different types of sensor (mainly position-sensing devices, force sensors and intra-operative imaging devices) are widely used. Recently, the need for a combined use of different sensors raised issues related to synchronization and spatial consistency of data from different sources of information. In this study, we propose a centralized, multi-sensor management software architecture for a distributed CIS system, which addresses sensor information consistency in both space and time. The software was developed as a data server module in a client-server architecture, using two open-source software libraries: Image-Guided Surgery Toolkit (IGSTK) and OpenCV. The ROBOCAST project (FP7 ICT 215190), which aims at integrating robotic and navigation devices and technologies in order to improve the outcome of the surgical intervention, was used as the benchmark. An experimental protocol was designed in order to prove the feasibility of a centralized module for data acquisition and to test the application latency when dealing with optical and electromagnetic tracking systems and ultrasound (US) imaging devices. Our results show that a centralized approach is suitable for minimizing synchronization errors; latency in the client-server communication was estimated to be 2 ms (median value) for tracking systems and 40 ms (median value) for US images. The proposed centralized approach proved to be adequate for neurosurgery requirements. Latency introduced by the proposed architecture does not affect tracking system performance in terms of frame rate and limits US images frame rate at 25 fps, which is acceptable for providing visual feedback to the surgeon in the OR. Copyright © 2012 John Wiley & Sons, Ltd.
The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture.
Caetano-Anollés, Gustavo; Kim, Hee Shin; Mittenthal, Jay E
2007-05-29
Metabolism represents a complex collection of enzymatic reactions and transport processes that convert metabolites into molecules capable of supporting cellular life. Here we explore the origins and evolution of modern metabolism. Using phylogenomic information linked to the structure of metabolic enzymes, we sort out recruitment processes and discover that most enzymatic activities were associated with the nine most ancient and widely distributed protein fold architectures. An analysis of newly discovered functions showed enzymatic diversification occurred early, during the onset of the modern protein world. Most importantly, phylogenetic reconstruction exercises and other evidence suggest strongly that metabolism originated in enzymes with the P-loop hydrolase fold in nucleotide metabolism, probably in pathways linked to the purine metabolic subnetwork. Consequently, the first enzymatic takeover of an ancient biochemistry or prebiotic chemistry was related to the synthesis of nucleotides for the RNA world.
Image-Processing Software For A Hypercube Computer
NASA Technical Reports Server (NTRS)
Lee, Meemong; Mazer, Alan S.; Groom, Steven L.; Williams, Winifred I.
1992-01-01
Concurrent Image Processing Executive (CIPE) is software system intended to develop and use image-processing application programs on concurrent computing environment. Designed to shield programmer from complexities of concurrent-system architecture, it provides interactive image-processing environment for end user. CIPE utilizes architectural characteristics of particular concurrent system to maximize efficiency while preserving architectural independence from user and programmer. CIPE runs on Mark-IIIfp 8-node hypercube computer and associated SUN-4 host computer.
Experimental Comparison of Two Quantum Computing Architectures
2017-03-28
IN A U G U RA L A RT IC LE CO M PU TE R SC IE N CE S Experimental comparison of two quantum computing architectures Norbert M. Linkea,b,1, Dmitri...the vast computing power a universal quantumcomputer could offer, several candidate systems are being explored. They have allowed experimental ...existing systems and the role of architecture in quantum computer design . These will be crucial for the realization of more advanced future incarna
The TIM Barrel Architecture Facilitated the Early Evolution of Protein-Mediated Metabolism.
Goldman, Aaron David; Beatty, Joshua T; Landweber, Laura F
2016-01-01
The triosephosphate isomerase (TIM) barrel protein fold is a structurally repetitive architecture that is present in approximately 10% of all enzymes. It is generally assumed that this ubiquity in modern proteomes reflects an essential historical role in early protein-mediated metabolism. Here, we provide quantitative and comparative analyses to support several hypotheses about the early importance of the TIM barrel architecture. An information theoretical analysis of protein structures supports the hypothesis that the TIM barrel architecture could arise more easily by duplication and recombination compared to other mixed α/β structures. We show that TIM barrel enzymes corresponding to the most taxonomically broad superfamilies also have the broadest range of functions, often aided by metal and nucleotide-derived cofactors that are thought to reflect an earlier stage of metabolic evolution. By comparison to other putatively ancient protein architectures, we find that the functional diversity of TIM barrel proteins cannot be explained simply by their antiquity. Instead, the breadth of TIM barrel functions can be explained, in part, by the incorporation of a broad range of cofactors, a trend that does not appear to be shared by proteins in general. These results support the hypothesis that the simple and functionally general TIM barrel architecture may have arisen early in the evolution of protein biosynthesis and provided an ideal scaffold to facilitate the metabolic transition from ribozymes, peptides, and geochemical catalysts to modern protein enzymes.
SPOT4 Operational Control Center (CMP)
NASA Technical Reports Server (NTRS)
Zaouche, G.
1993-01-01
CNES(F) is responsible for the development of a new generation of Operational Control Center (CMP) which will operate the new heliosynchronous remote sensing satellite (SPOT4). This Operational Control Center takes large benefit from the experience of the first generation of control center and from the recent advances in computer technology and standards. The CMP is designed for operating two satellites all the same time with a reduced pool of controllers. The architecture of this CMP is simple, robust, and flexible, since it is based on powerful distributed workstations interconnected through an Ethernet LAN. The application software uses modern and formal software engineering methods, in order to improve quality and reliability, and facilitate maintenance. This software is table driven so it can be easily adapted to other operational needs. Operation tasks are automated to the maximum extent, so that it could be possible to operate the CMP automatically with very limited human interference for supervision and decision making. This paper provides an overview of the SPOTS mission and associated ground segment. It also details the CMP, its functions, and its software and hardware architecture.
Infrastructure and the Virtual Observatory
NASA Astrophysics Data System (ADS)
Dowler, P.; Gaudet, S.; Schade, D.
2011-07-01
The modern data center is faced with architectural and software engineering challenges that grow along with the challenges facing observatories: massive data flow, distributed computing environments, and distributed teams collaborating on large and small projects. By using VO standards as key components of the infrastructure, projects can take advantage of a decade of intellectual investment by the IVOA community. By their nature, these standards are proven and tested designs that already exist. Adopting VO standards saves considerable design effort, allows projects to take advantage of open-source software and test suites to speed development, and enables the use of third party tools that understand the VO protocols. The evolving CADC architecture now makes heavy use of VO standards. We show examples of how these standards may be used directly, coupled with non-VO standards, or extended with custom capabilities to solve real problems and provide value to our users. In the end, we use VO services as major parts of the core infrastructure to reduce cost rather than as an extra layer with additional cost and we can deliver more general purpose and robust services to our user community.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
PCM-Based Durable Write Cache for Fast Disk I/O
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Zhuo; Wang, Bin; Carpenter, Patrick
2012-01-01
Flash based solid-state devices (FSSDs) have been adopted within the memory hierarchy to improve the performance of hard disk drive (HDD) based storage system. However, with the fast development of storage-class memories, new storage technologies with better performance and higher write endurance than FSSDs are emerging, e.g., phase-change memory (PCM). Understanding how to leverage these state-of-the-art storage technologies for modern computing systems is important to solve challenging data intensive computing problems. In this paper, we propose to leverage PCM for a hybrid PCM-HDD storage architecture. We identify the limitations of traditional LRU caching algorithms for PCM-based caches, and develop amore » novel hash-based write caching scheme called HALO to improve random write performance of hard disks. To address the limited durability of PCM devices and solve the degraded spatial locality in traditional wear-leveling techniques, we further propose novel PCM management algorithms that provide effective wear-leveling while maximizing access parallelism. We have evaluated this PCM-based hybrid storage architecture using applications with a diverse set of I/O access patterns. Our experimental results demonstrate that the HALO caching scheme leads to an average reduction of 36.8% in execution time compared to the LRU caching scheme, and that the SFC wear leveling extends the lifetime of PCM by a factor of 21.6.« less
A Computationally Efficient Visual Saliency Algorithm Suitable for an Analog CMOS Implementation.
D'Angelo, Robert; Wood, Richard; Lowry, Nathan; Freifeld, Geremy; Huang, Haiyao; Salthouse, Christopher D; Hollosi, Brent; Muresan, Matthew; Uy, Wes; Tran, Nhut; Chery, Armand; Poppe, Dorothy C; Sonkusale, Sameer
2018-06-27
Computer vision algorithms are often limited in their application by the large amount of data that must be processed. Mammalian vision systems mitigate this high bandwidth requirement by prioritizing certain regions of the visual field with neural circuits that select the most salient regions. This work introduces a novel and computationally efficient visual saliency algorithm for performing this neuromorphic attention-based data reduction. The proposed algorithm has the added advantage that it is compatible with an analog CMOS design while still achieving comparable performance to existing state-of-the-art saliency algorithms. This compatibility allows for direct integration with the analog-to-digital conversion circuitry present in CMOS image sensors. This integration leads to power savings in the converter by quantizing only the salient pixels. Further system-level power savings are gained by reducing the amount of data that must be transmitted and processed in the digital domain. The analog CMOS compatible formulation relies on a pulse width (i.e., time mode) encoding of the pixel data that is compatible with pulse-mode imagers and slope based converters often used in imager designs. This letter begins by discussing this time-mode encoding for implementing neuromorphic architectures. Next, the proposed algorithm is derived. Hardware-oriented optimizations and modifications to this algorithm are proposed and discussed. Next, a metric for quantifying saliency accuracy is proposed, and simulation results of this metric are presented. Finally, an analog synthesis approach for a time-mode architecture is outlined, and postsynthesis transistor-level simulations that demonstrate functionality of an implementation in a modern CMOS process are discussed.
Processing-in-Memory Enabled Graphics Processors for 3D Rendering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Chenhao; Song, Shuaiwen; Wang, Jing
2017-02-06
The performance of 3D rendering of Graphics Processing Unit that convents 3D vector stream into 2D frame with 3D image effects significantly impact users’ gaming experience on modern computer systems. Due to the high texture throughput in 3D rendering, main memory bandwidth becomes a critical obstacle for improving the overall rendering performance. 3D stacked memory systems such as Hybrid Memory Cube (HMC) provide opportunities to significantly overcome the memory wall by directly connecting logic controllers to DRAM dies. Based on the observation that texel fetches significantly impact off-chip memory traffic, we propose two architectural designs to enable Processing-In-Memory based GPUmore » for efficient 3D rendering.« less
Sensor Systems Based on FPGAs and Their Applications: A Survey
de la Piedra, Antonio; Braeken, An; Touhafi, Abdellah
2012-01-01
In this manuscript, we present a survey of designs and implementations of research sensor nodes that rely on FPGAs, either based upon standalone platforms or as a combination of microcontroller and FPGA. Several current challenges in sensor networks are distinguished and linked to the features of modern FPGAs. As it turns out, low-power optimized FPGAs are able to enhance the computation of several types of algorithms in terms of speed and power consumption in comparison to microcontrollers of commercial sensor nodes. We show that architectures based on the combination of microcontrollers and FPGA can play a key role in the future of sensor networks, in fields where processing capabilities such as strong cryptography, self-testing and data compression, among others, are paramount.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
NASA Astrophysics Data System (ADS)
Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.
2017-02-01
We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Bonsai: an event-based framework for processing and controlling data streams
Lopes, Gonçalo; Bonacchi, Niccolò; Frazão, João; Neto, Joana P.; Atallah, Bassam V.; Soares, Sofia; Moreira, Luís; Matias, Sara; Itskov, Pavel M.; Correia, Patrícia A.; Medina, Roberto E.; Calcaterra, Lorenza; Dreosti, Elena; Paton, Joseph J.; Kampff, Adam R.
2015-01-01
The design of modern scientific experiments requires the control and monitoring of many different data streams. However, the serial execution of programming instructions in a computer makes it a challenge to develop software that can deal with the asynchronous, parallel nature of scientific data. Here we present Bonsai, a modular, high-performance, open-source visual programming framework for the acquisition and online processing of data streams. We describe Bonsai's core principles and architecture and demonstrate how it allows for the rapid and flexible prototyping of integrated experimental designs in neuroscience. We specifically highlight some applications that require the combination of many different hardware and software components, including video tracking of behavior, electrophysiology and closed-loop control of stimulation. PMID:25904861
PyNEST: A Convenient Interface to the NEST Simulator.
Eppler, Jochen Martin; Helias, Moritz; Muller, Eilif; Diesmann, Markus; Gewaltig, Marc-Oliver
2008-01-01
The neural simulation tool NEST (http://www.nest-initiative.org) is a simulator for heterogeneous networks of point neurons or neurons with a small number of compartments. It aims at simulations of large neural systems with more than 10(4) neurons and 10(7) to 10(9) synapses. NEST is implemented in C++ and can be used on a large range of architectures from single-core laptops over multi-core desktop computers to super-computers with thousands of processor cores. Python (http://www.python.org) is a modern programming language that has recently received considerable attention in Computational Neuroscience. Python is easy to learn and has many extension modules for scientific computing (e.g. http://www.scipy.org). In this contribution we describe PyNEST, the new user interface to NEST. PyNEST combines NEST's efficient simulation kernel with the simplicity and flexibility of Python. Compared to NEST's native simulation language SLI, PyNEST makes it easier to set up simulations, generate stimuli, and analyze simulation results. We describe how PyNEST connects NEST and Python and how it is implemented. With a number of examples, we illustrate how it is used.
PyNEST: A Convenient Interface to the NEST Simulator
Eppler, Jochen Martin; Helias, Moritz; Muller, Eilif; Diesmann, Markus; Gewaltig, Marc-Oliver
2008-01-01
The neural simulation tool NEST (http://www.nest-initiative.org) is a simulator for heterogeneous networks of point neurons or neurons with a small number of compartments. It aims at simulations of large neural systems with more than 104 neurons and 107 to 109 synapses. NEST is implemented in C++ and can be used on a large range of architectures from single-core laptops over multi-core desktop computers to super-computers with thousands of processor cores. Python (http://www.python.org) is a modern programming language that has recently received considerable attention in Computational Neuroscience. Python is easy to learn and has many extension modules for scientific computing (e.g. http://www.scipy.org). In this contribution we describe PyNEST, the new user interface to NEST. PyNEST combines NEST's efficient simulation kernel with the simplicity and flexibility of Python. Compared to NEST's native simulation language SLI, PyNEST makes it easier to set up simulations, generate stimuli, and analyze simulation results. We describe how PyNEST connects NEST and Python and how it is implemented. With a number of examples, we illustrate how it is used. PMID:19198667
A new paradigm for atomically detailed simulations of kinetics in biophysical systems.
Elber, Ron
2017-01-01
The kinetics of biochemical and biophysical events determined the course of life processes and attracted considerable interest and research. For example, modeling of biological networks and cellular responses relies on the availability of information on rate coefficients. Atomically detailed simulations hold the promise of supplementing experimental data to obtain a more complete kinetic picture. However, simulations at biological time scales are challenging. Typical computer resources are insufficient to provide the ensemble of trajectories at the correct length that is required for straightforward calculations of time scales. In the last years, new technologies emerged that make atomically detailed simulations of rate coefficients possible. Instead of computing complete trajectories from reactants to products, these approaches launch a large number of short trajectories at different positions. Since the trajectories are short, they are computed trivially in parallel on modern computer architecture. The starting and termination positions of the short trajectories are chosen, following statistical mechanics theory, to enhance efficiency. These trajectories are analyzed. The analysis produces accurate estimates of time scales as long as hours. The theory of Milestoning that exploits the use of short trajectories is discussed, and several applications are described.
The Educational Challenge of Unraveling the Fantasies of Ontological Security
ERIC Educational Resources Information Center
Stein, Sharon; Hunt, Dallas; Suša, Rene; de Oliveira Andreotti, Vanessa
2017-01-01
In this article we address the current context of intensified racialized state securitization by tracing its roots to the naturalized colonial architectures of everyday modern life--which we present through the metaphor of "the house modernity built." While contemporary crises are often perceived to derive from external threats to the…
A Fruitful Exchange/Conflict: Engineers and Mathematicians in Early Modern Italy
ERIC Educational Resources Information Center
Maffioli, Cesare S.
2013-01-01
Exchanges of learning and controversies between engineers and mathematicians were important factors in the development of early modern science. This theme is discussed by focusing, first, on architectural and mathematical dynamism in mid 16th-century Milan. While some engineers-architects referred to Euclid and Vitruvius for improving their…
Digital optical computers at the optoelectronic computing systems center
NASA Technical Reports Server (NTRS)
Jordan, Harry F.
1991-01-01
The Digital Optical Computing Program within the National Science Foundation Engineering Research Center for Opto-electronic Computing Systems has as its specific goal research on optical computing architectures suitable for use at the highest possible speeds. The program can be targeted toward exploiting the time domain because other programs in the Center are pursuing research on parallel optical systems, exploiting optical interconnection and optical devices and materials. Using a general purpose computing architecture as the focus, we are developing design techniques, tools and architecture for operation at the speed of light limit. Experimental work is being done with the somewhat low speed components currently available but with architectures which will scale up in speed as faster devices are developed. The design algorithms and tools developed for a general purpose, stored program computer are being applied to other systems such as optimally controlled optical communication networks.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
Electro-Optic Computing Architectures: Volume II. Components and System Design and Analysis
1998-02-01
The objective of the Electro - Optic Computing Architecture (EOCA) program was to develop multi-function electro - optic interfaces and optical...interconnect units to enhance the performance of parallel processor systems and form the building blocks for future electro - optic computing architectures...Specifically, three multi-function interface modules were targeted for development - an Electro - Optic Interface (EOI), an Optical Interconnection Unit
Single-chip microprocessor that communicates directly using light
NASA Astrophysics Data System (ADS)
Sun, Chen; Wade, Mark T.; Lee, Yunsup; Orcutt, Jason S.; Alloatti, Luca; Georgas, Michael S.; Waterman, Andrew S.; Shainline, Jeffrey M.; Avizienis, Rimas R.; Lin, Sen; Moss, Benjamin R.; Kumar, Rajesh; Pavanello, Fabio; Atabaki, Amir H.; Cook, Henry M.; Ou, Albert J.; Leu, Jonathan C.; Chen, Yu-Hsin; Asanović, Krste; Ram, Rajeev J.; Popović, Miloš A.; Stojanović, Vladimir M.
2015-12-01
Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems—from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic-photonic systems enabled by silicon-based nanophotonic devices8. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic-photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic-photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a ‘zero-change’ approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic-photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.
Single-chip microprocessor that communicates directly using light.
Sun, Chen; Wade, Mark T; Lee, Yunsup; Orcutt, Jason S; Alloatti, Luca; Georgas, Michael S; Waterman, Andrew S; Shainline, Jeffrey M; Avizienis, Rimas R; Lin, Sen; Moss, Benjamin R; Kumar, Rajesh; Pavanello, Fabio; Atabaki, Amir H; Cook, Henry M; Ou, Albert J; Leu, Jonathan C; Chen, Yu-Hsin; Asanović, Krste; Ram, Rajeev J; Popović, Miloš A; Stojanović, Vladimir M
2015-12-24
Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems--from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic-photonic systems enabled by silicon-based nanophotonic devices. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic-photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic-photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a 'zero-change' approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic-photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.
Recommended Practice: Creating Cyber Forensics Plans for Control Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eric Cornelius; Mark Fabro
Cyber forensics has been in the popular mainstream for some time, and has matured into an information-technology capability that is very common among modern information security programs. The goal of cyber forensics is to support the elements of troubleshooting, monitoring, recovery, and the protection of sensitive data. Moreover, in the event of a crime being committed, cyber forensics is also the approach to collecting, analyzing, and archiving data as evidence in a court of law. Although scalable to many information technology domains, especially modern corporate architectures, cyber forensics can be challenging when being applied to non-traditional environments, which are notmore » comprised of current information technologies or are designed with technologies that do not provide adequate data storage or audit capabilities. In addition, further complexity is introduced if the environments are designed using proprietary solutions and protocols, thus limiting the ease of which modern forensic methods can be utilized. The legacy nature and somewhat diverse or disparate component aspects of control systems environments can often prohibit the smooth translation of modern forensics analysis into the control systems domain. Compounded by a wide variety of proprietary technologies and protocols, as well as critical system technologies with no capability to store significant amounts of event information, the task of creating a ubiquitous and unified strategy for technical cyber forensics on a control systems device or computing resource is far from trivial. To date, no direction regarding cyber forensics as it relates to control systems has been produced other than what might be privately available from commercial vendors. Current materials have been designed to support event recreation (event-based), and although important, these requirements do not always satisfy the needs associated with incident response or forensics that are driven by cyber incidents. To address these issues and to accommodate for the diversity in both system and architecture types, a framework based in recommended practices to address forensics in the control systems domain is required. This framework must be fully flexible to allow for deployment into any control systems environment regardless of technologies used. Moreover, the framework and practices must provide for direction on the integration of modern network security technologies with traditionally closed systems, the result being a true defense-in-depth strategy for control systems architectures. This document takes the traditional concepts of cyber forensics and forensics engineering and provides direction regarding augmentation for control systems operational environments. The goal is to provide guidance to the reader with specifics relating to the complexity of cyber forensics for control systems, guidance to allow organizations to create a self-sustaining cyber forensics program, and guidance to support the maintenance and evolution of such programs. As the current control systems cyber security community of interest is without any specific direction on how to proceed with forensics in control systems environments, this information product is intended to be a first step.« less
NASA Astrophysics Data System (ADS)
Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng
2018-04-01
In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.
NASA Technical Reports Server (NTRS)
Hsia, T. C.; Lu, G. Z.; Han, W. H.
1987-01-01
In advanced robot control problems, on-line computation of inverse Jacobian solution is frequently required. Parallel processing architecture is an effective way to reduce computation time. A parallel processing architecture is developed for the inverse Jacobian (inverse differential kinematic equation) of the PUMA arm. The proposed pipeline/parallel algorithm can be inplemented on an IC chip using systolic linear arrays. This implementation requires 27 processing cells and 25 time units. Computation time is thus significantly reduced.
Method and computer program product for maintenance and modernization backlogging
Mattimore, Bernard G; Reynolds, Paul E; Farrell, Jill M
2013-02-19
According to one embodiment, a computer program product for determining future facility conditions includes a computer readable medium having computer readable program code stored therein. The computer readable program code includes computer readable program code for calculating a time period specific maintenance cost, for calculating a time period specific modernization factor, and for calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. In another embodiment, a computer-implemented method for calculating future facility conditions includes calculating a time period specific maintenance cost, calculating a time period specific modernization factor, and calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. Other embodiments are also presented.
The Common Evolution of Geometry and Architecture from a Geodetic Point of View
NASA Astrophysics Data System (ADS)
Bellone, T.; Fiermonte, F.; Mussio, L.
2017-05-01
Throughout history the link between geometry and architecture has been strong and while architects have used mathematics to construct their buildings, geometry has always been the essential tool allowing them to choose spatial shapes which are aesthetically appropriate. Sometimes it is geometry which drives architectural choices, but at other times it is architectural innovation which facilitates the emergence of new ideas in geometry. Among the best known types of geometry (Euclidean, projective, analytical, Topology, descriptive, fractal,…) those most frequently employed in architectural design are: - Euclidean Geometry - Projective Geometry - The non-Euclidean geometries. Entire architectural periods are linked to specific types of geometry. Euclidean geometry, for example, was the basis for architectural styles from Antiquity through to the Romanesque period. Perspective and Projective geometry, for their part, were important from the Gothic period through the Renaissance and into the Baroque and Neo-classical eras, while non-Euclidean geometries characterize modern architecture.
Fluid/Structure Interaction Studies of Aircraft Using High Fidelity Equations on Parallel Computers
NASA Technical Reports Server (NTRS)
Guruswamy, Guru; VanDalsem, William (Technical Monitor)
1994-01-01
Abstract Aeroelasticity which involves strong coupling of fluids, structures and controls is an important element in designing an aircraft. Computational aeroelasticity using low fidelity methods such as the linear aerodynamic flow equations coupled with the modal structural equations are well advanced. Though these low fidelity approaches are computationally less intensive, they are not adequate for the analysis of modern aircraft such as High Speed Civil Transport (HSCT) and Advanced Subsonic Transport (AST) which can experience complex flow/structure interactions. HSCT can experience vortex induced aeroelastic oscillations whereas AST can experience transonic buffet associated structural oscillations. Both aircraft may experience a dip in the flutter speed at the transonic regime. For accurate aeroelastic computations at these complex fluid/structure interaction situations, high fidelity equations such as the Navier-Stokes for fluids and the finite-elements for structures are needed. Computations using these high fidelity equations require large computational resources both in memory and speed. Current conventional super computers have reached their limitations both in memory and speed. As a result, parallel computers have evolved to overcome the limitations of conventional computers. This paper will address the transition that is taking place in computational aeroelasticity from conventional computers to parallel computers. The paper will address special techniques needed to take advantage of the architecture of new parallel computers. Results will be illustrated from computations made on iPSC/860 and IBM SP2 computer by using ENSAERO code that directly couples the Euler/Navier-Stokes flow equations with high resolution finite-element structural equations.
Laghari, Samreen; Niazi, Muaz A
2016-01-01
Computer Networks have a tendency to grow at an unprecedented scale. Modern networks involve not only computers but also a wide variety of other interconnected devices ranging from mobile phones to other household items fitted with sensors. This vision of the "Internet of Things" (IoT) implies an inherent difficulty in modeling problems. It is practically impossible to implement and test all scenarios for large-scale and complex adaptive communication networks as part of Complex Adaptive Communication Networks and Environments (CACOONS). The goal of this study is to explore the use of Agent-based Modeling as part of the Cognitive Agent-based Computing (CABC) framework to model a Complex communication network problem. We use Exploratory Agent-based Modeling (EABM), as part of the CABC framework, to develop an autonomous multi-agent architecture for managing carbon footprint in a corporate network. To evaluate the application of complexity in practical scenarios, we have also introduced a company-defined computer usage policy. The conducted experiments demonstrated two important results: Primarily CABC-based modeling approach such as using Agent-based Modeling can be an effective approach to modeling complex problems in the domain of IoT. Secondly, the specific problem of managing the Carbon footprint can be solved using a multiagent system approach.
Taylor, Andrea B; Vinyard, Christopher J
2013-05-01
The jaw-closing muscles are responsible for generating many of the forces and movements associated with feeding. Muscle physiologic cross-sectional area (PCSA) and fiber length are two architectural parameters that heavily influence muscle function. While there have been numerous comparative studies of hominoid and hominin craniodental and mandibular morphology, little is known about hominoid jaw-muscle fiber architecture. We present novel data on masseter and temporalis internal muscle architecture for small- and large-bodied hominoids. Hominoid scaling patterns are evaluated and compared with representative New- (Cebus) and Old-World (Macaca) monkeys. Variation in hominoid jaw-muscle fiber architecture is related to both absolute size and allometry. PCSAs scale close to isometry relative to jaw length in anthropoids, but likely with positive allometry in hominoids. Thus, large-bodied apes may be capable of generating both absolutely and relatively greater muscle forces compared with smaller-bodied apes and monkeys. Compared with extant apes, modern humans exhibit a reduction in masseter PCSA relative to condyle-M1 length but retain relatively long fibers, suggesting humans may have sacrificed relative masseter muscle force during chewing without appreciably altering muscle excursion/contraction velocity. Lastly, craniometric estimates of PCSAs underestimate hominoid masseter and temporalis PCSAs by more than 50% in gorillas, and overestimate masseter PCSA by as much as 30% in humans. These findings underscore the difficulty of accurately estimating jaw-muscle fiber architecture from craniometric measures and suggest models of fossil hominin and hominoid bite forces will be improved by incorporating architectural data in estimating jaw-muscle forces. Copyright © 2013 Wiley Periodicals, Inc.
Trend analysis of modern high-rise construction
NASA Astrophysics Data System (ADS)
Radushinsky, Dmitry; Gubankov, Andrey; Mottaeva, Asiiat
2018-03-01
The article reviews the main trends of modern high-rise construction considered a number of architectural, engineering and technological, economic and image factors that have influenced the intensification of construction of high-rise buildings in the 21st century. The key factors of modern high-rise construction are identified, which are associated with an attractive image component for businessmen and politicians, with the ability to translate current views on architecture and innovations in construction technologies and the lobbying of relevant structures, as well as the opportunity to serve as an effective driver in the development of a complex of national economy sectors with the achievement of a multiplicative effect. The estimation of the priority nature of participation of foreign architectural bureaus in the design of super-high buildings in Russia at the present stage is given. The issue of economic expediency of construction of high-rise buildings, including those with only a residential function, has been investigated. The connection between the construction of skyscrapers as an important component of the image of cities in the marketing of places and territories, the connection of the availability of a high-rise center, the City, with the possibilities of attracting a "creative class" and the features of creating a large working space for specialists on the basis of territorial proximity and density of high-rise buildings.
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
Physical Realization of a Supervised Learning System Built with Organic Memristive Synapses
NASA Astrophysics Data System (ADS)
Lin, Yu-Pu; Bennett, Christopher H.; Cabaret, Théo; Vodenicarevic, Damir; Chabi, Djaafar; Querlioz, Damien; Jousselme, Bruno; Derycke, Vincent; Klein, Jacques-Olivier
2016-09-01
Multiple modern applications of electronics call for inexpensive chips that can perform complex operations on natural data with limited energy. A vision for accomplishing this is implementing hardware neural networks, which fuse computation and memory, with low cost organic electronics. A challenge, however, is the implementation of synapses (analog memories) composed of such materials. In this work, we introduce robust, fastly programmable, nonvolatile organic memristive nanodevices based on electrografted redox complexes that implement synapses thanks to a wide range of accessible intermediate conductivity states. We demonstrate experimentally an elementary neural network, capable of learning functions, which combines four pairs of organic memristors as synapses and conventional electronics as neurons. Our architecture is highly resilient to issues caused by imperfect devices. It tolerates inter-device variability and an adaptable learning rule offers immunity against asymmetries in device switching. Highly compliant with conventional fabrication processes, the system can be extended to larger computing systems capable of complex cognitive tasks, as demonstrated in complementary simulations.
NASA Astrophysics Data System (ADS)
Lu, Bin; Cheng, Xiaomin; Feng, Jinlong; Guan, Xiawei; Miao, Xiangshui
2016-07-01
Nonvolatile memory devices or circuits that can implement both storage and calculation are a crucial requirement for the efficiency improvement of modern computer. In this work, we realize logic functions by using [GeTe/Sb2Te3]n super lattice phase change memory (PCM) cell in which higher threshold voltage is needed for phase change with a magnetic field applied. First, the [GeTe/Sb2Te3]n super lattice cells were fabricated and the R-V curve was measured. Then we designed the logic circuits with the super lattice PCM cell verified by HSPICE simulation and experiments. Seven basic logic functions are first demonstrated in this letter; then several multi-input logic gates are presented. The proposed logic devices offer the advantages of simple structures and low power consumption, indicating that the super lattice PCM has the potential in the future nonvolatile central processing unit design, facilitating the development of massive parallel computing architecture.
Pei, Zongrui; Max-Planck-Inst. fur Eisenforschung, Duseldorf; Eisenbach, Markus
2017-02-06
Dislocations are among the most important defects in determining the mechanical properties of both conventional alloys and high-entropy alloys. The Peierls-Nabarro model supplies an efficient pathway to their geometries and mobility. The difficulty in solving the integro-differential Peierls-Nabarro equation is how to effectively avoid the local minima in the energy landscape of a dislocation core. Among the other methods to optimize the dislocation core structures, we choose the algorithm of Particle Swarm Optimization, an algorithm that simulates the social behaviors of organisms. By employing more particles (bigger swarm) and more iterative steps (allowing them to explore for longer time), themore » local minima can be effectively avoided. But this would require more computational cost. The advantage of this algorithm is that it is readily parallelized in modern high computing architecture. We demonstrate the performance of our parallelized algorithm scales linearly with the number of employed cores.« less
Physical Realization of a Supervised Learning System Built with Organic Memristive Synapses.
Lin, Yu-Pu; Bennett, Christopher H; Cabaret, Théo; Vodenicarevic, Damir; Chabi, Djaafar; Querlioz, Damien; Jousselme, Bruno; Derycke, Vincent; Klein, Jacques-Olivier
2016-09-07
Multiple modern applications of electronics call for inexpensive chips that can perform complex operations on natural data with limited energy. A vision for accomplishing this is implementing hardware neural networks, which fuse computation and memory, with low cost organic electronics. A challenge, however, is the implementation of synapses (analog memories) composed of such materials. In this work, we introduce robust, fastly programmable, nonvolatile organic memristive nanodevices based on electrografted redox complexes that implement synapses thanks to a wide range of accessible intermediate conductivity states. We demonstrate experimentally an elementary neural network, capable of learning functions, which combines four pairs of organic memristors as synapses and conventional electronics as neurons. Our architecture is highly resilient to issues caused by imperfect devices. It tolerates inter-device variability and an adaptable learning rule offers immunity against asymmetries in device switching. Highly compliant with conventional fabrication processes, the system can be extended to larger computing systems capable of complex cognitive tasks, as demonstrated in complementary simulations.
WDEC: A Code for Modeling White Dwarf Structure and Pulsations
NASA Astrophysics Data System (ADS)
Bischoff-Kim, Agnès; Montgomery, Michael H.
2018-05-01
The White Dwarf Evolution Code (WDEC), written in Fortran, makes models of white dwarf stars. It is fast, versatile, and includes the latest physics. The code evolves hot (∼100,000 K) input models down to a chosen effective temperature by relaxing the models to be solutions of the equations of stellar structure. The code can also be used to obtain g-mode oscillation modes for the models. WDEC has a long history going back to the late 1960s. Over the years, it has been updated and re-packaged for modern computer architectures and has specifically been used in computationally intensive asteroseismic fitting. Generations of white dwarf astronomers and dozens of publications have made use of the WDEC, although the last true instrument paper is the original one, published in 1975. This paper discusses the history of the code, necessary to understand why it works the way it does, details the physics and features in the code today, and points the reader to where to find the code and a user guide.
Art at the Airport: An Exploration of New Art Worlds
ERIC Educational Resources Information Center
Szekely, Ilona
2012-01-01
Many airports have transformed empty waiting spaces into mini malls, children's play areas, and displays of beautiful art, making a long wait a bit more pleasant. For the modern airport, showcasing art has become an important component, with perks including a built-in global audience, as well as the vast spaces of modern architecture. For the art…
Collaborative Working Architecture for IoT-Based Applications.
Mora, Higinio; Signes-Pont, María Teresa; Gil, David; Johnsson, Magnus
2018-05-23
The new sensing applications need enhanced computing capabilities to handle the requirements of complex and huge data processing. The Internet of Things (IoT) concept brings processing and communication features to devices. In addition, the Cloud Computing paradigm provides resources and infrastructures for performing the computations and outsourcing the work from the IoT devices. This scenario opens new opportunities for designing advanced IoT-based applications, however, there is still much research to be done to properly gear all the systems for working together. This work proposes a collaborative model and an architecture to take advantage of the available computing resources. The resulting architecture involves a novel network design with different levels which combines sensing and processing capabilities based on the Mobile Cloud Computing (MCC) paradigm. An experiment is included to demonstrate that this approach can be used in diverse real applications. The results show the flexibility of the architecture to perform complex computational tasks of advanced applications.
NASA Astrophysics Data System (ADS)
Golovina, Svetlana; Oblasov, Yurii
2018-03-01
Skyscraper is a significant architectural structure in the world's largest cities. The appearance of a skyscraper in the city's architectural composition enhances its status, introduces dynamics into the shape of the city, modernizes the existing environment. Its architectural structure which can have both expressive triumphal forms and ascetic ones. For a deep understanding of the architecture of high-rise buildings must be considered by several criteria. Various approaches can be found in the competitive development of high-rise buildings in Moscow and the US cities in the middle of the twentieth century In this article we will consider how and on the basis of what the architectural decisions of high-rise buildings were formed.
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
TTEthernet for Integrated Spacecraft Networks
NASA Technical Reports Server (NTRS)
Loveless, Andrew
2015-01-01
Aerospace projects have traditionally employed federated avionics architectures, in which each computer system is designed to perform one specific function (e.g. navigation). There are obvious downsides to this approach, including excessive weight (from so much computing hardware), and inefficient processor utilization (since modern processors are capable of performing multiple tasks). There has therefore been a push for integrated modular avionics (IMA), in which common computing platforms can be leveraged for different purposes. This consolidation of multiple vehicle functions to shared computing platforms can significantly reduce spacecraft cost, weight, and design complexity. However, the application of IMA principles introduces significant challenges, as the data network must accommodate traffic of mixed criticality and performance levels - potentially all related to the same shared computer hardware. Because individual network technologies are rarely so competent, the development of truly integrated network architectures often proves unreasonable. Several different types of networks are utilized - each suited to support a specific vehicle function. Critical functions are typically driven by precise timing loops, requiring networks with strict guarantees regarding message latency (i.e. determinism) and fault-tolerance. Alternatively, non-critical systems generally employ data networks prioritizing flexibility and high performance over reliable operation. Switched Ethernet has seen widespread success filling this role in terrestrial applications. Its high speed, flexibility, and the availability of inexpensive commercial off-the-shelf (COTS) components make it desirable for inclusion in spacecraft platforms. Basic Ethernet configurations have been incorporated into several preexisting aerospace projects, including both the Space Shuttle and International Space Station (ISS). However, classical switched Ethernet cannot provide the high level of network determinism required by real-time spacecraft applications. Even with modern advancements, the uncoordinated (i.e. event-driven) nature of Ethernet communication unavoidably leads to message contention within network switches. The arbitration process used to resolve such conflicts introduces variation in the time it takes for messages to be forwarded. TTEthernet1 introduces decentralized clock synchronization to switched Ethernet, enabling message transmission according to a time-triggered (TT) paradigm. A network planning tool is used to allocate each device a finite amount of time in which it may transmit a frame. Each time slot is repeated sequentially to form a periodic communication schedule that is then loaded onto each TTEthernet device (e.g. switches and end systems). Each network participant references the synchronized time in order to dispatch messages at predetermined instances. This schedule guarantees that no contention exists between time-triggered Ethernet frames in the network switches, therefore eliminating the need for arbitration (and the timing variation it causes). Besides time-triggered messaging, TTEthernet networks may provide two additional traffic classes to support communication of different criticality levels. In the rate-constrained (RC) traffic class, the frame payload size and rate of transmission along each communication channel are limited to predetermined maximums. The network switches can therefore be configured to accommodate the known worst-case traffic pattern, and buffer overflows can be eliminated. The best-effort (BE) traffic class behaves akin to classical Ethernet. No guarantees are provided regarding transmission latency or successful message delivery. TTEthernet coordinates transmission of all three traffic classes over the same physical connections, therefore accommodating the full spectrum of traffic criticality levels required in IMA architectures. Common computing platforms (e.g. LRUs) can share networking resources in such a way that failures in non-critical systems (using BE or RC communication modes) cannot impact flight-critical functions (using TT communication). Furthermore, TTEthernet hardware (e.g. switches, cabling) can be shared by both TTEthernet and classical Ethernet traffic.
[Data collection in anesthesia. Experiences with the inauguration of a new information system].
Zbinden, A M; Rothenbühler, H; Häberli, B
1997-06-01
In many institutions information systems are used to process off-line anaesthesia data for invoices, statistical purposes, and quality assurance. Information systems are also increasingly being used to improve process control in order to reduce costs. Most of today's systems were created when information technology and working processes in anaesthesia were very different from those in use today. Thus, many institutions must now replace their computer systems but are probably not aware of how complex this change will be. Modern information systems mostly use client-server architecture and relational data bases. Substituting an old system with a new one is frequently a greater task than designing a system from scratch. This article gives the conclusions drawn from the experience obtained when a large departmental computer system is redesigned in an university hospital. The new system was based on a client-server architecture and was developed by an external company without preceding conceptual analysis. Modules for patient, anaesthesia, surgical, and pain-service data were included. Data were analysed using a separate statistical package (RS/1 from Bolt Beranek), taking advantage of its powerful precompiled procedures. Development and introduction of the new system took much more time and effort than expected despite the use of modern software tools. Introduction of the new program required intensive user training despite the choice of modem graphic screen layouts. Automatic data-reading systems could not be used, as too many faults occurred and the effort for the user was too high. However, after the initial problems were solved the system turned out to be a powerful tool for quality control (both process and outcome quality), billing, and scheduling. The statistical analysis of the data resulted in meaningful and relevant conclusions. Before creating a new information system, the working processes have to be analysed and, if possible, made more efficient; a detailed programme specification must then be made. A servicing and maintenance contract should be drawn up before the order is given to a company. Time periods of equal duration have to be scheduled for defining, writing, testing and introducing the program. Modern client-server systems with relational data bases are by no means simpler to establish and maintain than previous mainframe systems with hierarchical data bases, and thus, experienced computer specialists need to be close at hand. We recommend collecting data only once for both statistics and quality control. To verify data quality, a system of random spot-sampling has to be established. Despite the large investments needed to build up such a system, we consider it a powerful tool for helping to solve the difficult daily problems of managing a surgical and anaesthesia unit.
GeantV: from CPU to accelerators
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Arora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Sehgal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The GeantV project aims to research and develop the next-generation simulation software describing the passage of particles through matter. While the modern CPU architectures are being targeted first, resources such as GPGPU, Intel© Xeon Phi, Atom or ARM cannot be ignored anymore by HEP CPU-bound applications. The proof of concept GeantV prototype has been mainly engineered for CPU's having vector units but we have foreseen from early stages a bridge to arbitrary accelerators. A software layer consisting of architecture/technology specific backends supports currently this concept. This approach allows to abstract out the basic types such as scalar/vector but also to formalize generic computation kernels using transparently library or device specific constructs based on Vc, CUDA, Cilk+ or Intel intrinsics. While the main goal of this approach is portable performance, as a bonus, it comes with the insulation of the core application and algorithms from the technology layer. This allows our application to be long term maintainable and versatile to changes at the backend side. The paper presents the first results of basket-based GeantV geometry navigation on the Intel© Xeon Phi KNC architecture. We present the scalability and vectorization study, conducted using Intel performance tools, as well as our preliminary conclusions on the use of accelerators for GeantV transport. We also describe the current work and preliminary results for using the GeantV transport kernel on GPUs.
NASA Astrophysics Data System (ADS)
Leon, Angel Luis
2003-11-01
This thesis reports on the study of the acoustic properties of 18 theaters belonging to the Andalusian historical and architectural heritage. These theaters have undergone recent renovations to modernize and equip them appropriately. Coincident with this work, evaluations and qualification assessments with regard to their acoustic properties have been carried out for the individual theaters and for the group as a whole. Data measurements for this purpose consisted of acoustic measurements in situ, both before the renovation and after the renovation. These results have been compared with computer simulations of sound fields. Variables and parameters considered include the following: reverberation time, rapid speech transition index, back-ground noise, definition, clarity, strength, lateral efficiency, interaural cross-correlation coefficient, volume/seat ratio, volume/audience-area ratio. Based on the measurements and analysis, general conclusions are given in regard to the acoustic performance of theaters whose typology and size are comparable to those that were used in this study (between 800 and 8000 cubic meters). It is noted that these properties are comparable to those of the majority of European theaters. The results and conclusions are presented so that they should be of interest to architectural acoustics practitioners and to architects who are involved in the planning of renovation projects for theaters Thesis advisors: Juan J. Sendra and Jaime Navarro Copies of this thesis written in Spanish may be obtained by contacting the author, Angel L. Leon, E.T.S. de Arquitectura de Sevilla, Dpto. de Construcciones Arquitectonicas I, Av. Reina Mercedes, 2, 41012 Sevilla, Spain. E-mail address: leonr@us.es
Computer vision camera with embedded FPGA processing
NASA Astrophysics Data System (ADS)
Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel
2000-03-01
Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
State-of-the-art in Heterogeneous Computing
Brodtkorb, Andre R.; Dyken, Christopher; Hagen, Trond R.; ...
2010-01-01
Node level heterogeneous architectures have become attractive during the last decade for several reasons: compared to traditional symmetric CPUs, they offer high peak performance and are energy and/or cost efficient. With the increase of fine-grained parallelism in high-performance computing, as well as the introduction of parallelism in workstations, there is an acute need for a good overview and understanding of these architectures. We give an overview of the state-of-the-art in heterogeneous computing, focusing on three commonly found architectures: the Cell Broadband Engine Architecture, graphics processing units (GPUs), and field programmable gate arrays (FPGAs). We present a review of hardware, availablemore » software tools, and an overview of state-of-the-art techniques and algorithms. Furthermore, we present a qualitative and quantitative comparison of the architectures, and give our view on the future of heterogeneous computing.« less
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1991-01-01
The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
Advanced computer architecture for large-scale real-time applications.
DOT National Transportation Integrated Search
1973-04-01
Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.
ERIC Educational Resources Information Center
Beltrametti, Monica; English, Will
1994-01-01
Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…
ERIC Educational Resources Information Center
Farid, Ayman A.; Zaghloul, Weaam M.; Dewidar, Khaled M.
2014-01-01
The great shift in sustainability and computer aided design in the field of architecture caused a remarkable change in the architecture philosophy, new aspects of beauty and aesthetic values are being introduced, and traditional definitions for beauty cannot fully cover this aspects, which causes a gap between; new architecture works criticism and…
NASA Technical Reports Server (NTRS)
Colloredo, Scott; Gray, James A.
2011-01-01
The impending conclusion of the Space Shuttle Program and the Constellation Program cancellation unveiled in the FY2011 President's budget created a large void for human spaceflight capability and specifically launch activity from the Florida launch Site (FlS). This void created an opportunity to re-architect the launch site to be more accommodating to the future NASA heavy lift and commercial space industry. The goal is to evolve the heritage capabilities into a more affordable and flexible launch complex. This case study will discuss the FlS architecture evolution from the trade studies to select primary launch site locations for future customers, to improving infrastructure; promoting environmental remediation/compliance; improving offline processing, manufacturing, & recovery; developing range interface and control services with the US Air Force, and developing modernization efforts for the launch Pad, Vehicle Assembly Building, Mobile launcher, and supporting infrastructure. The architecture studies will steer how to best invest limited modernization funding from initiatives like the 21 st elSe and other potential funding.
Programmable hardware for reconfigurable computing systems
NASA Astrophysics Data System (ADS)
Smith, Stephen
1996-10-01
In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
Efficient Phase Unwrapping Architecture for Digital Holographic Microscopy
Hwang, Wen-Jyi; Cheng, Shih-Chang; Cheng, Chau-Jern
2011-01-01
This paper presents a novel phase unwrapping architecture for accelerating the computational speed of digital holographic microscopy (DHM). A fast Fourier transform (FFT) based phase unwrapping algorithm providing a minimum squared error solution is adopted for hardware implementation because of its simplicity and robustness to noise. The proposed architecture is realized in a pipeline fashion to maximize throughput of the computation. Moreover, the number of hardware multipliers and dividers are minimized to reduce the hardware costs. The proposed architecture is used as a custom user logic in a system on programmable chip (SOPC) for physical performance measurement. Experimental results reveal that the proposed architecture is effective for expediting the computational speed while consuming low hardware resources for designing an embedded DHM system. PMID:22163688
Fault tolerant and lifetime control architecture for autonomous vehicles
NASA Astrophysics Data System (ADS)
Bogdanov, Alexander; Chen, Yi-Liang; Sundareswaran, Venkataraman; Altshuler, Thomas
2008-04-01
Increased vehicle autonomy, survivability and utility can provide an unprecedented impact on mission success and are one of the most desirable improvements for modern autonomous vehicles. We propose a general architecture of intelligent resource allocation, reconfigurable control and system restructuring for autonomous vehicles. The architecture is based on fault-tolerant control and lifetime prediction principles, and it provides improved vehicle survivability, extended service intervals, greater operational autonomy through lower rate of time-critical mission failures and lesser dependence on supplies and maintenance. The architecture enables mission distribution, adaptation and execution constrained on vehicle and payload faults and desirable lifetime. The proposed architecture will allow managing missions more efficiently by weighing vehicle capabilities versus mission objectives and replacing the vehicle only when it is necessary.
ERIC Educational Resources Information Center
Betts, Janelle Lyon
2001-01-01
Describes a high school art assignment in which students utilize Appleworks or Claris Works to design their own house, after learning about architectural styles and how to use the computer program. States that the project develops student computer skills and increases student knowledge about architecture. (CMK)
NASA Astrophysics Data System (ADS)
Georgiev, K.; Zlatev, Z.
2010-11-01
The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
The National Library of Kosovo "PJETER Bogdani" Rapid Condition Assessment and Documentation
NASA Astrophysics Data System (ADS)
Eppich, R.; Ramku, B.; Binakaj, N.
2017-08-01
The National Library of Kosovo "Pjetër Bogdani" is a symbol of Prishtina, Kosovo and the quest for knowledge. It is simultaneously an icon of modernity and symbol of the past. Unfortunately, it suffered through the Kosovo war and neglect in times of economic difficulty. It was also unfortunately featured in the British newspaper The Telegraph in their travel section: "One of the world's 30 ugliest buildings?" In late 2015 the Kosovo Architectural Foundation, a non-profit dedicated to spirit of creating and preserving unique architecture, became concerned with the reputation and condition of the Library and contacted the Kosovo Ministry of Culture, visited the site and initiated a project to raise awareness and document this modern masterpiece. The Getty Foundation and their Keeping it Modern grant program awarded funding for initial condition assessment, documentation, capacity building and investigations. This paper discusses the project to document and improve the image and awareness of this important structure and set priorities for its future.
Analysis of view synthesis prediction architectures in modern coding standards
NASA Astrophysics Data System (ADS)
Tian, Dong; Zou, Feng; Lee, Chris; Vetro, Anthony; Sun, Huifang
2013-09-01
Depth-based 3D formats are currently being developed as extensions to both AVC and HEVC standards. The availability of depth information facilitates the generation of intermediate views for advanced 3D applications and displays, and also enables more efficient coding of the multiview input data through view synthesis prediction techniques. This paper outlines several approaches that have been explored to realize view synthesis prediction in modern video coding standards such as AVC and HEVC. The benefits and drawbacks of various architectures are analyzed in terms of performance, complexity, and other design considerations. It is hence concluded that block-based VSP prediction for multiview video signals provides attractive coding gains with comparable complexity as traditional motion/disparity compensation.
US NDC Modernization: Service Oriented Architecture Proof of Concept
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamlet, Benjamin R.; Encarnacao, Andre Villanova; Jackson, Keilan R.
2014-12-01
This report is a progress update on the US NDC Modernization Service Oriented Architecture (SOA) study describing results from a proof of concept project completed from May through September 2013. Goals for this proof of concept are 1) gain experience configuring, using, and running an Enterprise Service Bus (ESB), 2) understand the implications of wrapping existing software in standardized interfaces for use as web services, and 3) gather performance metrics for a notional seismic event monitoring pipeline implemented using services with various data access and communication patterns. The proof of concept is a follow on to a previous SOA performancemore » study. Work was performed by four undergraduate summer student interns under the guidance of Sandia staff.« less
Evaluation of Visual Computer Simulator for Computer Architecture Education
ERIC Educational Resources Information Center
Imai, Yoshiro; Imai, Masatoshi; Moritoh, Yoshio
2013-01-01
This paper presents trial evaluation of a visual computer simulator in 2009-2011, which has been developed to play some roles of both instruction facility and learning tool simultaneously. And it illustrates an example of Computer Architecture education for University students and usage of e-Learning tool for Assembly Programming in order to…
Design of a massively parallel computer using bit serial processing elements
NASA Technical Reports Server (NTRS)
Aburdene, Maurice F.; Khouri, Kamal S.; Piatt, Jason E.; Zheng, Jianqing
1995-01-01
A 1-bit serial processor designed for a parallel computer architecture is described. This processor is used to develop a massively parallel computational engine, with a single instruction-multiple data (SIMD) architecture. The computer is simulated and tested to verify its operation and to measure its performance for further development.
A heterogeneous hierarchical architecture for real-time computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skroch, D.A.; Fornaro, R.J.
The need for high-speed data acquisition and control algorithms has prompted continued research in the area of multiprocessor systems and related programming techniques. The result presented here is a unique hardware and software architecture for high-speed real-time computer systems. The implementation of a prototype of this architecture has required the integration of architecture, operating systems and programming languages into a cohesive unit. This report describes a Heterogeneous Hierarchial Architecture for Real-Time (H{sup 2} ART) and system software for program loading and interprocessor communication.
An Architecture for SCADA Network Forensics
NASA Astrophysics Data System (ADS)
Kilpatrick, Tim; Gonzalez, Jesus; Chandia, Rodrigo; Papa, Mauricio; Shenoi, Sujeet
Supervisory control and data acquisition (SCADA) systems are widely used in industrial control and automation. Modern SCADA protocols often employ TCP/IP to transport sensor data and control signals. Meanwhile, corporate IT infrastructures are interconnecting with previously isolated SCADA networks. The use of TCP/IP as a carrier protocol and the interconnection of IT and SCADA networks raise serious security issues. This paper describes an architecture for SCADA network forensics. In addition to supporting forensic investigations of SCADA network incidents, the architecture incorporates mechanisms for monitoring process behavior, analyzing trends and optimizing plant performance.
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.
NASA Astrophysics Data System (ADS)
Al-Refaie, Ahmed F.; Tennyson, Jonathan
2017-12-01
Construction and diagonalization of the Hamiltonian matrix is the rate-limiting step in most low-energy electron - molecule collision calculations. Tennyson (1996) implemented a novel algorithm for Hamiltonian construction which took advantage of the structure of the wavefunction in such calculations. This algorithm is re-engineered to make use of modern computer architectures and the use of appropriate diagonalizers is considered. Test calculations demonstrate that significant speed-ups can be gained using multiple CPUs. This opens the way to calculations which consider higher collision energies, larger molecules and / or more target states. The methodology, which is implemented as part of the UK molecular R-matrix codes (UKRMol and UKRMol+) can also be used for studies of bound molecular Rydberg states, photoionization and positron-molecule collisions.
Development of an Object-Oriented Turbomachinery Analysis Code within the NPSS Framework
NASA Technical Reports Server (NTRS)
Jones, Scott M.
2014-01-01
During the preliminary or conceptual design phase of an aircraft engine, the turbomachinery designer has a need to estimate the effects of a large number of design parameters such as flow size, stage count, blade count, radial position, etc. on the weight and efficiency of a turbomachine. Computer codes are invariably used to perform this task however, such codes are often very old, written in outdated languages with arcane input files, and rarely adaptable to new architectures or unconventional layouts. Given the need to perform these kinds of preliminary design trades, a modern 2-D turbomachinery design and analysis code has been written using the Numerical Propulsion System Simulation (NPSS) framework. This paper discusses the development of the governing equations and the structure of the primary objects used in OTAC.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mendygral, P. J.; Radcliffe, N.; Kandalla, K.
2017-02-01
We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it maymore » be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.« less
2D-RBUC for efficient parallel compression of residuals
NASA Astrophysics Data System (ADS)
Đurđević, Đorđe M.; Tartalja, Igor I.
2018-02-01
In this paper, we present a method for lossless compression of residuals with an efficient SIMD parallel decompression. The residuals originate from lossy or near lossless compression of height fields, which are commonly used to represent models of terrains. The algorithm is founded on the existing RBUC method for compression of non-uniform data sources. We have adapted the method to capture 2D spatial locality of height fields, and developed the data decompression algorithm for modern GPU architectures already present even in home computers. In combination with the point-level SIMD-parallel lossless/lossy high field compression method HFPaC, characterized by fast progressive decompression and seamlessly reconstructed surface, the newly proposed method trades off small efficiency degradation for a non negligible compression ratio (measured up to 91%) benefit.
Automatic high-throughput screening of colloidal crystals using machine learning
NASA Astrophysics Data System (ADS)
Spellings, Matthew; Glotzer, Sharon C.
Recent improvements in hardware and software have united to pose an interesting problem for computational scientists studying self-assembly of particles into crystal structures: while studies covering large swathes of parameter space can be dispatched at once using modern supercomputers and parallel architectures, identifying the different regions of a phase diagram is often a serial task completed by hand. While analytic methods exist to distinguish some simple structures, they can be difficult to apply, and automatic identification of more complex structures is still lacking. In this talk we describe one method to create numerical ``fingerprints'' of local order and use them to analyze a study of complex ordered structures. We can use these methods as first steps toward automatic exploration of parameter space and, more broadly, the strategic design of new materials.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiang, Nai-Yuan; Zavala, Victor M.
We present a filter line-search algorithm that does not require inertia information of the linear system. This feature enables the use of a wide range of linear algebra strategies and libraries, which is essential to tackle large-scale problems on modern computing architectures. The proposed approach performs curvature tests along the search step to detect negative curvature and to trigger convexification. We prove that the approach is globally convergent and we implement the approach within a parallel interior-point framework to solve large-scale and highly nonlinear problems. Our numerical tests demonstrate that the inertia-free approach is as efficient as inertia detection viamore » symmetric indefinite factorizations. We also demonstrate that the inertia-free approach can lead to reductions in solution time because it reduces the amount of convexification needed.« less
Three Program Architecture for Design Optimization
NASA Technical Reports Server (NTRS)
Miura, Hirokazu; Olson, Lawrence E. (Technical Monitor)
1998-01-01
In this presentation, I would like to review historical perspective on the program architecture used to build design optimization capabilities based on mathematical programming and other numerical search techniques. It is rather straightforward to classify the program architecture in three categories as shown above. However, the relative importance of each of the three approaches has not been static, instead dynamically changing as the capabilities of available computational resource increases. For example, we considered that the direct coupling architecture would never be used for practical problems, but availability of such computer systems as multi-processor. In this presentation, I would like to review the roles of three architecture from historical as well as current and future perspective. There may also be some possibility for emergence of hybrid architecture. I hope to provide some seeds for active discussion where we are heading to in the very dynamic environment for high speed computing and communication.
ERIC Educational Resources Information Center
Arumi, Francisco N.
Computer programs capable of describing the thermal behavior of buildings are used to help architectural students understand environmental systems. The Numerical Simulation Laboratory at the Architectural School of the University of Texas at Austin was developed to provide the necessary software capable of simulating the energy transactions…
A Distributed Architecture for Tsunami Early Warning and Collaborative Decision-support in Crises
NASA Astrophysics Data System (ADS)
Moßgraber, J.; Middleton, S.; Hammitzsch, M.; Poslad, S.
2012-04-01
The presentation will describe work on the system architecture that is being developed in the EU FP7 project TRIDEC on "Collaborative, Complex and Critical Decision-Support in Evolving Crises". The challenges for a Tsunami Early Warning System (TEWS) are manifold and the success of a system depends crucially on the system's architecture. A modern warning system following a system-of-systems approach has to integrate various components and sub-systems such as different information sources, services and simulation systems. Furthermore, it has to take into account the distributed and collaborative nature of warning systems. In order to create an architecture that supports the whole spectrum of a modern, distributed and collaborative warning system one must deal with multiple challenges. Obviously, one cannot expect to tackle these challenges adequately with a monolithic system or with a single technology. Therefore, a system architecture providing the blueprints to implement the system-of-systems approach has to combine multiple technologies and architectural styles. At the bottom layer it has to reliably integrate a large set of conventional sensors, such as seismic sensors and sensor networks, buoys and tide gauges, and also innovative and unconventional sensors, such as streams of messages from social media services. At the top layer it has to support collaboration on high-level decision processes and facilitates information sharing between organizations. In between, the system has to process all data and integrate information on a semantic level in a timely manner. This complex communication follows an event-driven mechanism allowing events to be published, detected and consumed by various applications within the architecture. Therefore, at the upper layer the event-driven architecture (EDA) aspects are combined with principles of service-oriented architectures (SOA) using standards for communication and data exchange. The most prominent challenges on this layer include providing a framework for information integration on a syntactic and semantic level, leveraging distributed processing resources for a scalable data processing platform, and automating data processing and decision support workflows.
NASA Astrophysics Data System (ADS)
Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.
2017-12-01
As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
[Notes on hospital architecture in Brazil: between the traditional and the modern].
Costa, Renato Gama-Rosa
2011-12-01
The relationship between the history of health assistance and architecture is not always obvious. The article points to some challenges in investigating this relation, which is most readily visible in the construction of medical facilities, especially hospitals and sanitariums. In Brazil, this fledgling field has begun drawing the attention of researchers from the applied human and social sciences, especially in more recent decades.
Scaling Support Vector Machines On Modern HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Fu, Haohuan; Song, Shuaiwen
2015-02-01
We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
ERIC Educational Resources Information Center
Meehan, Mark W.
2012-01-01
This dissertation investigates the development and function of the Institute of Traditional Islamic Art and Architecture in Amman, Jordan. A vertical case study using grounded theory methodology, the research attempts to create a rich and holistic understanding of the Institute. Specific areas of study include the factors involved in the founding…
Trust-Management, Intrusion-Tolerance, Accountability, and Reconstitution Architecture (TIARA)
2009-12-01
Tainting, tagged, metadata, architecture, hardware, processor, microkernel , zero-kernel, co-design 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF... microkernels (e.g., [27]) embraced the idea that it was beneficial to reduce the ker- nel, separating out services as separate processes isolated from...limited adoption. More recently Tanenbaum [72] notes the security virtues of microkernels and suggests the modern importance of security makes it
NASA Technical Reports Server (NTRS)
Weeks, Cindy Lou
1986-01-01
Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas; Schuman, Catherine; Patton, Robert
The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshopmore » we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.« less
A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potok, Thomas E; Schuman, Catherine D; Young, Steven R
Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determinemore » network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.« less
ERIC Educational Resources Information Center
Amenyo, John-Thones
2012-01-01
Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…
Overview of computational structural methods for modern military aircraft
NASA Technical Reports Server (NTRS)
Kudva, J. N.
1992-01-01
Computational structural methods are essential for designing modern military aircraft. This briefing deals with computational structural methods (CSM) currently used. First a brief summary of modern day aircraft structural design procedures is presented. Following this, several ongoing CSM related projects at Northrop are discussed. Finally, shortcomings in this area, future requirements, and summary remarks are given.
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
Switching from computer to microcomputer architecture education
NASA Astrophysics Data System (ADS)
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-03-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to microcomputer architecture. The authors present their strategies towards a successful crossing of boundaries between engineering disciplines. This communication aims at providing a different aspect on professional courses that are, nowadays, addressed at the expense of traditional courses.
Three-Dimensional Nanobiocomputing Architectures With Neuronal Hypercells
2007-06-01
Neumann architectures, and CMOS fabrication. Novel solutions of massive parallel distributed computing and processing (pipelined due to systolic... and processing platforms utilizing molecular hardware within an enabling organization and architecture. The design technology is based on utilizing a...Microsystems and Nanotechnologies investigated a novel 3D3 (Hardware Software Nanotechnology) technology to design super-high performance computing
On the Impact of Execution Models: A Case Study in Computational Chemistry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chavarría-Miranda, Daniel; Halappanavar, Mahantesh; Krishnamoorthy, Sriram
2015-05-25
Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancingmore » that has comparable performance to a traditional hypergraph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.« less
FEBio: finite elements for biomechanics.
Maas, Steve A; Ellis, Benjamin J; Ateshian, Gerard A; Weiss, Jeffrey A
2012-01-01
In the field of computational biomechanics, investigators have primarily used commercial software that is neither geared toward biological applications nor sufficiently flexible to follow the latest developments in the field. This lack of a tailored software environment has hampered research progress, as well as dissemination of models and results. To address these issues, we developed the FEBio software suite (http://mrl.sci.utah.edu/software/febio), a nonlinear implicit finite element (FE) framework, designed specifically for analysis in computational solid biomechanics. This paper provides an overview of the theoretical basis of FEBio and its main features. FEBio offers modeling scenarios, constitutive models, and boundary conditions, which are relevant to numerous applications in biomechanics. The open-source FEBio software is written in C++, with particular attention to scalar and parallel performance on modern computer architectures. Software verification is a large part of the development and maintenance of FEBio, and to demonstrate the general approach, the description and results of several problems from the FEBio Verification Suite are presented and compared to analytical solutions or results from other established and verified FE codes. An additional simulation is described that illustrates the application of FEBio to a research problem in biomechanics. Together with the pre- and postprocessing software PREVIEW and POSTVIEW, FEBio provides a tailored solution for research and development in computational biomechanics.
Switching from Computer to Microcomputer Architecture Education
ERIC Educational Resources Information Center
Bolanakis, Dimosthenis E.; Kotsis, Konstantinos T.; Laopoulos, Theodore
2010-01-01
In the last decades, the technological and scientific evolution of the computing discipline has been widely affecting research in software engineering education, which nowadays advocates more enlightened and liberal ideas. This article reviews cross-disciplinary research on a computer architecture class in consideration of its switching to…
An Architecture for Cross-Cloud System Management
NASA Astrophysics Data System (ADS)
Dodda, Ravi Teja; Smith, Chris; van Moorsel, Aad
The emergence of the cloud computing paradigm promises flexibility and adaptability through on-demand provisioning of compute resources. As the utilization of cloud resources extends beyond a single provider, for business as well as technical reasons, the issue of effectively managing such resources comes to the fore. Different providers expose different interfaces to their compute resources utilizing varied architectures and implementation technologies. This heterogeneity poses a significant system management problem, and can limit the extent to which the benefits of cross-cloud resource utilization can be realized. We address this problem through the definition of an architecture to facilitate the management of compute resources from different cloud providers in an homogenous manner. This preserves the flexibility and adaptability promised by the cloud computing paradigm, whilst enabling the benefits of cross-cloud resource utilization to be realized. The practical efficacy of the architecture is demonstrated through an implementation utilizing compute resources managed through different interfaces on the Amazon Elastic Compute Cloud (EC2) service. Additionally, we provide empirical results highlighting the performance differential of these different interfaces, and discuss the impact of this performance differential on efficiency and profitability.
NASA Astrophysics Data System (ADS)
Sarkar, Subir; Banerjee, Santanu; Samanta, Pradip; Chakraborty, Nivedita; Chakraborty, Partha Pratim; Mukhopadhyay, Soumik; Singh, Arvind K.
2014-09-01
Microbial mat-related structures (MRS) in siliciclastics have been investigated from four Proterozic formations in India, namely the Marwar Supergroup, the Vindhyan Supergroup, the Chhatisgarh Supergroup and the Khariar Group for their spectral variations, genetic aspects, palaeo-environmental significance and influence on sequence stratigraphic architecture. The maximum diversification of MRS has been experienced in shallow marine coastal Precambrian successions. Observations made from modern environment as well as Precambrian rock records clearly indicates that the features like petee ridges, sand-cracks, gas domes, multi-directed ripples, reticulate surfaces, sieve-like surfaces and setulf are most likely to form in the shallowest part of the marine basins, in upper intertidal to supratidal conditions while wrinkle structures, roll-up structures and patchy ripples had a broader range of palaeogeographic settings from the supratidal to subtidal conditions. Discoidal microbial colony (DMC) represents a special variety of the mat-layer feature in modern environment that may have diverse internal architecture, sometimes falsely resembles Ediacaran medusoids. The uniqueness in sequence stratigraphic architecture of the microbial mat-covered sediment is reflected by the presence of more amalgamated HSTs compare to that of TSTs. The preservation of forced and normal regressive deposits on low-gradient epeiric shelf under low continental freeboard indicates microbial mat-infested sea-floor impedes erosion and concomitant sediment supply may facilitate formation and preservation of regressive packages.
Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.; ...
2017-01-03
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
Thermalnet: a Deep Convolutional Network for Synthetic Thermal Image Generation
NASA Astrophysics Data System (ADS)
Kniaz, V. V.; Gorbatsevich, V. S.; Mizginov, V. A.
2017-05-01
Deep convolutional neural networks have dramatically changed the landscape of the modern computer vision. Nowadays methods based on deep neural networks show the best performance among image recognition and object detection algorithms. While polishing of network architectures received a lot of scholar attention, from the practical point of view the preparation of a large image dataset for a successful training of a neural network became one of major challenges. This challenge is particularly profound for image recognition in wavelengths lying outside the visible spectrum. For example no infrared or radar image datasets large enough for successful training of a deep neural network are available to date in public domain. Recent advances of deep neural networks prove that they are also capable to do arbitrary image transformations such as super-resolution image generation, grayscale image colorisation and imitation of style of a given artist. Thus a natural question arise: how could be deep neural networks used for augmentation of existing large image datasets? This paper is focused on the development of the Thermalnet deep convolutional neural network for augmentation of existing large visible image datasets with synthetic thermal images. The Thermalnet network architecture is inspired by colorisation deep neural networks.
A Specification for a Godunov-type Eulerian 2-D Hydrocode, Revision 0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nystrom, William D; Robey, Jonathan M
2012-05-01
The purpose of this code specification is to describe an algorithm for solving the Euler equations of hydrodynamics in a 2D rectangular region in sufficient detail to allow a software developer to produce an implementation on their target platform using their programming language of choice without requiring detailed knowledge and experience in the field of computational fluid dynamics. It should be possible for a software developer who is proficient in the programming language of choice and is knowledgable of the target hardware to produce an efficient implementation of this specification if they also possess a thorough working knowledge of parallelmore » programming and have some experience in scientific programming using fields and meshes. On modern architectures, it will be important to focus on issues related to the exploitation of the fine grain parallelism and data locality present in this algorithm. This specification aims to make that task easier by presenting the essential details of the algorithm in a systematic and language neutral manner while also avoiding the inclusion of implementation details that would likely be specific to a particular type of programming paradigm or platform architecture.« less
Mantle convection on modern supercomputers
NASA Astrophysics Data System (ADS)
Weismüller, Jens; Gmeiner, Björn; Mohr, Marcus; Waluga, Christian; Wohlmuth, Barbara; Rüde, Ulrich; Bunge, Hans-Peter
2015-04-01
Mantle convection is the cause for plate tectonics, the formation of mountains and oceans, and the main driving mechanism behind earthquakes. The convection process is modeled by a system of partial differential equations describing the conservation of mass, momentum and energy. Characteristic to mantle flow is the vast disparity of length scales from global to microscopic, turning mantle convection simulations into a challenging application for high-performance computing. As system size and technical complexity of the simulations continue to increase, design and implementation of simulation models for next generation large-scale architectures demand an interdisciplinary co-design. Here we report about recent advances of the TERRA-NEO project, which is part of the high visibility SPPEXA program, and a joint effort of four research groups in computer sciences, mathematics and geophysical application under the leadership of FAU Erlangen. TERRA-NEO develops algorithms for future HPC infrastructures, focusing on high computational efficiency and resilience in next generation mantle convection models. We present software that can resolve the Earth's mantle with up to 1012 grid points and scales efficiently to massively parallel hardware with more than 50,000 processors. We use our simulations to explore the dynamic regime of mantle convection assessing the impact of small scale processes on global mantle flow.
EMG amplifier with wireless data transmission
NASA Astrophysics Data System (ADS)
Kowalski, Grzegorz; Wildner, Krzysztof
2017-08-01
Wireless medical diagnostics is a trend in modern technology used in medicine. This paper presents a concept of realization, architecture of hardware and software implementation of an elecromyography signal (EMG) amplifier with wireless data transmission. This amplifier consists of three components: analogue processing of bioelectric signal module, micro-controller circuit and an application enabling data acquisition via a personal computer. The analogue bioelectric signal processing circuit receives electromyography signals from the skin surface, followed by initial analogue processing and preparation of the signals for further digital processing. The second module is a micro-controller circuit designed to wirelessly transmit the electromyography signals from the analogue signal converter to a personal computer. Its purpose is to eliminate the need for wired connections between the patient and the data logging device. The third block is a computer application designed to display the transmitted electromyography signals, as well as data capture and analysis. Its purpose is to provide a graphical representation of the collected data. The entire device has been thoroughly tested to ensure proper functioning. In use, the device displayed the captured electromyography signal from the arm of the patient. Amplitude- frequency characteristics were set in order to investigate the bandwidth and the overall gain of the device.
A Low Cost VLSI Architecture for Spike Sorting Based on Feature Extraction with Peak Search.
Chang, Yuan-Jyun; Hwang, Wen-Jyi; Chen, Chih-Chang
2016-12-07
The goal of this paper is to present a novel VLSI architecture for spike sorting with high classification accuracy, low area costs and low power consumption. A novel feature extraction algorithm with low computational complexities is proposed for the design of the architecture. In the feature extraction algorithm, a spike is separated into two portions based on its peak value. The area of each portion is then used as a feature. The algorithm is simple to implement and less susceptible to noise interference. Based on the algorithm, a novel architecture capable of identifying peak values and computing spike areas concurrently is proposed. To further accelerate the computation, a spike can be divided into a number of segments for the local feature computation. The local features are subsequently merged with the global ones by a simple hardware circuit. The architecture can also be easily operated in conjunction with the circuits for commonly-used spike detection algorithms, such as the Non-linear Energy Operator (NEO). The architecture has been implemented by an Application-Specific Integrated Circuit (ASIC) with 90-nm technology. Comparisons to the existing works show that the proposed architecture is well suited for real-time multi-channel spike detection and feature extraction requiring low hardware area costs, low power consumption and high classification accuracy.
GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing
Fang, Ye; Ding, Yun; Feinstein, Wei P.; Koppelman, David M.; Moreno, Juana; Jarrell, Mark; Ramanujam, J.; Brylinski, Michal
2016-01-01
Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249. PMID:27420300
NASA Astrophysics Data System (ADS)
Ford, Eric B.
2009-05-01
We present the results of a highly parallel Kepler equation solver using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX and the "Compute Unified Device Architecture" (CUDA) programming environment. We apply this to evaluate a goodness-of-fit statistic (e.g., χ2) for Doppler observations of stars potentially harboring multiple planetary companions (assuming negligible planet-planet interactions). Given the high-dimensionality of the model parameter space (at least five dimensions per planet), a global search is extremely computationally demanding. We expect that the underlying Kepler solver and model evaluator will be combined with a wide variety of more sophisticated algorithms to provide efficient global search, parameter estimation, model comparison, and adaptive experimental design for radial velocity and/or astrometric planet searches. We tested multiple implementations using single precision, double precision, pairs of single precision, and mixed precision arithmetic. We find that the vast majority of computations can be performed using single precision arithmetic, with selective use of compensated summation for increased precision. However, standard single precision is not adequate for calculating the mean anomaly from the time of observation and orbital period when evaluating the goodness-of-fit for real planetary systems and observational data sets. Using all double precision, our GPU code outperforms a similar code using a modern CPU by a factor of over 60. Using mixed precision, our GPU code provides a speed-up factor of over 600, when evaluating nsys > 1024 models planetary systems each containing npl = 4 planets and assuming nobs = 256 observations of each system. We conclude that modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's equation and a goodness-of-fit statistic for orbital models when presented with a large parameter space.
NASA Astrophysics Data System (ADS)
Lavrov, V. V.; Spirin, N. A.
2016-09-01
Advances in modern science and technology are inherently connected with the development, implementation, and widespread use of computer systems based on mathematical modeling. Algorithms and computer systems are gaining practical significance solving a range of process tasks in metallurgy of MES-level (Manufacturing Execution Systems - systems controlling industrial process) of modern automated information systems at the largest iron and steel enterprises in Russia. This fact determines the necessity to develop information-modeling systems based on mathematical models that will take into account the physics of the process, the basics of heat and mass exchange, the laws of energy conservation, and also the peculiarities of the impact of technological and standard characteristics of raw materials on the manufacturing process data. Special attention in this set of operations for metallurgic production is devoted to blast-furnace production, as it consumes the greatest amount of energy, up to 50% of the fuel used in ferrous metallurgy. The paper deals with the requirements, structure and architecture of BF Process Engineer's Automated Workstation (AWS), a computer decision support system of MES Level implemented in the ICS of the Blast Furnace Plant at Magnitogorsk Iron and Steel Works. It presents a brief description of main model subsystems as well as assumptions made in the process of mathematical modelling. Application of the developed system allows the engineering and process staff to analyze online production situations in the blast furnace plant, to solve a number of process tasks related to control of heat, gas dynamics and slag conditions of blast-furnace smelting as well as to calculate the optimal composition of blast-furnace slag, which eventually results in increasing technical and economic performance of blast-furnace production.
GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing.
Fang, Ye; Ding, Yun; Feinstein, Wei P; Koppelman, David M; Moreno, Juana; Jarrell, Mark; Ramanujam, J; Brylinski, Michal
2016-01-01
Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.
Cache Energy Optimization Techniques For Modern Processors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mittal, Sparsh
2013-01-01
Modern multicore processors are employing large last-level caches, for example Intel's E7-8800 processor uses 24MB L3 cache. Further, with each CMOS technology generation, leakage energy has been dramatically increasing and hence, leakage energy is expected to become a major source of energy dissipation, especially in last-level caches (LLCs). The conventional schemes of cache energy saving either aim at saving dynamic energy or are based on properties specific to first-level caches, and thus these schemes have limited utility for last-level caches. Further, several other techniques require offline profiling or per-application tuning and hence are not suitable for product systems. In thismore » book, we present novel cache leakage energy saving schemes for single-core and multicore systems; desktop, QoS, real-time and server systems. Also, we present cache energy saving techniques for caches designed with both conventional SRAM devices and emerging non-volatile devices such as STT-RAM (spin-torque transfer RAM). We present software-controlled, hardware-assisted techniques which use dynamic cache reconfiguration to configure the cache to the most energy efficient configuration while keeping the performance loss bounded. To profile and test a large number of potential configurations, we utilize low-overhead, micro-architecture components, which can be easily integrated into modern processor chips. We adopt a system-wide approach to save energy to ensure that cache reconfiguration does not increase energy consumption of other components of the processor. We have compared our techniques with state-of-the-art techniques and have found that our techniques outperform them in terms of energy efficiency and other relevant metrics. The techniques presented in this book have important applications in improving energy-efficiency of higher-end embedded, desktop, QoS, real-time, server processors and multitasking systems. This book is intended to be a valuable guide for both newcomers and veterans in the field of cache power management. It will help graduate students, CAD tool developers and designers in understanding the need of energy efficiency in modern computing systems. Further, it will be useful for researchers in gaining insights into algorithms and techniques for micro-architectural and system-level energy optimization using dynamic cache reconfiguration. We sincerely believe that the ``food for thought'' presented in this book will inspire the readers to develop even better ideas for designing ``green'' processors of tomorrow.« less
Integrating Xgrid into the HENP distributed computing model
NASA Astrophysics Data System (ADS)
Hajdu, L.; Kocoloski, A.; Lauret, J.; Miller, M.
2008-07-01
Modern Macintosh computers feature Xgrid, a distributed computing architecture built directly into Apple's OS X operating system. While the approach is radically different from those generally expected by the Unix based Grid infrastructures (Open Science Grid, TeraGrid, EGEE), opportunistic computing on Xgrid is nonetheless a tempting and novel way to assemble a computing cluster with a minimum of additional configuration. In fact, it requires only the default operating system and authentication to a central controller from each node. OS X also implements arbitrarily extensible metadata, allowing an instantly updated file catalog to be stored as part of the filesystem itself. The low barrier to entry allows an Xgrid cluster to grow quickly and organically. This paper and presentation will detail the steps that can be taken to make such a cluster a viable resource for HENP research computing. We will further show how to provide to users a unified job submission framework by integrating Xgrid through the STAR Unified Meta-Scheduler (SUMS), making tasks and jobs submission effortlessly at reach for those users already using the tool for traditional Grid or local cluster job submission. We will discuss additional steps that can be taken to make an Xgrid cluster a full partner in grid computing initiatives, focusing on Open Science Grid integration. MIT's Xgrid system currently supports the work of multiple research groups in the Laboratory for Nuclear Science, and has become an important tool for generating simulations and conducting data analyses at the Massachusetts Institute of Technology.
Generic Divide and Conquer Internet-Based Computing
NASA Technical Reports Server (NTRS)
Radenski, Atanas; Follen, Gregory J. (Technical Monitor)
2001-01-01
The rapid growth of internet-based applications and the proliferation of networking technologies have been transforming traditional commercial application areas as well as computer and computational sciences and engineering. This growth stimulates the exploration of new, internet-oriented software technologies that can open new research and application opportunities not only for the commercial world, but also for the scientific and high -performance computing applications community. The general goal of this research project is to contribute to better understanding of the transition to internet-based high -performance computing and to develop solutions for some of the difficulties of this transition. More specifically, our goal is to design an architecture for generic divide and conquer internet-based computing, to develop a portable implementation of this architecture, to create an example library of high-performance divide-and-conquer computing agents that run on top of this architecture, and to evaluate the performance of these agents. We have been designing an architecture that incorporates a master task-pool server and utilizes satellite computational servers that operate on the Internet in a dynamically changing large configuration of lower-end nodes provided by volunteer contributors. Our designed architecture is intended to be complementary to and accessible from computational grids such as Globus, Legion, and Condor. Grids provide remote access to existing high-end computing resources; in contrast, our goal is to utilize idle processor time of lower-end internet nodes. Our project is focused on a generic divide-and-conquer paradigm and its applications that operate on a loose and ever changing pool of lower-end internet nodes.
THE COMPUTER AND THE ARCHITECTURAL PROFESSION.
ERIC Educational Resources Information Center
HAVILAND, DAVID S.
THE ROLE OF ADVANCING TECHNOLOGY IN THE FIELD OF ARCHITECTURE IS DISCUSSED IN THIS REPORT. PROBLEMS IN COMMUNICATION AND THE DESIGN PROCESS ARE IDENTIFIED. ADVANTAGES AND DISADVANTAGES OF COMPUTERS ARE MENTIONED IN RELATION TO MAN AND MACHINE INTERACTION. PRESENT AND FUTURE IMPLICATIONS OF COMPUTER USAGE ARE IDENTIFIED AND DISCUSSED WITH RESPECT…
The Contribution of Visualization to Learning Computer Architecture
ERIC Educational Resources Information Center
Yehezkel, Cecile; Ben-Ari, Mordechai; Dreyfus, Tommy
2007-01-01
This paper describes a visualization environment and associated learning activities designed to improve learning of computer architecture. The environment, EasyCPU, displays a model of the components of a computer and the dynamic processes involved in program execution. We present the results of a research program that analysed the contribution of…
Technology advances and market forces: Their impact on high performance architectures
NASA Technical Reports Server (NTRS)
Best, D. R.
1978-01-01
Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vineyard, Craig Michael; Verzi, Stephen Joseph
As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
Computing architecture for autonomous microgrids
Goldsmith, Steven Y.
2015-09-29
A computing architecture that facilitates autonomously controlling operations of a microgrid is described herein. A microgrid network includes numerous computing devices that execute intelligent agents, each of which is assigned to a particular entity (load, source, storage device, or switch) in the microgrid. The intelligent agents can execute in accordance with predefined protocols to collectively perform computations that facilitate uninterrupted control of the .
Blueprint for a microwave trapped ion quantum computer.
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K
2017-02-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
International travel as medical research: architecture and the modern hospital.
Logan, Cameron; Willis, Julie
2010-01-01
The design and development of the modern hospital in Australia had a profound impact on medical practice and research at a variety of levels. Between the late 1920s and the 1950s hospital architects, administrators, and politicians travelled widely in order to review the latest international developments in the hospital field They were motivated by Australia's geographic isolation and a growing concern with how to govern the population at the level of physical health. While not 'medical research' in the conventional sense of the term, this travel was a powerful generator of medical thinking in Australia and has left a rich archival legacy. This paper draws on that archive to demonstrate the ways in which architectural research and international networks of hospital specialists profoundly shaped the provision of medical infrastructure in Australia.
Natural language processing: an introduction.
Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W
2011-01-01
To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.
Natural language processing: an introduction
Ohno-Machado, Lucila; Chapman, Wendy W
2011-01-01
Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786
Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.
Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo
2016-07-19
Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .
2016-01-01
Background Computer Networks have a tendency to grow at an unprecedented scale. Modern networks involve not only computers but also a wide variety of other interconnected devices ranging from mobile phones to other household items fitted with sensors. This vision of the "Internet of Things" (IoT) implies an inherent difficulty in modeling problems. Purpose It is practically impossible to implement and test all scenarios for large-scale and complex adaptive communication networks as part of Complex Adaptive Communication Networks and Environments (CACOONS). The goal of this study is to explore the use of Agent-based Modeling as part of the Cognitive Agent-based Computing (CABC) framework to model a Complex communication network problem. Method We use Exploratory Agent-based Modeling (EABM), as part of the CABC framework, to develop an autonomous multi-agent architecture for managing carbon footprint in a corporate network. To evaluate the application of complexity in practical scenarios, we have also introduced a company-defined computer usage policy. Results The conducted experiments demonstrated two important results: Primarily CABC-based modeling approach such as using Agent-based Modeling can be an effective approach to modeling complex problems in the domain of IoT. Secondly, the specific problem of managing the Carbon footprint can be solved using a multiagent system approach. PMID:26812235
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Youngblood, John N.; Saha, Aindam
1987-01-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, C.C.; Youngblood, J.N.; Saha, A.
1987-12-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
Modernizing the Opposed-Piston Engine for More Efficient Military Ground Vehicle Applications
2012-08-01
stroke (OP2S) engines and their use in military applications. It also highlights the engine’s fundamental architectural advantages. In addition, the...rejection. Furthermore, the paper includes an overview of the fundamental challenges of OP2S engines, along with a discussion of how Achates Power...It also highlights the engine’s fundamental architectural advantages. In addition, the paper introduces the Achates Power opposed-piston engine
Urban landscape architecture design under the view of sustainable development
NASA Astrophysics Data System (ADS)
Chen, WeiLin
2017-08-01
The concept of sustainable development in modern city landscape design advocates landscape architecture, which is the main development direction in the field of landscape design. They are also effective measures to promote the sustainable development of city garden. Based on this, combined with the connotation of sustainable development and sustainable design, this paper analyzes and discusses the design of urban landscape under the concept of sustainable development.
SU (2) lattice gauge theory simulations on Fermi GPUs
NASA Astrophysics Data System (ADS)
Cardoso, Nuno; Bicudo, Pedro
2011-05-01
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
NASA Astrophysics Data System (ADS)
Agaian, Sos S.; Akopian, David; D'Souza, Sunil
2006-02-01
Modern mobile devices are some of the most technologically advanced devices that people use on a daily basis and the current trends in mobile phone technology indicate that tasks achievable by mobile devices will soon exceed our imagination. This paper undertakes a case study of the development and implementation of one of the first known steganography (data hiding) applications on a mobile device. Steganography is traditionally accomplished using the high processing speeds of desktop or notebook computers. With the introduction of mobile platform operating systems, there arises an opportunity for the users to develop and embed their own applications. We take advantage of this opportunity with the introduction of wireless steganographic algorithms. Thus we demonstrates that custom applications, popular with security establishments, can be developed also on mobile systems independent of both the mobile device manufacturer and mobile service provider. For example, this might be a very important feature if the communication is to be controlled exclusively by authorized personnel. The paper begins by reviewing the technological capabilities of modern mobile devices. Then we address a suitable development platform which is based on Symbian TM/Series60 TM architecture. Finally, two data hiding applications developed for Symbian TM/Series60 TM mobile phones are presented.
NASA Astrophysics Data System (ADS)
Carlton, David Bryan
The exponential improvements in speed, energy efficiency, and cost that the computer industry has relied on for growth during the last 50 years are in danger of ending within the decade. These improvements all have relied on scaling the size of the silicon-based transistor that is at the heart of every modern CPU down to smaller and smaller length scales. However, as the size of the transistor reaches scales that are measured in the number of atoms that make it up, it is clear that this scaling cannot continue forever. As a result of this, there has been a great deal of research effort directed at the search for the next device that will continue to power the growth of the computer industry. However, due to the billions of dollars of investment that conventional silicon transistors have received over the years, it is unlikely that a technology will emerge that will be able to beat it outright in every performance category. More likely, different devices will possess advantages over conventional transistors for certain applications and uses. One of these emerging computing platforms is nanomagnetic logic (NML). NML-based circuits process information by manipulating the magnetization states of single-domain nanomagnets coupled to their nearest neighbors through magnetic dipole interactions. The state variable is magnetization direction and computations can take place without passing an electric current. This makes them extremely attractive as a replacement for conventional transistor-based computing architectures for certain ultra-low power applications. In most work to date, nanomagnetic logic circuits have used an external magnetic clocking field to reset the system between computations. The clocking field is then subsequently removed very slowly relative to the magnetization dynamics, guiding the nanomagnetic logic circuit adiabatically into its magnetic ground state. In this dissertation, I will discuss the dynamics behind this process and show that it is greatly influenced by thermal fluctuations. The magnetic ground state containing the answer to the computation is reached by a stochastic process very similar to the thermal annealing of crystalline materials. We will discuss how these dynamics affect the expected reliability, speed, and energy dissipation of NML systems operating under these conditions. Next I will show how a slight change in the properties of the nanomagnets that make up a NML circuit can completely alter the dynamics by which computations take place. The addition of biaxial anisotropy to the magnetic energy landscape creates a metastable state along the hard axis of the nanomagnet. This metastability can be used to remove the stochastic nature of the computation and has large implications for reliability, speed, and energy dissipation which will all be discussed. The changes to NML operation by the addition of biaxial anisotropy introduce new challenges to realizing a commercially viable logic architecture. In the final chapter, I will discuss these challenges and talk about the architectural changes that are necessary to make a working NML circuit based on nanomagnets with biaxial anisotropy.
Lunar architecture and urbanism
NASA Astrophysics Data System (ADS)
Sherwood, Brent
1992-09-01
Human civilization and architecture have defined each other for over 5000 years on Earth. Even in the novel environment of space, persistent issues of human urbanism will eclipse, within a historically short time, the technical challenges of space settlement that dominate our current view. By adding modern topics in space engineering, planetology, life support, human factors, material invention, and conservation to their already renaissance array of expertise, urban designers can responsibly apply ancient, proven standards to the exciting new opportunities afforded by space. Inescapable facts about the Moon set real boundaries within which tenable lunar urbanism and its component architecture must eventually develop.
Lunar architecture and urbanism
NASA Technical Reports Server (NTRS)
Sherwood, Brent
1992-01-01
Human civilization and architecture have defined each other for over 5000 years on Earth. Even in the novel environment of space, persistent issues of human urbanism will eclipse, within a historically short time, the technical challenges of space settlement that dominate our current view. By adding modern topics in space engineering, planetology, life support, human factors, material invention, and conservation to their already renaissance array of expertise, urban designers can responsibly apply ancient, proven standards to the exciting new opportunities afforded by space. Inescapable facts about the Moon set real boundaries within which tenable lunar urbanism and its component architecture must eventually develop.
Multiprocessor architecture: Synthesis and evaluation
NASA Technical Reports Server (NTRS)
Standley, Hilda M.
1990-01-01
Multiprocessor computed architecture evaluation for structural computations is the focus of the research effort described. Results obtained are expected to lead to more efficient use of existing architectures and to suggest designs for new, application specific, architectures. The brief descriptions given outline a number of related efforts directed toward this purpose. The difficulty is analyzing an existing architecture or in designing a new computer architecture lies in the fact that the performance of a particular architecture, within the context of a given application, is determined by a number of factors. These include, but are not limited to, the efficiency of the computation algorithm, the programming language and support environment, the quality of the program written in the programming language, the multiplicity of the processing elements, the characteristics of the individual processing elements, the interconnection network connecting processors and non-local memories, and the shared memory organization covering the spectrum from no shared memory (all local memory) to one global access memory. These performance determiners may be loosely classified as being software or hardware related. This distinction is not clear or even appropriate in many cases. The effect of the choice of algorithm is ignored by assuming that the algorithm is specified as given. Effort directed toward the removal of the effect of the programming language and program resulted in the design of a high-level parallel programming language. Two characteristics of the fundamental structure of the architecture (memory organization and interconnection network) are examined.
Modern Role of Architectural Style Category in Russian Practice
NASA Astrophysics Data System (ADS)
Chudinova, V. G.
2017-11-01
The article examines the functional aspects of architectural style use as an instrument for communication, self-positioning, representation and commercial attractiveness. The modern Russian practice is marked by a predominance of decorating methods rather than architectural ones used to create an artistic image. Specific examples illustrate stylistic trends that are indicative of an identity crisis. The problem goes beyond the scope of a custom or corporate design. This is especially evident in the former Soviet Union republics. The style issue is inherent in the advertising and commercial sphere of interior design and furnishing promotion. One can state that the category of art studies has been introduced into the collective consciousness but in some extrinsic way. Marketing managers and designers who have a very vague idea of the fundamental scientific concepts form a new language and market demand not only for a design work product but also for its positioning. This leads to a semantic distortion of the architectural style characteristic and a misconception. At the same time, there is a growing need for new definitions and verbalizations of perceptual experience. The conclusions contain an assumption on a reversible scientific and practical process when the theory is forced to accept and attend to a spontaneously formed and deep-rooted system of meanings. The need to develop the architectural theory in the communication language realm is brought up. The research problem is stated both for social science and anthropology as well as for culturology and art history.
GeantV: From CPU to accelerators
Amadio, G.; Ananya, A.; Apostolakis, J.; ...
2016-01-01
The GeantV project aims to research and develop the next-generation simulation software describing the passage of particles through matter. While the modern CPU architectures are being targeted first, resources such as GPGPU, Intel© Xeon Phi, Atom or ARM cannot be ignored anymore by HEP CPU-bound applications. The proof of concept GeantV prototype has been mainly engineered for CPU's having vector units but we have foreseen from early stages a bridge to arbitrary accelerators. A software layer consisting of architecture/technology specific backends supports currently this concept. This approach allows to abstract out the basic types such as scalar/vector but also tomore » formalize generic computation kernels using transparently library or device specific constructs based on Vc, CUDA, Cilk+ or Intel intrinsics. While the main goal of this approach is portable performance, as a bonus, it comes with the insulation of the core application and algorithms from the technology layer. This allows our application to be long term maintainable and versatile to changes at the backend side. The paper presents the first results of basket-based GeantV geometry navigation on the Intel© Xeon Phi KNC architecture. We present the scalability and vectorization study, conducted using Intel performance tools, as well as our preliminary conclusions on the use of accelerators for GeantV transport. Lastly, we also describe the current work and preliminary results for using the GeantV transport kernel on GPUs.« less
The SysMan monitoring service and its management environment
NASA Astrophysics Data System (ADS)
Debski, Andrzej; Janas, Ekkehard
1996-06-01
Management of modern information systems is becoming more and more complex. There is a growing need for powerful, flexible and affordable management tools to assist system managers in maintaining such systems. It is at the same time evident that effective management should integrate network management, system management and application management in a uniform way. Object oriented OSI management architecture with its four basic modelling concepts (information, organization, communication and functional models) together with widely accepted distribution platforms such as ANSA/CORBA, constitutes a reliable and modern framework for the implementation of a management toolset. This paper focuses on the presentation of concepts and implementation results of an object oriented management toolset developed and implemented within the framework of the ESPRIT project 7026 SysMan. An overview is given of the implemented SysMan management services including the System Management Service, Monitoring Service, Network Management Service, Knowledge Service, Domain and Policy Service, and the User Interface. Special attention is paid to the Monitoring Service which incorporates the architectural key entity responsible for event management. Its architecture and building components, especially filters, are emphasized and presented in detail.
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
Hierarchial parallel computer architecture defined by computational multidisciplinary mechanics
NASA Technical Reports Server (NTRS)
Padovan, Joe; Gute, Doug; Johnson, Keith
1989-01-01
The goal is to develop an architecture for parallel processors enabling optimal handling of multi-disciplinary computation of fluid-solid simulations employing finite element and difference schemes. The goals, philosphical and modeling directions, static and dynamic poly trees, example problems, interpolative reduction, the impact on solvers are shown in viewgraph form.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Uhr, L.
1987-01-01
This book is written by research scientists involved in the development of massively parallel, but hierarchically structured, algorithms, architectures, and programs for image processing, pattern recognition, and computer vision. The book gives an integrated picture of the programs and algorithms that are being developed, and also of the multi-computer hardware architectures for which these systems are designed.
Computer Architecture's Changing Role in Rebooting Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik P.
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Computer Architecture's Changing Role in Rebooting Computing
DeBenedictis, Erik P.
2017-04-26
In this paper, Windows 95 started the Wintel era, in which Microsoft Windows running on Intel x86 microprocessors dominated the computer industry and changed the world. Retaining the x86 instruction set across many generations let users buy new and more capable microprocessors without having to buy software to work with new architectures.
Design of a fault tolerant airborne digital computer. Volume 1: Architecture
NASA Technical Reports Server (NTRS)
Wensley, J. H.; Levitt, K. N.; Green, M. W.; Goldberg, J.; Neumann, P. G.
1973-01-01
This volume is concerned with the architecture of a fault tolerant digital computer for an advanced commercial aircraft. All of the computations of the aircraft, including those presently carried out by analogue techniques, are to be carried out in this digital computer. Among the important qualities of the computer are the following: (1) The capacity is to be matched to the aircraft environment. (2) The reliability is to be selectively matched to the criticality and deadline requirements of each of the computations. (3) The system is to be readily expandable. contractible, and (4) The design is to appropriate to post 1975 technology. Three candidate architectures are discussed and assessed in terms of the above qualities. Of the three candidates, a newly conceived architecture, Software Implemented Fault Tolerance (SIFT), provides the best match to the above qualities. In addition SIFT is particularly simple and believable. The other candidates, Bus Checker System (BUCS), also newly conceived in this project, and the Hopkins multiprocessor are potentially more efficient than SIFT in the use of redundancy, but otherwise are not as attractive.
Using a software-defined computer in teaching the basics of computer architecture and operation
NASA Astrophysics Data System (ADS)
Kosowska, Julia; Mazur, Grzegorz
2017-08-01
The paper describes the concept and implementation of SDC_One software-defined computer designed for experimental and didactic purposes. Equipped with extensive hardware monitoring mechanisms, the device enables the students to monitor the computer's operation on bus transfer cycle or instruction cycle basis, providing the practical illustration of basic aspects of computer's operation. In the paper, we describe the hardware monitoring capabilities of SDC_One and some scenarios of using it in teaching the basics of computer architecture and microprocessor operation.
2012-04-30
tool that provides a means of balancing capability development against cost and interdependent risks through the use of modern portfolio theory ...Focardi, 2007; Tutuncu & Cornuejols, 2007) that are extensions of modern portfolio and control theory . The reformulation allows for possible changes...Acquisition: Wave Model context • An Investment Portfolio Approach – Mean Variance Approach – Mean - Variance : A Robust Version • Concept
Orthorectification by Using Gpgpu Method
NASA Astrophysics Data System (ADS)
Sahin, H.; Kulur, S.
2012-07-01
Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
Correlations between Climate Change and the Modern European Construction
NASA Astrophysics Data System (ADS)
Gumińska, Anna
2017-10-01
The aim of the study was to analyze the links between climate change and the way modern cities are structured and responded to climate change. How do these changes affect building materials and technologies, or does climate change affect the type of technology and materials used? The most important results are the effects of analysing selected examples of a modern European building, the use of materials and technology, the adaptation of buildings to the changing climate. Selected examples of contemporary architecture from Germany, Italy and Denmark, Norway and Sweden. There are also examples in photographic documentation. The most important criteria affecting the objects are elements that shape the changing climate, as well as existing legal and technical requirements. The main conclusion was that modern urban space is adapted to the changing climate. Unprecedented climatic phenomena in this area: intense and sudden rain, snow, floods, strong winds, abundant sunshine, high temperature changes, greenhouse effect of the city - “island heat”, atmospheric pollution. Building materials and technologies contribute to the optimal conservation of natural resources, buildings are shaped in such a way as to ensure safety, resilience and environmental protection. However, there is still a need for continuous monitoring of climate change, criteria affecting the design and construction of urban and central facilities. Key words: energy efficiency, renewable energy, climate change, contemporary architecture.
A Serial Bus Architecture for Parallel Processing Systems
1986-09-01
pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more communication capacity is needed, pushing...chip. The wider the communication path the more pins are needed to effect the data transfer. As Integrated Circuits grow in computational power, more...13 2. A Suitable Architecture Sought 14 II. OPTIMUM ARCHITECTURE OF LARGE INTEGRATED A. PARTIONING SILICON FOR MAXIMUM 1? 1. Transistor
FTDD973: A multimedia knowledge-based system and methodology for operator training and diagnostics
NASA Technical Reports Server (NTRS)
Hekmatpour, Amir; Brown, Gary; Brault, Randy; Bowen, Greg
1993-01-01
FTDD973 (973 Fabricator Training, Documentation, and Diagnostics) is an interactive multimedia knowledge based system and methodology for computer-aided training and certification of operators, as well as tool and process diagnostics in IBM's CMOS SGP fabrication line (building 973). FTDD973 is an example of what can be achieved with modern multimedia workstations. Knowledge-based systems, hypertext, hypergraphics, high resolution images, audio, motion video, and animation are technologies that in synergy can be far more useful than each by itself. FTDD973's modular and object-oriented architecture is also an example of how improvements in software engineering are finally making it possible to combine many software modules into one application. FTDD973 is developed in ExperMedia/2; and OS/2 multimedia expert system shell for domain experts.
NASA Astrophysics Data System (ADS)
MacDonald, J.
This paper looks at the use of astronomical programmes and the development of new media modeling techniques as a means to better understand archaeoastronomy. The paper also suggests that these new methods and technologies are a means of furthering the public perceptions of archaeoastronomy and the important role that 'astronomy' played in the history and development of human culture. This discussion is rooted in a computer simulation of Stonehenge and its land and skyscape. The integration of the astronomy software allows viewing horizon astronomical lignments in relation to digitally recreated Neolithic/Early Bronze Age (EBA) monumental architecture. This work shows how modern virtual modelling techniques can be a tool for testing archaeoastronomical hypotheses, as well as a demonstrative tool for teaching and promoting archaeoastronomy in mainstream media.
A high-speed DAQ framework for future high-level trigger and event building clusters
NASA Astrophysics Data System (ADS)
Caselle, M.; Ardila Perez, L. E.; Balzer, M.; Dritschler, T.; Kopmann, A.; Mohr, H.; Rota, L.; Vogelgesang, M.; Weber, M.
2017-03-01
Modern data acquisition and trigger systems require a throughput of several GB/s and latencies of the order of microseconds. To satisfy such requirements, a heterogeneous readout system based on FPGA readout cards and GPU-based computing nodes coupled by InfiniBand has been developed. The incoming data from the back-end electronics is delivered directly into the internal memory of GPUs through a dedicated peer-to-peer PCIe communication. High performance DMA engines have been developed for direct communication between FPGAs and GPUs using "DirectGMA (AMD)" and "GPUDirect (NVIDIA)" technologies. The proposed infrastructure is a candidate for future generations of event building clusters, high-level trigger filter farms and low-level trigger system. In this paper the heterogeneous FPGA-GPU architecture will be presented and its performance be discussed.
NASA Technical Reports Server (NTRS)
Stuart, James R.
1995-01-01
The Teledesic satellites are a new class of small satellites which demonstrate the important commercial benefits of using technologies developed for other purposes by U.S. National Laboratories. The Teledesic satellite architecture, subsystem design features, and new technologies are described. The new Teledesic satellite manufacturing, integration, and test approaches which use modern high volume production techniques and result in surprisingly low space segment costs are discussed. The constellation control and management features and attendant software architecture features are addressed. After briefly discussing the economic and technological impact on the USA commercial space industries of the space communications revolution and such large constellation projects, the paper concludes with observations on the trend toward future system architectures using networked groups of much smaller satellites.
Modeling driver behavior in a cognitive architecture.
Salvucci, Dario D
2006-01-01
This paper explores the development of a rigorous computational model of driver behavior in a cognitive architecture--a computational framework with underlying psychological theories that incorporate basic properties and limitations of the human system. Computational modeling has emerged as a powerful tool for studying the complex task of driving, allowing researchers to simulate driver behavior and explore the parameters and constraints of this behavior. An integrated driver model developed in the ACT-R (Adaptive Control of Thought-Rational) cognitive architecture is described that focuses on the component processes of control, monitoring, and decision making in a multilane highway environment. This model accounts for the steering profiles, lateral position profiles, and gaze distributions of human drivers during lane keeping, curve negotiation, and lane changing. The model demonstrates how cognitive architectures facilitate understanding of driver behavior in the context of general human abilities and constraints and how the driving domain benefits cognitive architectures by pushing model development toward more complex, realistic tasks. The model can also serve as a core computational engine for practical applications that predict and recognize driver behavior and distraction.
Advanced cloud fault tolerance system
NASA Astrophysics Data System (ADS)
Sumangali, K.; Benny, Niketa
2017-11-01
Cloud computing has become a prevalent on-demand service on the internet to store, manage and process data. A pitfall that accompanies cloud computing is the failures that can be encountered in the cloud. To overcome these failures, we require a fault tolerance mechanism to abstract faults from users. We have proposed a fault tolerant architecture, which is a combination of proactive and reactive fault tolerance. This architecture essentially increases the reliability and the availability of the cloud. In the future, we would like to compare evaluations of our proposed architecture with existing architectures and further improve it.
Heavy Lift Vehicle (HLV) Avionics Flight Computing Architecture Study
NASA Technical Reports Server (NTRS)
Hodson, Robert F.; Chen, Yuan; Morgan, Dwayne R.; Butler, A. Marc; Sdhuh, Joseph M.; Petelle, Jennifer K.; Gwaltney, David A.; Coe, Lisa D.; Koelbl, Terry G.; Nguyen, Hai D.
2011-01-01
A NASA multi-Center study team was assembled from LaRC, MSFC, KSC, JSC and WFF to examine potential flight computing architectures for a Heavy Lift Vehicle (HLV) to better understand avionics drivers. The study examined Design Reference Missions (DRMs) and vehicle requirements that could impact the vehicles avionics. The study considered multiple self-checking and voting architectural variants and examined reliability, fault-tolerance, mass, power, and redundancy management impacts. Furthermore, a goal of the study was to develop the skills and tools needed to rapidly assess additional architectures should requirements or assumptions change.
Design Directions: Looking for What Is 'Missing'
ERIC Educational Resources Information Center
AIA Journal, 1978
1978-01-01
In modern architecture of the 1970s, esthetic choices that appear regularly are "historicism,""high-tech," and "slick style." The Brooklyn Children's Museum is one of the many examples photographed. (Author/MLF)
The new landscape of parallel computer architecture
NASA Astrophysics Data System (ADS)
Shalf, John
2007-07-01
The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Copy Hiding Application Interface
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jones, Holger; Poliakoff, David; Robinson, Peter
2016-10-06
CHAI is a light-weight framework which abstracts the automated movement of data (e.g. to/from Host/Device) via RAJA like performance portability programming model constructs. It can be viewed as a utility framework and an adjunct to FAJA (A Performance Portability Framework). Performance Portability is a technique that abstracts the complexities of modern Heterogeneous Architectures while allowing the original program to undergo incremental minimally invasive code changes in order to adapt to the newer architectures.
The Live Access Server - A Web-Services Framework for Earth Science Data
NASA Astrophysics Data System (ADS)
Schweitzer, R.; Hankin, S. C.; Callahan, J. S.; O'Brien, K.; Manke, A.; Wang, X. Y.
2005-12-01
The Live Access Server (LAS) is a general purpose Web-server for delivering services related to geo-science data sets. Data providers can use the LAS architecture to build custom Web interfaces to their scientific data. Users and client programs can then access the LAS site to search the provider's on-line data holdings, make plots of data, create sub-sets in a variety of formats, compare data sets and perform analysis on the data. The Live Access server software has continued to evolve by expanding the types of data (in-situ observations and curvilinear grids) it can serve and by taking advantages of advances in software infrastructure both in the earth sciences community (THREDDS, the GrADS Data Server, the Anagram framework and Java netCDF 2.2) and in the Web community (Java Servlet and the Apache Jakarta frameworks). This presentation will explore the continued evolution of the LAS architecture towards a complete Web-services-based framework. Additionally, we will discuss the redesign and modernization of some of the support tools available to LAS installers. Soon after the initial implementation, the LAS architecture was redesigned to separate the components that are responsible for the user interaction (the User Interface Server) from the components that are responsible for interacting with the data and producing the output requested by the user (the Product Server). During this redesign, we changed the implementation of the User Interface Server from CGI and JavaScript to the Java Servlet specification using Apache Jakarta Velocity backed by a database store for holding the user interface widget components. The User Interface server is now quite flexible and highly configurable because we modernized the components used for the implementation. Meanwhile, the implementation of the Product Server has remained a Perl CGI-based system. Clearly, the time has come to modernize this part of the LAS architecture. Before undertaking such a modernization it is important to understand what we hope to gain. Specifically we would like to make it even easier to add new output products into our core system based on the Ferret analysis and visualization package. By carefully factoring the tasks needed to create a product we will be able to create new products simply by adding a description of the product into the configuration and by writing the Ferret script needed to create the product. No code will need to be added to the Product Server to bring the new product on-line. The new architecture should be faster at extracting and processing configuration information needed to address each request. Finally, the new Product Server architecture should make it even easier to pass specialized configuration information to the Product Server to deal with unanticipated special data structures or processing requirements.
Modernism in Belgrade: Classification of Modernist Housing Buildings 1919-1980
NASA Astrophysics Data System (ADS)
Dragutinovic, Anica; Pottgiesser, Uta; De Vos, Els; Melenhorst, Michel
2017-10-01
Yugoslavian Modernist Architecture, although part of a larger cultural phenomenon, received hardly any international attention, since there are only a few internationally published studies about it. Nevertheless, Modernist Architecture of the Inter-war Yugoslavia (Kingdom of Yugoslavia), and specially Modernist Architecture of the Post-war Yugoslavia (Socialist Federal Republic of Yugoslavia under the “reign” of Tito), represents the most important architectural heritage of the 20th century in former Yugoslavian countries. Belgrade, as the capital city of both newly founded Yugoslavia(s), experienced an immediate economic, political and cultural expansion after the both wars, as well as a large population increase. The construction of sufficient and appropriate new housing was a major undertaking in both periods (1919-1940 and 1948-1980), however conceived and realized with deeply diverging views. The transition from villas and modest apartment buildings, as main housing typologies in the Inter-war period, to the mass housing of the Post-war period, was not only a result of the different socio-political context of the two Yugoslavia(s), but also the country’s industrialization, modernization and technological development. Through the classification of Modernist housing buildings in Belgrade, this paper will investigate on relations between the transformations of the main housing typologies executed under different socio-political contexts on the one side, and development of building technologies, construction systems and materials applied on those buildings on the other side. The paper wants to shed light on the Yugoslavian Modernist Architecture in order to increase the international awareness on its architectural and heritage values. The aim is an integrated re-evaluation of the buildings, presentation of their current condition and potentials for future (re)use with a specific focus on building envelopes and construction.
The Development of Sociocultural Competence with the Help of Computer Technology
ERIC Educational Resources Information Center
Rakhimova, Alina E.; Yashina, Marianna E.; Mukhamadiarova, Albina F.; Sharipova, Astrid V.
2017-01-01
The article deals with the description of the process of development sociocultural knowledge and competences using computer technologies. On the whole the development of modern computer technologies allows teachers to broaden trainees' sociocultural outlook and trace their progress online. Observation of modern computer technologies and estimation…
Enabling a high throughput real time data pipeline for a large radio telescope array with GPUs
NASA Astrophysics Data System (ADS)
Edgar, R. G.; Clark, M. A.; Dale, K.; Mitchell, D. A.; Ord, S. M.; Wayth, R. B.; Pfister, H.; Greenhill, L. J.
2010-10-01
The Murchison Widefield Array (MWA) is a next-generation radio telescope currently under construction in the remote Western Australia Outback. Raw data will be generated continuously at 5 GiB s-1, grouped into 8 s cadences. This high throughput motivates the development of on-site, real time processing and reduction in preference to archiving, transport and off-line processing. Each batch of 8 s data must be completely reduced before the next batch arrives. Maintaining real time operation will require a sustained performance of around 2.5 TFLOP s-1 (including convolutions, FFTs, interpolations and matrix multiplications). We describe a scalable heterogeneous computing pipeline implementation, exploiting both the high computing density and FLOP-per-Watt ratio of modern GPUs. The architecture is highly parallel within and across nodes, with all major processing elements performed by GPUs. Necessary scatter-gather operations along the pipeline are loosely synchronized between the nodes hosting the GPUs. The MWA will be a frontier scientific instrument and a pathfinder for planned peta- and exa-scale facilities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trędak, Przemysław, E-mail: przemyslaw.tredak@fuw.edu.pl; Rudnicki, Witold R.; Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, ul. Pawińskiego 5a, 02-106 Warsaw
The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPUmore » to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.« less
VLSI Implementation of Fault Tolerance Multiplier based on Reversible Logic Gate
NASA Astrophysics Data System (ADS)
Ahmad, Nabihah; Hakimi Mokhtar, Ahmad; Othman, Nurmiza binti; Fhong Soon, Chin; Rahman, Ab Al Hadi Ab
2017-08-01
Multiplier is one of the essential component in the digital world such as in digital signal processing, microprocessor, quantum computing and widely used in arithmetic unit. Due to the complexity of the multiplier, tendency of errors are very high. This paper aimed to design a 2×2 bit Fault Tolerance Multiplier based on Reversible logic gate with low power consumption and high performance. This design have been implemented using 90nm Complemetary Metal Oxide Semiconductor (CMOS) technology in Synopsys Electronic Design Automation (EDA) Tools. Implementation of the multiplier architecture is by using the reversible logic gates. The fault tolerance multiplier used the combination of three reversible logic gate which are Double Feynman gate (F2G), New Fault Tolerance (NFT) gate and Islam Gate (IG) with the area of 160μm x 420.3μm (67.25 mm2). This design achieved a low power consumption of 122.85μW and propagation delay of 16.99ns. The fault tolerance multiplier proposed achieved a low power consumption and high performance which suitable for application of modern computing as it has a fault tolerance capabilities.
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Nark, Douglas M.; Nguyen, Duc T.; Tungkahotara, Siroj
2006-01-01
A finite element solution to the convected Helmholtz equation in a nonuniform flow is used to model the noise field within 3-D acoustically treated aero-engine nacelles. Options to select linear or cubic Hermite polynomial basis functions and isoparametric elements are included. However, the key feature of the method is a domain decomposition procedure that is based upon the inter-mixing of an iterative and a direct solve strategy for solving the discrete finite element equations. This procedure is optimized to take full advantage of sparsity and exploit the increased memory and parallel processing capability of modern computer architectures. Example computations are presented for the Langley Flow Impedance Test facility and a rectangular mapping of a full scale, generic aero-engine nacelle. The accuracy and parallel performance of this new solver are tested on both model problems using a supercomputer that contains hundreds of central processing units. Results show that the method gives extremely accurate attenuation predictions, achieves super-linear speedup over hundreds of CPUs, and solves upward of 25 million complex equations in a quarter of an hour.
Systems-on-chip approach for real-time simulation of wheel-rail contact laws
NASA Astrophysics Data System (ADS)
Mei, T. X.; Zhou, Y. J.
2013-04-01
This paper presents the development of a systems-on-chip approach to speed up the simulation of wheel-rail contact laws, which can be used to reduce the requirement for high-performance computers and enable simulation in real time for the use of hardware-in-loop for experimental studies of the latest vehicle dynamic and control technologies. The wheel-rail contact laws are implemented using a field programmable gate array (FPGA) device with a design that substantially outperforms modern general-purpose PC platforms or fixed architecture digital signal processor devices in terms of processing time, configuration flexibility and cost. In order to utilise the FPGA's parallel-processing capability, the operations in the contact laws algorithms are arranged in a parallel manner and multi-contact patches are tackled simultaneously in the design. The interface between the FPGA device and the host PC is achieved by using a high-throughput and low-latency Ethernet link. The development is based on FASTSIM algorithms, although the design can be adapted and expanded for even more computationally demanding tasks.
Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias
2014-06-10
We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
Lexical is as lexical does: computational approaches to lexical representation
Woollams, Anna M.
2015-01-01
In much of neuroimaging and neuropsychology, regions of the brain have been associated with ‘lexical representation’, with little consideration as to what this cognitive construct actually denotes. Within current computational models of word recognition, there are a number of different approaches to the representation of lexical knowledge. Structural lexical representations, found in original theories of word recognition, have been instantiated in modern localist models. However, such a representational scheme lacks neural plausibility in terms of economy and flexibility. Connectionist models have therefore adopted distributed representations of form and meaning. Semantic representations in connectionist models necessarily encode lexical knowledge. Yet when equipped with recurrent connections, connectionist models can also develop attractors for familiar forms that function as lexical representations. Current behavioural, neuropsychological and neuroimaging evidence shows a clear role for semantic information, but also suggests some modality- and task-specific lexical representations. A variety of connectionist architectures could implement these distributed functional representations, and further experimental and simulation work is required to discriminate between these alternatives. Future conceptualisations of lexical representations will therefore emerge from a synergy between modelling and neuroscience. PMID:25893204
Blocked inverted indices for exact clustering of large chemical spaces.
Thiel, Philipp; Sach-Peltason, Lisa; Ottmann, Christian; Kohlbacher, Oliver
2014-09-22
The calculation of pairwise compound similarities based on fingerprints is one of the fundamental tasks in chemoinformatics. Methods for efficient calculation of compound similarities are of the utmost importance for various applications like similarity searching or library clustering. With the increasing size of public compound databases, exact clustering of these databases is desirable, but often computationally prohibitively expensive. We present an optimized inverted index algorithm for the calculation of all pairwise similarities on 2D fingerprints of a given data set. In contrast to other algorithms, it neither requires GPU computing nor yields a stochastic approximation of the clustering. The algorithm has been designed to work well with multicore architectures and shows excellent parallel speedup. As an application example of this algorithm, we implemented a deterministic clustering application, which has been designed to decompose virtual libraries comprising tens of millions of compounds in a short time on current hardware. Our results show that our implementation achieves more than 400 million Tanimoto similarity calculations per second on a common desktop CPU. Deterministic clustering of the available chemical space thus can be done on modern multicore machines within a few days.
Architecutres, Models, Algorithms, and Software Tools for Configurable Computing
2000-03-06
and J.G. Nash. The gated interconnection network for dynamic programming. Plenum, 1988 . [18] Ju wook Jang, Heonchul Park, and Viktor K. Prasanna. A ...Sep. 1997. [2] C. Ebeling, D. C. Cronquist , P. Franklin and C. Fisher, "RaPiD - A configurable computing architecture for compute-intensive...ABSTRACT (Maximum 200 words) The Models, Algorithms, and Architectures for Reconfigurable Computing (MAARC) project developed a sound framework for
Blueprint for a microwave trapped ion quantum computer
Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.
2017-01-01
The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Analysis of Introducing Active Learning Methodologies in a Basic Computer Architecture Course
ERIC Educational Resources Information Center
Arbelaitz, Olatz; José I. Martín; Muguerza, Javier
2015-01-01
This paper presents an analysis of introducing active methodologies in the Computer Architecture course taught in the second year of the Computer Engineering Bachelor's degree program at the University of the Basque Country (UPV/EHU), Spain. The paper reports the experience from three academic years, 2011-2012, 2012-2013, and 2013-2014, in which…
ERIC Educational Resources Information Center
Nikolic, B.; Radivojevic, Z.; Djordjevic, J.; Milutinovic, V.
2009-01-01
Courses in Computer Architecture and Organization are regularly included in Computer Engineering curricula. These courses are usually organized in such a way that students obtain not only a purely theoretical experience, but also a practical understanding of the topics lectured. This practical work is usually done in a laboratory using simulators…
A Project-Based Learning Approach to Programmable Logic Design and Computer Architecture
ERIC Educational Resources Information Center
Kellett, C. M.
2012-01-01
This paper describes a course in programmable logic design and computer architecture as it is taught at the University of Newcastle, Australia. The course is designed around a major design project and has two supplemental assessment tasks that are also described. The context of the Computer Engineering degree program within which the course is…
ERIC Educational Resources Information Center
Stanley, Timothy D.; Wong, Lap Kei; Prigmore, Daniel; Benson, Justin; Fishler, Nathan; Fife, Leslie; Colton, Don
2007-01-01
Students learn better when they both hear and do. In computer architecture courses "doing" can be difficult in small schools without hardware laboratories hosted by computer engineering, electrical engineering, or similar departments. Software solutions exist. Our success with George Mills' Multimedia Logic (MML) is the focus of this paper. MML…
The Role of Sketch in Architecture Design
NASA Astrophysics Data System (ADS)
Li, Yanjin; Ning, Wen
2017-06-01
With the continuous development of computer technology, we rely more and more on the computer and pay more and more attention to the final design results, so that we ignore the importance of the sketch. However, the sketch is the most basic and effective way of architecture design. Based on the study of the sketch of Tjibao Cultural Center of sketch, the paper explores the role of sketch in architecture design .
SU (2) lattice gauge theory simulations on Fermi GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p
2011-05-10
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
Exploration of operator method digital optical computers for application to NASA
NASA Technical Reports Server (NTRS)
1990-01-01
Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dinda, Peter August
2015-03-17
This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems softwaremore » for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3), although a prototype connecting Palacios with the GEM5 architectural simulator was demonstrated, our conclusion was that such a platform was less useful for design space exploration than anticipated due to inherent complexity of the connection between the instruction set architecture level and the microarchitectural level. For effort (4), we found that a code injection approach proved to be more fruitful. The results of our efforts are publicly available in the open source Palacios codebase and published papers, all of which are available from the project web site, v3vee.org. Palacios is currently one of the two codebases (the other being Sandia’s Kitten lightweight kernel) that underlies the node operating system for the DOE Hobbes Project, one of two projects tasked with building a systems software prototype for the national exascale computing effort.« less
Layered Architectures for Quantum Computers and Quantum Repeaters
NASA Astrophysics Data System (ADS)
Jones, Nathan C.
This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
Neural simulations on multi-core architectures.
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing.
Neural Simulations on Multi-Core Architectures
Eichner, Hubert; Klug, Tobias; Borst, Alexander
2009-01-01
Neuroscience is witnessing increasing knowledge about the anatomy and electrophysiological properties of neurons and their connectivity, leading to an ever increasing computational complexity of neural simulations. At the same time, a rather radical change in personal computer technology emerges with the establishment of multi-cores: high-density, explicitly parallel processor architectures for both high performance as well as standard desktop computers. This work introduces strategies for the parallelization of biophysically realistic neural simulations based on the compartmental modeling technique and results of such an implementation, with a strong focus on multi-core architectures and automation, i.e. user-transparent load balancing. PMID:19636393
Advanced flight computer. Special study
NASA Technical Reports Server (NTRS)
Coo, Dennis
1995-01-01
This report documents a special study to define a 32-bit radiation hardened, SEU tolerant flight computer architecture, and to investigate current or near-term technologies and development efforts that contribute to the Advanced Flight Computer (AFC) design and development. An AFC processing node architecture is defined. Each node may consist of a multi-chip processor as needed. The modular, building block approach uses VLSI technology and packaging methods that demonstrate a feasible AFC module in 1998 that meets that AFC goals. The defined architecture and approach demonstrate a clear low-risk, low-cost path to the 1998 production goal, with intermediate prototypes in 1996.
Advanced information processing system for advanced launch system: Avionics architecture synthesis
NASA Technical Reports Server (NTRS)
Lala, Jaynarayan H.; Harper, Richard E.; Jaskowiak, Kenneth R.; Rosch, Gene; Alger, Linda S.; Schor, Andrei L.
1991-01-01
The Advanced Information Processing System (AIPS) is a fault-tolerant distributed computer system architecture that was developed to meet the real time computational needs of advanced aerospace vehicles. One such vehicle is the Advanced Launch System (ALS) being developed jointly by NASA and the Department of Defense to launch heavy payloads into low earth orbit at one tenth the cost (per pound of payload) of the current launch vehicles. An avionics architecture that utilizes the AIPS hardware and software building blocks was synthesized for ALS. The AIPS for ALS architecture synthesis process starting with the ALS mission requirements and ending with an analysis of the candidate ALS avionics architecture is described.
GASP-PL/I Simulation of Integrated Avionic System Processor Architectures. M.S. Thesis
NASA Technical Reports Server (NTRS)
Brent, G. A.
1978-01-01
A development study sponsored by NASA was completed in July 1977 which proposed a complete integration of all aircraft instrumentation into a single modular system. Instead of using the current single-function aircraft instruments, computers compiled and displayed inflight information for the pilot. A processor architecture called the Team Architecture was proposed. This is a hardware/software approach to high-reliability computer systems. A follow-up study of the proposed Team Architecture is reported. GASP-PL/1 simulation models are used to evaluate the operating characteristics of the Team Architecture. The problem, model development, simulation programs and results at length are presented. Also included are program input formats, outputs and listings.
Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment
NASA Technical Reports Server (NTRS)
Duong, Tuan A.; Duong, Vu A.
2012-01-01
A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant.
Electromagnetic Physics Models for Parallel Computing Architectures
NASA Astrophysics Data System (ADS)
Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Duhem, L.; Elvira, D.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.
2016-10-01
The recent emergence of hardware architectures characterized by many-core or accelerated processors has opened new opportunities for concurrent programming models taking advantage of both SIMD and SIMT architectures. GeantV, a next generation detector simulation, has been designed to exploit both the vector capability of mainstream CPUs and multi-threading capabilities of coprocessors including NVidia GPUs and Intel Xeon Phi. The characteristics of these architectures are very different in terms of the vectorization depth and type of parallelization needed to achieve optimal performance. In this paper we describe implementation of electromagnetic physics models developed for parallel computing architectures as a part of the GeantV project. Results of preliminary performance evaluation and physics validation are presented as well.
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Secchi, Simone; Tumeo, Antonino; Villa, Oreste
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Villa, Oreste; Tumeo, Antonino; Secchi, Simone
Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, wemore » introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.« less
Computational structures for robotic computations
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chang, P. R.
1987-01-01
The computational problem of inverse kinematics and inverse dynamics of robot manipulators by taking advantage of parallelism and pipelining architectures is discussed. For the computation of inverse kinematic position solution, a maximum pipelined CORDIC architecture has been designed based on a functional decomposition of the closed-form joint equations. For the inverse dynamics computation, an efficient p-fold parallel algorithm to overcome the recurrence problem of the Newton-Euler equations of motion to achieve the time lower bound of O(log sub 2 n) has also been developed.
Yokohama, Noriya
2013-07-01
This report was aimed at structuring the design of architectures and studying performance measurement of a parallel computing environment using a Monte Carlo simulation for particle therapy using a high performance computing (HPC) instance within a public cloud-computing infrastructure. Performance measurements showed an approximately 28 times faster speed than seen with single-thread architecture, combined with improved stability. A study of methods of optimizing the system operations also indicated lower cost.
Efficient development of memory bounded geo-applications to scale on modern supercomputers
NASA Astrophysics Data System (ADS)
Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric
2016-04-01
Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases, near peak memory bandwidth transfer is achieved. Our approach allows us to get the best out of the current hardware.
ERIC Educational Resources Information Center
Berg, A. I.; And Others
Five articles which were selected from a Russian language book on cybernetics and then translated are presented here. They deal with the topics of: computer-developed computers, heuristics and modern sciences, linguistics and practice, cybernetics and moral-ethical considerations, and computer chess programs. (Author/JY)
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.
2003-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT).
Solving the Cauchy-Riemann equations on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
Battlefield awareness computers: the engine of battlefield digitization
NASA Astrophysics Data System (ADS)
Ho, Jackson; Chamseddine, Ahmad
1997-06-01
To modernize the army for the 21st century, the U.S. Army Digitization Office (ADO) initiated in 1995 the Force XXI Battle Command Brigade-and-Below (FBCB2) Applique program which became a centerpiece in the U.S. Army's master plan to win future information wars. The Applique team led by TRW fielded a 'tactical Internet' for Brigade and below command to demonstrate the advantages of 'shared situation awareness' and battlefield digitization in advanced war-fighting experiments (AWE) to be conducted in March 1997 at the Army's National Training Center in California. Computing Devices is designated the primary hardware developer for the militarized version of the battlefield awareness computers. The first generation of militarized battlefield awareness computer, designated as the V3 computer, was an integration of off-the-shelf components developed to meet the agressive delivery requirements of the Task Force XXI AWE. The design efficiency and cost effectiveness of the computer hardware were secondary in importance to delivery deadlines imposed by the March 1997 AWE. However, declining defense budgets will impose cost constraints on the Force XXI production hardware that can only be met by rigorous value engineering to further improve design optimization for battlefield awareness without compromising the level of reliability the military has come to expect in modern military hardened vetronics. To answer the Army's needs for a more cost effective computing solution, Computing Devices developed a second generation 'combat ready' battlefield awareness computer, designated the V3+, which is designed specifically to meet the upcoming demands of Force XXI (FBCB2) and beyond. The primary design objective is to achieve a technologically superior design, value engineered to strike an optimal balance between reliability, life cycle cost, and procurement cost. Recognizing that the diverse digitization demands of Force XXI cannot be adequately met by any one computer hardware solution, Computing Devices is planning to develop a notebook sized military computer designed for space limited vehicle-mounted applications, as well as a high-performance portable workstation equipped with a 19', full color, ultra-high resolution and high brightness active matrix liquid crystal display (AMLCD) targeting the command posts and tactical operations centers (TOC) applications. Together with the wearable computers Computing Devices developed at the Minneapolis facility for dismounted soldiers, Computing Devices will have a complete suite of interoperable battlefield awareness computers spanning the entire spectrum of battle digitization operating environments. Although this paper's primary focus is on a second generation 'combat ready' battlefield awareness computer or the V3+, this paper also briefly discusses the extension of the V3+ architecture to address the needs of the embedded and command post applications.3080
Kalman filter tracking on parallel architectures
NASA Astrophysics Data System (ADS)
Cerati, G.; Elmer, P.; Krutelyov, S.; Lantz, S.; Lefebvre, M.; McDermott, K.; Riley, D.; Tadel, M.; Wittich, P.; Wurthwein, F.; Yagil, A.
2017-10-01
We report on the progress of our studies towards a Kalman filter track reconstruction algorithm with optimal performance on manycore architectures. The combinatorial structure of these algorithms is not immediately compatible with an efficient SIMD (or SIMT) implementation; the challenge for us is to recast the existing software so it can readily generate hundreds of shared-memory threads that exploit the underlying instruction set of modern processors. We show how the data and associated tasks can be organized in a way that is conducive to both multithreading and vectorization. We demonstrate very good performance on Intel Xeon and Xeon Phi architectures, as well as promising first results on Nvidia GPUs.
On the Performance of an Algebraic MultigridSolver on Multicore Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, A H; Schulz, M; Yang, U M
2010-04-29
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.
A Novel Approach to Noise-Filtering Based on a Gain-Scheduling Neural Network Architecture
NASA Technical Reports Server (NTRS)
Troudet, T.; Merrill, W.
1994-01-01
A gain-scheduling neural network architecture is proposed to enhance the noise-filtering efficiency of feedforward neural networks, in terms of both nominal performance and robustness. The synergistic benefits of the proposed architecture are demonstrated and discussed in the context of the noise-filtering of signals that are typically encountered in aerospace control systems. The synthesis of such a gain-scheduled neurofiltering provides the robustness of linear filtering, while preserving the nominal performance advantage of conventional nonlinear neurofiltering. Quantitative performance and robustness evaluations are provided for the signal processing of pitch rate responses to typical pilot command inputs for a modern fighter aircraft model.
A single VLSI chip for computing syndromes in the (225, 223) Reed-Solomon decoder
NASA Technical Reports Server (NTRS)
Hsu, I. S.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.
1986-01-01
A description of a single VLSI chip for computing syndromes in the (255, 223) Reed-Solomon decoder is presented. The architecture that leads to this single VLSI chip design makes use of the dual basis multiplication algorithm. The same architecture can be applied to design VLSI chips to compute various kinds of number theoretic transforms.
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL)
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Owen, Jeffrey E.
1988-01-01
A direct-execution parallel architecture for the Advanced Continuous Simulation Language (ACSL) is presented which overcomes the traditional disadvantages of simulations executed on a digital computer. The incorporation of parallel processing allows the mapping of simulations into a digital computer to be done in the same inherently parallel manner as they are currently mapped onto an analog computer. The direct-execution format maximizes the efficiency of the executed code since the need for a high level language compiler is eliminated. Resolution is greatly increased over that which is available with an analog computer without the sacrifice in execution speed normally expected with digitial computer simulations. Although this report covers all aspects of the new architecture, key emphasis is placed on the processing element configuration and the microprogramming of the ACLS constructs. The execution times for all ACLS constructs are computed using a model of a processing element based on the AMD 29000 CPU and the AMD 29027 FPU. The increase in execution speed provided by parallel processing is exemplified by comparing the derived execution times of two ACSL programs with the execution times for the same programs executed on a similar sequential architecture.
2007-04-01
Services and System Capabilities Enterprise Rules and Standards for Interoperability Navy AFArmy TRANS COM DFASDLA Ente prise Shared Services and System...Where commonality among components exists, there are also opportunities for identifying and leveraging shared services . A service-oriented architecture...and (3) shared services . The BMA federation strategy, according to these officials, is the first mission area federation strategy, and it is their
Evidence of common and separate eye and hand accumulators underlying flexible eye-hand coordination
Jana, Sumitash; Gopal, Atul
2016-01-01
Eye and hand movements are initiated by anatomically separate regions in the brain, and yet these movements can be flexibly coupled and decoupled, depending on the need. The computational architecture that enables this flexible coupling of independent effectors is not understood. Here, we studied the computational architecture that enables flexible eye-hand coordination using a drift diffusion framework, which predicts that the variability of the reaction time (RT) distribution scales with its mean. We show that a common stochastic accumulator to threshold, followed by a noisy effector-dependent delay, explains eye-hand RT distributions and their correlation in a visual search task that required decision-making, while an interactive eye and hand accumulator model did not. In contrast, in an eye-hand dual task, an interactive model better predicted the observed correlations and RT distributions than a common accumulator model. Notably, these two models could only be distinguished on the basis of the variability and not the means of the predicted RT distributions. Additionally, signatures of separate initiation signals were also observed in a small fraction of trials in the visual search task, implying that these distinct computational architectures were not a manifestation of the task design per se. Taken together, our results suggest two unique computational architectures for eye-hand coordination, with task context biasing the brain toward instantiating one of the two architectures. NEW & NOTEWORTHY Previous studies on eye-hand coordination have considered mainly the means of eye and hand reaction time (RT) distributions. Here, we leverage the approximately linear relationship between the mean and standard deviation of RT distributions, as predicted by the drift-diffusion model, to propose the existence of two distinct computational architectures underlying coordinated eye-hand movements. These architectures, for the first time, provide a computational basis for the flexible coupling between eye and hand movements. PMID:27784809
The Global Experience of Deployment of Energy-Efficient Technologies in High-Rise Construction
NASA Astrophysics Data System (ADS)
Potienko, Natalia D.; Kuznetsova, Anna A.; Solyakova, Darya N.; Klyueva, Yulia E.
2018-03-01
The objective of this research is to examine issues related to the increasing importance of energy-efficient technologies in high-rise construction. The aim of the paper is to investigate modern approaches to building design that involve implementation of various energy-saving technologies in diverse climates and at different structural levels, including the levels of urban development, functionality, planning, construction and engineering. The research methodology is based on the comprehensive analysis of the advanced global expertise in the design and construction of energy-efficient high-rise buildings, with the examination of their positive and negative features. The research also defines the basic principles of energy-efficient architecture. Besides, it draws parallels between the climate characteristics of countries that lead in the field of energy-efficient high-rise construction, on the one hand, and the climate in Russia, on the other, which makes it possible to use the vast experience of many countries, wholly or partially. The paper also gives an analytical review of the results arrived at by implementing energy efficiency principles into high-rise architecture. The study findings determine the impact of energy-efficient technologies on high-rise architecture and planning solutions. In conclusion, the research states that, apart from aesthetic and compositional interpretation of architectural forms, an architect nowadays has to address the task of finding a synthesis between technological and architectural solutions, which requires knowledge of advanced technologies. The study findings reveal that the implementation of modern energy-efficient technologies into high-rise construction is of immediate interest and is sure to bring long-term benefits.
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.
Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan
2017-11-01
The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
Computer graphics in architecture and engineering
NASA Technical Reports Server (NTRS)
Greenberg, D. P.
1975-01-01
The present status of the application of computer graphics to the building profession or architecture and its relationship to other scientific and technical areas were discussed. It was explained that, due to the fragmented nature of architecture and building activities (in contrast to the aerospace industry), a comprehensive, economic utilization of computer graphics in this area is not practical and its true potential cannot now be realized due to the present inability of architects and structural, mechanical, and site engineers to rely on a common data base. Future emphasis will therefore have to be placed on a vertical integration of the construction process and effective use of a three-dimensional data base, rather than on waiting for any technological breakthrough in interactive computing.
Innovative architectures for dense multi-microprocessor computers
NASA Technical Reports Server (NTRS)
Larson, Robert E.
1989-01-01
The purpose is to summarize a Phase 1 SBIR project performed for the NASA/Langley Computational Structural Mechanics Group. The project was performed from February to August 1987. The main objectives of the project were to: (1) expand upon previous research into the application of chordal ring architectures to the general problem of designing multi-microcomputer architectures, (2) attempt to identify a family of chordal rings such that each chordal ring can be simply expanded to produce the next member of the family, (3) perform a preliminary, high-level design of an expandable multi-microprocessor computer based upon chordal rings, (4) analyze the potential use of chordal ring based multi-microprocessors for sparse matrix problems and other applications arising in computational structural mechanics.
Fault tolerant architectures for integrated aircraft electronics systems
NASA Technical Reports Server (NTRS)
Levitt, K. N.; Melliar-Smith, P. M.; Schwartz, R. L.
1983-01-01
Work into possible architectures for future flight control computer systems is described. Ada for Fault-Tolerant Systems, the NETS Network Error-Tolerant System architecture, and voting in asynchronous systems are covered.
Martins, Goncalo; Moondra, Arul; Dubey, Abhishek; Bhattacharjee, Anirban; Koutsoukos, Xenofon D.
2016-01-01
In modern networked control applications, confidentiality and integrity are important features to address in order to prevent against attacks. Moreover, network control systems are a fundamental part of the communication components of current cyber-physical systems (e.g., automotive communications). Many networked control systems employ Time-Triggered (TT) architectures that provide mechanisms enabling the exchange of precise and synchronous messages. TT systems have computation and communication constraints, and with the aim to enable secure communications in the network, it is important to evaluate the computational and communication overhead of implementing secure communication mechanisms. This paper presents a comprehensive analysis and evaluation of the effects of adding a Hash-based Message Authentication (HMAC) to TT networked control systems. The contributions of the paper include (1) the analysis and experimental validation of the communication overhead, as well as a scalability analysis that utilizes the experimental result for both wired and wireless platforms and (2) an experimental evaluation of the computational overhead of HMAC based on a kernel-level Linux implementation. An automotive application is used as an example, and the results show that it is feasible to implement a secure communication mechanism without interfering with the existing automotive controller execution times. The methods and results of the paper can be used for evaluating the performance impact of security mechanisms and, thus, for the design of secure wired and wireless TT networked control systems. PMID:27463718
Reinforced Adversarial Neural Computer for de Novo Molecular Design.
Putin, Evgeny; Asadulaev, Arip; Ivanenkov, Yan; Aladinskiy, Vladimir; Sanchez-Lengeling, Benjamin; Aspuru-Guzik, Alán; Zhavoronkov, Alex
2018-06-12
In silico modeling is a crucial milestone in modern drug design and development. Although computer-aided approaches in this field are well-studied, the application of deep learning methods in this research area is at the beginning. In this work, we present an original deep neural network (DNN) architecture named RANC (Reinforced Adversarial Neural Computer) for the de novo design of novel small-molecule organic structures based on the generative adversarial network (GAN) paradigm and reinforcement learning (RL). As a generator RANC uses a differentiable neural computer (DNC), a category of neural networks, with increased generation capabilities due to the addition of an explicit memory bank, which can mitigate common problems found in adversarial settings. The comparative results have shown that RANC trained on the SMILES string representation of the molecules outperforms its first DNN-based counterpart ORGANIC by several metrics relevant to drug discovery: the number of unique structures, passing medicinal chemistry filters (MCFs), Muegge criteria, and high QED scores. RANC is able to generate structures that match the distributions of the key chemical features/descriptors (e.g., MW, logP, TPSA) and lengths of the SMILES strings in the training data set. Therefore, RANC can be reasonably regarded as a promising starting point to develop novel molecules with activity against different biological targets or pathways. In addition, this approach allows scientists to save time and covers a broad chemical space populated with novel and diverse compounds.
Martins, Goncalo; Moondra, Arul; Dubey, Abhishek; Bhattacharjee, Anirban; Koutsoukos, Xenofon D
2016-07-25
In modern networked control applications, confidentiality and integrity are important features to address in order to prevent against attacks. Moreover, network control systems are a fundamental part of the communication components of current cyber-physical systems (e.g., automotive communications). Many networked control systems employ Time-Triggered (TT) architectures that provide mechanisms enabling the exchange of precise and synchronous messages. TT systems have computation and communication constraints, and with the aim to enable secure communications in the network, it is important to evaluate the computational and communication overhead of implementing secure communication mechanisms. This paper presents a comprehensive analysis and evaluation of the effects of adding a Hash-based Message Authentication (HMAC) to TT networked control systems. The contributions of the paper include (1) the analysis and experimental validation of the communication overhead, as well as a scalability analysis that utilizes the experimental result for both wired and wireless platforms and (2) an experimental evaluation of the computational overhead of HMAC based on a kernel-level Linux implementation. An automotive application is used as an example, and the results show that it is feasible to implement a secure communication mechanism without interfering with the existing automotive controller execution times. The methods and results of the paper can be used for evaluating the performance impact of security mechanisms and, thus, for the design of secure wired and wireless TT networked control systems.
Adaptations in Electronic Structure Calculations in Heterogeneous Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Talamudupula, Sai
Modern quantum chemistry deals with electronic structure calculations of unprecedented complexity and accuracy. They demand full power of high-performance computing and must be in tune with the given architecture for superior e ciency. To make such applications resourceaware, it is desirable to enable their static and dynamic adaptations using some external software (middleware), which may monitor both system availability and application needs, rather than mix science with system-related calls inside the application. The present work investigates scienti c application interlinking with middleware based on the example of the computational chemistry package GAMESS and middleware NICAN. The existing synchronous model ismore » limited by the possible delays due to the middleware processing time under the sustainable runtime system conditions. Proposed asynchronous and hybrid models aim at overcoming this limitation. When linked with NICAN, the fragment molecular orbital (FMO) method is capable of adapting statically and dynamically its fragment scheduling policy based on the computing platform conditions. Signi cant execution time and throughput gains have been obtained due to such static adaptations when the compute nodes have very di erent core counts. Dynamic adaptations are based on the main memory availability at run time. NICAN prompts FMO to postpone scheduling certain fragments, if there is not enough memory for their immediate execution. Hence, FMO may be able to complete the calculations whereas without such adaptations it aborts.« less
Framework for Architecture Trade Study Using MBSE and Performance Simulation
NASA Technical Reports Server (NTRS)
Ryan, Jessica; Sarkani, Shahram; Mazzuchim, Thomas
2012-01-01
Increasing complexity in modern systems as well as cost and schedule constraints require a new paradigm of system engineering to fulfill stakeholder needs. Challenges facing efficient trade studies include poor tool interoperability, lack of simulation coordination (design parameters) and requirements flowdown. A recent trend toward Model Based System Engineering (MBSE) includes flexible architecture definition, program documentation, requirements traceability and system engineering reuse. As a new domain MBSE still lacks governing standards and commonly accepted frameworks. This paper proposes a framework for efficient architecture definition using MBSE in conjunction with Domain Specific simulation to evaluate trade studies. A general framework is provided followed with a specific example including a method for designing a trade study, defining candidate architectures, planning simulations to fulfill requirements and finally a weighted decision analysis to optimize system objectives.
HyperForest: A high performance multi-processor architecture for real-time intelligent systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, P. Jr.; Rebeil, J.P.; Pollard, H.
1997-04-01
Intelligent Systems are characterized by the intensive use of computer power. The computer revolution of the last few years is what has made possible the development of the first generation of Intelligent Systems. Software for second generation Intelligent Systems will be more complex and will require more powerful computing engines in order to meet real-time constraints imposed by new robots, sensors, and applications. A multiprocessor architecture was developed that merges the advantages of message-passing and shared-memory structures: expendability and real-time compliance. The HyperForest architecture will provide an expandable real-time computing platform for computationally intensive Intelligent Systems and open the doorsmore » for the application of these systems to more complex tasks in environmental restoration and cleanup projects, flexible manufacturing systems, and DOE`s own production and disassembly activities.« less
Motion camera based on a custom vision sensor and an FPGA architecture
NASA Astrophysics Data System (ADS)
Arias-Estrada, Miguel
1998-09-01
A digital camera for custom focal plane arrays was developed. The camera allows the test and development of analog or mixed-mode arrays for focal plane processing. The camera is used with a custom sensor for motion detection to implement a motion computation system. The custom focal plane sensor detects moving edges at the pixel level using analog VLSI techniques. The sensor communicates motion events using the event-address protocol associated to a temporal reference. In a second stage, a coprocessing architecture based on a field programmable gate array (FPGA) computes the time-of-travel between adjacent pixels. The FPGA allows rapid prototyping and flexible architecture development. Furthermore, the FPGA interfaces the sensor to a compact PC computer which is used for high level control and data communication to the local network. The camera could be used in applications such as self-guided vehicles, mobile robotics and smart surveillance systems. The programmability of the FPGA allows the exploration of further signal processing like spatial edge detection or image segmentation tasks. The article details the motion algorithm, the sensor architecture, the use of the event- address protocol for velocity vector computation and the FPGA architecture used in the motion camera system.
Topical perspective on massive threading and parallelism.
Farber, Robert M
2011-09-01
Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.
Computational Chemistry Using Modern Electronic Structure Methods
ERIC Educational Resources Information Center
Bell, Stephen; Dines, Trevor J.; Chowdhry, Babur Z.; Withnall, Robert
2007-01-01
Various modern electronic structure methods are now days used to teach computational chemistry to undergraduate students. Such quantum calculations can now be easily used even for large size molecules.
Modelling parallel programs and multiprocessor architectures with AXE
NASA Technical Reports Server (NTRS)
Yan, Jerry C.; Fineman, Charles E.
1991-01-01
AXE, An Experimental Environment for Parallel Systems, was designed to model and simulate for parallel systems at the process level. It provides an integrated environment for specifying computation models, multiprocessor architectures, data collection, and performance visualization. AXE is being used at NASA-Ames for developing resource management strategies, parallel problem formulation, multiprocessor architectures, and operating system issues related to the High Performance Computing and Communications Program. AXE's simple, structured user-interface enables the user to model parallel programs and machines precisely and efficiently. Its quick turn-around time keeps the user interested and productive. AXE models multicomputers. The user may easily modify various architectural parameters including the number of sites, connection topologies, and overhead for operating system activities. Parallel computations in AXE are represented as collections of autonomous computing objects known as players. Their use and behavior is described. Performance data of the multiprocessor model can be observed on a color screen. These include CPU and message routing bottlenecks, and the dynamic status of the software.
A high performance parallel computing architecture for robust image features
NASA Astrophysics Data System (ADS)
Zhou, Renyan; Liu, Leibo; Wei, Shaojun
2014-03-01
A design of parallel architecture for image feature detection and description is proposed in this article. The major component of this architecture is a 2D cellular network composed of simple reprogrammable processors, enabling the Hessian Blob Detector and Haar Response Calculation, which are the most computing-intensive stage of the Speeded Up Robust Features (SURF) algorithm. Combining this 2D cellular network and dedicated hardware for SURF descriptors, this architecture achieves real-time image feature detection with minimal software in the host processor. A prototype FPGA implementation of the proposed architecture achieves 1318.9 GOPS general pixel processing @ 100 MHz clock and achieves up to 118 fps in VGA (640 × 480) image feature detection. The proposed architecture is stand-alone and scalable so it is easy to be migrated into VLSI implementation.