Performance study of a data flow architecture
NASA Technical Reports Server (NTRS)
Adams, George
1985-01-01
Teams of scientists studied data flow concepts, static data flow machine architecture, and the VAL language. Each team mapped its application onto the machine and coded it in VAL. The principal findings of the study were: (1) Five of the seven applications used the full power of the target machine. The galactic simulation and multigrid fluid flow teams found that a significantly smaller version of the machine (16 processing elements) would suffice. (2) A number of machine design parameters including processing element (PE) function unit numbers, array memory size and bandwidth, and routing network capability were found to be crucial for optimal machine performance. (3) The study participants readily acquired VAL programming skills. (4) Participants learned that application-based performance evaluation is a sound method of evaluating new computer architectures, even those that are not fully specified. During the course of the study, participants developed models for using computers to solve numerical problems and for evaluating new architectures. These models form the bases for future evaluation studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fischler, M.
1992-04-01
The issues to be addressed here are those of balance'' in machine architecture. By this, we mean how much emphasis must be placed on various aspects of the system to maximize its usefulness for physics. There are three components that contribute to the utility of a system: How the machine can be used, how big a problem can be attacked, and what the effective capabilities (power) of the hardware are like. The effective power issue is a matter of evaluating the impact of design decisions trading off architectural features such as memory bandwidth and interprocessor communication capabilities. What is studiedmore » is the effect these machine parameters have on how quickly the system can solve desired problems. There is a reasonable method for studying this: One selects a few representative algorithms and computes the impact of changing memory bandwidths, and so forth. The only room for controversy here is in the selection of representative problems. The issue of how big a problem can be attacked boils down to a balance of memory size versus power. Although this is a balance issue it is very different than the effective power situation, because no firm answer can be given at this time. The power to memory ratio is highly problem dependent, and optimizing it requires several pieces of physics input, including: how big a lattice is needed for interesting results; what sort of algorithms are best to use; and how many sweeps are needed to get valid results. We seem to be at the threshold of learning things about these issues, but for now, the memory size issue will necessarily be addressed in terms of best guesses, rules of thumb, and researchers' opinions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fischler, M.
1992-04-01
The issues to be addressed here are those of ``balance`` in machine architecture. By this, we mean how much emphasis must be placed on various aspects of the system to maximize its usefulness for physics. There are three components that contribute to the utility of a system: How the machine can be used, how big a problem can be attacked, and what the effective capabilities (power) of the hardware are like. The effective power issue is a matter of evaluating the impact of design decisions trading off architectural features such as memory bandwidth and interprocessor communication capabilities. What is studiedmore » is the effect these machine parameters have on how quickly the system can solve desired problems. There is a reasonable method for studying this: One selects a few representative algorithms and computes the impact of changing memory bandwidths, and so forth. The only room for controversy here is in the selection of representative problems. The issue of how big a problem can be attacked boils down to a balance of memory size versus power. Although this is a balance issue it is very different than the effective power situation, because no firm answer can be given at this time. The power to memory ratio is highly problem dependent, and optimizing it requires several pieces of physics input, including: how big a lattice is needed for interesting results; what sort of algorithms are best to use; and how many sweeps are needed to get valid results. We seem to be at the threshold of learning things about these issues, but for now, the memory size issue will necessarily be addressed in terms of best guesses, rules of thumb, and researchers` opinions.« less
Exascale Hardware Architectures Working Group
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hemmert, S; Ang, J; Chiang, P
2011-03-15
The ASC Exascale Hardware Architecture working group is challenged to provide input on the following areas impacting the future use and usability of potential exascale computer systems: processor, memory, and interconnect architectures, as well as the power and resilience of these systems. Going forward, there are many challenging issues that will need to be addressed. First, power constraints in processor technologies will lead to steady increases in parallelism within a socket. Additionally, all cores may not be fully independent nor fully general purpose. Second, there is a clear trend toward less balanced machines, in terms of compute capability compared tomore » memory and interconnect performance. In order to mitigate the memory issues, memory technologies will introduce 3D stacking, eventually moving on-socket and likely on-die, providing greatly increased bandwidth but unfortunately also likely providing smaller memory capacity per core. Off-socket memory, possibly in the form of non-volatile memory, will create a complex memory hierarchy. Third, communication energy will dominate the energy required to compute, such that interconnect power and bandwidth will have a significant impact. All of the above changes are driven by the need for greatly increased energy efficiency, as current technology will prove unsuitable for exascale, due to unsustainable power requirements of such a system. These changes will have the most significant impact on programming models and algorithms, but they will be felt across all layers of the machine. There is clear need to engage all ASC working groups in planning for how to deal with technological changes of this magnitude. The primary function of the Hardware Architecture Working Group is to facilitate codesign with hardware vendors to ensure future exascale platforms are capable of efficiently supporting the ASC applications, which in turn need to meet the mission needs of the NNSA Stockpile Stewardship Program. This issue is relatively immediate, as there is only a small window of opportunity to influence hardware design for 2018 machines. Given the short timeline a firm co-design methodology with vendors is of prime importance.« less
Stanford Hardware Development Program
NASA Technical Reports Server (NTRS)
Peterson, A.; Linscott, I.; Burr, J.
1986-01-01
Architectures for high performance, digital signal processing, particularly for high resolution, wide band spectrum analysis were developed. These developments are intended to provide instrumentation for NASA's Search for Extraterrestrial Intelligence (SETI) program. The real time signal processing is both formal and experimental. The efficient organization and optimal scheduling of signal processing algorithms were investigated. The work is complemented by efforts in processor architecture design and implementation. A high resolution, multichannel spectrometer that incorporates special purpose microcoded signal processors is being tested. A general purpose signal processor for the data from the multichannel spectrometer was designed to function as the processing element in a highly concurrent machine. The processor performance required for the spectrometer is in the range of 1000 to 10,000 million instructions per second (MIPS). Multiple node processor configurations, where each node performs at 100 MIPS, are sought. The nodes are microprogrammable and are interconnected through a network with high bandwidth for neighboring nodes, and medium bandwidth for nodes at larger distance. The implementation of both the current mutlichannel spectrometer and the signal processor as Very Large Scale Integration CMOS chip sets was commenced.
Architectural requirements for the Red Storm computing system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Camp, William J.; Tomkins, James Lee
This report is based on the Statement of Work (SOW) describing the various requirements for delivering 3 new supercomputer system to Sandia National Laboratories (Sandia) as part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative (ASCI) program. This system is named Red Storm and will be a distributed memory, massively parallel processor (MPP) machine built primarily out of commodity parts. The requirements presented here distill extensive architectural and design experience accumulated over a decade and a half of research, development and production operation of similar machines at Sandia. Red Storm will have an unusually high bandwidth, low latencymore » interconnect, specially designed hardware and software reliability features, a light weight kernel compute node operating system and the ability to rapidly switch major sections of the machine between classified and unclassified computing environments. Particular attention has been paid to architectural balance in the design of Red Storm, and it is therefore expected to achieve an atypically high fraction of its peak speed of 41 TeraOPS on real scientific computing applications. In addition, Red Storm is designed to be upgradeable to many times this initial peak capability while still retaining appropriate balance in key design dimensions. Installation of the Red Storm computer system at Sandia's New Mexico site is planned for 2004, and it is expected that the system will be operated for a minimum of five years following installation.« less
AHaH computing-from metastable switches to attractors to machine learning.
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures-all key capabilities of biological nervous systems and modern machine learning algorithms with real world application.
AHaH Computing–From Metastable Switches to Attractors to Machine Learning
Nugent, Michael Alexander; Molter, Timothy Wesley
2014-01-01
Modern computing architecture based on the separation of memory and processing leads to a well known problem called the von Neumann bottleneck, a restrictive limit on the data bandwidth between CPU and RAM. This paper introduces a new approach to computing we call AHaH computing where memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We envision that both von Neumann and AHaH computing architectures will operate together on the same machine, but that the AHaH computing processor may reduce the power consumption and processing time for certain adaptive learning tasks by orders of magnitude. The paper begins by drawing a connection between the properties of volatility, thermodynamics, and Anti-Hebbian and Hebbian (AHaH) plasticity. We show how AHaH synaptic plasticity leads to attractor states that extract the independent components of applied data streams and how they form a computationally complete set of logic functions. After introducing a general memristive device model based on collections of metastable switches, we show how adaptive synaptic weights can be formed from differential pairs of incremental memristors. We also disclose how arrays of synaptic weights can be used to build a neural node circuit operating AHaH plasticity. By configuring the attractor states of the AHaH node in different ways, high level machine learning functions are demonstrated. This includes unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization of procedures–all key capabilities of biological nervous systems and modern machine learning algorithms with real world application. PMID:24520315
Programmable bandwidth management in software-defined EPON architecture
NASA Astrophysics Data System (ADS)
Li, Chengjun; Guo, Wei; Wang, Wei; Hu, Weisheng; Xia, Ming
2016-07-01
This paper proposes a software-defined EPON architecture which replaces the hardware-implemented DBA module with reprogrammable DBA module. The DBA module allows pluggable bandwidth allocation algorithms among multiple ONUs adaptive to traffic profiles and network states. We also introduce a bandwidth management scheme executed at the controller to manage the customized DBA algorithms for all date queues of ONUs. Our performance investigation verifies the effectiveness of this new EPON architecture, and numerical results show that software-defined EPONs can achieve less traffic delay and provide better support to service differentiation in comparison with traditional EPONs.
Miyamoto, Kenji; Kuwano, Shigeru; Terada, Jun; Otaka, Akihiro
2016-01-25
We analyze the mobile fronthaul (MFH) bandwidth and the wireless transmission performance in the split-PHY processing (SPP) architecture, which redefines the functional split of centralized/cloud RAN (C-RAN) while preserving high wireless coordinated multi-point (CoMP) transmission/reception performance. The SPP architecture splits the base stations (BS) functions between wireless channel coding/decoding and wireless modulation/demodulation, and employs its own CoMP joint transmission and reception schemes. Simulation results show that the SPP architecture reduces the MFH bandwidth by up to 97% from conventional C-RAN while matching the wireless bit error rate (BER) performance of conventional C-RAN in uplink joint reception with only 2-dB signal to noise ratio (SNR) penalty.
A novel EPON architecture for supporting direct communication between ONUs
NASA Astrophysics Data System (ADS)
Wang, Liqian; Chen, Xue; Wang, Zhen
2008-11-01
In the traditional EPON network, optical signal from one ONU can not reach other ONUs. So ONUs can not directly transmit packets to other ONUs .The packets must be transferred by the OLT and it consumes both upstream bandwidth and downstream bandwidth. The bandwidth utilization is low and becomes lower when there are more packets among ONUs. When the EPON network carries P2P (Peer-to-Peer) applications and VPN applications, there would be a great lot of packets among ONUs and the traditional EPON network meets the problem of low bandwidth utilization. In the worst situation the bandwidth utilization of traditional EPON only is 50 percent. This paper proposed a novel EPON architecture and a novel medium access control protocol to realize direct packets transmission between ONUs. In the proposed EPON we adopt a novel circled architecture in the splitter. Due to the circled-splitter, optical signals from an ONU can reach the other ONUs and packets could be directly transmitted between two ONUs. The traffic between two ONUs only consumes upstream bandwidth and the bandwidth cost is reduced by 50 percent. Moreover, this kind of directly transmission reduces the packet's latency.
Software-defined optical network for metro-scale geographically distributed data centers.
Samadi, Payman; Wen, Ke; Xu, Junjie; Bergman, Keren
2016-05-30
The emergence of cloud computing and big data has rapidly increased the deployment of small and mid-sized data centers. Enterprises and cloud providers require an agile network among these data centers to empower application reliability and flexible scalability. We present a software-defined inter data center network to enable on-demand scale out of data centers on a metro-scale optical network. The architecture consists of a combined space/wavelength switching platform and a Software-Defined Networking (SDN) control plane equipped with a wavelength and routing assignment module. It enables establishing transparent and bandwidth-selective connections from L2/L3 switches, on-demand. The architecture is evaluated in a testbed consisting of 3 data centers, 5-25 km apart. We successfully demonstrated end-to-end bulk data transfer and Virtual Machine (VM) migrations across data centers with less than 100 ms connection setup time and close to full link capacity utilization.
Hybrid petacomputing meets cosmology: The Roadrunner Universe project
NASA Astrophysics Data System (ADS)
Habib, Salman; Pope, Adrian; Lukić, Zarija; Daniel, David; Fasel, Patricia; Desai, Nehal; Heitmann, Katrin; Hsu, Chung-Hsing; Ankeny, Lee; Mark, Graham; Bhattacharya, Suman; Ahrens, James
2009-07-01
The target of the Roadrunner Universe project at Los Alamos National Laboratory is a set of very large cosmological N-body simulation runs on the hybrid supercomputer Roadrunner, the world's first petaflop platform. Roadrunner's architecture presents opportunities and difficulties characteristic of next-generation supercomputing. We describe a new code designed to optimize performance and scalability by explicitly matching the underlying algorithms to the machine architecture, and by using the physics of the problem as an essential aid in this process. While applications will differ in specific exploits, we believe that such a design process will become increasingly important in the future. The Roadrunner Universe project code, MC3 (Mesh-based Cosmology Code on the Cell), uses grid and direct particle methods to balance the capabilities of Roadrunner's conventional (Opteron) and accelerator (Cell BE) layers. Mirrored particle caches and spectral techniques are used to overcome communication bandwidth limitations and possible difficulties with complicated particle-grid interaction templates.
Design and deployment of an elastic network test-bed in IHEP data center based on SDN
NASA Astrophysics Data System (ADS)
Zeng, Shan; Qi, Fazhi; Chen, Gang
2017-10-01
High energy physics experiments produce huge amounts of raw data, while because of the sharing characteristics of the network resources, there is no guarantee of the available bandwidth for each experiment which may cause link congestion problems. On the other side, with the development of cloud computing technologies, IHEP have established a cloud platform based on OpenStack which can ensure the flexibility of the computing and storage resources, and more and more computing applications have been deployed on virtual machines established by OpenStack. However, under the traditional network architecture, network capability can’t be required elastically, which becomes the bottleneck of restricting the flexible application of cloud computing. In order to solve the above problems, we propose an elastic cloud data center network architecture based on SDN, and we also design a high performance controller cluster based on OpenDaylight. In the end, we present our current test results.
A Bandwidth-Optimized Multi-Core Architecture for Irregular Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Secchi, Simone; Tumeo, Antonino; Villa, Oreste
This paper presents an architecture template for next-generation high performance computing systems specifically targeted to irregular applications. We start our work by considering that future generation interconnection and memory bandwidth full-system numbers are expected to grow by a factor of 10. In order to keep up with such a communication capacity, while still resorting to fine-grained multithreading as the main way to tolerate unpredictable memory access latencies of irregular applications, we show how overall performance scaling can benefit from the multi-core paradigm. At the same time, we also show how such an architecture template must be coupled with specific techniquesmore » in order to optimize bandwidth utilization and achieve the maximum scalability. We propose a technique based on memory references aggregation, together with the related hardware implementation, as one of such optimization techniques. We explore the proposed architecture template by focusing on the Cray XMT architecture and, using a dedicated simulation infrastructure, validate the performance of our template with two typical irregular applications. Our experimental results prove the benefits provided by both the multi-core approach and the bandwidth optimization reference aggregation technique.« less
Dynamically programmable cache
NASA Astrophysics Data System (ADS)
Nakkar, Mouna; Harding, John A.; Schwartz, David A.; Franzon, Paul D.; Conte, Thomas
1998-10-01
Reconfigurable machines have recently been used as co- processors to accelerate the execution of certain algorithms or program subroutines. The problems with the above approach include high reconfiguration time and limited partial reconfiguration. By far the most critical problems are: (1) the small on-chip memory which results in slower execution time, and (2) small FPGA areas that cannot implement large subroutines. Dynamically Programmable Cache (DPC) is a novel architecture for embedded processors which offers solutions to the above problems. To solve memory access problems, DPC processors merge reconfigurable arrays with the data cache at various cache levels to create a multi-level reconfigurable machines. As a result DPC machines have both higher data accessibility and FPGA memory bandwidth. To solve the limited FPGA resource problem, DPC processors implemented multi-context switching (Virtualization) concept. Virtualization allows implementation of large subroutines with fewer FPGA cells. Additionally, DPC processors can parallelize the execution of several operations resulting in faster execution time. In this paper, the speedup improvement for DPC machines are shown to be 5X faster than an Altera FLEX10K FPGA chip and 2X faster than a Sun Ultral SPARC station for two different algorithms (convolution and motion estimation).
In-camera video-stream processing for bandwidth reduction in web inspection
NASA Astrophysics Data System (ADS)
Jullien, Graham A.; Li, QiuPing; Hajimowlana, S. Hossain; Morvay, J.; Conflitti, D.; Roberts, James W.; Doody, Brian C.
1996-02-01
Automated machine vision systems are now widely used for industrial inspection tasks where video-stream data information is taken in by the camera and then sent out to the inspection system for future processing. In this paper we describe a prototype system for on-line programming of arbitrary real-time video data stream bandwidth reduction algorithms; the output of the camera only contains information that has to be further processed by a host computer. The processing system is built into a DALSA CCD camera and uses a microcontroller interface to download bit-stream data to a XILINXTM FPGA. The FPGA is directly connected to the video data-stream and outputs data to a low bandwidth output bus. The camera communicates to a host computer via an RS-232 link to the microcontroller. Static memory is used to both generate a FIFO interface for buffering defect burst data, and for off-line examination of defect detection data. In addition to providing arbitrary FPGA architectures, the internal program of the microcontroller can also be changed via the host computer and a ROM monitor. This paper describes a prototype system board, mounted inside a DALSA camera, and discusses some of the algorithms currently being implemented for web inspection applications.
VENI, video, VICI: The merging of computer and video technologies
NASA Technical Reports Server (NTRS)
Horowitz, Jay G.
1993-01-01
The topics covered include the following: High Definition Television (HDTV) milestones; visual information bandwidth; television frequency allocation and bandwidth; horizontal scanning; workstation RGB color domain; NTSC color domain; American HDTV time-table; HDTV image size; digital HDTV hierarchy; task force on digital image architecture; open architecture model; future displays; and the ULTIMATE imaging system.
A Hybrid OFDM-TDM Architecture with Decentralized Dynamic Bandwidth Allocation for PONs
Cevik, Taner
2013-01-01
One of the major challenges of passive optical networks is to achieve a fair arbitration mechanism that will prevent possible collisions from occurring at the upstream channel when multiple users attempt to access the common fiber at the same time. Therefore, in this study we mainly focus on fair bandwidth allocation among users, and present a hybrid Orthogonal Frequency Division Multiplexed/Time Division Multiplexed architecture with a dynamic bandwidth allocation scheme that provides satisfying service qualities to the users depending on their varying bandwidth requirements. Unnecessary delays in centralized schemes occurring during bandwidth assignment stage are eliminated by utilizing a decentralized approach. Instead of sending bandwidth demands to the optical line terminal (OLT) which is the only competent authority, each optical network unit (ONU) runs the same bandwidth demand determination algorithm. ONUs inform each other via signaling channel about the status of their queues. This information is fed to the bandwidth determination algorithm which is run by each ONU in a distributed manner. Furthermore, Light Load Penalty, which is a phenomenon in optical communications, is mitigated by limiting the amount of bandwidth that an ONU can demand. PMID:24194684
TrustGuard: A Containment Architecture with Verified Output
2017-01-01
that the TrustGuard system has minimal performance decline, despite restrictions such as high communication latency and limited available bandwidth...design are the availability of high bandwidth and low delays between the host and the monitoring chip. 3-D integration provides an alternate way of...TRUSTGUARD: A CONTAINMENT ARCHITECTURE WITH VERIFIED OUTPUT SOUMYADEEP GHOSH A DISSERTATION PRESENTED TO THE FACULTY OF PRINCETON UNIVERSITY IN
Klonoff, David C
2017-07-01
The Internet of Things (IoT) is generating an immense volume of data. With cloud computing, medical sensor and actuator data can be stored and analyzed remotely by distributed servers. The results can then be delivered via the Internet. The number of devices in IoT includes such wireless diabetes devices as blood glucose monitors, continuous glucose monitors, insulin pens, insulin pumps, and closed-loop systems. The cloud model for data storage and analysis is increasingly unable to process the data avalanche, and processing is being pushed out to the edge of the network closer to where the data-generating devices are. Fog computing and edge computing are two architectures for data handling that can offload data from the cloud, process it nearby the patient, and transmit information machine-to-machine or machine-to-human in milliseconds or seconds. Sensor data can be processed near the sensing and actuating devices with fog computing (with local nodes) and with edge computing (within the sensing devices). Compared to cloud computing, fog computing and edge computing offer five advantages: (1) greater data transmission speed, (2) less dependence on limited bandwidths, (3) greater privacy and security, (4) greater control over data generated in foreign countries where laws may limit use or permit unwanted governmental access, and (5) lower costs because more sensor-derived data are used locally and less data are transmitted remotely. Connected diabetes devices almost all use fog computing or edge computing because diabetes patients require a very rapid response to sensor input and cannot tolerate delays for cloud computing.
High speed all-optical networks
NASA Technical Reports Server (NTRS)
Chlamtac, Imrich
1993-01-01
An inherent problem of conventional point-to-point WAN architectures is that they cannot translate optical transmission bandwidth into comparable user available throughput due to the limiting electronic processing speed of the switching nodes. This report presents the first solution to WDM based WAN networks that overcomes this limitation. The proposed Lightnet architecture takes into account the idiosyncrasies of WDM switching/transmission leading to an efficient and pragmatic solution. The Lightnet architecture trades the ample WDM bandwidth for a reduction in the number of processing stages and a simplification of each switching stage, leading to drastically increased effective network throughputs.
Compact VLSI neural computer integrated with active pixel sensor for real-time ATR applications
NASA Astrophysics Data System (ADS)
Fang, Wai-Chi; Udomkesmalee, Gabriel; Alkalai, Leon
1997-04-01
A compact VLSI neural computer integrated with an active pixel sensor has been under development to mimic what is inherent in biological vision systems. This electronic eye- brain computer is targeted for real-time machine vision applications which require both high-bandwidth communication and high-performance computing for data sensing, synergy of multiple types of sensory information, feature extraction, target detection, target recognition, and control functions. The neural computer is based on a composite structure which combines Annealing Cellular Neural Network (ACNN) and Hierarchical Self-Organization Neural Network (HSONN). The ACNN architecture is a programmable and scalable multi- dimensional array of annealing neurons which are locally connected with their local neurons. Meanwhile, the HSONN adopts a hierarchical structure with nonlinear basis functions. The ACNN+HSONN neural computer is effectively designed to perform programmable functions for machine vision processing in all levels with its embedded host processor. It provides a two order-of-magnitude increase in computation power over the state-of-the-art microcomputer and DSP microelectronics. A compact current-mode VLSI design feasibility of the ACNN+HSONN neural computer is demonstrated by a 3D 16X8X9-cube neural processor chip design in a 2-micrometers CMOS technology. Integration of this neural computer as one slice of a 4'X4' multichip module into the 3D MCM based avionics architecture for NASA's New Millennium Program is also described.
Performance highlights of the ALMA correlators
NASA Astrophysics Data System (ADS)
Baudry, Alain; Lacasse, Richard; Escoffier, Ray; Webber, John; Greenberg, Joseph; Platt, Laurence; Treacy, Robert; Saez, Alejandro F.; Cais, Philippe; Comoretto, Giovanni; Quertier, Benjamin; Okumura, Sachiko K.; Kamazaki, Takeshi; Chikada, Yoshihiro; Watanabe, Manabu; Okuda, Takeshi; Kurono, Yasutake; Iguchi, Satoru
2012-09-01
Two large correlators have been constructed to combine the signals captured by the ALMA antennas deployed on the Atacama Desert in Chile at an elevation of 5050 meters. The Baseline correlator was fabricated by a NRAO/European team to process up to 64 antennas for 16 GHz bandwidth in two polarizations and another correlator, the Atacama Compact Array (ACA) correlator, was fabricated by a Japanese team to process up to 16 antennas. Both correlators meet the same specifications except for the number of processed antennas. The main architectural differences between these two large machines will be underlined. Selected features of the Baseline and ACA correlators as well as the main technical challenges met by the designers will be briefly discussed. The Baseline correlator is the largest correlator ever built for radio astronomy. Its digital hybrid architecture provides a wide variety of observing modes including the ability to divide each input baseband into 32 frequency-mobile sub-bands for high spectral resolution and to be operated as a conventional 'lag' correlator for high time resolution. The various observing modes offered by the ALMA correlators to the science community for 'Early Science' are presented, as well as future observing modes. Coherently phasing the array to provide VLBI maps of extremely compact sources is another feature of the ALMA correlators. Finally, the status and availability of these large machines will be presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Tumeo, Antonino; Ferrandi, Fabrizio
Emerging applications such as data mining, bioinformatics, knowledge discovery, social network analysis are irregular. They use data structures based on pointers or linked lists, such as graphs, unbalanced trees or unstructures grids, which generates unpredictable memory accesses. These data structures usually are large, but difficult to partition. These applications mostly are memory bandwidth bounded and have high synchronization intensity. However, they also have large amounts of inherent dynamic parallelism, because they potentially perform a task for each one of the element they are exploring. Several efforts are looking at accelerating these applications on hybrid architectures, which integrate general purpose processorsmore » with reconfigurable devices. Some solutions, which demonstrated significant speedups, include custom-hand tuned accelerators or even full processor architectures on the reconfigurable logic. In this paper we present an approach for the automatic synthesis of accelerators from C, targeted at irregular applications. In contrast to typical High Level Synthesis paradigms, which construct a centralized Finite State Machine, our approach generates dynamically scheduled hardware components. While parallelism exploitation in typical HLS-generated accelerators is usually bound within a single execution flow, our solution allows concurrently running multiple execution flow, thus also exploiting the coarser grain task parallelism of irregular applications. Our approach supports multiple, multi-ported and distributed memories, and atomic memory operations. Its main objective is parallelizing as many memory operations as possible, independently from their execution time, to maximize the memory bandwidth utilization. This significantly differs from current HLS flows, which usually consider a single memory port and require precise scheduling of memory operations. A key innovation of our approach is the generation of a memory interface controller, which dynamically maps concurrent memory accesses to multiple ports. We present a case study on a typical irregular kernel, Graph Breadth First search (BFS), exploring different tradeoffs in terms of parallelism and number of memories.« less
The P-Mesh: A Commodity-based Scalable Network Architecture for Clusters
NASA Technical Reports Server (NTRS)
Nitzberg, Bill; Kuszmaul, Chris; Stockdale, Ian; Becker, Jeff; Jiang, John; Wong, Parkson; Tweten, David (Technical Monitor)
1998-01-01
We designed a new network architecture, the P-Mesh which combines the scalability and fault resilience of a torus with the performance of a switch. We compare the scalability, performance, and cost of the hub, switch, torus, tree, and P-Mesh architectures. The latter three are capable of scaling to thousands of nodes, however, the torus has severe performance limitations with that many processors. The tree and P-Mesh have similar latency, bandwidth, and bisection bandwidth, but the P-Mesh outperforms the switch architecture (a lower bound for tree performance) on 16-node NAB Parallel Benchmark tests by up to 23%, and costs 40% less. Further, the P-Mesh has better fault resilience characteristics. The P-Mesh architecture trades increased management overhead for lower cost, and is a good bridging technology while the price of tree uplinks is expensive.
Importance of balanced architectures in the design of high-performance imaging systems
NASA Astrophysics Data System (ADS)
Sgro, Joseph A.; Stanton, Paul C.
1999-03-01
Imaging systems employed in demanding military and industrial applications, such as automatic target recognition and computer vision, typically require real-time high-performance computing resources. While high- performances computing systems have traditionally relied on proprietary architectures and custom components, recent advances in high performance general-purpose microprocessor technology have produced an abundance of low cost components suitable for use in high-performance computing systems. A common pitfall in the design of high performance imaging system, particularly systems employing scalable multiprocessor architectures, is the failure to balance computational and memory bandwidth. The performance of standard cluster designs, for example, in which several processors share a common memory bus, is typically constrained by memory bandwidth. The symptom characteristic of this problem is failure to the performance of the system to scale as more processors are added. The problem becomes exacerbated if I/O and memory functions share the same bus. The recent introduction of microprocessors with large internal caches and high performance external memory interfaces makes it practical to design high performance imaging system with balanced computational and memory bandwidth. Real word examples of such designs will be presented, along with a discussion of adapting algorithm design to best utilize available memory bandwidth.
NASA Astrophysics Data System (ADS)
Larger, Laurent; Baylón-Fuentes, Antonio; Martinenghi, Romain; Udaltsov, Vladimir S.; Chembo, Yanne K.; Jacquot, Maxime
2017-01-01
Reservoir computing, originally referred to as an echo state network or a liquid state machine, is a brain-inspired paradigm for processing temporal information. It involves learning a "read-out" interpretation for nonlinear transients developed by high-dimensional dynamics when the latter is excited by the information signal to be processed. This novel computational paradigm is derived from recurrent neural network and machine learning techniques. It has recently been implemented in photonic hardware for a dynamical system, which opens the path to ultrafast brain-inspired computing. We report on a novel implementation involving an electro-optic phase-delay dynamics designed with off-the-shelf optoelectronic telecom devices, thus providing the targeted wide bandwidth. Computational efficiency is demonstrated experimentally with speech-recognition tasks. State-of-the-art speed performances reach one million words per second, with very low word error rate. Additionally, to record speed processing, our investigations have revealed computing-efficiency improvements through yet-unexplored temporal-information-processing techniques, such as simultaneous multisample injection and pitched sampling at the read-out compared to information "write-in".
High-speed ultrafast laser machining with tertiary beam positioning (Conference Presentation)
NASA Astrophysics Data System (ADS)
Yang, Chuan; Zhang, Haibin
2017-03-01
For an industrial laser application, high process throughput and low average cost of ownership are critical to commercial success. Benefiting from high peak power, nonlinear absorption and small-achievable spot size, ultrafast lasers offer advantages of minimal heat affected zone, great taper and sidewall quality, and small via capability that exceeds the limits of their predecessors in via drilling for electronic packaging. In the past decade, ultrafast lasers have both grown in power and reduced in cost. For example, recently, disk and fiber technology have both shown stable operation in the 50W to 200W range, mostly at high repetition rate (beyond 500 kHz) that helps avoid detrimental nonlinear effects. However, to effectively and efficiently scale the throughput with the fast-growing power capability of the ultrafast lasers while keeping the beneficial laser-material interactions is very challenging, mainly because of the bottleneck imposed by the inertia-related acceleration limit and servo gain bandwidth when only stages and galvanometers are being used. On the other side, inertia-free scanning solutions like acoustic optics and electronic optical deflectors have small scan field, and therefore not suitable for large-panel processing. Our recent system developments combine stages, galvanometers, and AODs into a coordinated tertiary architecture for high bandwidth and meanwhile large field beam positioning. Synchronized three-level movements allow extremely fast local speed and continuous motion over the whole stage travel range. We present the via drilling results from such ultrafast system with up to 3MHz pulse to pulse random access, enabling high quality low cost ultrafast machining with emerging high average power laser sources.
ASIC-based architecture for the real-time computation of 2D convolution with large kernel size
NASA Astrophysics Data System (ADS)
Shao, Rui; Zhong, Sheng; Yan, Luxin
2015-12-01
Bidimensional convolution is a low-level processing algorithm of interest in many areas, but its high computational cost constrains the size of the kernels, especially in real-time embedded systems. This paper presents a hardware architecture for the ASIC-based implementation of 2-D convolution with medium-large kernels. Aiming to improve the efficiency of storage resources on-chip, reducing off-chip bandwidth of these two issues, proposed construction of a data cache reuse. Multi-block SPRAM to cross cached images and the on-chip ping-pong operation takes full advantage of the data convolution calculation reuse, design a new ASIC data scheduling scheme and overall architecture. Experimental results show that the structure can achieve 40× 32 size of template real-time convolution operations, and improve the utilization of on-chip memory bandwidth and on-chip memory resources, the experimental results show that the structure satisfies the conditions to maximize data throughput output , reducing the need for off-chip memory bandwidth.
Wide-bandwidth high-resolution search for extraterrestrial intelligence
NASA Technical Reports Server (NTRS)
Horowitz, Paul
1995-01-01
Research was accomplished during the third year of the grant on: BETA architecture, an FFT array, a feature extractor, the Pentium array and workstation, and a radio astronomy spectrometer. The BETA (this SETI project) system architecture has been evolving generally in the direction of greater robustness against terrestrial interference. The new design adds a powerful state-memory feature, multiple simultaneous thresholds, and the ability to integrate multiple spectra in a flexible state-machine architecture. The FFT array is reported with regards to its hardware verification, array production, and control. The feature extractor is responsible for maintaining a moving baseline, recognizing large spectral peaks, following the progress of previously identified interesting spectral regions, and blocking signals from regions previously identified as containing interference. The Pentium array consists of 21 Pentium-based PC motherboards, each with 16 MByte of RAM and an Ethernet interface. Each motherboard receives and processes the data from a feature extractor/correlator board set, passing on the results of a first analysis to the central Unix workstation (through which each is also booted). The radio astronomy spectrometer is a technological spinoff from SETI work. It is proposed to be a combined spectrometer and power-accumulator, for use at Arecibo Observatory to search for neutral hydrogen emission from condensations of neutral hydrogen at high redshift (z = 5).
High bandwidth electro-optic technology for intersatellite optical communications
NASA Technical Reports Server (NTRS)
Krainak, Michael A.
1992-01-01
The research and development of electronic and electro-optic components for geosynchronous and low earth orbiting satellite optical high bandwidth communications at the NASA-Goddard Space Flight Center is reviewed. Intersatellite optical communications retains a strong reliance on microwave circuit technology in several areas - the microwave to optical interface, the laser transmitter modulation driver and the optical receiver. A microwave to optical interface is described requiring high bandwidth electronic downconverters and demodulators. Electrical bandwidth and current drive requirements for the laser modulation driver for three laser alternatives are discussed. Bandwidth and noise requirements are presented for optical receiver architectures.
NASA Technical Reports Server (NTRS)
Richard, Mark A.
1993-01-01
The recent discovery of high temperature superconductors (HTS) has generated a substantial amount of interest in microstrip antenna applications. However, the high permittivity of substrates compatible with HTS results in narrow bandwidths and high patch edge impedances of such antennas. To investigate the performance of superconducting microstrip antennas, three antenna architectures at K and Ka-band frequencies are examined. Superconducting microstrip antennas that are directly coupled, gap coupled, and electromagnetically coupled to a microstrip transmission line were designed and fabricated on lanthanum aluminate substrates using YBa2Cu3O7 superconducting thin films. For each architecture, a single patch antenna and a four element array were fabricated. Measurements from these antennas, including input impedance, bandwidth, patterns, efficiency, and gain are presented. The measured results show usable antennas can be constructed using any of the architectures. All architectures show excellent gain characteristics, with less than 2 dB of total loss in the four element arrays. Although the direct and gap coupled antennas are the simplest antennas to design and fabricate, they suffer from narrow bandwidths. The electromagnetically coupled antenna, on the other hand, allows the flexibility of using a low permittivity substrate for the patch radiator, while using HTS for the feed network, thus increasing the bandwidth while effectively utilizing the low loss properties of HTS. Each antenna investigated in this research is the first of its kind reported.
Efficient traffic grooming with dynamic ONU grouping for multiple-OLT-based access network
NASA Astrophysics Data System (ADS)
Zhang, Shizong; Gu, Rentao; Ji, Yuefeng; Wang, Hongxiang
2015-12-01
Fast bandwidth growth urges large-scale high-density access scenarios, where the multiple Passive Optical Networking (PON) system clustered deployment can be adopted as an appropriate solution to fulfill the huge bandwidth demands, especially for a future 5G mobile network. However, the lack of interaction between different optical line terminals (OLTs) results in part of the bandwidth resources waste. To increase the bandwidth efficiency, as well as reduce bandwidth pressure at the edge of a network, we propose a centralized flexible PON architecture based on Time- and Wavelength-Division Multiplexing PON (TWDM PON). It can provide flexible affiliation for optical network units (ONUs) and different OLTs to support access network traffic localization. Specifically, a dynamic ONU grouping algorithm (DGA) is provided to obtain the minimal OLT outbound traffic. Simulation results show that DGA obtains an average 25.23% traffic gain increment under different OLT numbers within a small ONU number situation, and the traffic gain will increase dramatically with the increment of the ONU number. As the DGA can be deployed easily as an application running above the centralized control plane, the proposed architecture can be helpful to improve the network efficiency for future traffic-intensive access scenarios.
Coarse-Grain Bandwidth Estimation Techniques for Large-Scale Space Network
NASA Technical Reports Server (NTRS)
Cheung, Kar-Ming; Jennings, Esther
2013-01-01
In this paper, we describe a top-down analysis and simulation approach to size the bandwidths of a store-andforward network for a given network topology, a mission traffic scenario, and a set of data types with different latency requirements. We use these techniques to estimate the wide area network (WAN) bandwidths of the ground links for different architecture options of the proposed Integrated Space Communication and Navigation (SCaN) Network.
A System Architecture for Efficient Transmission of Massive DNA Sequencing Data.
Sağiroğlu, Mahmut Şamİl; Külekcİ, M Oğuzhan
2017-11-01
The DNA sequencing data analysis pipelines require significant computational resources. In that sense, cloud computing infrastructures appear as a natural choice for this processing. However, the first practical difficulty in reaching the cloud computing services is the transmission of the massive DNA sequencing data from where they are produced to where they will be processed. The daily practice here begins with compressing the data in FASTQ file format, and then sending these data via fast data transmission protocols. In this study, we address the weaknesses in that daily practice and present a new system architecture that incorporates the computational resources available on the client side while dynamically adapting itself to the available bandwidth. Our proposal considers the real-life scenarios, where the bandwidth of the connection between the parties may fluctuate, and also the computing power on the client side may be of any size ranging from moderate personal computers to powerful workstations. The proposed architecture aims at utilizing both the communication bandwidth and the computing resources for satisfying the ultimate goal of reaching the results as early as possible. We present a prototype implementation of the proposed architecture, and analyze several real-life cases, which provide useful insights for the sequencing centers, especially on deciding when to use a cloud service and in what conditions.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1992-01-01
The theory of intelligent machines proposes a hierarchical organization for the functions of an autonomous robot based on the principle of increasing precision with decreasing intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed. The authors present a computer architecture that implements the lower two levels of the intelligent machine. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Execution-level controllers for motion and vision systems are briefly addressed, as well as the Petri net transducer software used to implement coordination-level functions. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
NASA Astrophysics Data System (ADS)
Yang, Chen; Liu, LeiBo; Yin, ShouYi; Wei, ShaoJun
2014-12-01
The computational capability of a coarse-grained reconfigurable array (CGRA) can be significantly restrained due to data and context memory bandwidth bottlenecks. Traditionally, two methods have been used to resolve this problem. One method loads the context into the CGRA at run time. This method occupies very small on-chip memory but induces very large latency, which leads to low computational efficiency. The other method adopts a multi-context structure. This method loads the context into the on-chip context memory at the boot phase. Broadcasting the pointer of a set of contexts changes the hardware configuration on a cycle-by-cycle basis. The size of the context memory induces a large area overhead in multi-context structures, which results in major restrictions on application complexity. This paper proposes a Predictable Context Cache (PCC) architecture to address the above context issues by buffering the context inside a CGRA. In this architecture, context is dynamically transferred into the CGRA. Utilizing a PCC significantly reduces the on-chip context memory and the complexity of the applications running on the CGRA is no longer restricted by the size of the on-chip context memory. Data preloading is the most frequently used approach to hide input data latency and speed up the data transmission process for the data bandwidth issue. Rather than fundamentally reducing the amount of input data, the transferred data and computations are processed in parallel. However, the data preloading method cannot work efficiently because data transmission becomes the critical path as the reconfigurable array scale increases. This paper also presents a Hierarchical Data Memory (HDM) architecture as a solution to the efficiency problem. In this architecture, high internal bandwidth is provided to buffer both reused input data and intermediate data. The HDM architecture relieves the external memory from the data transfer burden so that the performance is significantly improved. As a result of using PCC and HDM, experiments running mainstream video decoding programs achieved performance improvements of 13.57%-19.48% when there was a reasonable memory size. Therefore, 1080p@35.7fps for H.264 high profile video decoding can be achieved on PCC and HDM architecture when utilizing a 200 MHz working frequency. Further, the size of the on-chip context memory no longer restricted complex applications, which were efficiently executed on the PCC and HDM architecture.
QoS support over ultrafast TDM optical networks
NASA Astrophysics Data System (ADS)
Narvaez, Paolo; Siu, Kai-Yeung; Finn, Steven G.
1999-08-01
HLAN is a promising architecture to realize Tb/s access networks based on ultra-fast optical TDM technologies. This paper presents new research results on efficient algorithms for the support of quality of service over the HLAN network architecture. In particular, we propose a new scheduling algorithm that emulates fair queuing in a distributed manner for bandwidth allocation purpose. The proposed scheduler collects information on the queue of each host on the network and then instructs each host how much data to send. Our new scheduling algorithm ensures full bandwidth utilization, while guaranteeing fairness among all hosts.
Flexible All-Digital Receiver for Bandwidth Efficient Modulations
NASA Technical Reports Server (NTRS)
Gray, Andrew; Srinivasan, Meera; Simon, Marvin; Yan, Tsun-Yee
2000-01-01
An all-digital high data rate parallel receiver architecture developed jointly by Goddard Space Flight Center and the Jet Propulsion Laboratory is presented. This receiver utilizes only a small number of high speed components along with a majority of lower speed components operating in a parallel frequency domain structure implementable in CMOS, and can currently process up to 600 Mbps with standard QPSK modulation. Performance results for this receiver for bandwidth efficient QPSK modulation schemes such as square-root raised cosine pulse shaped QPSK and Feher's patented QPSK are presented, demonstrating the flexibility of the receiver architecture.
Solving the Cauchy-Riemann equations on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
Discussed is the implementation of a single algorithm on three parallel-vector computers. The algorithm is a relaxation scheme for the solution of the Cauchy-Riemann equations; a set of coupled first order partial differential equations. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, and SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The machine architectures are briefly described. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Conclusions are presented.
All-optical central-frequency-programmable and bandwidth-tailorable radar
Zou, Weiwen; Zhang, Hao; Long, Xin; Zhang, Siteng; Cui, Yuanjun; Chen, Jianping
2016-01-01
Radar has been widely used for military, security, and rescue purposes, and modern radar should be reconfigurable at multi-bands and have programmable central frequencies and considerable bandwidth agility. Microwave photonics or photonics-assisted radio-frequency technology is a unique solution to providing such capabilities. Here, we demonstrate an all-optical central-frequency-programmable and bandwidth-tailorable radar architecture that provides a coherent system and utilizes one mode-locked laser for both signal generation and reception. Heterodyning of two individually filtered optical pulses that are pre-chirped via wavelength-to-time mapping generates a wideband linearly chirped radar signal. The working bands can be flexibly tailored with the desired bandwidth at a user-preferred carrier frequency. Radar echoes are first modulated onto the pre-chirped optical pulse, which is also used for signal generation, and then stretched in time or compressed in frequency several fold based on the time-stretch principle. Thus, digitization is facilitated without loss of detection ability. We believe that our results demonstrate an innovative radar architecture with an ultra-high-range resolution. PMID:26795596
Series Elastic Actuators for legged robots
NASA Astrophysics Data System (ADS)
Pratt, Jerry E.; Krupp, Benjamin T.
2004-09-01
Series Elastic Actuators provide many benefits in force control of robots in unconstrained environments. These benefits include high force fidelity, extremely low impedance, low friction, and good force control bandwidth. Series Elastic Actuators employ a novel mechanical design architecture which goes against the common machine design principal of "stiffer is better." A compliant element is placed between the gear train and driven load to intentionally reduce the stiffness of the actuator. A position sensor measures the deflection, and the force output is accurately calculated using Hooke"s Law (F=Kx). A control loop then servos the actuator to the desired output force. The resulting actuator has inherent shock tolerance, high force fidelity and extremely low impedance. These characteristics are desirable in many applications including legged robots, exoskeletons for human performance amplification, robotic arms, haptic interfaces, and adaptive suspensions. We describe several variations of Series Elastic Actuators that have been developed using both electric and hydraulic components.
Selecting a Benchmark Suite to Profile High-Performance Computing (HPC) Machines
2014-11-01
architectures. Machines now contain central processing units (CPUs), graphics processing units (GPUs), and many integrated core ( MIC ) architecture all...evaluate the feasibility and applicability of a new architecture just released to the market . Researchers are often unsure how available resources will...architectures. Having a suite of programs running on different architectures, such as GPUs, MICs , and CPUs, adds complexity and technical challenges
Cross-layer shared protection strategy towards data plane in software defined optical networks
NASA Astrophysics Data System (ADS)
Xiong, Yu; Li, Zhiqiang; Zhou, Bin; Dong, Xiancun
2018-04-01
In order to ensure reliable data transmission on the data plane and minimize resource consumption, a novel protection strategy towards data plane is proposed in software defined optical networks (SDON). Firstly, we establish a SDON architecture with hierarchical structure of data plane, which divides the data plane into four layers for getting fine-grained bandwidth resource. Then, we design the cross-layer routing and resource allocation based on this network architecture. Through jointly considering the bandwidth resource on all the layers, the SDN controller could allocate bandwidth resource to working path and backup path in an economical manner. Next, we construct auxiliary graphs and transform the shared protection problem into the graph vertex coloring problem. Therefore, the resource consumption on backup paths can be reduced further. The simulation results demonstrate that the proposed protection strategy can achieve lower protection overhead and higher resource utilization ratio.
Modelling of internal architecture of kinesin nanomotor as a machine language.
Khataee, H R; Ibrahim, M Y
2012-09-01
Kinesin is a protein-based natural nanomotor that transports molecular cargoes within cells by walking along microtubules. Kinesin nanomotor is considered as a bio-nanoagent which is able to sense the cell through its sensors (i.e. its heads and tail), make the decision internally and perform actions on the cell through its actuator (i.e. its motor domain). The study maps the agent-based architectural model of internal decision-making process of kinesin nanomotor to a machine language using an automata algorithm. The applied automata algorithm receives the internal agent-based architectural model of kinesin nanomotor as a deterministic finite automaton (DFA) model and generates a regular machine language. The generated regular machine language was acceptable by the architectural DFA model of the nanomotor and also in good agreement with its natural behaviour. The internal agent-based architectural model of kinesin nanomotor indicates the degree of autonomy and intelligence of the nanomotor interactions with its cell. Thus, our developed regular machine language can model the degree of autonomy and intelligence of kinesin nanomotor interactions with its cell as a language. Modelling of internal architectures of autonomous and intelligent bio-nanosystems as machine languages can lay the foundation towards the concept of bio-nanoswarms and next phases of the bio-nanorobotic systems development.
Reverse time migration: A seismic processing application on the connection machine
NASA Technical Reports Server (NTRS)
Fiebrich, Rolf-Dieter
1987-01-01
The implementation of a reverse time migration algorithm on the Connection Machine, a massively parallel computer is described. Essential architectural features of this machine as well as programming concepts are presented. The data structures and parallel operations for the implementation of the reverse time migration algorithm are described. The algorithm matches the Connection Machine architecture closely and executes almost at the peak performance of this machine.
A computer architecture for intelligent machines
NASA Technical Reports Server (NTRS)
Lefebvre, D. R.; Saridis, G. N.
1991-01-01
The Theory of Intelligent Machines proposes a hierarchical organization for the functions of an autonomous robot based on the Principle of Increasing Precision With Decreasing Intelligence. An analytic formulation of this theory using information-theoretic measures of uncertainty for each level of the intelligent machine has been developed in recent years. A computer architecture that implements the lower two levels of the intelligent machine is presented. The architecture supports an event-driven programming paradigm that is independent of the underlying computer architecture and operating system. Details of Execution Level controllers for motion and vision systems are addressed, as well as the Petri net transducer software used to implement Coordination Level functions. Extensions to UNIX and VxWorks operating systems which enable the development of a heterogeneous, distributed application are described. A case study illustrates how this computer architecture integrates real-time and higher-level control of manipulator and vision systems.
NASA Astrophysics Data System (ADS)
Bock, Carlos; Prat, Josep; Walker, Stuart D.
2005-12-01
A novel time/space/wavelength division multiplexing (TDM/WDM) architecture using the free spectral range (FSR) periodicity of the arrayed waveguide grating (AWG) is presented. A shared tunable laser and a photoreceiver stack featuring dynamic bandwidth allocation (DBA) and remote modulation are used for transmission and reception. Transmission tests show correct operation at 2.5 Gb/s to a 30-km reach, and network performance calculations using queue modeling demonstrate that a high-bandwidth-demanding application could be deployed on this network.
NASA Astrophysics Data System (ADS)
Yang, Wei; Hall, Trevor J.
2013-12-01
The Internet is entering an era of cloud computing to provide more cost effective, eco-friendly and reliable services to consumer and business users. As a consequence, the nature of the Internet traffic has been fundamentally transformed from a pure packet-based pattern to today's predominantly flow-based pattern. Cloud computing has also brought about an unprecedented growth in the Internet traffic. In this paper, a hybrid optical switch architecture is presented to deal with the flow-based Internet traffic, aiming to offer flexible and intelligent bandwidth on demand to improve fiber capacity utilization. The hybrid optical switch is capable of integrating IP into optical networks for cloud-based traffic with predictable performance, for which the delay performance of the electronic module in the hybrid optical switch architecture is evaluated through simulation.
Buset, Jonathan M; El-Sahn, Ziad A; Plant, David V
2012-06-18
We demonstrate an improved overlapped-subcarrier multiplexed (O-SCM) WDM PON architecture transmitting over a single feeder using cost sensitive intensity modulation/direct detection transceivers, data re-modulation and simple electronics. Incorporating electronic equalization and Reed-Solomon forward-error correction codes helps to overcome the bandwidth limitation of a remotely seeded reflective semiconductor optical amplifier (RSOA)-based ONU transmitter. The O-SCM architecture yields greater spectral efficiency and higher bit rates than many other SCM techniques while maintaining resilience to upstream impairments. We demonstrate full-duplex 5 Gb/s transmission over 20 km and analyze BER performance as a function of transmitted and received power. The architecture provides flexibility to network operators by relaxing common design constraints and enabling full-duplex operation at BER ∼ 10(-10) over a wide range of OLT launch powers from 3.5 to 8 dBm.
NASA Technical Reports Server (NTRS)
Albus, James S.
1996-01-01
The Real-time Control System (RCS) developed at NIST and elsewhere over the past two decades defines a reference model architecture for design and analysis of complex intelligent control systems. The RCS architecture consists of a hierarchically layered set of functional processing modules connected by a network of communication pathways. The primary distinguishing feature of the layers is the bandwidth of the control loops. The characteristic bandwidth of each level is determined by the spatial and temporal integration window of filters, the temporal frequency of signals and events, the spatial frequency of patterns, and the planning horizon and granularity of the planners that operate at each level. At each level, tasks are decomposed into sequential subtasks, to be performed by cooperating sets of subordinate agents. At each level, signals from sensors are filtered and correlated with spatial and temporal features that are relevant to the control function being implemented at that level.
Parallel machine architecture and compiler design facilities
NASA Technical Reports Server (NTRS)
Kuck, David J.; Yew, Pen-Chung; Padua, David; Sameh, Ahmed; Veidenbaum, Alex
1990-01-01
The objective is to provide an integrated simulation environment for studying and evaluating various issues in designing parallel systems, including machine architectures, parallelizing compiler techniques, and parallel algorithms. The status of Delta project (which objective is to provide a facility to allow rapid prototyping of parallelized compilers that can target toward different machine architectures) is summarized. Included are the surveys of the program manipulation tools developed, the environmental software supporting Delta, and the compiler research projects in which Delta has played a role.
Path connectivity based spectral defragmentation in flexible bandwidth networks.
Wang, Ying; Zhang, Jie; Zhao, Yongli; Zhang, Jiawei; Zhao, Jie; Wang, Xinbo; Gu, Wanyi
2013-01-28
Optical networks with flexible bandwidth provisioning have become a very promising networking architecture. It enables efficient resource utilization and supports heterogeneous bandwidth demands. In this paper, two novel spectrum defragmentation approaches, i.e. Maximum Path Connectivity (MPC) algorithm and Path Connectivity Triggering (PCT) algorithm, are proposed based on the notion of Path Connectivity, which is defined to represent the maximum variation of node switching ability along the path in flexible bandwidth networks. A cost-performance-ratio based profitability model is given to denote the prons and cons of spectrum defragmentation. We compare these two proposed algorithms with non-defragmentation algorithm in terms of blocking probability. Then we analyze the differences of defragmentation profitability between MPC and PCT algorithms.
Dragas, Jelena; Jäckel, David; Hierlemann, Andreas; Franke, Felix
2017-01-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction. PMID:25415989
Dragas, Jelena; Jackel, David; Hierlemann, Andreas; Franke, Felix
2015-03-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction.
NASA Astrophysics Data System (ADS)
Tekin, Tolga; Töpper, Michael; Reichl, Herbert
2009-05-01
Technological frontiers between semiconductor technology, packaging, and system design are disappearing. Scaling down geometries [1] alone does not provide improvement of performance, less power, smaller size, and lower cost. It will require "More than Moore" [2] through the tighter integration of system level components at the package level. System-in-Package (SiP) will deliver the efficient use of three dimensions (3D) through innovation in packaging and interconnect technology. A key bottleneck to the implementation of high-performance microelectronic systems, including SiP, is the lack of lowlatency, high-bandwidth, and high density off-chip interconnects. Some of the challenges in achieving high-bandwidth chip-to-chip communication using electrical interconnects include the high losses in the substrate dielectric, reflections and impedance discontinuities, and susceptibility to crosstalk [3]. Obviously, the incentive for the use of photonics to overcome the challenges and leverage low-latency and highbandwidth communication will enable the vision of optical computing within next generation architectures. Supercomputers of today offer sustained performance of more than petaflops, which can be increased by utilizing optical interconnects. Next generation computing architectures are needed with ultra low power consumption; ultra high performance with novel interconnection technologies. In this paper we will discuss a CMOS compatible underlying technology to enable next generation optical computing architectures. By introducing a new optical layer within the 3D SiP, the development of converged microsystems, deployment for next generation optical computing architecture will be leveraged.
Software architecture standard for simulation virtual machine, version 2.0
NASA Technical Reports Server (NTRS)
Sturtevant, Robert; Wessale, William
1994-01-01
The Simulation Virtual Machine (SBM) is an Ada architecture which eases the effort involved in the real-time software maintenance and sustaining engineering. The Software Architecture Standard defines the infrastructure which all the simulation models are built from. SVM was developed for and used in the Space Station Verification and Training Facility.
Scalable Architecture for Multihop Wireless ad Hoc Networks
NASA Technical Reports Server (NTRS)
Arabshahi, Payman; Gray, Andrew; Okino, Clayton; Yan, Tsun-Yee
2004-01-01
A scalable architecture for wireless digital data and voice communications via ad hoc networks has been proposed. Although the details of the architecture and of its implementation in hardware and software have yet to be developed, the broad outlines of the architecture are fairly clear: This architecture departs from current commercial wireless communication architectures, which are characterized by low effective bandwidth per user and are not well suited to low-cost, rapid scaling in large metropolitan areas. This architecture is inspired by a vision more akin to that of more than two dozen noncommercial community wireless networking organizations established by volunteers in North America and several European countries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murphy, Richard C.
2009-09-01
This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential ofmore » PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.« less
FPGA cluster for high-performance AO real-time control system
NASA Astrophysics Data System (ADS)
Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.
2006-06-01
Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
Building a Terabyte Memory Bandwidth Compute Node with Four Consumer Electronics GPUs
NASA Astrophysics Data System (ADS)
Omlin, Samuel; Räss, Ludovic; Podladchikov, Yuri
2014-05-01
GPUs released for consumer electronics are generally built with the same chip architectures as the GPUs released for professional usage. With regards to scientific computing, there are no obvious important differences in functionality or performance between the two types of releases, yet the price can differ up to one order of magnitude. For example, the consumer electronics release of the most recent NVIDIA Kepler architecture (GK110), named GeForce GTX TITAN, performed equally well in conducted memory bandwidth tests as the professional release, named Tesla K20; the consumer electronics release costs about one third of the professional release. We explain how to design and assemble a well adjusted computer with four high-end consumer electronics GPUs (GeForce GTX TITAN) combining more than 1 terabyte/s memory bandwidth. We compare the system's performance and precision with the one of hardware released for professional usage. The system can be used as a powerful workstation for scientific computing or as a compute node in a home-built GPU cluster.
Flexible software architecture for user-interface and machine control in laboratory automation.
Arutunian, E B; Meldrum, D R; Friedman, N A; Moody, S E
1998-10-01
We describe a modular, layered software architecture for automated laboratory instruments. The design consists of a sophisticated user interface, a machine controller and multiple individual hardware subsystems, each interacting through a client-server architecture built entirely on top of open Internet standards. In our implementation, the user-interface components are built as Java applets that are downloaded from a server integrated into the machine controller. The user-interface client can thereby provide laboratory personnel with a familiar environment for experiment design through a standard World Wide Web browser. Data management and security are seamlessly integrated at the machine-controller layer using QNX, a real-time operating system. This layer also controls hardware subsystems through a second client-server interface. This architecture has proven flexible and relatively easy to implement and allows users to operate laboratory automation instruments remotely through an Internet connection. The software architecture was implemented and demonstrated on the Acapella, an automated fluid-sample-processing system that is under development at the University of Washington.
Adaptive Code Division Multiple Access Protocol for Wireless Network-on-Chip Architectures
NASA Astrophysics Data System (ADS)
Vijayakumaran, Vineeth
Massive levels of integration following Moore's Law ushered in a paradigm shift in the way on-chip interconnections were designed. With higher and higher number of cores on the same die traditional bus based interconnections are no longer a scalable communication infrastructure. On-chip networks were proposed enabled a scalable plug-and-play mechanism for interconnecting hundreds of cores on the same chip. Wired interconnects between the cores in a traditional Network-on-Chip (NoC) system, becomes a bottleneck with increase in the number of cores thereby increasing the latency and energy to transmit signals over them. Hence, there has been many alternative emerging interconnect technologies proposed, namely, 3D, photonic and multi-band RF interconnects. Although they provide better connectivity, higher speed and higher bandwidth compared to wired interconnects; they also face challenges with heat dissipation and manufacturing difficulties. On-chip wireless interconnects is one other alternative proposed which doesn't need physical interconnection layout as data travels over the wireless medium. They are integrated into a hybrid NOC architecture consisting of both wired and wireless links, which provides higher bandwidth, lower latency, lesser area overhead and reduced energy dissipation in communication. However, as the bandwidth of the wireless channels is limited, an efficient media access control (MAC) scheme is required to enhance the utilization of the available bandwidth. This thesis proposes using a multiple access mechanism such as Code Division Multiple Access (CDMA) to enable multiple transmitter-receiver pairs to send data over the wireless channel simultaneously. It will be shown that such a hybrid wireless NoC with an efficient CDMA based MAC protocol can significantly increase the performance of the system while lowering the energy dissipation in data transfer. In this work it is shown that the wireless NoC with the proposed CDMA based MAC protocol outperformed the wired counterparts and several other wireless architectures proposed in literature in terms of bandwidth and packet energy dissipation. Significant gains were observed in packet energy dissipation and bandwidth even with scaling the system to higher number of cores. Non-uniform traffic simulations showed that the proposed CDMA-WiNoC was consistent in bandwidth across all traffic patterns. It is also shown that the CDMA based MAC scheme does not introduce additional reliability concerns in data transfer over the on-chip wireless interconnects.
Toward a Mobility-Driven Architecture for Multimodal Underwater Networking
2017-02-01
applications. By equipping AUVs with short-range, high -bandwidth underwater wireless communications , which feature lower energy-per-bit cost than acoustic...protocols. They suffer from significant transmission path losses at high frequencies , long propagation delays, low and distance-dependent bandwidth, time...of data preprocessing, data compression, and either tethering to a surface buoy able to use radio frequency (RF) communications or using undersea
Framework for teleoperated microassembly systems
NASA Astrophysics Data System (ADS)
Reinhart, Gunther; Anton, Oliver; Ehrenstrasser, Michael; Patron, Christian; Petzold, Bernd
2002-02-01
Manual assembly of minute parts is currently done using simple devices such as tweezers or magnifying glasses. The operator therefore requires a great deal of concentration for successful assembly. Teleoperated micro-assembly systems are a promising method for overcoming the scaling barrier. However, most of today's telepresence systems are based on proprietary and one-of-a-kind solutions. Frameworks which supply the basic functions of a telepresence system, e.g. to establish flexible communication links that depend on bandwidth requirements or to synchronize distributed components, are not currently available. Large amounts of time and money have to be invested in order to create task-specific teleoperated micro-assembly systems from scratch. For this reason, an object-oriented framework for telepresence systems that is based on CORBA as a common middleware was developed at the Institute for Machine Tools and Industrial Management (iwb). The framework is based on a distributed architectural concept and is realized in C++. External hardware components such as haptic, video or sensor devices are coupled to the system by means of defined software interfaces. In this case, the special requirements of teleoperation systems have to be considered, e.g. dynamic parameter settings for sensors during operation. Consequently, an architectural concept based on logical sensors has been developed to achieve maximum flexibility and to enable a task-oriented integration of hardware components.
Accessing Geospatial Services in Limited Bandwidth Service-Oriented Architecture (SOA) Environments
ERIC Educational Resources Information Center
Boggs, James D.
2013-01-01
First responders are continuously moving at an incident site and this movement requires them to access Service-Oriented Architecture services, such as a Web Map Service, via mobile wireless networks. First responders from inside a building often have problems in communicating to devices outside that building due to propagation obstacles. Dynamic…
High speed all optical networks
NASA Technical Reports Server (NTRS)
Chlamtac, Imrich; Ganz, Aura
1990-01-01
An inherent problem of conventional point-to-point wide area network (WAN) architectures is that they cannot translate optical transmission bandwidth into comparable user available throughput due to the limiting electronic processing speed of the switching nodes. The first solution to wavelength division multiplexing (WDM) based WAN networks that overcomes this limitation is presented. The proposed Lightnet architecture takes into account the idiosyncrasies of WDM switching/transmission leading to an efficient and pragmatic solution. The Lightnet architecture trades the ample WDM bandwidth for a reduction in the number of processing stages and a simplification of each switching stage, leading to drastically increased effective network throughputs. The principle of the Lightnet architecture is the construction and use of virtual topology networks, embedded in the original network in the wavelength domain. For this construction Lightnets utilize the new concept of lightpaths which constitute the links of the virtual topology. Lightpaths are all-optical, multihop, paths in the network that allow data to be switched through intermediate nodes using high throughput passive optical switches. The use of the virtual topologies and the associated switching design introduce a number of new ideas, which are discussed in detail.
The Tera Multithreaded Architecture and Unstructured Meshes
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.; Mavriplis, Dimitri J.
1998-01-01
The Tera Multithreaded Architecture (MTA) is a new parallel supercomputer currently being installed at San Diego Supercomputing Center (SDSC). This machine has an architecture quite different from contemporary parallel machines. The computational processor is a custom design and the machine uses hardware to support very fine grained multithreading. The main memory is shared, hardware randomized and flat. These features make the machine highly suited to the execution of unstructured mesh problems, which are difficult to parallelize on other architectures. We report the results of a study carried out during July-August 1998 to evaluate the execution of EUL3D, a code that solves the Euler equations on an unstructured mesh, on the 2 processor Tera MTA at SDSC. Our investigation shows that parallelization of an unstructured code is extremely easy on the Tera. We were able to get an existing parallel code (designed for a shared memory machine), running on the Tera by changing only the compiler directives. Furthermore, a serial version of this code was compiled to run in parallel on the Tera by judicious use of directives to invoke the "full/empty" tag bits of the machine to obtain synchronization. This version achieves 212 and 406 Mflop/s on one and two processors respectively, and requires no attention to partitioning or placement of data issues that would be of paramount importance in other parallel architectures.
Using Multiple FPGA Architectures for Real-time Processing of Low-level Machine Vision Functions
Thomas H. Drayer; William E. King; Philip A. Araman; Joseph G. Tront; Richard W. Conners
1995-01-01
In this paper, we investigate the use of multiple Field Programmable Gate Array (FPGA) architectures for real-time machine vision processing. The use of FPGAs for low-level processing represents an excellent tradeoff between software and special purpose hardware implementations. A library of modules that implement common low-level machine vision operations is presented...
Rosen's (M,R) system as an X-machine.
Palmer, Michael L; Williams, Richard A; Gatherer, Derek
2016-11-07
Robert Rosen's (M,R) system is an abstract biological network architecture that is allegedly both irreducible to sub-models of its component states and non-computable on a Turing machine. (M,R) stands as an obstacle to both reductionist and mechanistic presentations of systems biology, principally due to its self-referential structure. If (M,R) has the properties claimed for it, computational systems biology will not be possible, or at best will be a science of approximate simulations rather than accurate models. Several attempts have been made, at both empirical and theoretical levels, to disprove this assertion by instantiating (M,R) in software architectures. So far, these efforts have been inconclusive. In this paper, we attempt to demonstrate why - by showing how both finite state machine and stream X-machine formal architectures fail to capture the self-referential requirements of (M,R). We then show that a solution may be found in communicating X-machines, which remove self-reference using parallel computation, and then synthesise such machine architectures with object-orientation to create a formal basis for future software instantiations of (M,R) systems. Copyright © 2016 Elsevier Ltd. All rights reserved.
Neural networks with fuzzy Petri nets for modeling a machining process
NASA Astrophysics Data System (ADS)
Hanna, Moheb M.
1998-03-01
The paper presents an intelligent architecture based a feedforward neural network with fuzzy Petri nets for modeling product quality in a CNC machining center. It discusses how the proposed architecture can be used for modeling, monitoring and control a product quality specification such as surface roughness. The surface roughness represents the output quality specification manufactured by a CNC machining center as a result of a milling process. The neural network approach employed the selected input parameters which defined by the machine operator via the CNC code. The fuzzy Petri nets approach utilized the exact input milling parameters, such as spindle speed, feed rate, tool diameter and coolant (off/on), which can be obtained via the machine or sensors system. An aim of the proposed architecture is to model the demanded quality of surface roughness as high, medium or low.
High Performance Power Amplifiers Utilizing Novel Balun Design Techniques
NASA Astrophysics Data System (ADS)
Stameroff, Alexander Nicholas
In this PhD. research, a new power amplifier architecture is introduced. This work develops the push-pull architecture into a multifunctional matching network and combiner to create a high power, high efficiency, linear power amplifier (PA) that operates over a wide bandwidth. The traditional push-pull architecture uses an input balun to split a single ended signal into a differential signal, amplify it, and recombine it. This new technique realizes this architecture as a planar, hybrid, PA in X band. The first contribution of this work is the development of planar Marchand baluns that operate over a wide bandwidth. An analysis technique is developed and broadside coupled, Marchand baluns in an inhomogeneous medium are employed. These baluns operate over a bandwidth from 5 to 26 GHz with amplitude and phase imbalances less than 0.5 dB and 5 °, respectively. The even and odd mode behavior of the Marchand balun is utilized to provide harmonic matching for the PA. The balun inherently presents an open circuit to common mode signals at its center frequency. This is utilized to match the second harmonic to an open circuit condition. A band-stop filter is used as a harmonic trap to match the third harmonic to a short circuit. This achieves inverse class F matching for high efficiency operation. This network simultaneously acts as a combiner and matching network for high power and efficiency. A prototype PA was fabricated to prove this concept and achieves a saturated output power, Psat, greater than 33 dBm and a power added efficiency, PAE, greater than 62% over the bandwidth from 9.7 to 10.3 GHz. This technique was refined to operate over a wide bandwidth. The harmonic trap was removed and the out-of-band behavior of the balun was used to provide the short circuit matching at the third harmonic. A prototype PA was fabricated that achieved a 1 dB compressed power, P1dB, and PAE greater than 40 dBm and 55% respectively over the band from 8 to 12 GHz. Finally, the technique was extended to combine power from four transistors by the development of a 4-to-1 balun. A prototype PA was fabricated to prove this concept and achieves a P1dB and PAE greater than 43 dBm and 55% over the band from 8 to 12 GHz.
Distributed Antenna-Coupled TES for FIR Detectors Arrays
NASA Technical Reports Server (NTRS)
Day, Peter K.; Leduc, Henry G.; Dowell, C. Darren; Lee, Richard A.; Zmuidzinas, Jonas
2007-01-01
We describe a new architecture for a superconducting detector for the submillimeter and far-infrared. This detector uses a distributed hot-electron transition edge sensor (TES) to collect the power from a focal-plane-filling slot antenna array. The sensors lay directly across the slots of the antenna and match the antenna impedance of about 30 ohms. Each pixel contains many sensors that are wired in parallel as a single distributed TES, which results in a low impedance that readily matches to a multiplexed SQUID readout These detectors are inherently polarization sensitive, with very low cross-polarization response, but can also be configured to sum both polarizations. The dual-polarization design can have a bandwidth of 50The use of electron-phonon decoupling eliminates the need for micro-machining, making the focal plane much easier to fabricate than with absorber-coupled, mechanically isolated pixels. We discuss applications of these detectors and a hybridization scheme compatible with arrays of tens of thousands of pixels.
How do I resolve problems reading the binary data?
Atmospheric Science Data Center
2014-12-08
... affecting compilation would be differing versions of the operating system and compilers the read software are being run on. Big ... Unix machines are Big Endian architecture while Linux systems are Little Endian architecture. Data generated on a Unix machine are ...
Pilot-in-the-Loop Analysis of Propulsive-Only Flight Control Systems
NASA Technical Reports Server (NTRS)
Chou, Hwei-Lan; Biezad, Daniel J.
1996-01-01
Longitudinal control system architectures are presented which directly couple flight stick motions to throttle commands for a multi-engine aircraft. This coupling enables positive attitude control with complete failure of the flight control system. The architectures chosen vary from simple feedback gains to classical lead-lag compensators with and without prefilters. Each architecture is reviewed for its appropriateness for piloted flight. The control systems are then analyzed with pilot-in-the-loop metrics related to bandwidth required for landing. Results indicate that current and proposed bandwidth requirements should be modified for throttles only flight control. Pilot ratings consistently showed better ratings than predicted by analysis. Recommendations are made for more robust design and implementation. The use of Quantitative Feedback Theory for compensator design is discussed. Although simple and effective augmented control can be achieved in a wide variety of failed configurations, a few configuration characteristics are dominant for pilot-in-the-loop control. These characteristics will be tested in a simulator study involving failed flight controls for a multi-engine aircraft.
2007-03-01
potential of moving closer to the goal of a fully service-oriented GIG by allowing even computing - and bandwidth-constrained elements to participate...the functionality provided by core network assets with relatively unlimited bandwidth and computing resources. Finally, the nature of information is...the Department of Defense is a requirement for ubiquitous computer connectivity. An espoused vehicle for delivering that ubiquity is the Global
Analysis of performance improvements for host and GPU interface of the APENet+ 3D Torus network
NASA Astrophysics Data System (ADS)
Ammendola A, R.; Biagioni, A.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Paolucci, P. S.; Rossetti, D.; Simula, F.; Tosoratto, L.; Vicini, P.
2014-06-01
APEnet+ is an INFN (Italian Institute for Nuclear Physics) project aiming to develop a custom 3-Dimensional torus interconnect network optimized for hybrid clusters CPU-GPU dedicated to High Performance scientific Computing. The APEnet+ interconnect fabric is built on a FPGA-based PCI-express board with 6 bi-directional off-board links showing 34 Gbps of raw bandwidth per direction, and leverages upon peer-to-peer capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, GPU-to-GPU low latency transfers. The minimization of APEnet+ transfer latency is achieved through the adoption of RDMA protocol implemented in FPGA with specialized hardware blocks tightly coupled with embedded microprocessor. This architecture provides a high performance low latency offload engine for both trasmit and receive side of data transactions: preliminary results are encouraging, showing 50% of bandwidth increase for large packet size transfers. In this paper we describe the APEnet+ architecture, detailing the hardware implementation and discuss the impact of such RDMA specialized hardware on host interface latency and bandwidth.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
Machine-Learning Approach for Design of Nanomagnetic-Based Antennas
NASA Astrophysics Data System (ADS)
Gianfagna, Carmine; Yu, Huan; Swaminathan, Madhavan; Pulugurtha, Raj; Tummala, Rao; Antonini, Giulio
2017-08-01
We propose a machine-learning approach for design of planar inverted-F antennas with a magneto-dielectric nanocomposite substrate. It is shown that machine-learning techniques can be efficiently used to characterize nanomagnetic-based antennas by accurately mapping the particle radius and volume fraction of the nanomagnetic material to antenna parameters such as gain, bandwidth, radiation efficiency, and resonant frequency. A modified mixing rule model is also presented. In addition, the inverse problem is addressed through machine learning as well, where given the antenna parameters, the corresponding design space of possible material parameters is identified.
Fully programmable and scalable optical switching fabric for petabyte data center.
Zhu, Zhonghua; Zhong, Shan; Chen, Li; Chen, Kai
2015-02-09
We present a converged EPS and OCS switching fabric for data center networks (DCNs) based on a distributed optical switching architecture leveraging both WDM & SDM technologies. The architecture is topology adaptive, well suited to dynamic and diverse *-cast traffic patterns. Compared to a typical folded-Clos network, the new architecture is more readily scalable to future multi-Petabyte data centers with 1000 + racks while providing a higher link bandwidth, reducing transceiver count by 50%, and improving cabling efficiency by more than 90%.
Enabling Low-Power, Multi-Modal Neural Interfaces Through a Common, Low-Bandwidth Feature Space.
Irwin, Zachary T; Thompson, David E; Schroeder, Karen E; Tat, Derek M; Hassani, Ali; Bullard, Autumn J; Woo, Shoshana L; Urbanchek, Melanie G; Sachs, Adam J; Cederna, Paul S; Stacey, William C; Patil, Parag G; Chestek, Cynthia A
2016-05-01
Brain-Machine Interfaces (BMIs) have shown great potential for generating prosthetic control signals. Translating BMIs into the clinic requires fully implantable, wireless systems; however, current solutions have high power requirements which limit their usability. Lowering this power consumption typically limits the system to a single neural modality, or signal type, and thus to a relatively small clinical market. Here, we address both of these issues by investigating the use of signal power in a single narrow frequency band as a decoding feature for extracting information from electrocorticographic (ECoG), electromyographic (EMG), and intracortical neural data. We have designed and tested the Multi-modal Implantable Neural Interface (MINI), a wireless recording system which extracts and transmits signal power in a single, configurable frequency band. In prerecorded datasets, we used the MINI to explore low frequency signal features and any resulting tradeoff between power savings and decoding performance losses. When processing intracortical data, the MINI achieved a power consumption 89.7% less than a more typical system designed to extract action potential waveforms. When processing ECoG and EMG data, the MINI achieved similar power reductions of 62.7% and 78.8%. At the same time, using the single signal feature extracted by the MINI, we were able to decode all three modalities with less than a 9% drop in accuracy relative to using high-bandwidth, modality-specific signal features. We believe this system architecture can be used to produce a viable, cost-effective, clinical BMI.
Fiber to the serving area: telephone-like star architecture for CATV
NASA Astrophysics Data System (ADS)
Fellows, David M.
1992-02-01
CATV systems traditionally use a tree and branch architecture to bring up to 550 MHz of analog bandwidth to every home in a franchise area. This changed slightly with the advent of AM fiber optic equipment, as fiber optics were used in an overlay fashion to reduce coaxial amplifier cascades and improve subscriber quality and reliability. Within the last year, fiber has economically replaced coaxial trunking. The resulting fiber to the serving area architecture combines fiber and coaxial stars for a network that looks much like the carrier serving area architectures used by telephone companies.
An intelligent CNC machine control system architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, D.J.; Loucks, C.S.
1996-10-01
Intelligent, agile manufacturing relies on automated programming of digitally controlled processes. Currently, processes such as Computer Numerically Controlled (CNC) machining are difficult to automate because of highly restrictive controllers and poor software environments. It is also difficult to utilize sensors and process models for adaptive control, or to integrate machining processes with other tasks within a factory floor setting. As part of a Laboratory Directed Research and Development (LDRD) program, a CNC machine control system architecture based on object-oriented design and graphical programming has been developed to address some of these problems and to demonstrate automated agile machining applications usingmore » platform-independent software.« less
A Cascaded Self-Similar Rat-Race Hybrid Coupler Architecture and its Compact Ka-Band Implementation
2017-03-01
real-estate and limit the system-level performance, including bandwidth, gain, and energy - efficiency. These many challenges are positioning passive...and are used in numerous RF/mm-wave systems for radar and wireless communications. Although a Marchand balun covers a large bandwidth, it is...requires multiple λ/4 transmission lines (t-lines), making its on-chip designs very costly even for RF/mm-wave bands. Reported miniaturized rat-race
2010-07-22
dependent , providing a natural bandwidth match between compute cores and the memory subsystem. • High Bandwidth Dcnsity. Waveguides crossing the chip...simulate this memory access architecture on a 2S6-core chip with a concentrated 64-node network lIsing detailed traces of high-performance embedded...memory modulcs, wc placc memory access poi nts (MAPs) around the pcriphery of the chip connected to thc nctwork. These MAPs, shown in Figure 4, contain
Software defined multi-OLT passive optical network for flexible traffic allocation
NASA Astrophysics Data System (ADS)
Zhang, Shizong; Gu, Rentao; Ji, Yuefeng; Zhang, Jiawei; Li, Hui
2016-10-01
With the rapid growth of 4G mobile network and vehicular network services mobile terminal users have increasing demand on data sharing among different radio remote units (RRUs) and roadside units (RSUs). Meanwhile, commercial video-streaming, video/voice conference applications delivered through peer-to-peer (P2P) technology are still keep on stimulating the sharp increment of bandwidth demand in both business and residential subscribers. However, a significant issue is that, although wavelength division multiplexing (WDM) and orthogonal frequency division multiplexing (OFDM) technology have been proposed to fulfil the ever-increasing bandwidth demand in access network, the bandwidth of optical fiber is not unlimited due to the restriction of optical component properties and modulation/demodulation technology, and blindly increase the wavelength cannot meet the cost-sensitive characteristic of the access network. In this paper, we propose a software defined multi-OLT PON architecture to support efficient scheduling of access network traffic. By introducing software defined networking technology and wavelength selective switch into TWDM PON system in central office, multiple OLTs can be considered as a bandwidth resource pool and support flexible traffic allocation for optical network units (ONUs). Moreover, under the configuration of the control plane, ONUs have the capability of changing affiliation between different OLTs under different traffic situations, thus the inter-OLT traffic can be localized and the data exchange pressure of the core network can be released. Considering this architecture is designed to be maximum following the TWDM PON specification, the existing optical distribution network (ODN) investment can be saved and conventional EPON/GPON equipment can be compatible with the proposed architecture. What's more, based on this architecture, we propose a dynamic wavelength scheduling algorithm, which can be deployed as an application on control plane and achieve effective scheduling OLT wavelength resources between different OLTs based on various traffic situation. Simulation results show that, by using the scheduling algorithm, network traffic between different OLTs can be optimized effectively, and the wavelength utilization of the multi-OLT system can be improved due to the flexible wavelength scheduling.
A New mHealth Communication Framework for Use in Wearable WBANs and Mobile Technologies
Hamida, Sana Tmar-Ben; Hamida, Elyes Ben; Ahmed, Beena
2015-01-01
Driven by the development of biomedical sensors and the availability of high mobile bandwidth, mobile health (mHealth) systems are now offering a wider range of new services. This revolution makes the idea of in-home health monitoring practical and provides the opportunity for assessment in “real-world” environments producing more ecologically valid data. In the field of insomnia diagnosis, for example, it is now possible to offer patients wearable sleep monitoring systems which can be used in the comfort of their homes over long periods of time. The recorded data collected from body sensors can be sent to a remote clinical back-end system for analysis and assessment. Most of the research on sleep reported in the literature mainly looks into how to automate the analysis of the sleep data and does not address the problem of the efficient encoding and secure transmissions of the collected health data. This article reviews the key enabling communication technologies and research challenges for the design of efficient mHealth systems. An end-to-end mHealth system architecture enabling the remote assessment and monitoring of patient's sleep disorders is then proposed and described as a case study. Finally, various mHealth data serialization formats and machine-to-machine (M2M) communication protocols are evaluated and compared under realistic operating conditions. PMID:25654718
A new mHealth communication framework for use in wearable WBANs and mobile technologies.
Hamida, Sana Tmar-Ben; Hamida, Elyes Ben; Ahmed, Beena
2015-02-03
Driven by the development of biomedical sensors and the availability of high mobile bandwidth, mobile health (mHealth) systems are now offering a wider range of new services. This revolution makes the idea of in-home health monitoring practical and provides the opportunity for assessment in "real-world" environments producing more ecologically valid data. In the field of insomnia diagnosis, for example, it is now possible to offer patients wearable sleep monitoring systems which can be used in the comfort of their homes over long periods of time. The recorded data collected from body sensors can be sent to a remote clinical back-end system for analysis and assessment. Most of the research on sleep reported in the literature mainly looks into how to automate the analysis of the sleep data and does not address the problem of the efficient encoding and secure transmissions of the collected health data. This article reviews the key enabling communication technologies and research challenges for the design of efficient mHealth systems. An end-to-end mHealth system architecture enabling the remote assessment and monitoring of patient's sleep disorders is then proposed and described as a case study. Finally, various mHealth data serialization formats and machine-to-machine (M2M) communication protocols are evaluated and compared under realistic operating conditions.
Performance prediction: A case study using a multi-ring KSR-1 machine
NASA Technical Reports Server (NTRS)
Sun, Xian-He; Zhu, Jianping
1995-01-01
While computers with tens of thousands of processors have successfully delivered high performance power for solving some of the so-called 'grand-challenge' applications, the notion of scalability is becoming an important metric in the evaluation of parallel machine architectures and algorithms. In this study, the prediction of scalability and its application are carefully investigated. A simple formula is presented to show the relation between scalability, single processor computing power, and degradation of parallelism. A case study is conducted on a multi-ring KSR1 shared virtual memory machine. Experimental and theoretical results show that the influence of topology variation of an architecture is predictable. Therefore, the performance of an algorithm on a sophisticated, heirarchical architecture can be predicted and the best algorithm-machine combination can be selected for a given application.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karamooz, Saeed; Breeding, John Eric; Justice, T Alan
As MicroTCA expands into applications beyond the telecommunications industry from which it originated, it faces new challenges in the area of inter-blade communications. The ability to achieve deterministic, low-latency communications between blades is critical to realizing a scalable architecture. In the past, legacy bus architectures accomplished inter-blade communications using dedicated parallel buses across the backplane. Because of limited fabric resources on its backplane, MicroTCA uses the carrier hub (MCH) for this purpose. Unfortunately, MCH products from commercial vendors are limited to standard bus protocols such as PCI Express, Serial Rapid IO and 10/40GbE. While these protocols have exceptional throughput capability,more » they are neither deterministic nor necessarily low-latency. To overcome this limitation, an MCH has been developed based on the Xilinx Virtex-7 690T FPGA. This MCH provides the system architect/developer complete flexibility in both the interface protocol and routing of information between blades. In this paper, we present the application of this configurable MCH concept to the Machine Protection System under development for the Spallation Neutron Sources's proton accelerator. Specifically, we demonstrate the use of the configurable MCH as a 12x4-lane crossbar switch using the Aurora protocol to achieve a deterministic, low-latency data link. In this configuration, the crossbar has an aggregate bandwidth of 48 GB/s.« less
Sigint Application for Polymorphous Computing Architecture (PCA): Wideband DF
2006-08-01
Polymorphous Computing Architecture (PCA) program as stated by Robert Graybill is to Develop the computing foundation for agile systems by establishing...ubiquitous MUSIC algorithm rely upon an underlying narrowband signal model [8]. In this case, narrowband means that the signal bandwidth is less than...a wideband DF algorithm is needed to compensate for this model inadequacy. Among the various wideband DF techniques available, the coherent signal
Bandwidth Study of the Microwave Reflectors with Rectangular Corrugations
NASA Astrophysics Data System (ADS)
Zhang, Liang; He, Wenlong; Donaldson, Craig R.; Cross, Adrian W.
2016-09-01
The mode-selective microwave reflector with periodic rectangular corrugations in the inner surface of a circular metallic waveguide is studied in this paper. The relations between the bandwidth and reflection coefficient for different numbers of corrugation sections were studied through a global optimization method. Two types of reflectors were investigated. One does not consider the phase response and the other does. Both types of broadband reflectors operating at W-band were machined and measured to verify the numerical simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tal, J.; Lopez, A.; Edwards, J.M.
1995-04-01
In this paper, an alternative solution to the traditional CNC machine tool controller has been introduced. Software and hardware modules have been described and their incorporation in a CNC control system has been outlined. This type of CNC machine tool controller demonstrates that technology is accessible and can be readily implemented into an open architecture machine tool controller. Benefit to the user is greater controller flexibility, while being economically achievable. PC based, motion as well as non-motion features will provide flexibility through a Windows environment. Up-grading this type of controller system through software revisions will keep the machine tool inmore » a competitive state with minimal effort. Software and hardware modules are mass produced permitting competitive procurement and incorporation. Open architecture CNC systems provide diagnostics thus enhancing maintainability, and machine tool up-time. A major concern of traditional CNC systems has been operator training time. Training time can be greatly minimized by making use of Windows environment features.« less
A survey of compiler optimization techniques
NASA Technical Reports Server (NTRS)
Schneck, P. B.
1972-01-01
Major optimization techniques of compilers are described and grouped into three categories: machine dependent, architecture dependent, and architecture independent. Machine-dependent optimizations tend to be local and are performed upon short spans of generated code by using particular properties of an instruction set to reduce the time or space required by a program. Architecture-dependent optimizations are global and are performed while generating code. These optimizations consider the structure of a computer, but not its detailed instruction set. Architecture independent optimizations are also global but are based on analysis of the program flow graph and the dependencies among statements of source program. A conceptual review of a universal optimizer that performs architecture-independent optimizations at source-code level is also presented.
Clark, Edward B; Hickinbotham, Simon J; Stepney, Susan
2017-05-01
We present a novel stringmol-based artificial chemistry system modelled on the universal constructor architecture (UCA) first explored by von Neumann. In a UCA, machines interact with an abstract description of themselves to replicate by copying the abstract description and constructing the machines that the abstract description encodes. DNA-based replication follows this architecture, with DNA being the abstract description, the polymerase being the copier, and the ribosome being the principal machine in expressing what is encoded on the DNA. This architecture is semantically closed as the machine that defines what the abstract description means is itself encoded on that abstract description. We present a series of experiments with the stringmol UCA that show the evolution of the meaning of genomic material, allowing the concept of semantic closure and transitions between semantically closed states to be elucidated in the light of concrete examples. We present results where, for the first time in an in silico system, simultaneous evolution of the genomic material, copier and constructor of a UCA, giving rise to viable offspring. © 2017 The Author(s).
NASA Astrophysics Data System (ADS)
Balamurugan, A. M.; Sivasubramanian, A.
2014-06-01
The Optical Burst Switching (OBS) is an emergent result to the technology issue that could achieve a viable network in future. They have the ability to meet the bandwidth requisite of those applications that call for intensive bandwidth. The field of optical transmission has undergone numerous advancements and is still being researched mainly due to the fact that optical data transmission can be done at enormous speeds. The concept of OBS is still far from perfection facing issues in case of security threat. The transfer of optical switching paradigm to optical burst switching faces serious downfall in the fields of burst aggregation, routing, authentication, dispute resolution and quality of service (QoS). This paper proposes a framework based on QKD based secure edge router architecture design to provide burst confidentiality. The QKD protocol offers high level of confidentiality as it is indestructible. The design architecture was implemented in FPGA using diverse models and the results were taken. The results show that the proposed model is suitable for real time secure routing applications of the Optical burst switched networks.
High-performance, scalable optical network-on-chip architectures
NASA Astrophysics Data System (ADS)
Tan, Xianfang
The rapid advance of technology enables a large number of processing cores to be integrated into a single chip which is called a Chip Multiprocessor (CMP) or a Multiprocessor System-on-Chip (MPSoC) design. The on-chip interconnection network, which is the communication infrastructure for these processing cores, plays a central role in a many-core system. With the continuously increasing complexity of many-core systems, traditional metallic wired electronic networks-on-chip (NoC) became a bottleneck because of the unbearable latency in data transmission and extremely high energy consumption on chip. Optical networks-on-chip (ONoC) has been proposed as a promising alternative paradigm for electronic NoC with the benefits of optical signaling communication such as extremely high bandwidth, negligible latency, and low power consumption. This dissertation focus on the design of high-performance and scalable ONoC architectures and the contributions are highlighted as follow: 1. A micro-ring resonator (MRR)-based Generic Wavelength-routed Optical Router (GWOR) is proposed. A method for developing any sized GWOR is introduced. GWOR is a scalable non-blocking ONoC architecture with simple structure, low cost and high power efficiency compared to existing ONoC designs. 2. To expand the bandwidth and improve the fault tolerance of the GWOR, a redundant GWOR architecture is designed by cascading different type of GWORs into one network. 3. The redundant GWOR built with MRR-based comb switches is proposed. Comb switches can expand the bandwidth while keep the topology of GWOR unchanged by replacing the general MRRs with comb switches. 4. A butterfly fat tree (BFT)-based hybrid optoelectronic NoC (HONoC) architecture is developed in which GWORs are used for global communication and electronic routers are used for local communication. The proposed HONoC uses less numbers of electronic routers and links than its counterpart of electronic BFT-based NoC. It takes the advantages of GWOR in optical communication and BFT in non-uniform traffic communication and three-dimension (3D) implementation. 5. A cycle-accurate NoC simulator is developed to evaluate the performance of proposed HONoC architectures. It is a comprehensive platform that can simulate both electronic and optical NoCs. Different size HONoC architectures are evaluated in terms of throughput, latency and energy dissipation. Simulation results confirm that HONoC achieves good network performance with lower power consumption.
Dual-scale topology optoelectronic processor.
Marsden, G C; Krishnamoorthy, A V; Esener, S C; Lee, S H
1991-12-15
The dual-scale topology optoelectronic processor (D-STOP) is a parallel optoelectronic architecture for matrix algebraic processing. The architecture can be used for matrix-vector multiplication and two types of vector outer product. The computations are performed electronically, which allows multiplication and summation concepts in linear algebra to be generalized to various nonlinear or symbolic operations. This generalization permits the application of D-STOP to many computational problems. The architecture uses a minimum number of optical transmitters, which thereby reduces fabrication requirements while maintaining area-efficient electronics. The necessary optical interconnections are space invariant, minimizing space-bandwidth requirements.
Implementation of an ADI method on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
The implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are: the MPP, an SIMD machine with 16K bit serial processors; FLEX/32, an MIMD machine with 20 processors; and CRAY/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the FLEX/32 and CRAY/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Implementation of an ADI method on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1987-01-01
In this paper the implementation of an ADI method for solving the diffusion equation on three parallel/vector computers is discussed. The computers were chosen so as to encompass a variety of architectures. They are the MPP, an SIMD machine with 16-Kbit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2, an MIMD machine with four vector processors. The Gaussian elimination algorithm is used to solve a set of tridiagonal systems on the Flex/32 and Cray/2 while the cyclic elimination algorithm is used to solve these systems on the MPP. The implementation of the method is discussed in relation to these architectures and measures of the performance on each machine are given. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally conclusions are presented.
Biomorphic architectures for autonomous Nanosat designs
NASA Technical Reports Server (NTRS)
Hasslacher, Brosl; Tilden, Mark W.
1995-01-01
Modern space tool design is the science of making a machine both massively complex while at the same time extremely robust and dependable. We propose a novel nonlinear control technique that produces capable, self-organizing, micron-scale space machines at low cost and in large numbers by parallel silicon assembly. Experiments using biomorphic architectures (with ideal space attributes) have produced a wide spectrum of survival-oriented machines that are reliably domesticated for work applications in specific environments. In particular, several one-chip satellite prototypes show interesting control properties that can be turned into numerous application-specific machines for autonomous, disposable space tasks. We believe that the real power of these architectures lies in their potential to self-assemble into larger, robust, loosely coupled structures. Assembly takes place at hierarchical space scales, with different attendant properties, allowing for inexpensive solutions to many daunting work tasks. The nature of biomorphic control, design, engineering options, and applications are discussed.
Frequency Domain Beamforming for a Deep Space Network Downlink Array
NASA Technical Reports Server (NTRS)
Navarro, Robert
2012-01-01
This paper describes a frequency domain beamformer to array up to 8 antennas of NASA's Deep Space Network currently in development. The objective of this array is to replace and enhance the capability of the DSN 70m antennas with multiple 34m antennas for telemetry, navigation and radio science use. The array will coherently combine the entire 500 MHz of usable bandwidth available to DSN receivers. A frequency domain beamforming architecture was chosen over a time domain based architecture to handle the large signal bandwidth and efficiently perform delay and phase calibration. The antennas of the DSN are spaced far enough apart that random atmospheric and phase variations between antennas need to be calibrated out on an ongoing basis in real-time. The calibration is done using measurements obtained from a correlator. This DSN Downlink Array expands upon a proof of concept breadboard array built previously to develop the technology and will become an operational asset of the Deep Space Network. Design parameters for frequency channelization, array calibration and delay corrections will be presented as well a method to efficiently calibrate the array for both wide and narrow bandwidth telemetry.
Two-level main memory co-design: Multi-threaded algorithmic primitives, analysis, and simulation
Bender, Michael A.; Berry, Jonathan W.; Hammond, Simon D.; ...
2017-01-03
A challenge in computer architecture is that processors often cannot be fed data from DRAM as fast as CPUs can consume it. Therefore, many applications are memory-bandwidth bound. With this motivation and the realization that traditional architectures (with all DRAM reachable only via bus) are insufficient to feed groups of modern processing units, vendors have introduced a variety of non-DDR 3D memory technologies (Hybrid Memory Cube (HMC),Wide I/O 2, High Bandwidth Memory (HBM)). These offer higher bandwidth and lower power by stacking DRAM chips on the processor or nearby on a silicon interposer. We will call these solutions “near-memory,” andmore » if user-addressable, “scratchpad.” High-performance systems on the market now offer two levels of main memory: near-memory on package and traditional DRAM further away. In the near term we expect the latencies near-memory and DRAM to be similar. Here, it is natural to think of near-memory as another module on the DRAM level of the memory hierarchy. Vendors are expected to offer modes in which the near memory is used as cache, but we believe that this will be inefficient.« less
Efficient green lasers for high-resolution scanning micro-projector displays
NASA Astrophysics Data System (ADS)
Bhatia, Vikram; Bauco, Anthony S.; Oubei, Hassan M.; Loeber, David A. S.
2010-02-01
Laser-based projectors are gaining increased acceptance in mobile device market due to their low power consumption, superior image quality and small size. The basic configuration of such micro-projectors is a miniature mirror that creates an image by raster scanning the collinear red, blue and green laser beams that are individually modulated on a pixel-bypixel basis. The image resolution of these displays can be limited by the modulation bandwidth of the laser sources, and the modulation speed of the green laser has been one of the key limitations in the development of these displays. We will discuss how this limitation is fundamental to the architecture of many laser designs and then present a green laser configuration which overcomes these difficulties. In this green laser architecture infra-red light from a distributed Bragg-reflector (DBR) laser diode undergoes conversion to green light in a waveguided second harmonic generator (SHG) crystal. The direct doubling in a single pass through the SHG crystal allows the device to operate at the large modulation bandwidth of the DBR laser. We demonstrate that the resultant product has a small footprint (<0.7 cc envelope volume), high efficiency (>9% electrical-to-optical conversion) and large modulation bandwidth (>100 MHz).
Dynamic Resource Allocation for IEEE802.16e
NASA Astrophysics Data System (ADS)
Nascimento, Alberto; Rodriguez, Jonathan
Mobile communications has witnessed an exponential increase in the amount of users, services and applications. New high bandwidth consuming applications are targeted for B3G networks raising more stringent requirements for Dynamic Resource Allocation (DRA) architectures and packet schedulers that must be spectrum efficient and deliver QoS for heterogeneous applications and services. In this paper we propose a new cross layer-based architecture framework embedded in a newly designed DRA architecture for the Mobile WiMAX standard. System level simulation results show that the proposed architecture can be considered a viable candidate solution for supporting mixed services in a cost-effective manner in contrast to existing approaches.
Hardware architecture design of a fast global motion estimation method
NASA Astrophysics Data System (ADS)
Liang, Chaobing; Sang, Hongshi; Shen, Xubang
2015-12-01
VLSI implementation of gradient-based global motion estimation (GME) faces two main challenges: irregular data access and high off-chip memory bandwidth requirement. We previously proposed a fast GME method that reduces computational complexity by choosing certain number of small patches containing corners and using them in a gradient-based framework. A hardware architecture is designed to implement this method and further reduce off-chip memory bandwidth requirement. On-chip memories are used to store coordinates of the corners and template patches, while the Gaussian pyramids of both the template and reference frame are stored in off-chip SDRAMs. By performing geometric transform only on the coordinates of the center pixel of a 3-by-3 patch in the template image, a 5-by-5 area containing the warped 3-by-3 patch in the reference image is extracted from the SDRAMs by burst read. Patched-based and burst mode data access helps to keep the off-chip memory bandwidth requirement at the minimum. Although patch size varies at different pyramid level, all patches are processed in term of 3x3 patches, so the utilization of the patch-processing circuit reaches 100%. FPGA implementation results show that the design utilizes 24,080 bits on-chip memory and for a sequence with resolution of 352x288 and frequency of 60Hz, the off-chip bandwidth requirement is only 3.96Mbyte/s, compared with 243.84Mbyte/s of the original gradient-based GME method. This design can be used in applications like video codec, video stabilization, and super-resolution, where real-time GME is a necessity and minimum memory bandwidth requirement is appreciated.
Recursive computer architecture for VLSI
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treleaven, P.C.; Hopkins, R.P.
1982-01-01
A general-purpose computer architecture based on the concept of recursion and suitable for VLSI computer systems built from replicated (lego-like) computing elements is presented. The recursive computer architecture is defined by presenting a program organisation, a machine organisation and an experimental machine implementation oriented to VLSI. The experimental implementation is being restricted to simple, identical microcomputers each containing a memory, a processor and a communications capability. This future generation of lego-like computer systems are termed fifth generation computers by the Japanese. 30 references.
NASA Technical Reports Server (NTRS)
Stevens, H. D.; Miles, E. S.; Rock, S. J.; Cannon, R. H.
1994-01-01
Expanding man's presence in space requires capable, dexterous robots capable of being controlled from the Earth. Traditional 'hand-in-glove' control paradigms require the human operator to directly control virtually every aspect of the robot's operation. While the human provides excellent judgment and perception, human interaction is limited by low bandwidth, delayed communications. These delays make 'hand-in-glove' operation from Earth impractical. In order to alleviate many of the problems inherent to remote operation, Stanford University's Aerospace Robotics Laboratory (ARL) has developed the Object-Based Task-Level Control architecture. Object-Based Task-Level Control (OBTLC) removes the burden of teleoperation from the human operator and enables execution of tasks not possible with current techniques. OBTLC is a hierarchical approach to control where the human operator is able to specify high-level, object-related tasks through an intuitive graphical user interface. Infrequent task-level command replace constant joystick operations, eliminating communications bandwidth and time delay problems. The details of robot control and task execution are handled entirely by the robot and computer control system. The ARL has implemented the OBTLC architecture on a set of Free-Flying Space Robots. The capability of the OBTLC architecture has been demonstrated by controlling the ARL Free-Flying Space Robots from NASA Ames Research Center.
NASA Technical Reports Server (NTRS)
Fatoohi, Rod; Saini, Subbash; Ciotti, Robert
2006-01-01
We study the performance of inter-process communication on four high-speed multiprocessor systems using a set of communication benchmarks. The goal is to identify certain limiting factors and bottlenecks with the interconnect of these systems as well as to compare these interconnects. We measured network bandwidth using different number of communicating processors and communication patterns, such as point-to-point communication, collective communication, and dense communication patterns. The four platforms are: a 512-processor SGI Altix 3700 BX2 shared-memory machine with 3.2 GB/s links; a 64-processor (single-streaming) Cray XI shared-memory machine with 32 1.6 GB/s links; a 128-processor Cray Opteron cluster using a Myrinet network; and a 1280-node Dell PowerEdge cluster with an InfiniBand network. Our, results show the impact of the network bandwidth and topology on the overall performance of each interconnect.
NASA Astrophysics Data System (ADS)
He, Huimin; Liu, Fengman; Li, Baoxia; Xue, Haiyun; Wang, Haidong; Qiu, Delong; Zhou, Yunyan; Cao, Liqiang
2016-11-01
With the development of the multicore processor, the bandwidth and capacity of the memory, rather than the memory area, are the key factors in server performance. At present, however, the new architectures, such as fully buffered DIMM (FBDIMM), hybrid memory cube (HMC), and high bandwidth memory (HBM), cannot be commercially applied in the server. Therefore, a new architecture for the server is proposed. CPU and memory are separated onto different boards, and optical interconnection is used for the communication between them. Each optical module corresponds to each dual inline memory module (DIMM) with 64 channels. Compared to the previous technology, not only can the architecture realize high-capacity and wide-bandwidth memory, it also can reduce power consumption and cost, and be compatible with the existing dynamic random access memory (DRAM). In this article, the proposed module with system-in-package (SiP) integration is demonstrated. In the optical module, the silicon photonic chip is included, which is a promising technology to be applied in the next-generation data exchanging centers. And due to the bandwidth-distance performance of the optical interconnection, SerDes chips are introduced to convert the 64-bit data at 800 Mbps from/to 4-channel data at 12.8 Gbps after/before they are transmitted though optical fiber. All the devices are packaged on cheap organic substrates. To ensure the performance of the whole system, several optimization efforts have been performed on the two modules. High-speed interconnection traces have been designed and simulated with electromagnetic simulation software. Steady-state thermal characteristics of the transceiver module have been evaluated by ANSYS APLD based on finite-element methodology (FEM). Heat sinks are placed at the hotspot area to ensure the reliability of all working chips. Finally, this transceiver system based on silicon photonics is measured, and the eye diagrams of data and clock signals are verified.
Functional language and data flow architectures
NASA Technical Reports Server (NTRS)
Ercegovac, M. D.; Patel, D. R.; Lang, T.
1983-01-01
This is a tutorial article about language and architecture approaches for highly concurrent computer systems based on the functional style of programming. The discussion concentrates on the basic aspects of functional languages, and sequencing models such as data-flow, demand-driven and reduction which are essential at the machine organization level. Several examples of highly concurrent machines are described.
Submicron Systems Architecture Project
1981-11-01
This project is concerned with the architecture , design , and testing of VLSI Systems. The principal activities in this report period include: The Tree Machine; COPE, The Homogeneous Machine; Computational Arrays; Switch-Level Model for MOS Logic Design; Testing; Local Network and Designer Workstations; Self-timed Systems; Characterization of Deadlock Free Resource Contention; Concurrency Algebra; Language Design and Logic for Program Verification.
Broadband locally resonant metamaterials with graded hierarchical architecture
NASA Astrophysics Data System (ADS)
Liu, Chenchen; Reina, Celia
2018-03-01
We investigate the effect of hierarchical designs on the bandgap structure of periodic lattice systems with inner resonators. A detailed parameter study reveals various interesting features of structures with two levels of hierarchy as compared with one level systems with identical static mass. In particular: (i) their overall bandwidth is approximately equal, yet bounded above by the bandwidth of the single-resonator system; (ii) the number of bandgaps increases with the level of hierarchy; and (iii) the spectrum of bandgap frequencies is also enlarged. Taking advantage of these features, we propose graded hierarchical structures with ultra-broadband properties. These designs are validated over analogous continuum models via finite element simulations, demonstrating their capability to overcome the bandwidth narrowness that is typical of resonant metamaterials.
NASA Astrophysics Data System (ADS)
Zhao, Yongli; Li, Yajie; Wang, Xinbo; Chen, Bowen; Zhang, Jie
2016-09-01
A hierarchical software-defined networking (SDN) control architecture is designed for multi-domain optical networks with the Open Daylight (ODL) controller. The OpenFlow-based Control Virtual Network Interface (CVNI) protocol is deployed between the network orchestrator and the domain controllers. Then, a dynamic bandwidth on demand (BoD) provisioning solution is proposed based on time scheduling in software-defined multi-domain optical networks (SD-MDON). Shared Risk Link Groups (SRLG)-disjoint routing schemes are adopted to separate each tenant for reliability. The SD-MDON testbed is built based on the proposed hierarchical control architecture. Then the proposed time scheduling-based BoD (Ts-BoD) solution is experimentally demonstrated on the testbed. The performance of the Ts-BoD solution is evaluated with respect to blocking probability, resource utilization, and lightpath setup latency.
Designing a VMEbus FDDI adapter card
NASA Astrophysics Data System (ADS)
Venkataraman, Raman
1992-03-01
This paper presents a system architecture for a VMEbus FDDI adapter card containing a node core, FDDI block, frame buffer memory and system interface unit. Most of the functions of the PHY and MAC layers of FDDI are implemented with National's FDDI chip set and the SMT implementation is simplified with a low cost microcontroller. The factors that influence the system bus bandwidth utilization and FDDI bandwidth utilization are the data path and frame buffer memory architecture. The VRAM based frame buffer memory has two sections - - LLC frame memory and SMT frame memory. Each section with an independent serial access memory (SAM) port provides an independent access after the initial data transfer cycle on the main port and hence, the throughput is maximized on each port of the memory. The SAM port simplifies the system bus master DMA design and the VMEbus interface can be designed with low-cost off-the-shelf interface chips.
Cooperating reduction machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kluge, W.E.
1983-11-01
This paper presents a concept and a system architecture for the concurrent execution of program expressions of a concrete reduction language based on lamda-expressions. If formulated appropriately, these expressions are well-suited for concurrent execution, following a demand-driven model of computation. In particular, recursive program expressions with nonlinear expansion may, at run time, recursively be partitioned into a hierarchy of independent subexpressions which can be reduced by a corresponding hierarchy of virtual reduction machines. This hierarchy unfolds and collapses dynamically, with virtual machines recursively assuming the role of masters that create and eventually terminate, or synchronize with, slaves. The paper alsomore » proposes a nonhierarchically organized system of reduction machines, each featuring a stack architecture, that effectively supports the allocation of virtual machines to the real machines of the system in compliance with their hierarchical order of creation and termination. 25 references.« less
An Application Server for Scientific Collaboration
NASA Astrophysics Data System (ADS)
Cary, John R.; Luetkemeyer, Kelly G.
1998-11-01
Tech-X Corporation has developed SciChat, an application server for scientific collaboration. Connections are made to the server through a Java client, that can either be an application or an applet served in a web page. Once connected, the client may choose to start or join a session. A session includes not only other clients, but also an application. Any client can send a command to the application. This command is executed on the server and echoed to all clients. The results of the command, whether numerical or graphical, are then distributed to all of the clients; thus, multiple clients can interact collaboratively with a single application. The client is developed in Java, the server in C++, and the middleware is the Common Object Request Broker Architecture. In this system, the Graphical User Interface processing is on the client machine, so one does not have the disadvantages of insufficient bandwidth as occurs when running X over the internet. Because the server, client, and middleware are object oriented, new types of servers and clients specialized to particular scientific applications are more easily developed.
Knowledge Reasoning with Semantic Data for Real-Time Data Processing in Smart Factory
Wang, Shiyong; Li, Di; Liu, Chengliang
2018-01-01
The application of high-bandwidth networks and cloud computing in manufacturing systems will be followed by mass data. Industrial data analysis plays important roles in condition monitoring, performance optimization, flexibility, and transparency of the manufacturing system. However, the currently existing architectures are mainly for offline data analysis, not suitable for real-time data processing. In this paper, we first define the smart factory as a cloud-assisted and self-organized manufacturing system in which physical entities such as machines, conveyors, and products organize production through intelligent negotiation and the cloud supervises this self-organized process for fault detection and troubleshooting based on data analysis. Then, we propose a scheme to integrate knowledge reasoning and semantic data where the reasoning engine processes the ontology model with real time semantic data coming from the production process. Based on these ideas, we build a benchmarking system for smart candy packing application that supports direct consumer customization and flexible hybrid production, and the data are collected and processed in real time for fault diagnosis and statistical analysis. PMID:29415444
NASA Astrophysics Data System (ADS)
Zhang, Huibin; Wang, Yuqiao; Chen, Haoran; Zhao, Yongli; Zhang, Jie
2017-12-01
In software defined optical networks (SDON), the centralized control plane may encounter numerous intrusion threatens which compromise the security level of provisioned services. In this paper, the issue of control plane security is studied and two machine-learning-based control plane intrusion detection techniques are proposed for SDON with properly selected features such as bandwidth, route length, etc. We validate the feasibility and efficiency of the proposed techniques by simulations. Results show an accuracy of 83% for intrusion detection can be achieved with the proposed machine-learning-based control plane intrusion detection techniques.
High Performance Databases For Scientific Applications
NASA Technical Reports Server (NTRS)
French, James C.; Grimshaw, Andrew S.
1997-01-01
The goal for this task is to develop an Extensible File System (ELFS). ELFS attacks the problem of the following: 1. Providing high bandwidth performance architectures; 2. Reducing the cognitive burden faced by applications programmers when they attempt to optimize; and 3. Seamlessly managing the proliferation of data formats and architectural differences. The approach for ELFS solution consists of language and run-time system support that permits the specification on a hierarchy of file classes.
Geotechnical engineering practices in Canada and Europe
DOT National Transportation Integrated Search
2011-12-01
This report describes Machine-to-Machine service architecture and how it is evolving over the next several years. Nearly 50 billion Machine-to-Machine (M2M) devices are predicted to be deployed by all sectors by 2025. The largest impediment to M2M de...
Bandwidth increasing mechanism by introducing a curve fixture to the cantilever generator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Weiqun, E-mail: weiqunliu@home.swjtu.edu.cn; Liu, Congzhi; Ren, Bingyu
2016-07-25
A nonlinear wideband generator architecture by clamping the cantilever beam generator with a curve fixture is proposed. Devices with different nonlinear stiffness can be obtained by properly choosing the fixture curve according to the design requirements. Three available generator types are presented and discussed for polynomial curves. Experimental investigations show that the proposed mechanism effectively extends the operation bandwidth with good power performance. Especially, the simplicity and easy feasibility allow the mechanism to be widely applied for vibration generators in different scales and environments.
Software architecture for time-constrained machine vision applications
NASA Astrophysics Data System (ADS)
Usamentiaga, Rubén; Molleda, Julio; García, Daniel F.; Bulnes, Francisco G.
2013-01-01
Real-time image and video processing applications require skilled architects, and recent trends in the hardware platform make the design and implementation of these applications increasingly complex. Many frameworks and libraries have been proposed or commercialized to simplify the design and tuning of real-time image processing applications. However, they tend to lack flexibility, because they are normally oriented toward particular types of applications, or they impose specific data processing models such as the pipeline. Other issues include large memory footprints, difficulty for reuse, and inefficient execution on multicore processors. We present a novel software architecture for time-constrained machine vision applications that addresses these issues. The architecture is divided into three layers. The platform abstraction layer provides a high-level application programming interface for the rest of the architecture. The messaging layer provides a message-passing interface based on a dynamic publish/subscribe pattern. A topic-based filtering in which messages are published to topics is used to route the messages from the publishers to the subscribers interested in a particular type of message. The application layer provides a repository for reusable application modules designed for machine vision applications. These modules, which include acquisition, visualization, communication, user interface, and data processing, take advantage of the power of well-known libraries such as OpenCV, Intel IPP, or CUDA. Finally, the proposed architecture is applied to a real machine vision application: a jam detector for steel pickling lines.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vineyard, Craig Michael; Verzi, Stephen Joseph
As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilizemore » memory.« less
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
VTK-m: Accelerating the Visualization Toolkit for Massively Threaded Architectures
Moreland, Kenneth; Sewell, Christopher; Usher, William; ...
2016-05-09
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
Engineering molecular machines
NASA Astrophysics Data System (ADS)
Erman, Burak
2016-04-01
Biological molecular motors use chemical energy, mostly in the form of ATP hydrolysis, and convert it to mechanical energy. Correlated thermal fluctuations are essential for the function of a molecular machine and it is the hydrolysis of ATP that modifies the correlated fluctuations of the system. Correlations are consequences of the molecular architecture of the protein. The idea that synthetic molecular machines may be constructed by designing the proper molecular architecture is challenging. In their paper, Sarkar et al (2016 New J. Phys. 18 043006) propose a synthetic molecular motor based on the coarse grained elastic network model of proteins and show by numerical simulations that motor function is realized, ranging from deterministic to thermal, depending on temperature. This work opens up a new range of possibilities of molecular architecture based engine design.
Silicon Nanophotonics for Many-Core On-Chip Networks
NASA Astrophysics Data System (ADS)
Mohamed, Moustafa
Number of cores in many-core architectures are scaling to unprecedented levels requiring ever increasing communication capacity. Traditionally, architects follow the path of higher throughput at the expense of latency. This trend has evolved into being problematic for performance in many-core architectures. Moreover, the trends of power consumption is increasing with system scaling mandating nontraditional solutions. Nanophotonics can address these problems, offering benefits in the three frontiers of many-core processor design: Latency, bandwidth, and power. Nanophotonics leverage circuit-switching flow control allowing low latency; in addition, the power consumption of optical links is significantly lower compared to their electrical counterparts at intermediate and long links. Finally, through wave division multiplexing, we can keep the high bandwidth trends without sacrificing the throughput. This thesis focuses on realizing nanophotonics for communication in many-core architectures at different design levels considering reliability challenges that our fabrication and measurements reveal. First, we study how to design on-chip networks for low latency, low power, and high bandwidth by exploiting the full potential of nanophotonics. The design process considers device level limitations and capabilities on one hand, and system level demands in terms of power and performance on the other hand. The design involves the choice of devices, designing the optical link, the topology, the arbitration technique, and the routing mechanism. Next, we address the problem of reliability in on-chip networks. Reliability not only degrades performance but can block communication. Hence, we propose a reliability-aware design flow and present a reliability management technique based on this flow to address reliability in the system. In the proposed flow reliability is modeled and analyzed for at the device, architecture, and system level. Our reliability management technique is superior to existing solutions in terms of power and performance. In fact, our solution can scale to thousand core with low overhead.
NASA Astrophysics Data System (ADS)
Ko, Tony H.; Hartl, Ingmar; Drexler, Wolfgang; Ghanta, Ravi K.; Fujimoto, James G.
2002-06-01
Quantitative, three-dimensional mapping of retinal architectural morphology was achieved using an ultrahigh resolution ophthalmic OCT system. This OCT system utilizes a broad bandwidth titanium-sapphire laser light source generating bandwidths of up to 300 nm near 800 nm center wavelength. The system enables real-time cross-sectional imaging of the retina with ~3 micrometers axial resolution. The macula and the papillomacular axis of a normal human subject were systematically mapped using a series of linear scans. Edge detection and segmentation algorithms were developed to quantify retinal and intraretinal thicknesses. Topographic mapping of the total retinal thickness and the total ganglion cell/inner plexiform layer thickness was achieved around the macula. A topographic mapping quantifying the progressive thickening of the nerve fiber layer (NFL) nasally approaching the optic disk was also demonstrated. The ability to create three-dimensional topographic mapping of retinal architectural morphology at ~3 micrometers axial resolution will be relevant for the diagnosis of many retinal diseases. The topographic quantification of these structures can serve as a powerful tool for developing algorithms and clinical scanning protocols for the screening and staging of ophthalmic diseases such as glaucoma.
Yang, Hui; Zhang, Jie; Zhao, Yongli; Ji, Yuefeng; Li, Hui; Lin, Yi; Li, Gang; Han, Jianrui; Lee, Young; Ma, Teng
2014-07-28
Data center interconnection with elastic optical networks is a promising scenario to meet the high burstiness and high-bandwidth requirements of data center services. We previously implemented enhanced software defined networking over elastic optical network for data center application [Opt. Express 21, 26990 (2013)]. On the basis of it, this study extends to consider the time-aware data center service scheduling with elastic service time and service bandwidth according to the various time sensitivity requirements. A novel time-aware enhanced software defined networking (TeSDN) architecture for elastic data center optical interconnection has been proposed in this paper, by introducing a time-aware resources scheduling (TaRS) scheme. The TeSDN can accommodate the data center services with required QoS considering the time dimensionality, and enhance cross stratum optimization of application and elastic optical network stratums resources based on spectrum elasticity, application elasticity and time elasticity. The overall feasibility and efficiency of the proposed architecture is experimentally verified on our OpenFlow-based testbed. The performance of TaRS scheme under heavy traffic load scenario is also quantitatively evaluated based on TeSDN architecture in terms of blocking probability and resource occupation rate.
Scaling Support Vector Machines On Modern HPC Platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Fu, Haohuan; Song, Shuaiwen
2015-02-01
We designed and implemented MIC-SVM, a highly efficient parallel SVM for x86 based multicore and many-core architectures, such as the Intel Ivy Bridge CPUs and Intel Xeon Phi co-processor (MIC). We propose various novel analysis methods and optimization techniques to fully utilize the multilevel parallelism provided by these architectures and serve as general optimization methods for other machine learning tools.
Application-oriented integrated control center (AICC) for heterogeneous optical networks
NASA Astrophysics Data System (ADS)
Zhao, Yongli; Zhang, Jie; Cao, Xuping; Wang, Dajiang; Wu, Koubo; Cai, Yinxiang; Gu, Wanyi
2011-12-01
Various broad bandwidth services have being swallowing the bandwidth resource of optical networks, such as the data center application and cloud computation. There are still some challenges for future optical networks although the available bandwidth is increasing with the development of transmission technologies. The relationship between upper application layer and lower network resource layer is necessary to be researched further. In order to improve the efficiency of network resources and capability of service provisioning, heterogeneous optical networks resource can be abstracted as unified Application Programming Interfaces (APIs) which can be open to various upper applications through Application-oriented Integrated Control Center (AICC) proposed in the paper. A novel Openflow-based unified control architecture is proposed for the optimization of cross layer resources. Numeric results show good performance of AICC through simulation experiments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brandt, James M.; Devine, Karen Dragon; Gentile, Ann C.
2014-09-01
As computer systems grow in both size and complexity, the need for applications and run-time systems to adjust to their dynamic environment also grows. The goal of the RAAMP LDRD was to combine static architecture information and real-time system state with algorithms to conserve power, reduce communication costs, and avoid network contention. We devel- oped new data collection and aggregation tools to extract static hardware information (e.g., node/core hierarchy, network routing) as well as real-time performance data (e.g., CPU uti- lization, power consumption, memory bandwidth saturation, percentage of used bandwidth, number of network stalls). We created application interfaces that allowedmore » this data to be used easily by algorithms. Finally, we demonstrated the benefit of integrating system and application information for two use cases. The first used real-time power consumption and memory bandwidth saturation data to throttle concurrency to save power without increasing application execution time. The second used static or real-time network traffic information to reduce or avoid network congestion by remapping MPI tasks to allocated processors. Results from our work are summarized in this report; more details are available in our publications [2, 6, 14, 16, 22, 29, 38, 44, 51, 54].« less
A new multifunction acousto-optic signal processor
NASA Technical Reports Server (NTRS)
Berg, N. J.; Casseday, M. W.; Filipov, A. N.; Pellegrino, J. M.
1984-01-01
An acousto-optic architecture for simultaneously obtaining time integration correlation and high-speed power spectrum analysis was constructed using commercially available TeO2 modulators and photodiode detector-arrays. The correlator section of the processor uses coherent interferometry to attain maximum bandwidth and dynamic range while achieving a time-bandwidth product of 1 million. Two correllator outputs are achieved in this system configuration. One is optically filtered and magnified 2 : 1 to decrease the spatial frequency to a level where a 25-MHz bandwidth may be sampled by a 62-mm array with elements on 25-micro centers. The other output is magnified by a factor of 10 such that the center 4 microseconds of information is available for estimation of time-difference-of-arrival to within 10 ns. The Bragg cell spectrum-analyzer section, which also has two outputs, resolves a 25-MHz instantaneous bandwidth to 25 kHz and can determine discrete-frequency reception time to within 15 microseconds. A microprocessor combines spectrum analysis information with that obtained from the correlator.
2009-04-22
bandwidth and response times. Forrester Research uses the analogy of a consumer using an automated teller machine to explain how technical SLAs should...be crafted. “It’s not enough that you put your card and Personal Identification Number (PIN) [in the machine ] and request to withdraw cash...IRR) Net Present Value (NPV) Other Relevant Metrics Payback Period Cost/Benefit Ratio Cost, Economic, and/or Financial Analysis Yes Yes Yes
Automated Discovery of Machine-Specific Code Improvements
1984-12-01
operation of the source language. Additional analysis may reveal special features of the target architecture that may be exploited to generate efficient...Additional analysis may reveal special features of the target architecture that may be exploited to generate efficient code. Such analysis is optional...incorporate knowledge of the source language, but do not refer to features of the target machine. These early phases are sometimes referred to as the
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan
2005-06-01
Call for Papers: Optical Access Networks With the wide deployment of fiber-optic technology over the past two decades, we have witnessed a tremendous growth of bandwidth capacity in the backbone networks of today's telecommunications infrastructure. However, access networks, which cover the "last-mile" areas and serve numerous residential and small business users, have not been scaled up commensurately. The local subscriber lines for telephone and cable television are still using twisted pairs and coaxial cables. Most residential connections to the Internet are still through dial-up modems operating at a low speed on twisted pairs. As the demand for access bandwidth increases with emerging high-bandwidth applications, such as distance learning, high-definition television (HDTV), and video on demand (VoD), the last-mile access networks have become a bandwidth bottleneck in today's telecommunications infrastructure. To ease this bottleneck, it is imperative to provide sufficient bandwidth capacity in the access networks to open the bottleneck and thus present more opportunities for the provisioning of multiservices. Optical access solutions promise huge bandwidth to service providers and low-cost high-bandwidth services to end users and are therefore widely considered the technology of choice for next-generation access networks. To realize the vision of optical access networks, however, many key issues still need to be addressed, such as network architectures, signaling protocols, and implementation standards. The major challenges lie in the fact that an optical solution must be not only robust, scalable, and flexible, but also implemented at a low cost comparable to that of existing access solutions in order to increase the economic viability of many potential high-bandwidth applications. In recent years, optical access networks have been receiving tremendous attention from both academia and industry. A large number of research activities have been carried out or are now underway this hot area. The purpose of this feature issue is to expose the networking community to the latest research breakthroughs and progresses in the area of optical access networks. This feature issue aims to present a collection of papers that focus on the state-of-the-art research in various networking aspects of optical access networks. Original papers are solicited from all researchers involved in area of optical access networks. Topics of interest include but not limited to: Optical access network architectures and protocols Passive optical networks (BPON, EPON, GPON, etc.) Active optical networks Multiple access control Multiservices and QoS provisioning Network survivability Field trials and standards Performance modeling and analysis
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan; Jersey Inst Ansari, New; Jersey Inst, New
2005-04-01
Call for Papers: Optical Access Networks With the wide deployment of fiber-optic technology over the past two decades, we have witnessed a tremendous growth of bandwidth capacity in the backbone networks of today's telecommunications infrastructure. However, access networks, which cover the "last-mile" areas and serve numerous residential and small business users, have not been scaled up commensurately. The local subscriber lines for telephone and cable television are still using twisted pairs and coaxial cables. Most residential connections to the Internet are still through dial-up modems operating at a low speed on twisted pairs. As the demand for access bandwidth increases with emerging high-bandwidth applications, such as distance learning, high-definition television (HDTV), and video on demand (VoD), the last-mile access networks have become a bandwidth bottleneck in today's telecommunications infrastructure. To ease this bottleneck, it is imperative to provide sufficient bandwidth capacity in the access networks to open the bottleneck and thus present more opportunities for the provisioning of multiservices. Optical access solutions promise huge bandwidth to service providers and low-cost high-bandwidth services to end users and are therefore widely considered the technology of choice for next-generation access networks. To realize the vision of optical access networks, however, many key issues still need to be addressed, such as network architectures, signaling protocols, and implementation standards. The major challenges lie in the fact that an optical solution must be not only robust, scalable, and flexible, but also implemented at a low cost comparable to that of existing access solutions in order to increase the economic viability of many potential high-bandwidth applications. In recent years, optical access networks have been receiving tremendous attention from both academia and industry. A large number of research activities have been carried out or are now underway this hot area. The purpose of this feature issue is to expose the networking community to the latest research breakthroughs and progresses in the area of optical access networks. This feature issue aims to present a collection of papers that focus on the state-of-the-art research in various networking aspects of optical access networks. Original papers are solicited from all researchers involved in area of optical access networks. Topics of interest include but not limited to: Optical access network architectures and protocols Passive optical networks (BPON, EPON, GPON, etc.) Active optical networks Multiple access control Multiservices and QoS provisioning Network survivability Field trials and standards Performance modeling and analysis
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan
2005-05-01
Call for Papers: Optical Access Networks With the wide deployment of fiber-optic technology over the past two decades, we have witnessed a tremendous growth of bandwidth capacity in the backbone networks of today's telecommunications infrastructure. However, access networks, which cover the "last-mile" areas and serve numerous residential and small business users, have not been scaled up commensurately. The local subscriber lines for telephone and cable television are still using twisted pairs and coaxial cables. Most residential connections to the Internet are still through dial-up modems operating at a low speed on twisted pairs. As the demand for access bandwidth increases with emerging high-bandwidth applications, such as distance learning, high-definition television (HDTV), and video on demand (VoD), the last-mile access networks have become a bandwidth bottleneck in today's telecommunications infrastructure. To ease this bottleneck, it is imperative to provide sufficient bandwidth capacity in the access networks to open the bottleneck and thus present more opportunities for the provisioning of multiservices. Optical access solutions promise huge bandwidth to service providers and low-cost high-bandwidth services to end users and are therefore widely considered the technology of choice for next-generation access networks. To realize the vision of optical access networks, however, many key issues still need to be addressed, such as network architectures, signaling protocols, and implementation standards. The major challenges lie in the fact that an optical solution must be not only robust, scalable, and flexible, but also implemented at a low cost comparable to that of existing access solutions in order to increase the economic viability of many potential high-bandwidth applications. In recent years, optical access networks have been receiving tremendous attention from both academia and industry. A large number of research activities have been carried out or are now underway this hot area. The purpose of this feature issue is to expose the networking community to the latest research breakthroughs and progresses in the area of optical access networks. This feature issue aims to present a collection of papers that focus on the state-of-the-art research in various networking aspects of optical access networks. Original papers are solicited from all researchers involved in area of optical access networks. Topics of interest include but not limited to: Optical access network architectures and protocols Passive optical networks (BPON, EPON, GPON, etc.) Active optical networks Multiple access control Multiservices and QoS provisioning Network survivability Field trials and standards Performance modeling and analysis
Acoustic transient classification with a template correlation processor.
Edwards, R T
1999-10-01
I present an architecture for acoustic pattern classification using trinary-trinary template correlation. In spite of its computational simplicity, the algorithm and architecture represent a method which greatly reduces bandwidth of the input, storage requirements of the classifier memory, and power consumption of the system without compromising classification accuracy. The linear system should be amenable to training using recently-developed methods such as Independent Component Analysis (ICA), and we predict that behavior will be qualitatively similar to that of structures in the auditory cortex.
Low-Power Architectures for Large Radio Astronomy Correlators
NASA Technical Reports Server (NTRS)
D'Addario, Larry R.
2011-01-01
The architecture of a cross-correlator for a synthesis radio telescope with N greater than 1000 antennas is studied with the objective of minimizing power consumption. It is found that the optimum architecture minimizes memory operations, and this implies preference for a matrix structure over a pipeline structure and avoiding the use of memory banks as accumulation registers when sharing multiply-accumulators among baselines. A straw-man design for N = 2000 and bandwidth of 1 GHz, based on ASICs fabricated in a 90 nm CMOS process, is presented. The cross-correlator proper (excluding per-antenna processing) is estimated to consume less than 35 kW.
Open multi-agent control architecture to support virtual-reality-based man-machine interfaces
NASA Astrophysics Data System (ADS)
Freund, Eckhard; Rossmann, Juergen; Brasch, Marcel
2001-10-01
Projective Virtual Reality is a new and promising approach to intuitively operable man machine interfaces for the commanding and supervision of complex automation systems. The user interface part of Projective Virtual Reality heavily builds on latest Virtual Reality techniques, a task deduction component and automatic action planning capabilities. In order to realize man machine interfaces for complex applications, not only the Virtual Reality part has to be considered but also the capabilities of the underlying robot and automation controller are of great importance. This paper presents a control architecture that has proved to be an ideal basis for the realization of complex robotic and automation systems that are controlled by Virtual Reality based man machine interfaces. The architecture does not just provide a well suited framework for the real-time control of a multi robot system but also supports Virtual Reality metaphors and augmentations which facilitate the user's job to command and supervise a complex system. The developed control architecture has already been used for a number of applications. Its capability to integrate sensor information from sensors of different levels of abstraction in real-time helps to make the realized automation system very responsive to real world changes. In this paper, the architecture will be described comprehensively, its main building blocks will be discussed and one realization that is built based on an open source real-time operating system will be presented. The software design and the features of the architecture which make it generally applicable to the distributed control of automation agents in real world applications will be explained. Furthermore its application to the commanding and control of experiments in the Columbus space laboratory, the European contribution to the International Space Station (ISS), is only one example which will be described.
An implementation of a reference symbol approach to generic modulation in fading channels
NASA Technical Reports Server (NTRS)
Young, R. J.; Lodge, J. H.; Pacola, L. C.
1990-01-01
As mobile satellite communications systems evolve over the next decade, they will have to adapt to a changing tradeoff between bandwidth and power. This paper presents a flexible approach to digital modulation and coding that will accommodate both wideband and narrowband schemes. This architecture could be the basis for a family of modems, each satisfying a specific power and bandwidth constraint, yet all having a large number of common signal processing blocks. The implementation of this generic approach, with general purpose digital processors for transmission of 4.8 kilobits per sec. digitally encoded speech, is described.
Investigation of voltage source design's for Electrical Impedance Mammography (EIM) Systems.
Qureshi, Tabassum R; Chatwin, Chris R; Zhou, Zhou; Li, Nan; Wang, W
2012-01-01
According to Jossient, interesting characteristics of breast tissues mostly lie above 1MHz; therefore a wideband excitation source covering higher frequencies (i.e. above 1MHz) is required. The main objective of this research is to establish a feasible bandwidth envelope that can be used to design a constant EIM voltage source over a wide bandwidth with low output impedance for practical implementation. An excitation source is one of the major components in bio-impedance measurement systems. In any bio-impedance measurement system the excitation source can be achieved either by injecting current and measuring the resulting voltages, or by applying voltages and measuring the current developed. This paper describes three voltage source architectures and based on their bandwidth comparison; a differential voltage controlled voltage source (VCVS) is proposed, which can be used over a wide bandwidth (>15MHz). This paper describes the performance of the designed EIM voltage source for different load conditions and load capacitances reporting signal-to-noise ratio of approx 90dB at 10MHz frequency, signal phase and maximum of 4.75kΩ source output impedance at 10MHz. Optimum data obtained using Pspice® is used to demonstrate the high-bandwidth performance of the source.
Evaluation of architectures for an ASP MPEG-4 decoder using a system-level design methodology
NASA Astrophysics Data System (ADS)
Garcia, Luz; Reyes, Victor; Barreto, Dacil; Marrero, Gustavo; Bautista, Tomas; Nunez, Antonio
2005-06-01
Trends in multimedia consumer electronics, digital video and audio, aim to reach users through low-cost mobile devices connected to data broadcasting networks with limited bandwidth. An emergent broadcasting network is the digital audio broadcasting network (DAB) which provides CD quality audio transmission together with robustness and efficiency techniques to allow good quality reception in motion conditions. This paper focuses on the system-level evaluation of different architectural options to allow low bandwidth digital video reception over DAB, based on video compression techniques. Profiling and design space exploration techniques are applied over the ASP MPEG-4 decoder in order to find out the best HW/SW partition given the application and platform constraints. An innovative SystemC-based system-level design tool, called CASSE, is being used for modelling, exploration and evaluation of different ASP MPEG-4 decoder HW/SW partitions. System-level trade offs and quantitative data derived from this analysis are also presented in this work.
Experimental demonstration of spectrum-sliced elastic optical path network (SLICE).
Kozicki, Bartłomiej; Takara, Hidehiko; Tsukishima, Yukio; Yoshimatsu, Toshihide; Yonenaga, Kazushige; Jinno, Masahiko
2010-10-11
We describe experimental demonstration of spectrum-sliced elastic optical path network (SLICE) architecture. We employ optical orthogonal frequency-division multiplexing (OFDM) modulation format and bandwidth-variable optical cross-connects (OXC) to generate, transmit and receive optical paths with bandwidths of up to 1 Tb/s. We experimentally demonstrate elastic optical path setup and spectrally-efficient transmission of multiple channels with bit rates ranging from 40 to 140 Gb/s between six nodes of a mesh network. We show dynamic bandwidth scalability for optical paths with bit rates of 40 to 440 Gb/s. Moreover, we demonstrate multihop transmission of a 1 Tb/s optical path over 400 km of standard single-mode fiber (SMF). Finally, we investigate the filtering properties and the required guard band width for spectrally-efficient allocation of optical paths in SLICE.
Equalizing Si photodetectors fabricated in standard CMOS processes
NASA Astrophysics Data System (ADS)
Guerrero, E.; Aguirre, J.; Sánchez-Azqueta, C.; Royo, G.; Gimeno, C.; Celma, S.
2017-05-01
This work presents a new continuous-time equalization approach to overcome the limited bandwidth of integrated CMOS photodetectors. It is based on a split-path topology that features completely decoupled controls for boosting and gain; this capability allows a better tuning of the equalizer in comparison with other architectures based on the degenerated differential pair, which is particularly helpful to achieve a proper calibration of the system. The equalizer is intended to enhance the bandwidth of CMOS standard n-well/p-bulk differential photodiodes (DPDs), which falls below 10MHz representing a bottleneck in fully integrated optoelectronic interfaces to fulfill the low-cost requirements of modern smart sensors. The proposed equalizer has been simulated in a 65nm CMOS process and biased with a single supply voltage of 1V, where the bandwidth of the DPD has been increased up to 3 GHz.
Architectures for intelligent machines
NASA Technical Reports Server (NTRS)
Saridis, George N.
1991-01-01
The theory of intelligent machines has been recently reformulated to incorporate new architectures that are using neural and Petri nets. The analytic functions of an intelligent machine are implemented by intelligent controls, using entropy as a measure. The resulting hierarchical control structure is based on the principle of increasing precision with decreasing intelligence. Each of the three levels of the intelligent control is using different architectures, in order to satisfy the requirements of the principle: the organization level is moduled after a Boltzmann machine for abstract reasoning, task planning and decision making; the coordination level is composed of a number of Petri net transducers supervised, for command exchange, by a dispatcher, which also serves as an interface to the organization level; the execution level, include the sensory, planning for navigation and control hardware which interacts one-to-one with the appropriate coordinators, while a VME bus provides a channel for database exchange among the several devices. This system is currently implemented on a robotic transporter, designed for space construction at the CIRSSE laboratories at the Rensselaer Polytechnic Institute. The progress of its development is reported.
Machine Learning for the Knowledge Plane
2006-06-01
this idea is to combine techniques from machine learning with new architectural concepts in networking to make the internet self-aware and self...work on the machine learning portion of the Knowledge Plane. This consisted of three components: (a) we wrote a document formulating the various
Implementation and analysis of a Navier-Stokes algorithm on parallel computers
NASA Technical Reports Server (NTRS)
Fatoohi, Raad A.; Grosch, Chester E.
1988-01-01
The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.
Flexible architecture of data acquisition firmware based on multi-behaviors finite state machine
NASA Astrophysics Data System (ADS)
Arpaia, Pasquale; Cimmino, Pasquale
2016-11-01
A flexible firmware architecture for different kinds of data acquisition systems, ranging from high-precision bench instruments to low-cost wireless transducers networks, is presented. The key component is a multi-behaviors finite state machine, easily configurable to both low- and high-performance requirements, to diverse operating systems, as well as to on-line and batch measurement algorithms. The proposed solution was validated experimentally on three case studies with data acquisition architectures: (i) concentrated, in a high-precision instrument for magnetic measurements at CERN, (ii) decentralized, for telemedicine remote monitoring of patients at home, and (iii) distributed, for remote monitoring of building's energy loss.
Neural architecture design based on extreme learning machine.
Bueno-Crespo, Andrés; García-Laencina, Pedro J; Sancho-Gómez, José-Luis
2013-12-01
Selection of the optimal neural architecture to solve a pattern classification problem entails to choose the relevant input units, the number of hidden neurons and its corresponding interconnection weights. This problem has been widely studied in many research works but their solutions usually involve excessive computational cost in most of the problems and they do not provide a unique solution. This paper proposes a new technique to efficiently design the MultiLayer Perceptron (MLP) architecture for classification using the Extreme Learning Machine (ELM) algorithm. The proposed method provides a high generalization capability and a unique solution for the architecture design. Moreover, the selected final network only retains those input connections that are relevant for the classification task. Experimental results show these advantages. Copyright © 2013 Elsevier Ltd. All rights reserved.
A Real-Time Tool Positioning Sensor for Machine-Tools
Ruiz, Antonio Ramon Jimenez; Rosas, Jorge Guevara; Granja, Fernando Seco; Honorato, Jose Carlos Prieto; Taboada, Jose Juan Esteve; Serrano, Vicente Mico; Jimenez, Teresa Molina
2009-01-01
In machining, natural oscillations, and elastic, gravitational or temperature deformations, are still a problem to guarantee the quality of fabricated parts. In this paper we present an optical measurement system designed to track and localize in 3D a reference retro-reflector close to the machine-tool's drill. The complete system and its components are described in detail. Several tests, some static (including impacts and rotations) and others dynamic (by executing linear and circular trajectories), were performed on two different machine tools. It has been integrated, for the first time, a laser tracking system into the position control loop of a machine-tool. Results indicate that oscillations and deformations close to the tool can be estimated with micrometric resolution and a bandwidth from 0 to more than 100 Hz. Therefore this sensor opens the possibility for on-line compensation of oscillations and deformations. PMID:22408472
2011-06-01
solutions that operate reliable under adverse conditions including a bandwidth-limited environment, and provide them with customised information...236 Klein, G. (1998) Sources of Power: How people make decisions, MIT Press, Cambridge, Mass ., USA, 1998 NATO (2007) NATO Architecture Framework
QoS-aware integrated fiber-wireless standard compliant architecture based on XGPON and EDCA
NASA Astrophysics Data System (ADS)
Kaur, Ravneet; Srivastava, Anand
2018-01-01
Converged Fiber-Wireless (FiWi) broadband access network proves to be a promising candidate that is reliable, robust, cost efficient, ubiquitous and capable of providing huge amount of bandwidth. To meet the ever-increasing bandwidth requirements, it has become very crucial to investigate the performance issues that arise with the deployment of next-generation Passive Optical Network (PON) and its integration with various wireless technologies. Apart from providing high speed internet access for mass use, this combined architecture aims to enable delivery of high quality and effective e-services in different categories including health, education, finance, banking, agriculture and e-government. In this work, we present an integrated architecture of 10-Gigabit-capable PON (XG-PON) and Enhanced Distributed Channel Access (EDCA) that combines the benefits of both technologies to meet the QoS demands of subscribers. Performance evaluation of the standards-compliant hybrid network is done using discrete-event Network Simulator-3 (NS-3) and results are reported in terms of throughput, average delay, average packet loss rate and fairness index. Per-class throughput signifies effectiveness of QoS distribution whereas aggregate throughput indicates effective utilization of wireless channel. This work has not been reported so far to the best of our knowledge.
Liu, Weisong; Huang, Zhitao; Wang, Xiang; Sun, Weichao
2017-01-01
In a cognitive radio sensor network (CRSN), wideband spectrum sensing devices which aims to effectively exploit temporarily vacant spectrum intervals as soon as possible are of great importance. However, the challenge of increasingly high signal frequency and wide bandwidth requires an extremely high sampling rate which may exceed today’s best analog-to-digital converters (ADCs) front-end bandwidth. Recently, the newly proposed architecture called modulated wideband converter (MWC), is an attractive analog compressed sensing technique that can highly reduce the sampling rate. However, the MWC has high hardware complexity owing to its parallel channel structure especially when the number of signals increases. In this paper, we propose a single channel modulated wideband converter (SCMWC) scheme for spectrum sensing of band-limited wide-sense stationary (WSS) signals. With one antenna or sensor, this scheme can save not only sampling rate but also hardware complexity. We then present a new, SCMWC based, single node CR prototype System, on which the spectrum sensing algorithm was tested. Experiments on our hardware prototype show that the proposed architecture leads to successful spectrum sensing. And the total sampling rate as well as hardware size is only one channel’s consumption of MWC. PMID:28471410
Cross channel dependency requirements of the multi-path redundant avionics suite
NASA Astrophysics Data System (ADS)
Martin, Fred; Adams, Darryl
Requirements for cross channel dependencies in the multipath redundant avionics suite (MPRAS) architecture are described. MPRAS is a data synchronous avionics architecture for space launch vehicle applications. The MPRAS cross channel data link (CCDL) provides the mechanism, required by data synchronous architectures, to exchange data and maintain synchronization among redundant channels. MPRAS architectural requirements impose a variety of characteristics for cross channel dependencies which make traditional CCDL solutions unacceptable for MPRAS target applications. The MPRAS CCDL requirements have led to a CCDL design which maintains resilience to faults, does not introduce large cross channel bandwidth reductions, and meets the other established MPRAS CCDL requirements. A review of fault-tolerant system principles applicable to CCDL issues is presented as well as a top-level functional description of the MPRAS CCDL design.
Terzenidis, Nikos; Moralis-Pegios, Miltiadis; Mourgias-Alexandris, George; Vyrsokinos, Konstantinos; Pleros, Nikos
2018-04-02
Departing from traditional server-centric data center architectures towards disaggregated systems that can offer increased resource utilization at reduced cost and energy envelopes, the use of high-port switching with highly stringent latency and bandwidth requirements becomes a necessity. We present an optical switch architecture exploiting a hybrid broadcast-and-select/wavelength routing scheme with small-scale optical feedforward buffering. The architecture is experimentally demonstrated at 10Gb/s, reporting error-free performance with a power penalty of <2.5dB. Moreover, network simulations for a 256-node system, revealed low-latency values of only 605nsec, at throughput values reaching 80% when employing 2-packet-size optical buffers, while multi-rack network performance was also investigated.
Machine Learning-Aided, Robust Wideband Spectrum Sensing for Cognitive Radios
2015-06-12
to even Approved for public release; distribution is unlimited. 2 on the order of a giga -Hertz (GHz). Due to wide bandwidth and noncontiguous...Frequency Band CS Compressive Sampling DFT Discrete Fourier Transform EMI Electro Magnetic Interference FFT Fast Fourier Transform GHz Giga Hertz Hz Hertz
An assessment of the connection machine
NASA Technical Reports Server (NTRS)
Schreiber, Robert
1990-01-01
The CM-2 is an example of a connection machine. The strengths and problems of this implementation are considered as well as important issues in the architecture and programming environment of connection machines in general. These are contrasted to the same issues in Multiple Instruction/Multiple Data (MIMD) microprocessors and multicomputers.
Characterizing output bottlenecks in a supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Bing; Chase, Jeffrey; Dillow, David A
2012-01-01
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
Frances: A Tool for Understanding Computer Architecture and Assembly Language
ERIC Educational Resources Information Center
Sondag, Tyler; Pokorny, Kian L.; Rajan, Hridesh
2012-01-01
Students in all areas of computing require knowledge of the computing device including software implementation at the machine level. Several courses in computer science curricula address these low-level details such as computer architecture and assembly languages. For such courses, there are advantages to studying real architectures instead of…
SWARM: A 32 GHz Correlator and VLBI Beamformer for the Submillimeter Array
NASA Astrophysics Data System (ADS)
Primiani, Rurik A.; Young, Kenneth H.; Young, André; Patel, Nimesh; Wilson, Robert W.; Vertatschitsch, Laura; Chitwood, Billie B.; Srinivasan, Ranjani; MacMahon, David; Weintroub, Jonathan
2016-03-01
A 32GHz bandwidth VLBI capable correlator and phased array has been designed and deployeda at the Smithsonian Astrophysical Observatory’s Submillimeter Array (SMA). The SMA Wideband Astronomical ROACH2 Machine (SWARM) integrates two instruments: a correlator with 140kHz spectral resolution across its full 32GHz band, used for connected interferometric observations, and a phased array summer used when the SMA participates as a station in the Event Horizon Telescope (EHT) very long baseline interferometry (VLBI) array. For each SWARM quadrant, Reconfigurable Open Architecture Computing Hardware (ROACH2) units shared under open-source from the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER) are equipped with a pair of ultra-fast analog-to-digital converters (ADCs), a field programmable gate array (FPGA) processor, and eight 10 Gigabit Ethernet (GbE) ports. A VLBI data recorder interface designated the SWARM digital back end, or SDBE, is implemented with a ninth ROACH2 per quadrant, feeding four Mark6 VLBI recorders with an aggregate recording rate of 64 Gbps. This paper describes the design and implementation of SWARM, as well as its deployment at SMA with reference to verification and science data.
A native IP satellite communications system
NASA Astrophysics Data System (ADS)
Koudelka, O.; Schmidt, M.; Ebert, J.; Schlemmer, H.; Kastner-Puschl, S.; Riedler, W.
2004-08-01
≪ In the framework of ESA's ARTES-5 program the Institute of Applied Systems Technology (Joanneum Research) in cooperation with the Department of Communications and Wave Propagation has developed a novel meshed satellite communications system which is optimised for Internet traffic and applications (L*IP—Local Network Interconnection via Satellite Systems Using the IP Protocol Suite). Both symmetrical and asymmetrical connections are supported. Bandwidth on demand and guaranteed quality of service are key features of the system. A novel multi-frequency TDMA access scheme utilises efficient methods of IP encapsulation. In contrast to other solutions it avoids legacy transport network techniques. While the DVB-RCS standard is based on ATM or MPEG transport cells, the solution of the L*IP system uses variable-length cells which reduces the overhead significantly. A flexible and programmable platform based on Linux machines was chosen to allow the easy implementation and adaptation to different standards. This offers the possibility to apply the system not only to satellite communications, but provides seamless integration with terrestrial fixed broadcast wireless access systems. The platform is also an ideal test-bed for a variety of interactive broadband communications systems. The paper describes the system architecture and the key features of the system.
Multimedia And Internetworking Architecture Infrastructure On Interactive E-Learning System
NASA Astrophysics Data System (ADS)
Indah, K. A. T.; Sukarata, G.
2018-01-01
Interactive e-learning is a distance learning method that involves information technology, electronic system or computer as one means of learning system used for teaching and learning process that is implemented without having face to face directly between teacher and student. A strong dependence on emerging technologies greatly influences the way in which the architecture is designed to produce a powerful interactive e-learning network. In this paper analyzed an architecture model where learning can be done interactively, involving many participants (N-way synchronized distance learning) using video conferencing technology. Also used broadband internet network as well as multicast techniques as a troubleshooting method for bandwidth usage can be efficient.
Experimental evaluation of achromatic phase shifters for mid-infrared starlight suppression.
Gappinger, Robert O; Diaz, Rosemary T; Ksendzov, Alexander; Lawson, Peter R; Lay, Oliver P; Liewer, Kurt M; Loya, Frank M; Martin, Stefan R; Serabyn, Eugene; Wallace, James K
2009-02-10
Phase shifters are a key component of nulling interferometry, one of the potential routes to enabling the measurement of faint exoplanet spectra. Here, three different achromatic phase shifters are evaluated experimentally in the mid-infrared, where such nulling interferometers may someday operate. The methods evaluated include the use of dispersive glasses, a through-focus field inversion, and field reversals on reflection from antisymmetric flat-mirror periscopes. All three approaches yielded deep, broadband, mid-infrared nulls, but the deepest broadband nulls were obtained with the periscope architecture. In the periscope system, average null depths of 4x10(-5) were obtained with a 25% bandwidth, and 2x10(-5) with a 20% bandwidth, at a central wavelength of 9.5 mum. The best short term nulls at 20% bandwidth were approximately 9x10(-6), in line with error budget predictions and the limits of the current generation of hardware.
Embedded instrumentation architecture
Boyd, Gerald M.; Farrow, Jeffrey
2015-09-29
The various technologies presented herein relate to generating copies of an incoming signal, wherein each copy of the signal can undergo different processing to facilitate control of bandwidth demands during communication of one or more signals relating to the incoming signal. A signal sharing component can be utilized to share copies of the incoming signal between a plurality of circuits/components which can include a first A/D converter, a second A/D converter, and a comparator component. The first A/D converter can operate at a low sampling rate and accordingly generates, and continuously transmits, a signal having a low bandwidth requirement. The second A/D converter can operate at a high sampling rate and hence generates a signal having a high bandwidth requirement. Transmission of a signal from the second A/D converter can be controlled by a signaling event (e.g., a signal pulse) being determined to have occurred by the comparator component.
Livermore Big Artificial Neural Network Toolkit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Essen, Brian Van; Jacobs, Sam; Kim, Hyojin
2016-07-01
LBANN is a toolkit that is designed to train artificial neural networks efficiently on high performance computing architectures. It is optimized to take advantages of key High Performance Computing features to accelerate neural network training. Specifically it is optimized for low-latency, high bandwidth interconnects, node-local NVRAM, node-local GPU accelerators, and high bandwidth parallel file systems. It is built on top of the open source Elemental distributed-memory dense and spars-direct linear algebra and optimization library that is released under the BSD license. The algorithms contained within LBANN are drawn from the academic literature and implemented to work within a distributed-memory framework.
Architecture design of motion estimation for ITU-T H.263
NASA Astrophysics Data System (ADS)
Ku, Chung-Wei; Lin, Gong-Sheng; Chen, Liang-Gee; Lee, Yung-Ping
1997-01-01
Digitalized video and audio system has become the trend of the progress in multimedia, because it provides great performance in quality and feasibility of processing. However, as the huge amount of information is needed while the bandwidth is limitted, data compression plays an important role in the system. Say, for a 176 x 144 monochromic sequence with 10 frames/sec frame rate, the bandwidth is about 2Mbps. This wastes much channel resource and limits the applications. MPEG (moving picttre ezpert groip) standardizes the video codec scheme, and it performs high compression ratio while providing good quality. MPEG-i is used for the frame size about 352 x 240 and 30 frames per second, and MPEG-2 provides scalibility and can be applied on scenes with higher definition, say HDTV (high definition television). On the other hand, some applications concerns the very low bit-rate, such as videophone and video-conferencing. Because the channel bandwidth is much limitted in telephone network, a very high compression ratio must be required. ITU-T announced the H.263 video coding standards to meet the above requirements.8 According to the simulation results of TMN-5,22 it outperforms 11.263 with little overhead of complexity. Since wireless communication is the trend in the near future, low power design of the video codec is an important issue for portable visual telephone. Motion estimation is the most computation consuming parts in the whole video codec. About 60% of the computation is spent on this parts for the encoder. Several architectures were proposed for efficient processing of block matching algorithms. In this paper, in order to meet the requirements of 11.263 and the expectation of low power consumption, a modified sandwich architecture in21 is proposed. Based on the parallel processing philosophy, low power is expected and the generation of either one motion vector or four motion vectors with half-pixel accuracy is achieved concurrently. In addition, we will present our solution how to solve the other addition modes in 11.263 with the proposed architecture.
NASA Technical Reports Server (NTRS)
Rogers, David
1988-01-01
The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
Programmable hardware for reconfigurable computing systems
NASA Astrophysics Data System (ADS)
Smith, Stephen
1996-10-01
In 1945 the work of J. von Neumann and H. Goldstein created the principal architecture for electronic computation that has now lasted fifty years. Nevertheless alternative architectures have been created that have computational capability, for special tasks, far beyond that feasible with von Neumann machines. The emergence of high capacity programmable logic devices has made the realization of these architectures practical. The original ENIAC and EDVAC machines were conceived to solve special mathematical problems that were far from today's concept of 'killer applications.' In a similar vein programmable hardware computation is being used today to solve unique mathematical problems. Our programmable hardware activity is focused on the research and development of novel computational systems based upon the reconfigurability of our programmable logic devices. We explore our programmable logic architectures and their implications for programmable hardware. One programmable hardware board implementation is detailed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Here, one of the most critical challenges for high-performance computing (HPC) scientific visualization is execution on massively threaded processors. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Our current production scientific visualization software is not designed for these new types of architectures. To address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moreland, Kenneth; Sewell, Christopher; Usher, William
Execution on massively threaded processors is one of the most critical challenges for high-performance computing (HPC) scientific visualization. Of the many fundamental changes we are seeing in HPC systems, one of the most profound is a reliance on new processor types optimized for execution bandwidth over latency hiding. Moreover, our current production scientific visualization software is not designed for these new types of architectures. In order to address this issue, the VTK-m framework serves as a container for algorithms, provides flexible data representation, and simplifies the design of visualization algorithms on new and future computer architecture.
Herrero, Héctor; Outón, Jose Luis; Puerto, Mildred; Sallé, Damien; López de Ipiña, Karmele
2017-01-01
This paper presents a state machine-based architecture, which enhances the flexibility and reusability of industrial robots, more concretely dual-arm multisensor robots. The proposed architecture, in addition to allowing absolute control of the execution, eases the programming of new applications by increasing the reusability of the developed modules. Through an easy-to-use graphical user interface, operators are able to create, modify, reuse and maintain industrial processes, increasing the flexibility of the cell. Moreover, the proposed approach is applied in a real use case in order to demonstrate its capabilities and feasibility in industrial environments. A comparative analysis is presented for evaluating the presented approach versus traditional robot programming techniques. PMID:28561750
Herrero, Héctor; Outón, Jose Luis; Puerto, Mildred; Sallé, Damien; López de Ipiña, Karmele
2017-05-31
This paper presents a state machine-based architecture, which enhances the flexibility and reusability of industrial robots, more concretely dual-arm multisensor robots. The proposed architecture, in addition to allowing absolute control of the execution, eases the programming of new applications by increasing the reusability of the developed modules. Through an easy-to-use graphical user interface, operators are able to create, modify, reuse and maintain industrial processes, increasing the flexibility of the cell. Moreover, the proposed approach is applied in a real use case in order to demonstrate its capabilities and feasibility in industrial environments. A comparative analysis is presented for evaluating the presented approach versus traditional robot programming techniques.
A Machine Learning Concept for DTN Routing
NASA Technical Reports Server (NTRS)
Dudukovich, Rachel; Hylton, Alan; Papachristou, Christos
2017-01-01
This paper discusses the concept and architecture of a machine learning based router for delay tolerant space networks. The techniques of reinforcement learning and Bayesian learning are used to supplement the routing decisions of the popular Contact Graph Routing algorithm. An introduction to the concepts of Contact Graph Routing, Q-routing and Naive Bayes classification are given. The development of an architecture for a cross-layer feedback framework for DTN (Delay-Tolerant Networking) protocols is discussed. Finally, initial simulation setup and results are given.
The dynamic analysis of drum roll lathe for machining of rollers
NASA Astrophysics Data System (ADS)
Qiao, Zheng; Wu, Dongxu; Wang, Bo; Li, Guo; Wang, Huiming; Ding, Fei
2014-08-01
An ultra-precision machine tool for machining of the roller has been designed and assembled, and due to the obvious impact which dynamic characteristic of machine tool has on the quality of microstructures on the roller surface, the dynamic characteristic of the existing machine tool is analyzed in this paper, so is the influence of circumstance that a large scale and slender roller is fixed in the machine on dynamic characteristic of the machine tool. At first, finite element model of the machine tool is built and simplified, and based on that, the paper carries on with the finite element mode analysis and gets the natural frequency and shaking type of four steps of the machine tool. According to the above model analysis results, the weak stiffness systems of machine tool can be further improved and the reasonable bandwidth of control system of the machine tool can be designed. In the end, considering the shock which is caused by Z axis as a result of fast positioning frequently to feeding system and cutting tool, transient analysis is conducted by means of ANSYS analysis in this paper. Based on the results of transient analysis, the vibration regularity of key components of machine tool and its impact on cutting process are explored respectively.
MWAHCA: a multimedia wireless ad hoc cluster architecture.
Diaz, Juan R; Lloret, Jaime; Jimenez, Jose M; Sendra, Sandra
2014-01-01
Wireless Ad hoc networks provide a flexible and adaptable infrastructure to transport data over a great variety of environments. Recently, real-time audio and video data transmission has been increased due to the appearance of many multimedia applications. One of the major challenges is to ensure the quality of multimedia streams when they have passed through a wireless ad hoc network. It requires adapting the network architecture to the multimedia QoS requirements. In this paper we propose a new architecture to organize and manage cluster-based ad hoc networks in order to provide multimedia streams. Proposed architecture adapts the network wireless topology in order to improve the quality of audio and video transmissions. In order to achieve this goal, the architecture uses some information such as each node's capacity and the QoS parameters (bandwidth, delay, jitter, and packet loss). The architecture splits the network into clusters which are specialized in specific multimedia traffic. The real system performance study provided at the end of the paper will demonstrate the feasibility of the proposal.
NASA Astrophysics Data System (ADS)
Li, Yan; Collier, Martin
2007-11-01
Wavelength-routed networks have received enormous attention due to the fact that they are relatively simple to implement and implicitly offer Quality of Service (QoS) guarantees. However, they suffer from a bandwidth inefficiency problem and require complex Routing and Wavelength Assignment (RWA). Most attempts to address the above issues exploit the joint use of WDM and TDM technologies. The resultant TDM-based wavelength-routed networks partition the wavelength bandwidth into fixed-length time slots organized as a fixed-length frame. Multiple connections can thus time-share a wavelength and the grooming of their traffic leads to better bandwidth utilization. The capability of switching in both wavelength and time domains in such networks also mitigates the RWA problem. However, TMD-based wavelength-routed networks work in synchronous mode and strict synchronization among all network nodes is required. Global synchronization for all-optical networks which operate at extremely high speed is technically challenging, and deploying an optical synchronizer for each wavelength involves considerable cost. An Optical Slotted Circuit Switching (OSCS) architecture is proposed in this paper. In an OSCS network, slotted circuits are created to better utilize the wavelength bandwidth than in classic wavelength-routed networks. The operation of the protocol is such as to avoid the need for global synchronization required by TDM-based wavelength-routed networks.
A self-defining hierarchical data system
NASA Technical Reports Server (NTRS)
Bailey, J.
1992-01-01
The Self-Defining Data System (SDS) is a system which allows the creation of self-defining hierarchical data structures in a form which allows the data to be moved between different machine architectures. Because the structures are self-defining they can be used for communication between independent modules in a distributed system. Unlike disk-based hierarchical data systems such as Starlink's HDS, SDS works entirely in memory and is very fast. Data structures are created and manipulated as internal dynamic structures in memory managed by SDS itself. A structure may then be exported into a caller supplied memory buffer in a defined external format. This structure can be written as a file or sent as a message to another machine. It remains static in structure until it is reimported into SDS. SDS is written in portable C and has been run on a number of different machine architectures. Structures are portable between machines with SDS looking after conversion of byte order, floating point format, and alignment. A Fortran callable version is also available for some machines.
A new software-based architecture for quantum computer
NASA Astrophysics Data System (ADS)
Wu, Nan; Song, FangMin; Li, Xiangdong
2010-04-01
In this paper, we study a reliable architecture of a quantum computer and a new instruction set and machine language for the architecture, which can improve the performance and reduce the cost of the quantum computing. We also try to address some key issues in detail in the software-driven universal quantum computers.
Machine vision systems using machine learning for industrial product inspection
NASA Astrophysics Data System (ADS)
Lu, Yi; Chen, Tie Q.; Chen, Jie; Zhang, Jian; Tisler, Anthony
2002-02-01
Machine vision inspection requires efficient processing time and accurate results. In this paper, we present a machine vision inspection architecture, SMV (Smart Machine Vision). SMV decomposes a machine vision inspection problem into two stages, Learning Inspection Features (LIF), and On-Line Inspection (OLI). The LIF is designed to learn visual inspection features from design data and/or from inspection products. During the OLI stage, the inspection system uses the knowledge learnt by the LIF component to inspect the visual features of products. In this paper we will present two machine vision inspection systems developed under the SMV architecture for two different types of products, Printed Circuit Board (PCB) and Vacuum Florescent Displaying (VFD) boards. In the VFD board inspection system, the LIF component learns inspection features from a VFD board and its displaying patterns. In the PCB board inspection system, the LIF learns the inspection features from the CAD file of a PCB board. In both systems, the LIF component also incorporates interactive learning to make the inspection system more powerful and efficient. The VFD system has been deployed successfully in three different manufacturing companies and the PCB inspection system is the process of being deployed in a manufacturing plant.
Pyramidal neurovision architecture for vision machines
NASA Astrophysics Data System (ADS)
Gupta, Madan M.; Knopf, George K.
1993-08-01
The vision system employed by an intelligent robot must be active; active in the sense that it must be capable of selectively acquiring the minimal amount of relevant information for a given task. An efficient active vision system architecture that is based loosely upon the parallel-hierarchical (pyramidal) structure of the biological visual pathway is presented in this paper. Although the computational architecture of the proposed pyramidal neuro-vision system is far less sophisticated than the architecture of the biological visual pathway, it does retain some essential features such as the converging multilayered structure of its biological counterpart. In terms of visual information processing, the neuro-vision system is constructed from a hierarchy of several interactive computational levels, whereupon each level contains one or more nonlinear parallel processors. Computationally efficient vision machines can be developed by utilizing both the parallel and serial information processing techniques within the pyramidal computing architecture. A computer simulation of a pyramidal vision system for active scene surveillance is presented.
NASA Astrophysics Data System (ADS)
Serrano, Rafael; González, Luis Carlos; Martín, Francisco Jesús
2009-11-01
Under the project SENSOR-IA which has had financial funding from the Order of Incentives to the Regional Technology Centers of the Counsil of Innovation, Science and Enterprise of Andalusia, an architecture for the optimization of a machining process in real time through rule-based expert system has been developed. The architecture consists of an acquisition system and sensor data processing engine (SATD) from an expert system (SE) rule-based which communicates with the SATD. The SE has been designed as an inference engine with an algorithm for effective action, using a modus ponens rule model of goal-oriented rules.The pilot test demonstrated that it is possible to govern in real time the machining process based on rules contained in a SE. The tests have been done with approximated rules. Future work includes an exhaustive collection of data with different tool materials and geometries in a database to extract more precise rules.
Parallel Algorithms for Computer Vision
1990-04-01
NA86-1, Thinking Machines Corporation, Cambridge, MA, December 1986. [43] J. Little, G. Blelloch, and T. Cass. How to program the connection machine for... to program the connection machine for computer vision. In Proc. Workshop on Comp. Architecture for Pattern Analysis and Machine Intell., 1987. [92] J...In Proceedings of SPIE Conf. on Advances in Intelligent Robotics Systems, Bellingham, VA, 1987. SPIE. [91] J. Little, G. Blelloch, and T. Cass. How
Proceedings of the NASA Conference on Space Telerobotics, volume 1
NASA Technical Reports Server (NTRS)
Rodriguez, Guillermo (Editor); Seraji, Homayoun (Editor)
1989-01-01
The theme of the Conference was man-machine collaboration in space. Topics addressed include: redundant manipulators; man-machine systems; telerobot architecture; remote sensing and planning; navigation; neural networks; fundamental AI research; and reasoning under uncertainty.
Cache write generate for parallel image processing on shared memory architectures.
Wittenbrink, C M; Somani, A K; Chen, C H
1996-01-01
We investigate cache write generate, our cache mode invention. We demonstrate that for parallel image processing applications, the new mode improves main memory bandwidth, CPU efficiency, cache hits, and cache latency. We use register level simulations validated by the UW-Proteus system. Many memory, cache, and processor configurations are evaluated.
Optoelectronic interconnects for 3D wafer stacks
NASA Astrophysics Data System (ADS)
Ludwig, David E.; Carson, John C.; Lome, Louis S.
1996-01-01
Wafer and chip stacking are envisioned as a means of providing increased processing power within the small confines of a three-dimensional structure. Optoelectronic devices can play an important role in these dense 3-D processing electronic packages in two ways. In pure electronic processing, optoelectronics can provide a method for increasing the number of input/output communication channels within the layers of the 3-D chip stack. Non-free space communication links allow the density of highly parallel input/output ports to increase dramatically over typical edge bus connections. In hybrid processors, where electronics and optics play a role in defining the computational algorithm, free space communication links are typically utilized for, among other reasons, the increased network link complexity which can be achieved. Free space optical interconnections provide bandwidths and interconnection complexity unobtainable in pure electrical interconnections. Stacked 3-D architectures can provide the electronics real estate and structure to deal with the increased bandwidth and global information provided by free space optical communications. This paper provides definitions and examples of 3-D stacked architectures in optoelectronics processors. The benefits and issues of these technologies are discussed.
Optoelectronic interconnects for 3D wafer stacks
NASA Astrophysics Data System (ADS)
Ludwig, David; Carson, John C.; Lome, Louis S.
1996-01-01
Wafer and chip stacking are envisioned as means of providing increased processing power within the small confines of a three-dimensional structure. Optoelectronic devices can play an important role in these dense 3-D processing electronic packages in two ways. In pure electronic processing, optoelectronics can provide a method for increasing the number of input/output communication channels within the layers of the 3-D chip stack. Non-free space communication links allow the density of highly parallel input/output ports to increase dramatically over typical edge bus connections. In hybrid processors, where electronics and optics play a role in defining the computational algorithm, free space communication links are typically utilized for, among other reasons, the increased network link complexity which can be achieved. Free space optical interconnections provide bandwidths and interconnection complexity unobtainable in pure electrical interconnections. Stacked 3-D architectures can provide the electronics real estate and structure to deal with the increased bandwidth and global information provided by free space optical communications. This paper will provide definitions and examples of 3-D stacked architectures in optoelectronics processors. The benefits and issues of these technologies will be discussed.
Multicast routing for wavelength-routed WDM networks with dynamic membership
NASA Astrophysics Data System (ADS)
Huang, Nen-Fu; Liu, Te-Lung; Wang, Yao-Tzung; Li, Bo
2000-09-01
Future broadband networks must support integrated services and offer flexible bandwidth usage. In our previous work, we explore the optical link control layer on the top of optical layer that enables the possibility of bandwidth on-demand service directly over wavelength division multiplexed (WDM) networks. Today, more and more applications and services such as video-conferencing software and Virtual LAN service require multicast support over the underlying networks. Currently, it is difficult to provide wavelength multicast over the optical switches without optical/electronic conversions although the conversion takes extra cost. In this paper, based on the proposed wavelength router architecture (equipped with ATM switches to offer O/E and E/O conversions when necessary), a dynamic multicast routing algorithm is proposed to furnish multicast services over WDM networks. The goal is to joint a new group member into the multicast tree so that the cost, including the link cost and the optical/electronic conversion cost, is kept as less as possible. The effectiveness of the proposed wavelength router architecture as well as the dynamic multicast algorithm is evaluated by simulation.
Satellite teleradiology test bed for digital mammography
NASA Astrophysics Data System (ADS)
Barnett, Bruce G.; Dudding, Kathryn E.; Abdel-Malek, Aiman A.; Mitchell, Robert J.
1996-05-01
Teleradiology offers significant improvement in efficiency and patient compliance over current practices in traditional film/screen-based diagnosis. The increasing number of women who need to be screened for breast cancer, including those in remote rural regions, make the advantages of teleradiology especially attractive for digital mammography. At the same time, the size and resolution of digital mammograms are among the most challenging to support in a cost effective teleradiology system. This paper will describe a teleradiology architecture developed for use with digital mammography by GE Corporate Research and Development in collaboration with Massachusetts General Hospital under National Cancer Institute (NCI/NIH) grant number R01 CA60246-01. The testbed architecture is based on the Digital Imaging and Communications in Medicine (DICOM) standard, created by the American College of Radiology and National Electrical Manufacturers Association. The testbed uses several Sun workstations running SunOS, which emulate a rural examination facility connected to a central diagnostic facility, and uses a TCP-based DICOM application to transfer images over a satellite link. Network performance depends on the product of the bandwidth times the round- trip time. A satellite link has a round trip of 513 milliseconds, making the bandwidth-delay a significant problem. This type of high bandwidth, high delay network is called a Long Fat Network, or LFN. The goal of this project was to quantify the performance of the satellite link, and evaluate the effectiveness of TCP over an LFN. Four workstations have Sun's HSI/S (High Speed Interface) option. Two are connected by a cable, and two are connected through a satellite link. Both interfaces have the same T1 bandwidth (1.544 Megabits per second). The only difference was the round trip time. Even with large window buffers, the time to transfer a file over the satellite link was significantly longer, due to the bandwidth-delay. To compensate for this, TCP extensions for LFNs such as the Window Scaling Option (described in RFC1323) were necessary to optimize the use of the link. A high level analysis of throughput, with and without these TCP extensions, will be discussed. Recommendations will be made as to the critical areas for future work.
NASA Technical Reports Server (NTRS)
Rickard, D. A.; Bodenheimer, R. E.
1976-01-01
Digital computer components which perform two dimensional array logic operations (Tse logic) on binary data arrays are described. The properties of Golay transforms which make them useful in image processing are reviewed, and several architectures for Golay transform processors are presented with emphasis on the skeletonizing algorithm. Conventional logic control units developed for the Golay transform processors are described. One is a unique microprogrammable control unit that uses a microprocessor to control the Tse computer. The remaining control units are based on programmable logic arrays. Performance criteria are established and utilized to compare the various Golay transform machines developed. A critique of Tse logic is presented, and recommendations for additional research are included.
NASA Technical Reports Server (NTRS)
Tick, Evan
1987-01-01
This note describes an efficient software emulator for the Warren Abstract Machine (WAM) Prolog architecture. The version of the WAM implemented is called Lcode. The Lcode emulator, written in C, executes the 'naive reverse' benchmark at 3900 LIPS. The emulator is one of a set of tools used to measure the memory-referencing characteristics and performance of Prolog programs. These tools include a compiler, assembler, and memory simulators. An overview of the Lcode architecture is given here, followed by a description and listing of the emulator code implementing each Lcode instruction. This note will be of special interest to those studying the WAM and its performance characteristics. In general, this note will be of interest to those creating efficient software emulators for abstract machine architectures.
Micromachined TWTs for THz Radiation Sources
NASA Technical Reports Server (NTRS)
Booske, John H.; vanderWeide, Daniel W.; Kory, Carol L.; Limbach, S.; Downey, Alan (Technical Monitor)
2001-01-01
The Terahertz (THz) region of the electromagnetic spectrum (about 300 - 3000 GHz in frequency or about 0.1 - 1 mm free space wavelength) has enormous potential for high-data-rate communications, spectroscopy, astronomy, space research, medicine, biology, surveillance, remote sensing, industrial process control, etc. It has been characterized as the most scientifically rich, yet under-utilized, region of the electromagnetic spectrum. The most critical roadblock to full exploitation of the THz band is lack of coherent radiation sources that are powerful (0.001 - 1.0 W continuous wave), efficient (> 1%), frequency agile (instantaneously tunable over 1% bandwidths or more), reliable, and comparatively inexpensive. To develop vacuum electron device (VED) radiation sources satisfying these requirements, fabrication and packaging approaches must be heavily considered to minimize costs, in addition to the basic interaction physics and circuit design. To minimize size of the prime power supply, beam voltage must be minimized, preferably 10 kV. Solid state sources satisfy the low voltage requirement, but are many orders of magnitude below power, efficiency, and bandwidth requirements. On the other hand, typical fast-wave VED sources in this regime (e.g., gyrotrons, FELs) tend to be large, expensive, high voltage and very high power devices unsuitable for most of the applications cited above. VEDs based on grating or inter-digital (ID) circuits have been researched and developed. However, achieving forward-wave amplifier operation with instantaneous fractional bandwidths > 1% is problematic for these devices with low-energy (< 15 kV) electron beams. Moreover, the interaction impedance is quite low unless the beam-circuit spacing is kept particularly narrow, often leading to significant beam interception. One solution to satisfy the THz source requirements mentioned above is to develop micromachined VEDs, or "micro-VEDs". Among other benefits, micro-machining technologies provide superior high frequency wall conductivity as a result of superior surface smoothness compared with conventional mechanical or electric discharge machining approaches. Micro-VED technologies are already being applied to the development of millimeter-wave klystrons at Stanford Linear Accelerator Center and submillimeter-wave klystrons at the University of Leeds. We are investigating the use of micro-machining technologies to develop THz regime TWTs, with emphasis on folded-waveguide TWTs. The folded-waveguide TWT (FW-TWT) has several features that make it attractive for THz-regime micro-VED applications. It is a relatively simple circuit to design and fabricate, it is amenable to precision pattern replication by micro-machining, and it is has been demonstrated capable of forward-wave amplification with appreciable bandwidth. We are conducting experimental and computational studies of micro-VED FW-TWTs to examine their feasibility for applications at frequencies from 200 - 1000 GHz.
Scalable Motion Estimation Processor Core for Multimedia System-on-Chip Applications
NASA Astrophysics Data System (ADS)
Lai, Yeong-Kang; Hsieh, Tian-En; Chen, Lien-Fei
2007-04-01
In this paper, we describe a high-throughput and scalable motion estimation processor architecture for multimedia system-on-chip applications. The number of processing elements (PEs) is scalable according to the variable algorithm parameters and the performance required for different applications. Using the PE rings efficiently and an intelligent memory-interleaving organization, the efficiency of the architecture can be increased. Moreover, using efficient on-chip memories and a data management technique can effectively decrease the power consumption and memory bandwidth. Techniques for reducing the number of interconnections and external memory accesses are also presented. Our results demonstrate that the proposed scalable PE-ringed architecture is a flexible and high-performance processor core in multimedia system-on-chip applications.
Extending the BEAGLE library to a multi-FPGA platform.
Jin, Zheming; Bakos, Jason D
2013-01-19
Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.
Initial Characterization of Optical Communications with Disruption-Tolerant Network Protocols
NASA Technical Reports Server (NTRS)
Schoolcraft, Joshua; Wilson, Keith
2011-01-01
Disruption-tolerant networks (DTNs) are groups of network assets connected with a suite of communication protocol technologies designed to mitigate the effects of link delay and disruption. Application of DTN protocols to diverse groups of network resources in multiple sub-networks results in an overlay network-of-networks with autonomous data routing capability. In space environments where delay or disruption is expected, performance of this type of architecture (such as an interplanetary internet) can increase with the inclusion of new communications mediums and techniques. Space-based optical communication links are therefore an excellent building block of space DTN architectures. When compared to traditional radio frequency (RF) communications, optical systems can provide extremely power-efficient and high bandwidth links bridging sub-networks. Because optical links are more susceptible to link disruption and experience the same light-speed delays as RF, optical-enabled DTN architectures can lessen potential drawbacks and maintain the benefits of autonomous optical communications over deep space distances. These environment-driven expectations - link delay and interruption, along with asymmetric data rates - are the purpose of the proof-of-concept experiment outlined herein. In recognizing the potential of these two technologies, we report an initial experiment and characterization of the performance of a DTN-enabled space optical link. The experiment design employs a point-to-point free-space optical link configured to have asymmetric bandwidth. This link connects two networked systems running a DTN protocol implementation designed and written at JPL for use on spacecraft, and further configured for higher bandwidth performance. Comparing baseline data transmission metrics with and without periodic optical link interruptions, the experiment confirmed the DTN protocols' ability to handle real-world unexpected link outages while maintaining capability of reliably delivering data at relatively high rates. Finally, performance characterizations from this data suggest performance optimizations to configuration and protocols for future optical-specific DTN space link scenarios.
MBASIC batch processor architectural overview
NASA Technical Reports Server (NTRS)
Reynolds, S. M.
1978-01-01
The MBASIC (TM) batch processor, a language translator designed to operate in the MBASIC (TM) environment is described. Features include: (1) a CONVERT TO BATCH command, usable from the ready mode; and (2) translation of the users program in stages through several levels of intermediate language and optimization. The processor is to be designed and implemented in both machine-independent and machine-dependent sections. The architecture is planned so that optimization processes are transparent to the rest of the system and need not be included in the first design implementation cycle.
Lasercom system architecture with reduced complexity
NASA Technical Reports Server (NTRS)
Lesh, James R. (Inventor); Chen, Chien-Chung (Inventor); Ansari, Homayoon (Inventor)
1994-01-01
Spatial acquisition and precision beam pointing functions are critical to spaceborne laser communication systems. In the present invention, a single high bandwidth CCD detector is used to perform both spatial acquisition and tracking functions. Compared to previous lasercom hardware design, the array tracking concept offers reduced system complexity by reducing the number of optical elements in the design. Specifically, the design requires only one detector and one beam steering mechanism. It also provides the means to optically close the point-ahead control loop. The technology required for high bandwidth array tracking was examined and shown to be consistent with current state of the art. The single detector design can lead to a significantly reduced system complexity and a lower system cost.
LaserCom System Architecture With Reduced Complexity
NASA Technical Reports Server (NTRS)
Lesh, James R. (Inventor); Chen, Chien-Chung (Inventor); Ansari, Homa-Yoon (Inventor)
1996-01-01
Spatial acquisition and precision beam pointing functions are critical to spaceborne laser communication systems. In the present invention a single high bandwidth CCD detector is used to perform both spatial acquisition and tracking functions. Compared to previous lasercom hardware design, the array tracking concept offers reduced system complexity by reducing the number of optical elements in the design. Specifically, the design requires only one detector and one beam steering mechanism. It also provides means to optically close the point-ahead control loop. The technology required for high bandwidth array tracking was examined and shown to be consistent with current state of the art. The single detector design can lead to a significantly reduced system complexity and a lower system cost.
Kwon, Kun-Sup; Yoon, Won-Sang
2010-01-01
In this paper we propose a method of removing from synthesizer output spurious signals due to quasi-amplitude modulation and superposition effect in a frequency-hopping synthesizer with direct digital frequency synthesizer (DDFS)-driven phase-locked loop (PLL) architecture, which has the advantages of high frequency resolution, fast transition time, and small size. There are spurious signals that depend on normalized frequency of DDFS. They can be dominant if they occur within the PLL loop bandwidth. We suggest that such signals can be eliminated by purposefully creating frequency errors in the developed synthesizer.
Adaptive Attitude Control of the Crew Launch Vehicle
NASA Technical Reports Server (NTRS)
Muse, Jonathan
2010-01-01
An H(sub infinity)-NMA architecture for the Crew Launch Vehicle was developed in a state feedback setting. The minimal complexity adaptive law was shown to improve base line performance relative to a performance metric based on Crew Launch Vehicle design requirements for all most all of the Worst-on-Worst dispersion cases. The adaptive law was able to maintain stability for some dispersions that are unstable with the nominal control law. Due to the nature of the H(sub infinity)-NMA architecture, the augmented adaptive control signal has low bandwidth which is a great benefit for a manned launch vehicle.
NASA Astrophysics Data System (ADS)
MacMahon, David H. E.; Price, Danny C.; Lebofsky, Matthew; Siemion, Andrew P. V.; Croft, Steve; DeBoer, David; Enriquez, J. Emilio; Gajjar, Vishal; Hellbourg, Gregory; Isaacson, Howard; Werthimer, Dan; Abdurashidova, Zuhra; Bloss, Marty; Brandt, Joe; Creager, Ramon; Ford, John; Lynch, Ryan S.; Maddalena, Ronald J.; McCullough, Randy; Ray, Jason; Whitehead, Mark; Woody, Dave
2018-04-01
The Breakthrough Listen Initiative is undertaking a comprehensive search for radio and optical signatures from extraterrestrial civilizations. An integral component of the project is the design and implementation of wide-bandwidth data recorder and signal processing systems. The capabilities of these systems, particularly at radio frequencies, directly determine survey speed; further, given a fixed observing time and spectral coverage, they determine sensitivity as well. Here, we detail the Breakthrough Listen wide-bandwidth data recording system deployed at the 100 m aperture Robert C. Byrd Green Bank Telescope. The system digitizes up to 6 GHz of bandwidth at 8 bits for both polarizations, storing the resultant 24 GB s‑1 of data to disk. This system is among the highest data rate baseband recording systems in use in radio astronomy. A future system expansion will double recording capacity, to achieve a total Nyquist bandwidth of 12 GHz in two polarizations. In this paper, we present details of the system architecture, along with salient configuration and disk-write optimizations used to achieve high-throughput data capture on commodity compute servers and consumer-class hard disk drives.
Interaction with Machine Improvisation
NASA Astrophysics Data System (ADS)
Assayag, Gerard; Bloch, George; Cont, Arshia; Dubnov, Shlomo
We describe two multi-agent architectures for an improvisation oriented musician-machine interaction systems that learn in real time from human performers. The improvisation kernel is based on sequence modeling and statistical learning. We present two frameworks of interaction with this kernel. In the first, the stylistic interaction is guided by a human operator in front of an interactive computer environment. In the second framework, the stylistic interaction is delegated to machine intelligence and therefore, knowledge propagation and decision are taken care of by the computer alone. The first framework involves a hybrid architecture using two popular composition/performance environments, Max and OpenMusic, that are put to work and communicate together, each one handling the process at a different time/memory scale. The second framework shares the same representational schemes with the first but uses an Active Learning architecture based on collaborative, competitive and memory-based learning to handle stylistic interactions. Both systems are capable of processing real-time audio/video as well as MIDI. After discussing the general cognitive background of improvisation practices, the statistical modelling tools and the concurrent agent architecture are presented. Then, an Active Learning scheme is described and considered in terms of using different improvisation regimes for improvisation planning. Finally, we provide more details about the different system implementations and describe several performances with the system.
HTMT-class Latency Tolerant Parallel Architecture for Petaflops Scale Computation
NASA Technical Reports Server (NTRS)
Sterling, Thomas; Bergman, Larry
2000-01-01
Computational Aero Sciences and other numeric intensive computation disciplines demand computing throughputs substantially greater than the Teraflops scale systems only now becoming available. The related fields of fluids, structures, thermal, combustion, and dynamic controls are among the interdisciplinary areas that in combination with sufficient resolution and advanced adaptive techniques may force performance requirements towards Petaflops. This will be especially true for compute intensive models such as Navier-Stokes are or when such system models are only part of a larger design optimization computation involving many design points. Yet recent experience with conventional MPP configurations comprising commodity processing and memory components has shown that larger scale frequently results in higher programming difficulty and lower system efficiency. While important advances in system software and algorithms techniques have had some impact on efficiency and programmability for certain classes of problems, in general it is unlikely that software alone will resolve the challenges to higher scalability. As in the past, future generations of high-end computers may require a combination of hardware architecture and system software advances to enable efficient operation at a Petaflops level. The NASA led HTMT project has engaged the talents of a broad interdisciplinary team to develop a new strategy in high-end system architecture to deliver petaflops scale computing in the 2004/5 timeframe. The Hybrid-Technology, MultiThreaded parallel computer architecture incorporates several advanced technologies in combination with an innovative dynamic adaptive scheduling mechanism to provide unprecedented performance and efficiency within practical constraints of cost, complexity, and power consumption. The emerging superconductor Rapid Single Flux Quantum electronics can operate at 100 GHz (the record is 770 GHz) and one percent of the power required by convention semiconductor logic. Wave Division Multiplexing optical communications can approach a peak per fiber bandwidth of 1 Tbps and the new Data Vortex network topology employing this technology can connect tens of thousands of ports providing a bi-section bandwidth on the order of a Petabyte per second with latencies well below 100 nanoseconds, even under heavy loads. Processor-in-Memory (PIM) technology combines logic and memory on the same chip exposing the internal bandwidth of the memory row buffers at low latency. And holographic storage photorefractive storage technologies provide high-density memory with access a thousand times faster than conventional disk technologies. Together these technologies enable a new class of shared memory system architecture with a peak performance in the range of a Petaflops but size and power requirements comparable to today's largest Teraflops scale systems. To achieve high-sustained performance, HTMT combines an advanced multithreading processor architecture with a memory-driven coarse-grained latency management strategy called "percolation", yielding high efficiency while reducing the much of the parallel programming burden. This paper will present the basic system architecture characteristics made possible through this series of advanced technologies and then give a detailed description of the new percolation approach to runtime latency management.
A Cognitive Systems Engineering Approach to Developing HMI Requirements for New Technologies
NASA Technical Reports Server (NTRS)
Fern, Lisa Carolynn
2016-01-01
This document examines the challenges inherent in designing and regulating to support human-automation interaction for new technologies that will deployed into complex systems. A key question for new technologies, is how work will be accomplished by the human and machine agents. This question has traditionally been framed as how functions should be allocated between humans and machines. Such framing misses the coordination and synchronization that is needed for the different human and machine roles in the system to accomplish their goals. Coordination and synchronization demands are driven by the underlying human-automation architecture of the new technology, which are typically not specified explicitly by the designers. The human machine interface (HMI) which is intended to facilitate human-machine interaction and cooperation, however, typically is defined explicitly and therefore serves as a proxy for human-automation cooperation requirements with respect to technical standards for technologies. Unfortunately, mismatches between the HMI and the coordination and synchronization demands of the underlying human-automation architecture, can lead to system breakdowns. A methodology is needed that both designers and regulators can utilize to evaluate the expected performance of a new technology given potential human-automation architectures. Three experiments were conducted to inform the minimum HMI requirements a detect and avoid system for unmanned aircraft systems (UAS). The results of the experiments provided empirical input to specific minimum operational performance standards that UAS manufacturers will have to meet in order to operate UAS in the National Airspace System (NAS). These studies represent a success story for how to objectively and systematically evaluate prototype technologies as part of the process for developing regulatory requirements. They also provide an opportunity to reflect on the lessons learned from a recent research effort in order to improve the methodology for defining technology requirements for regulators in the future. The biggest shortcoming of the presented research program was the absence of the explicit definition, generation and analysis of potential human-automation architectures. Failure to execute this step in the research process resulted in less efficient evaluation of the candidate prototypes technologies in addition to the complete absence of different approaches to human-automation cooperation. For example, all of the prototype technologies that were evaluated in the research program assumed a human-automation architecture that relied on serial processing from the automation to the human. While this type of human-automation architecture is typical across many different technologies and in many different domains, it ignores different architectures where humans and automation work in parallel. Defining potential human-automation architectures a priori also allows regulators to develop scenarios that will stress the performance boundaries of the technology during the evaluation phase. The importance of adding this step of generating and evaluating candidate human-automation architectures prior to formal empirical evaluation is discussed.
NASA Astrophysics Data System (ADS)
Chao, I.-Fen; Zhang, Tsung-Min
2015-06-01
Long-reach passive optical networks (LR-PONs) have been considered to be promising solutions for future access networks. In this paper, we propose a distributed medium access control (MAC) scheme over an advantageous LR-PON network architecture that reroutes the control information from and back to all ONUs through an (N + 1) × (N + 1) star coupler (SC) deployed near the ONUs, thereby overwhelming the extremely long propagation delay problem in LR-PONs. In the network, the control slot is designed to contain all bandwidth requirements of all ONUs and is in-band time-division-multiplexed with a number of data slots within a cycle. In the proposed MAC scheme, a novel profit-weight-based dynamic bandwidth allocation (P-DBA) scheme is presented. The algorithm is designed to efficiently and fairly distribute the amount of excess bandwidth based on a profit value derived from the excess bandwidth usage of each ONU, which resolves the problems of previously reported DBA schemes that are either unfair or inefficient. The simulation results show that the proposed decentralized algorithms exhibit a nearly three-order-of-magnitude improvement in delay performance compared to the centralized algorithms over LR-PONs. Moreover, the newly proposed P-DBA scheme guarantees low delay performance and fairness even when under attack by the malevolent ONU irrespective of traffic loads and burstiness.
NASA Astrophysics Data System (ADS)
Impemba, Ernesto; Inzerilli, Tiziano
2003-07-01
Integration of satellite access networks with the Internet is seen as a strategic goal to achieve in order to provide ubiquitous broadband access to Internet services in Next Generation Networks (NGNs). One of the main interworking aspects which has been most studied is an efficient management of satellite resources, i.e. bandwidth and buffer space, in order to satisfy most demanding application requirements as to delay control and bandwidth assurance. In this context, resource management in DVB-S/DVB-RCS satellite technologies, emerging technologies for broadband satellite access and transport of IP applications, is a research issue largely investigated as a means to provide efficient bi-directional communications across satellites. This is in particular one of the principal goals of the SATIP6 project, sponsored within the 5th EU Research Programme Framework, i.e. IST. In this paper we present a possible approach to efficiently exploit bandwidth, the most critical resource in a broadband satellite access network, while pursuing satisfaction of delay and bandwidth requirements for applications with guaranteed QoS through a traffic control architecture to be implemented in ground terminals. Performance of this approach is assessed in terms of efficient exploitation of the uplink bandwidth and differentiation and minimization of queuing delays for most demanding applications over a time-varying capacity. Opnet simulations is used as analysis tool.
WATERLOPP V2/64: A highly parallel machine for numerical computation
NASA Astrophysics Data System (ADS)
Ostlund, Neil S.
1985-07-01
Current technological trends suggest that the high performance scientific machines of the future are very likely to consist of a large number (greater than 1024) of processors connected and communicating with each other in some as yet undetermined manner. Such an assembly of processors should behave as a single machine in obtaining numerical solutions to scientific problems. However, the appropriate way of organizing both the hardware and software of such an assembly of processors is an unsolved and active area of research. It is particularly important to minimize the organizational overhead of interprocessor comunication, global synchronization, and contention for shared resources if the performance of a large number ( n) of processors is to be anything like the desirable n times the performance of a single processor. In many situations, adding a processor actually decreases the performance of the overall system since the extra organizational overhead is larger than the extra processing power added. The systolic loop architecture is a new multiple processor architecture which attemps at a solution to the problem of how to organize a large number of asynchronous processors into an effective computational system while minimizing the organizational overhead. This paper gives a brief overview of the basic systolic loop architecture, systolic loop algorithms for numerical computation, and a 64-processor implementation of the architecture, WATERLOOP V2/64, that is being used as a testbed for exploring the hardware, software, and algorithmic aspects of the architecture.
2017-01-01
The continuous technological advances in favor of mHealth represent a key factor in the improvement of medical emergency services. This systematic review presents the identification, study, and classification of the most up-to-date approaches surrounding the deployment of architectures for mHealth. Our review includes 25 articles obtained from databases such as IEEE Xplore, Scopus, SpringerLink, ScienceDirect, and SAGE. This review focused on studies addressing mHealth systems for outdoor emergency situations. In 60% of the articles, the deployment architecture relied in the connective infrastructure associated with emergent technologies such as cloud services, distributed services, Internet-of-things, machine-to-machine, vehicular ad hoc network, and service-oriented architecture. In 40% of the literature review, the deployment architecture for mHealth considered traditional connective infrastructure. Only 20% of the studies implemented an energy consumption protocol to extend system lifetime. We concluded that there is a need for more integrated solutions specifically for outdoor scenarios. Energy consumption protocols are needed to be implemented and evaluated. Emergent connective technologies are redefining the information management and overcome traditional technologies. PMID:29075430
Gonzalez, Enrique; Peña, Raul; Avila, Alfonso; Vargas-Rosales, Cesar; Munoz-Rodriguez, David
2017-01-01
The continuous technological advances in favor of mHealth represent a key factor in the improvement of medical emergency services. This systematic review presents the identification, study, and classification of the most up-to-date approaches surrounding the deployment of architectures for mHealth. Our review includes 25 articles obtained from databases such as IEEE Xplore, Scopus, SpringerLink, ScienceDirect, and SAGE. This review focused on studies addressing mHealth systems for outdoor emergency situations. In 60% of the articles, the deployment architecture relied in the connective infrastructure associated with emergent technologies such as cloud services, distributed services, Internet-of-things, machine-to-machine, vehicular ad hoc network, and service-oriented architecture. In 40% of the literature review, the deployment architecture for mHealth considered traditional connective infrastructure. Only 20% of the studies implemented an energy consumption protocol to extend system lifetime. We concluded that there is a need for more integrated solutions specifically for outdoor scenarios. Energy consumption protocols are needed to be implemented and evaluated. Emergent connective technologies are redefining the information management and overcome traditional technologies.
Low-power, transparent optical network interface for high bandwidth off-chip interconnects.
Liboiron-Ladouceur, Odile; Wang, Howard; Garg, Ajay S; Bergman, Keren
2009-04-13
The recent emergence of multicore architectures and chip multiprocessors (CMPs) has accelerated the bandwidth requirements in high-performance processors for both on-chip and off-chip interconnects. For next generation computing clusters, the delivery of scalable power efficient off-chip communications to each compute node has emerged as a key bottleneck to realizing the full computational performance of these systems. The power dissipation is dominated by the off-chip interface and the necessity to drive high-speed signals over long distances. We present a scalable photonic network interface approach that fully exploits the bandwidth capacity offered by optical interconnects while offering significant power savings over traditional E/O and O/E approaches. The power-efficient interface optically aggregates electronic serial data streams into a multiple WDM channel packet structure at time-of-flight latencies. We demonstrate a scalable optical network interface with 70% improvement in power efficiency for a complete end-to-end PCI Express data transfer.
WebPresent: a World Wide Web-based telepresentation tool for physicians
NASA Astrophysics Data System (ADS)
Sampath-Kumar, Srihari; Banerjea, Anindo; Moshfeghi, Mehran
1997-05-01
In this paper, we present the design architecture and the implementation status of WebPresent - a world wide web based tele-presentation tool. This tool allows a physician to use a conference server workstation and make a presentation of patient cases to a geographically distributed audience. The audience consists of other physicians collaborating on patients' health care management and physicians participating in continuing medical education. These physicians are at several locations with networks of different bandwidth and capabilities connecting them. Audiences also receive the patient case information on different computers ranging form high-end display workstations to laptops with low-resolution displays. WebPresent is a scalable networked multimedia tool which supports the presentation of hypertext, images, audio, video, and a white-board to remote physicians with hospital Intranet access. WebPresent allows the audience to receive customized information. The data received can differ in resolution and bandwidth, depending on the availability of resources such as display resolution and network bandwidth.
MWAHCA: A Multimedia Wireless Ad Hoc Cluster Architecture
Diaz, Juan R.; Jimenez, Jose M.; Sendra, Sandra
2014-01-01
Wireless Ad hoc networks provide a flexible and adaptable infrastructure to transport data over a great variety of environments. Recently, real-time audio and video data transmission has been increased due to the appearance of many multimedia applications. One of the major challenges is to ensure the quality of multimedia streams when they have passed through a wireless ad hoc network. It requires adapting the network architecture to the multimedia QoS requirements. In this paper we propose a new architecture to organize and manage cluster-based ad hoc networks in order to provide multimedia streams. Proposed architecture adapts the network wireless topology in order to improve the quality of audio and video transmissions. In order to achieve this goal, the architecture uses some information such as each node's capacity and the QoS parameters (bandwidth, delay, jitter, and packet loss). The architecture splits the network into clusters which are specialized in specific multimedia traffic. The real system performance study provided at the end of the paper will demonstrate the feasibility of the proposal. PMID:24737996
Drive to miniaturization: integrated optical networks on mobile platforms
NASA Astrophysics Data System (ADS)
Salour, Michael M.; Batayneh, Marwan; Figueroa, Luis
2011-11-01
With rapid growth of the Internet, bandwidth demand for data traffic is continuing to explode. In addition, emerging and future applications are becoming more and more network centric. With the proliferation of data communication platforms and data-intensive applications (e.g. cloud computing), high-bandwidth materials such as video clips dominating the Internet, and social networking tools, a networking technology is very desirable which can scale the Internet's capability (particularly its bandwidth) by two to three orders of magnitude. As the limits of Moore's law are approached, optical mesh networks based on wavelength-division multiplexing (WDM) have the ability to satisfy the large- and scalable-bandwidth requirements of our future backbone telecommunication networks. In addition, this trend is also affecting other special-purpose systems in applications such as mobile platforms, automobiles, aircraft, ships, tanks, and micro unmanned air vehicles (UAVs) which are becoming independent systems roaming the sky while sensing data, processing, making decisions, and even communicating and networking with other heterogeneous systems. Recently, WDM optical technologies have seen advances in its transmission speeds, switching technologies, routing protocols, and control systems. Such advances have made WDM optical technology an appealing choice for the design of future Internet architectures. Along these lines, scientists across the entire spectrum of the network architectures from physical layer to applications have been working on developing devices and communication protocols which can take full advantage of the rapid advances in WDM technology. Nevertheless, the focus has always been on large-scale telecommunication networks that span hundreds and even thousands of miles. Given these advances, we investigate the vision and applicability of integrating the traditionally large-scale WDM optical networks into miniaturized mobile platforms such as UAVs. We explain the benefits of WDM optical technology for these applications. We also describe some of the limitations of WDM optical networks as the size of a vehicle gets smaller, such as in micro-UAVs, and study the miniaturization and communication system limitations in such environments.
Yu, Jen-Shiang K; Hwang, Jenn-Kang; Tang, Chuan Yi; Yu, Chin-Hui
2004-01-01
A number of recently released numerical libraries including Automatically Tuned Linear Algebra Subroutines (ATLAS) library, Intel Math Kernel Library (MKL), GOTO numerical library, and AMD Core Math Library (ACML) for AMD Opteron processors, are linked against the executables of the Gaussian 98 electronic structure calculation package, which is compiled by updated versions of Fortran compilers such as Intel Fortran compiler (ifc/efc) 7.1 and PGI Fortran compiler (pgf77/pgf90) 5.0. The ifc 7.1 delivers about 3% of improvement on 32-bit machines compared to the former version 6.0. Performance improved from pgf77 3.3 to 5.0 is also around 3% when utilizing the original unmodified optimization options of the compiler enclosed in the software. Nevertheless, if extensive compiler tuning options are used, the speed can be further accelerated to about 25%. The performances of these fully optimized numerical libraries are similar. The double-precision floating-point (FP) instruction sets (SSE2) are also functional on AMD Opteron processors operated in 32-bit compilation, and Intel Fortran compiler has performed better optimization. Hardware-level tuning is able to improve memory bandwidth by adjusting the DRAM timing, and the efficiency in the CL2 mode is further accelerated by 2.6% compared to that of the CL2.5 mode. The FP throughput is measured by simultaneous execution of two identical copies of each of the test jobs. Resultant performance impact suggests that IA64 and AMD64 architectures are able to fulfill significantly higher throughput than the IA32, which is consistent with the SpecFPrate2000 benchmarks.
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan
2005-02-01
Call for Papers: Optical Access Networks With the wide deployment of fiber-optic technology over the past two decades, we have witnessed a tremendous growth of bandwidth capacity in the backbone networks of today's telecommunications infrastructure. However, access networks, which cover the "last-mile" areas and serve numerous residential and small business users, have not been scaled up commensurately. The local subscriber lines for telephone and cable television are still using twisted pairs and coaxial cables. Most residential connections to the Internet are still through dial-up modems operating at a low speed on twisted pairs. As the demand for access bandwidth increases with emerging high-bandwidth applications, such as distance learning, high-definition television (HDTV), and video on demand (VoD), the last-mile access networks have become a bandwidth bottleneck in today's telecommunications infrastructure. To ease this bottleneck, it is imperative to provide sufficient bandwidth capacity in the access networks to open the bottleneck and thus present more opportunities for the provisioning of multiservices. Optical access solutions promise huge bandwidth to service providers and low-cost high-bandwidth services to end users and are therefore widely considered the technology of choice for next-generation access networks. To realize the vision of optical access networks, however, many key issues still need to be addressed, such as network architectures, signaling protocols, and implementation standards. The major challenges lie in the fact that an optical solution must be not only robust, scalable, and flexible, but also implemented at a low cost comparable to that of existing access solutions in order to increase the economic viability of many potential high-bandwidth applications. In recent years, optical access networks have been receiving tremendous attention from both academia and industry. A large number of research activities have been carried out or are now underway this hot area. The purpose of this feature issue is to expose the networking community to the latest research breakthroughs and progresses in the area of optical access networks.
Examining the architecture of cellular computing through a comparative study with a computer.
Wang, Degeng; Gribskov, Michael
2005-06-22
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
Economical ground data delivery
NASA Technical Reports Server (NTRS)
Markley, Richard W.; Byrne, Russell H.; Bromberg, Daniel E.
1994-01-01
Data delivery in the Deep Space Network (DSN) involves transmission of a small amount of constant, high-priority traffic and a large amount of bursty, low priority data. The bursty traffic may be initially buffered and then metered back slowly as bandwidth becomes available. Today both types of data are transmitted over dedicated leased circuits. The authors investigated the potential of saving money by designing a hybrid communications architecture that uses leased circuits for high-priority network communications and dial-up circuits for low-priority traffic. Such an architecture may significantly reduce costs and provide an emergency backup. The architecture presented here may also be applied to any ground station-to-customer network within the range of a common carrier. The authors compare estimated costs for various scenarios and suggest security safeguards that should be considered.
Real-Time Wavefront Control for the PALM-3000 High Order Adaptive Optics System
NASA Technical Reports Server (NTRS)
Truong, Tuan N.; Bouchez, Antonin H.; Dekany, Richard G.; Guiwits, Stephen R.; Roberts, Jennifer E.; Troy, Mitchell
2008-01-01
We present a cost-effective scalable real-time wavefront control architecture based on off-the-shelf graphics processing units hosted in an ultra-low latency, high-bandwidth interconnect PC cluster environment composed of modules written in the component-oriented language of nesC. The architecture enables full-matrix reconstruction of the wavefront at up to 2 KHz with latency under 250 us for the PALM-3000 adaptive optics systems, a state-of-the-art upgrade on the 5.1 meter Hale Telescope that consists of a 64 x 64 subaperture Shack-Hartmann wavefront sensor and a 3368 active actuator high order deformable mirror in series with a 241 active actuator tweeter DM. The architecture can easily scale up to support much larger AO systems at higher rates and lower latency.
Exploration of operator method digital optical computers for application to NASA
NASA Technical Reports Server (NTRS)
1990-01-01
Digital optical computer design has been focused primarily towards parallel (single point-to-point interconnection) implementation. This architecture is compared to currently developing VHSIC systems. Using demonstrated multichannel acousto-optic devices, a figure of merit can be formulated. The focus is on a figure of merit termed Gate Interconnect Bandwidth Product (GIBP). Conventional parallel optical digital computer architecture demonstrates only marginal competitiveness at best when compared to projected semiconductor implements. Global, analog global, quasi-digital, and full digital interconnects are briefly examined as alternative to parallel digital computer architecture. Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading effects at very low power consumption levels.
A visual programming environment for the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl; Crockett, Thomas W.; Middleton, David
1988-01-01
The Navier-Stokes computer is a high-performance, reconfigurable, pipelined machine designed to solve large computational fluid dynamics problems. Due to the complexity of the architecture, development of effective, high-level language compilers for the system appears to be a very difficult task. Consequently, a visual programming methodology has been developed which allows users to program the system at an architectural level by constructing diagrams of the pipeline configuration. These schematic program representations can then be checked for validity and automatically translated into machine code. The visual environment is illustrated by using a prototype graphical editor to program an example problem.
NASA Astrophysics Data System (ADS)
Fern, Lisa Carolynn
This dissertation examines the challenges inherent in designing and regulating to support human-automation interaction for new technologies that will be deployed into complex systems. A key question for new technologies with increasingly capable automation, is how work will be accomplished by human and machine agents. This question has traditionally been framed as how functions should be allocated between humans and machines. Such framing misses the coordination and synchronization that is needed for the different human and machine roles in the system to accomplish their goals. Coordination and synchronization demands are driven by the underlying human-automation architecture of the new technology, which are typically not specified explicitly by designers. The human machine interface (HMI), which is intended to facilitate human-machine interaction and cooperation, typically is defined explicitly and therefore serves as a proxy for human-automation cooperation requirements with respect to technical standards for technologies. Unfortunately, mismatches between the HMI and the coordination and synchronization demands of the underlying human-automation architecture can lead to system breakdowns. A methodology is needed that both designers and regulators can utilize to evaluate the predicted performance of a new technology given potential human-automation architectures. Three experiments were conducted to inform the minimum HMI requirements for a detect and avoid (DAA) system for unmanned aircraft systems (UAS). The results of the experiments provided empirical input to specific minimum operational performance standards that UAS manufacturers will have to meet in order to operate UAS in the National Airspace System (NAS). These studies represent a success story for how to objectively and systematically evaluate prototype technologies as part of the process for developing regulatory requirements. They also provide an opportunity to reflect on the lessons learned in order to improve the methodology for defining technology requirements for regulators in the future. The biggest shortcoming of the presented research program was the absence of the explicit definition, generation and analysis of potential human-automation architectures. Failure to execute this step in the research process resulted in less efficient evaluation of the candidate prototypes technologies in addition to a lack of exploration of different approaches to human-automation cooperation. Defining potential human-automation architectures a priori also allows regulators to develop scenarios that will stress the performance boundaries of the technology during the evaluation phase. The importance of adding this step of generating and evaluating candidate human-automation architectures prior to formal empirical evaluation is discussed. This document concludes with a look at both the importance of, and the challenges facing, the inclusion of examining human-automation coordination issues as part of the safety assurance activities of new technologies.
2014-11-01
networks were trained to predict an individual’s electrocardiogram (ECG) and arterial blood pressure ( ABP ) waveform data, which can potentially help...various ESN architectures for prediction tasks, and establishes the benefits of using ESN architecture designs for predicting ECG and ABP waveforms...arterial blood pressure ( ABP ) waveforms immediately prior to the machine generated alarms. When tested, the algorithm suppressed approximately 59.7
Multiprocessor Z-Buffer Architecture for High-Speed, High Complexity Computer Image Generation.
1983-12-01
Oversampling 50 17. "Poking Through" Effects 51 18. Sampling Paths 52 19. Triangle Variables 54 20. Intelligent Tiling Algorithm 61 21. Tiler Functional Blocks...64 * 22. HSD Interface 65 23. Tiling Machine Setup 67 24. Tiling Machine 68 25. Tile Accumulate 69 26. A lx$ Sorting Machine 77 27. A 2x8 Sorting...Delay 227 87. Effect of Triangle Size on Tiler Throughput Rates 229 88. Tiling Machine Setup Stage Performance for Oversample Mode 234 89. Tiling
Embedded control system for computerized franking machine
NASA Astrophysics Data System (ADS)
Shi, W. M.; Zhang, L. B.; Xu, F.; Zhan, H. W.
2007-12-01
This paper presents a novel control system for franking machine. A methodology for operating a franking machine using the functional controls consisting of connection, configuration and franking electromechanical drive is studied. A set of enabling technologies to synthesize postage management software architectures driven microprocessor-based embedded systems is proposed. The cryptographic algorithm that calculates mail items is analyzed to enhance the postal indicia accountability and security. The study indicated that the franking machine is reliability, performance and flexibility in printing mail items.
Trends and New Directions in Software Architecture
2014-10-10
frameworks Open source Cloud strategies NoSQL Machine Learning MDD Incremental approaches Dashboards Distributed development...complexity grows NoSQL Models are not created equal 2014 Our Current Research Lightweight Evaluation and Architecture Prototyping for Big Data
NASA Technical Reports Server (NTRS)
Boriakoff, Valentin
1994-01-01
The goal of this project was the feasibility study of a particular architecture of a digital signal processing machine operating in real time which could do in a pipeline fashion the computation of the fast Fourier transform (FFT) of a time-domain sampled complex digital data stream. The particular architecture makes use of simple identical processors (called inner product processors) in a linear organization called a systolic array. Through computer simulation the new architecture to compute the FFT with systolic arrays was proved to be viable, and computed the FFT correctly and with the predicted particulars of operation. Integrated circuits to compute the operations expected of the vital node of the systolic architecture were proven feasible, and even with a 2 micron VLSI technology can execute the required operations in the required time. Actual construction of the integrated circuits was successful in one variant (fixed point) and unsuccessful in the other (floating point).
NASA Technical Reports Server (NTRS)
Weeks, Cindy Lou
1986-01-01
Experiments were conducted at NASA Ames Research Center to define multi-tasking software requirements for multiple-instruction, multiple-data stream (MIMD) computer architectures. The focus was on specifying solutions for algorithms in the field of computational fluid dynamics (CFD). The program objectives were to allow researchers to produce usable parallel application software as soon as possible after acquiring MIMD computer equipment, to provide researchers with an easy-to-learn and easy-to-use parallel software language which could be implemented on several different MIMD machines, and to enable researchers to list preferred design specifications for future MIMD computer architectures. Analysis of CFD algorithms indicated that extensions of an existing programming language, adaptable to new computer architectures, provided the best solution to meeting program objectives. The CoFORTRAN Language was written in response to these objectives and to provide researchers a means to experiment with parallel software solutions to CFD algorithms on machines with parallel architectures.
Guidance and control for unmanned ground vehicles
NASA Astrophysics Data System (ADS)
Bateman, Peter J.
1994-06-01
Techniques for the guidance, control, and navigation of unmanned ground vehicles are described in terms of the communication bandwidth requirements for driving and control of a vehicle remote from the human operator. Modes of operation are conveniently classified as conventional teleoperation, supervisory control, and fully autonomous control. The fundamental problem of maintaining a robust non-line-of-sight communications link between the human controller and the remote vehicle is discussed, as this provides the impetus for greater autonomy in the control system and the greatest scope for innovation. While supervisory control still requires the man to be providing the primary navigational intelligence, fully autonomous operation requires that mission navigation is provided solely by on-board machine intelligence. Methods directed at achieving this performance are described using various active and passive sensing of the terrain for route navigation and obstacle detection. Emphasis is given to TV imagery and signal processing techniques for image understanding. Reference is made to the limitations of current microprocessor technology and suitable computer architectures. Some of the more recent control techniques involve the use of neural networks, fuzzy logic, and data fusion and these are discussed in the context of road following and cross country navigation. Examples of autonomous vehicle testbeds operated at various laboratories around the world are given.
Providing QoS through machine-learning-driven adaptive multimedia applications.
Ruiz, Pedro M; Botía, Juan A; Gómez-Skarmeta, Antonio
2004-06-01
We investigate the optimization of the quality of service (QoS) offered by real-time multimedia adaptive applications through machine learning algorithms. These applications are able to adapt in real time their internal settings (i.e., video sizes, audio and video codecs, among others) to the unpredictably changing capacity of the network. Traditional adaptive applications just select a set of settings to consume less than the available bandwidth. We propose a novel approach in which the selected set of settings is the one which offers a better user-perceived QoS among all those combinations which satisfy the bandwidth restrictions. We use a genetic algorithm to decide when to trigger the adaptation process depending on the network conditions (i.e., loss-rate, jitter, etc.). Additionally, the selection of the new set of settings is done according to a set of rules which model the user-perceived QoS. These rules are learned using the SLIPPER rule induction algorithm over a set of examples extracted from scores provided by real users. We will demonstrate that the proposed approach guarantees a good user-perceived QoS even when the network conditions are constantly changing.
Ultra-broadband and planar sound diffuser with high uniformity of reflected intensity
NASA Astrophysics Data System (ADS)
Fan, Xu-Dong; Zhu, Yi-Fan; Liang, Bin; Yang, Jing; Yang, Jun; Cheng, Jian-Chun
2017-09-01
Schroeder diffusers, as a classical design of acoustic diffusers proposed over 40 years ago, play key roles in many practical scenarios ranging from architectural acoustics to noise control to particle manipulation. Despite the great success of conventional acoustic diffusers, it is still worth pursuing ideal acoustic diffusers that are essentially expected to produce perfect sound diffuse reflection within the unlimited bandwidth. Here, we propose a different mechanism for designing acoustic diffusers to overcome the basic limits in intensity uniformity and working bandwidth in the previous designs and demonstrate a practical implementation by acoustic metamaterials with dispersionless phase-steering capability. In stark contrast to the existing production of diffuse fields relying on random scattering of sound energy by using a specific mathematical number sequence of periodically distributed unit cells, we directly mold the reflected wavefront into the desired shape by precisely manipulating the local phases of individual subwavelength metastructures. We also benchmark our design via numerical simulation with a commercially available Schroeder diffuser, and the results verify that our proposed diffuser scatters incident acoustic energy into all directions more uniformly within an ultra-broad band regardless of the incident angle. Furthermore, our design enables further improvement of the working bandwidth just by simply downscaling each individual element. With ultra-broadband functionality and high uniformity of reflected intensity, our metamaterial-based production of the diffusive field opens a route to the design and application of acoustic diffusers and may have a significant impact on various fields such as architectural acoustics and medical ultrasound imaging/treatment.
NASA Astrophysics Data System (ADS)
Lazar, Aurel A.; White, John S.
1987-07-01
Theoretical analysis of integrated local area network model of MAGNET, an integrated network testbed developed at Columbia University, shows that the bandwidth freed up during video and voice calls during periods of little movement in the images and periods of silence in the speech signals could be utilized efficiently for graphics and data transmission. Based on these investigations, an architecture supporting adaptive protocols that are dynamicaly controlled by the requirements of a fluctuating load and changing user environment has been advanced. To further analyze the behavior of the network, a real-time packetized video system has been implemented. This system is embedded in the real-time multimedia workstation EDDY, which integrates video, voice, and data traffic flows. Protocols supporting variable-bandwidth, fixed-quality packetized video transport are described in detail.
NASA Astrophysics Data System (ADS)
Lazar, Aurel A.; White, John S.
1986-11-01
Theoretical analysis of an ILAN model of MAGNET, an integrated network testbed developed at Columbia University, shows that the bandwidth freed up by video and voice calls during periods of little movement in the images and silence periods in the speech signals could be utilized efficiently for graphics and data transmission. Based on these investigations, an architecture supporting adaptive protocols that are dynamically controlled by the requirements of a fluctuating load and changing user environment has been advanced. To further analyze the behavior of the network, a real-time packetized video system has been implemented. This system is embedded in the real time multimedia workstation EDDY that integrates video, voice and data traffic flows. Protocols supporting variable bandwidth, constant quality packetized video transport are descibed in detail.
An All-Optical Access Metro Interface for Hybrid WDM/TDM PON Based on OBS
NASA Astrophysics Data System (ADS)
Segarra, Josep; Sales, Vicent; Prat, Josep
2007-04-01
A new all-optical access metro network interface based on optical burst switching (OBS) is proposed. A hybrid wavelength-division multiplexing/time-division multiplexing (WDM/TDM) access architecture with reflective optical network units (ONUs), an arrayed-waveguide-grating outside plant, and a tunable laser stack at the optical line terminal (OLT) is presented as a solution for the passive optical network. By means of OBS and a dynamic bandwidth allocation (DBA) protocol, which polls the ONUs, the available access bandwidth is managed. All the network intelligence and costly equipment is located at the OLT, where the DBA module is centrally implemented, providing quality of service (QoS). To scale this access network, an optical cross connect (OXC) is then used to attain a large number of ONUs by the same OLT. The hybrid WDM/TDM structure is also extended toward the metropolitan area network (MAN) by introducing the concept of OBS multiplexer (OBS-M). The network element OBS-M bridges the MAN and access networks by offering all-optical cross connection, wavelength conversion, and data signaling. The proposed innovative OBS-M node yields a full optical data network, interfacing access and metro with a geographically distributed access control. The resulting novel access metro architectures are nonblocking and, with an improved signaling, provide QoS, scalability, and very low latency. Finally, numerical analysis and simulations demonstrate the traffic performance of the proposed access scheme and all-optical access metro interface and architectures.
Live Virtual Constructive Distributed Test Environment Characterization Report
NASA Technical Reports Server (NTRS)
Murphy, Jim; Kim, Sam K.
2013-01-01
This report documents message latencies observed over various Live, Virtual, Constructive, (LVC) simulation environment configurations designed to emulate possible system architectures for the Unmanned Aircraft Systems (UAS) Integration in the National Airspace System (NAS) Project integrated tests. For each configuration, four scenarios with progressively increasing air traffic loads were used to determine system throughput and bandwidth impacts on message latency.
Adaptive packet switch with an optical core (demonstrator)
NASA Astrophysics Data System (ADS)
Abdo, Ahmad; Bishtein, Vadim; Clark, Stewart A.; Dicorato, Pino; Lu, David T.; Paredes, Sofia A.; Taebi, Sareh; Hall, Trevor J.
2004-11-01
A three-stage opto-electronic packet switch architecture is described consisting of a reconfigurable optical centre stage surrounded by two electronic buffering stages partitioned into sectors to ease memory contention. A Flexible Bandwidth Provision (FBP) algorithm, implemented on a soft-core processor, is used to change the configuration of the input sectors and optical centre stage to set up internal paths that will provide variable bandwidth to serve the traffic. The switch is modeled by a bipartite graph built from a service matrix, which is a function of the arriving traffic. The bipartite graph is decomposed by solving an edge-colouring problem and the resulting permutations are used to configure the switch. Simulation results show that this architecture exhibits a dramatic reduction of complexity and increased potential for scalability, at the price of only a modest spatial speed-up k, 1
Serial Back-Plane Technologies in Advanced Avionics Architectures
NASA Technical Reports Server (NTRS)
Varnavas, Kosta
2005-01-01
Current back plane technologies such as VME, and current personal computer back planes such as PCI, are shared bus systems that can exhibit nondeterministic latencies. This means a card can take control of the bus and use resources indefinitely affecting the ability of other cards in the back plane to acquire the bus. This provides a real hit on the reliability of the system. Additionally, these parallel busses only have bandwidths in the 100s of megahertz range and EMI and noise effects get worse the higher the bandwidth goes. To provide scalable, fault-tolerant, advanced computing systems, more applicable to today s connected computing environment and to better meet the needs of future requirements for advanced space instruments and vehicles, serial back-plane technologies should be implemented in advanced avionics architectures. Serial backplane technologies eliminate the problem of one card getting the bus and never relinquishing it, or one minor problem on the backplane bringing the whole system down. Being serial instead of parallel improves the reliability by reducing many of the signal integrity issues associated with parallel back planes and thus significantly improves reliability. The increased speeds associated with a serial backplane are an added bonus.
The sixth generation robot in space
NASA Technical Reports Server (NTRS)
Butcher, A.; Das, A.; Reddy, Y. V.; Singh, H.
1990-01-01
The knowledge based simulator developed in the artificial intelligence laboratory has become a working test bed for experimenting with intelligent reasoning architectures. With this simulator, recently, small experiments have been done with an aim to simulate robot behavior to avoid colliding paths. An automatic extension of such experiments to intelligently planning robots in space demands advanced reasoning architectures. One such architecture for general purpose problem solving is explored. The robot, seen as a knowledge base machine, goes via predesigned abstraction mechanism for problem understanding and response generation. The three phases in one such abstraction scheme are: abstraction for representation, abstraction for evaluation, and abstraction for resolution. Such abstractions require multimodality. This multimodality requires the use of intensional variables to deal with beliefs in the system. Abstraction mechanisms help in synthesizing possible propagating lattices for such beliefs. The machine controller enters into a sixth generation paradigm.
Suciu, George; Suciu, Victor; Martian, Alexandru; Craciunescu, Razvan; Vulpe, Alexandru; Marcu, Ioana; Halunga, Simona; Fratu, Octavian
2015-11-01
Big data storage and processing are considered as one of the main applications for cloud computing systems. Furthermore, the development of the Internet of Things (IoT) paradigm has advanced the research on Machine to Machine (M2M) communications and enabled novel tele-monitoring architectures for E-Health applications. However, there is a need for converging current decentralized cloud systems, general software for processing big data and IoT systems. The purpose of this paper is to analyze existing components and methods of securely integrating big data processing with cloud M2M systems based on Remote Telemetry Units (RTUs) and to propose a converged E-Health architecture built on Exalead CloudView, a search based application. Finally, we discuss the main findings of the proposed implementation and future directions.
Communication acoustics in Bell Labs
NASA Astrophysics Data System (ADS)
Flanagan, J. L.
2004-05-01
Communication aoustics has been a central theme in Bell Labs research since its inception. Telecommunication serves human information exchange. And, humans favor spoken language as a principal mode. The atmospheric medium typically provides the link between articulation and hearing. Creation, control and detection of sound, and the human's facility for generation and perception are basic ingredients of telecommunication. Electronics technology of the 1920s ushered in great advances in communication at a distance, a strong economical impetus being to overcome bandwidth limitations of wireline and cable. Early research established criteria for speech transmission with high quality and intelligibility. These insights supported exploration of means for efficient transmission-obtaining the greatest amount of speech information over a given bandwidth. Transoceanic communication was initiated by undersea cables for telegraphy. But these long cables exhibited very limited bandwidth (order of few hundred Hz). The challenge of sending voice across the oceans spawned perhaps the best known speech compression technique of history-the Vocoder, which parametrized the signal for transmission in about 300 Hz bandwidth, one-tenth that required for the typical waveform channel. Quality and intelligibility were grave issues (and they still are). At the same time parametric representation offered possibilities for encryption and privacy inside a traditional voice bandwidth. Confidential conversations between Roosevelt and Churchill during World War II were carried over high-frequency radio by an encrypted vocoder system known as Sigsaly. Major engineering advances in the late 1940s and early 1950s moved telecommunications into a new regime-digital technology. These key advances were at least three: (i) new understanding of time-discrete (sampled) representation of signals, (ii) digital computation (especially binary based), and (iii) evolving capabilities in microelectronics that ultimately provided circuits of enormous complexity with low cost and power. Digital transmission (as exemplified in pulse code modulation-PCM, and its many derivatives) became a telecommunication mainstay, along with switches to control and route information in digital form. Concomitantly, storage means for digital information advanced, providing another impetus for speech compression. More and more, humans saw the need to exchange speech information with machines, as well as with other humans. Human-machine speech communication came to full stride in the early 1990s, and now has expanded to multimodal domains that begin to support enhanced naturalness, using contemporaneous sight, sound and touch signaling. Packet transmission is supplanting circuit switching, and voice and video are commonly being carried by Internet protocol.
Ntofon, Okung-Dike; Channegowda, Mayur P; Efstathiou, Nikolaos; Rashidi Fard, Mehdi; Nejabati, Reza; Hunter, David K; Simeonidou, Dimitra
2013-02-25
In this paper, a novel Software-Defined Networking (SDN) architecture is proposed for high-end Ultra High Definition (UHD) media applications. UHD media applications require huge amounts of bandwidth that can only be met with high-capacity optical networks. In addition, there are requirements for control frameworks capable of delivering effective application performance with efficient network utilization. A novel SDN-based Controller that tightly integrates application-awareness with network control and management is proposed for such applications. An OpenFlow-enabled test-bed demonstrator is reported with performance evaluations of advanced online and offline media- and network-aware schedulers.
A 1V low power second-order delta-sigma modulator for biomedical signal application.
Hsu, Chih-Han; Tang, Kea-Tiong
2013-01-01
This paper presents the design and implementation of a low-power delta-sigma modulator for biomedical application with a standard 90 nm CMOS technology. The delta-sigma architecture is implemented as 2nd order feedforward architecture. A low quiescent current operational transconductance amplifier (OTA) is utilized to reduce power consumption. This delta-sigma modulator operated in 1V power supply, and achieved 64.87 dB signal to noise distortion ratio (SNDR) at 10 KHz bandwidth with an oversampling ratio (OSR) of 64. The power consumption is 17.14 µW, and the figure-of-merit (FOM) is 0.60 pJ/conv.
Parallel-Processing CMOS Circuitry for M-QAM and 8PSK TCM
NASA Technical Reports Server (NTRS)
Gray, Andrew; Lee, Dennis; Hoy, Scott; Fisher, Dave; Fong, Wai; Ghuman, Parminder
2009-01-01
There has been some additional development of parts reported in "Multi-Modulator for Bandwidth-Efficient Communication" (NPO-40807), NASA Tech Briefs, Vol. 32, No. 6 (June 2009), page 34. The focus was on 1) The generation of M-order quadrature amplitude modulation (M-QAM) and octonary-phase-shift-keying, trellis-coded modulation (8PSK TCM), 2) The use of square-root raised-cosine pulse-shaping filters, 3) A parallel-processing architecture that enables low-speed [complementary metal oxide/semiconductor (CMOS)] circuitry to perform the coding, modulation, and pulse-shaping computations at a high rate; and 4) Implementation of the architecture in a CMOS field-programmable gate array.
NASA Astrophysics Data System (ADS)
Benini, Luca
2017-06-01
The "internet of everything" envisions trillions of connected objects loaded with high-bandwidth sensors requiring massive amounts of local signal processing, fusion, pattern extraction and classification. From the computational viewpoint, the challenge is formidable and can be addressed only by pushing computing fabrics toward massive parallelism and brain-like energy efficiency levels. CMOS technology can still take us a long way toward this goal, but technology scaling is losing steam. Energy efficiency improvement will increasingly hinge on architecture, circuits, design techniques such as heterogeneous 3D integration, mixed-signal preprocessing, event-based approximate computing and non-Von-Neumann architectures for scalable acceleration.
Predicate calculus for an architecture of multiple neural networks
NASA Astrophysics Data System (ADS)
Consoli, Robert H.
1990-08-01
Future projects with neural networks will require multiple individual network components. Current efforts along these lines are ad hoc. This paper relates the neural network to a classical device and derives a multi-part architecture from that model. Further it provides a Predicate Calculus variant for describing the location and nature of the trainings and suggests Resolution Refutation as a method for determining the performance of the system as well as the location of needed trainings for specific proofs. 2. THE NEURAL NETWORK AND A CLASSICAL DEVICE Recently investigators have been making reports about architectures of multiple neural networksL234. These efforts are appearing at an early stage in neural network investigations they are characterized by architectures suggested directly by the problem space. Touretzky and Hinton suggest an architecture for processing logical statements1 the design of this architecture arises from the syntax of a restricted class of logical expressions and exhibits syntactic limitations. In similar fashion a multiple neural netword arises out of a control problem2 from the sequence learning problem3 and from the domain of machine learning. 4 But a general theory of multiple neural devices is missing. More general attempts to relate single or multiple neural networks to classical computing devices are not common although an attempt is made to relate single neural devices to a Turing machines and Sun et a!. develop a multiple neural architecture that performs pattern classification.
Broadband spectroscopy of dynamic impedances with short chirp pulses.
Min, M; Land, R; Paavle, T; Parve, T; Annus, P; Trebbels, D
2011-07-01
An impedance spectrum of dynamic systems is time dependent. Fast impedance changes take place, for example, in high throughput microfluidic devices and in operating cardiovascular systems. Measurements must be as short as possible to avoid significant impedance changes during the spectrum analysis, and as long as possible for enlarging the excitation energy and obtaining a better signal-to-noise ratio (SNR). The authors propose to use specific short chirp pulses for excitation. Thanks to the specific properties of the chirp function, it is possible to meet the needs for a spectrum bandwidth, measurement time and SNR so that the most accurate impedance spectrogram can be obtained. The chirp wave excitation can include thousands of cycles when the impedance changes slowly, but in the case of very high speed changes it can be shorter than a single cycle, preserving the same excitation bandwidth. For example, a 100 kHz bandwidth can be covered by the chirp pulse with durations from 10 µs to 1 s; only its excitation energy differs also 10(5) times. After discussing theoretical short chirp properties in detail, the authors show how to generate short chirps in the microsecond range with a bandwidth up to a few MHz by using digital synthesis architectures developed inside a low-cost standard field programmable gate array.
THE COMPUTER AND THE ARCHITECTURAL PROFESSION.
ERIC Educational Resources Information Center
HAVILAND, DAVID S.
THE ROLE OF ADVANCING TECHNOLOGY IN THE FIELD OF ARCHITECTURE IS DISCUSSED IN THIS REPORT. PROBLEMS IN COMMUNICATION AND THE DESIGN PROCESS ARE IDENTIFIED. ADVANTAGES AND DISADVANTAGES OF COMPUTERS ARE MENTIONED IN RELATION TO MAN AND MACHINE INTERACTION. PRESENT AND FUTURE IMPLICATIONS OF COMPUTER USAGE ARE IDENTIFIED AND DISCUSSED WITH RESPECT…
Computer Security Primer: Systems Architecture, Special Ontology and Cloud Virtual Machines
ERIC Educational Resources Information Center
Waguespack, Leslie J.
2014-01-01
With the increasing proliferation of multitasking and Internet-connected devices, security has reemerged as a fundamental design concern in information systems. The shift of IS curricula toward a largely organizational perspective of security leaves little room for focus on its foundation in systems architecture, the computational underpinnings of…
Managing Parallelism and Resources in Scientific Dataflow Programs
1990-03-01
1983. [52] K. Hiraki , K. Nishida, S. Sekiguchi, and T. Shimada. Maintainence architecture and its LSI implementation of a dataflow computer with a... Hiraki , and K. Nishida. An architecture of a data flow machine and its evaluation. In Proceedings of CompCon 84, pages 486-490. IEEE, 1984. [84] N
Extending the BEAGLE library to a multi-FPGA platform
2013-01-01
Background Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein’s pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein’s pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. Results The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform’s peak memory bandwidth and the implementation’s memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE’s CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE’s GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. Conclusions The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor. PMID:23331707
New laser glass for short pulsed laser applications: the BLG80 (Conference Presentation)
NASA Astrophysics Data System (ADS)
George, Simi A.
2017-03-01
For achieving highest peak powers in a solid state laser (SSL) system, significant energy output and short pulses are necessary. For mode-locked lasers, it is well-known from the Fourier theorem that the largest gain bandwidths produce the narrowest pulse-widths; thus are transform limited. For an inhomogeneously broadened line width of a laser medium, if the intensity of pulses follow a Gaussian function, then the resulting mode-locked pulse will have a Gaussian shape with the emission bandwidth/pulse duration relationship of pulse ≥ 0.44?02/c. Thus, for high peak power SSL systems, laser designers incorporate gain materials capable of broad emission bandwidths. Available energy outputs from a phosphate glass host doped with rare-earth ions are unparalleled. Unfortunately, the emission bandwidths achievable from glass based gain materials are typically many factors smaller when compared to the Ti:Sapphire crystal. In order to overcome this limitation, a hybrid "mixed" laser glass amplifier - OPCPA approach was developed. The Texas petawatt laser that is currently in operation at the University of Texas-Austin and producing high peak powers uses this hybrid architecture. In this mixed-glass laser design, a phosphate and a silicate glass is used in series to achieve a broader bandwidth required before compression. Though proven, this technology is still insufficient for the future compact petawatt and exawatt systems capable of producing high energies and shorter pulse durations. New glasses with bandwidths that are two and three times larger than what is now available from glass hosts is needed if there is to be an alternative to Ti:Sapphire for laser designers. In this paper, we present new materials that may meet the necessary characteristics and demonstrate the laser and emission characteristics these through the internal and external studies.
Rofoee, Bijan Rahimzadeh; Zervas, Georgios; Yan, Yan; Amaya, Norberto; Qin, Yixuan; Simeonidou, Dimitra
2013-03-11
The paper presents a novel network architecture on demand approach using on-chip and-off chip implementations, enabling programmable, highly efficient and transparent networking, well suited for intra-datacenter communications. The implemented FPGA-based adaptable line-card with on-chip design along with an architecture on demand (AoD) based off-chip flexible switching node, deliver single chip dual L2-Packet/L1-time shared optical network (TSON) server Network Interface Cards (NIC) interconnected through transparent AoD based switch. It enables hitless adaptation between Ethernet over wavelength switched network (EoWSON), and TSON based sub-wavelength switching, providing flexible bitrates, while meeting strict bandwidth, QoS requirements. The on and off-chip performance results show high throughput (9.86Ethernet, 8.68Gbps TSON), high QoS, as well as hitless switch-over.
Automatic Adaptation of Tunable Distributed Applications
2001-01-01
size, weight, and battery life, with a single CPU, less memory, smaller hard disk, and lower bandwidth network connectivity. The power of PDAs is...wireless, and bluetooth [32] facilities; thus achieving different rates of data transmission. 1 With the trend of “write once, run everywhere...applications, a single component can execute on multiple processors (or machines) in parallel. These parallel applications, written in a specialized language
Li, Siqi; Jiang, Huiyan; Pang, Wenbo
2017-05-01
Accurate cell grading of cancerous tissue pathological image is of great importance in medical diagnosis and treatment. This paper proposes a joint multiple fully connected convolutional neural network with extreme learning machine (MFC-CNN-ELM) architecture for hepatocellular carcinoma (HCC) nuclei grading. First, in preprocessing stage, each grayscale image patch with the fixed size is obtained using center-proliferation segmentation (CPS) method and the corresponding labels are marked under the guidance of three pathologists. Next, a multiple fully connected convolutional neural network (MFC-CNN) is designed to extract the multi-form feature vectors of each input image automatically, which considers multi-scale contextual information of deep layer maps sufficiently. After that, a convolutional neural network extreme learning machine (CNN-ELM) model is proposed to grade HCC nuclei. Finally, a back propagation (BP) algorithm, which contains a new up-sample method, is utilized to train MFC-CNN-ELM architecture. The experiment comparison results demonstrate that our proposed MFC-CNN-ELM has superior performance compared with related works for HCC nuclei grading. Meanwhile, external validation using ICPR 2014 HEp-2 cell dataset shows the good generalization of our MFC-CNN-ELM architecture. Copyright © 2017 Elsevier Ltd. All rights reserved.
An Energy-Efficient Multi-Tier Architecture for Fall Detection Using Smartphones.
Guvensan, M Amac; Kansiz, A Oguz; Camgoz, N Cihan; Turkmen, H Irem; Yavuz, A Gokhan; Karsligil, M Elif
2017-06-23
Automatic detection of fall events is vital to providing fast medical assistance to the causality, particularly when the injury causes loss of consciousness. Optimization of the energy consumption of mobile applications, especially those which run 24/7 in the background, is essential for longer use of smartphones. In order to improve energy-efficiency without compromising on the fall detection performance, we propose a novel 3-tier architecture that combines simple thresholding methods with machine learning algorithms. The proposed method is implemented on a mobile application, called uSurvive, for Android smartphones. It runs as a background service and monitors the activities of a person in daily life and automatically sends a notification to the appropriate authorities and/or user defined contacts when it detects a fall. The performance of the proposed method was evaluated in terms of fall detection performance and energy consumption. Real life performance tests conducted on two different models of smartphone demonstrate that our 3-tier architecture with feature reduction could save up to 62% of energy compared to machine learning only solutions. In addition to this energy saving, the hybrid method has a 93% of accuracy, which is superior to thresholding methods and better than machine learning only solutions.
An Efficient, Highly Flexible Multi-Channel Digital Downconverter Architecture
NASA Technical Reports Server (NTRS)
Goodhart, Charles E.; Soriano, Melissa A.; Navarro, Robert; Trinh, Joseph T.; Sigman, Elliott H.
2013-01-01
In this innovation, a digital downconverter has been created that produces a large (16 or greater) number of output channels of smaller bandwidths. Additionally, this design has the flexibility to tune each channel independently to anywhere in the input bandwidth to cover a wide range of output bandwidths (from 32 MHz down to 1 kHz). Both the flexibility in channel frequency selection and the more than four orders of magnitude range in output bandwidths (decimation rates from 32 to 640,000) presented significant challenges to be solved. The solution involved breaking the digital downconversion process into a two-stage process. The first stage is a 2 oversampled filter bank that divides the whole input bandwidth as a real input signal into seven overlapping, contiguous channels represented with complex samples. Using the symmetry of the sine and cosine functions in a similar way to that of an FFT (fast Fourier transform), this downconversion is very efficient and gives seven channels fixed in frequency. An arbitrary number of smaller bandwidth channels can be formed from second-stage downconverters placed after the first stage of downconversion. Because of the overlapping of the first stage, there is no gap in coverage of the entire input bandwidth. The input to any of the second-stage downconverting channels has a multiplexer that chooses one of the seven wideband channels from the first stage. These second-stage downconverters take up fewer resources because they operate at lower bandwidths than doing the entire downconversion process from the input bandwidth for each independent channel. These second-stage downconverters are each independent with fine frequency control tuning, providing extreme flexibility in positioning the center frequency of a downconverted channel. Finally, these second-stage downconverters have flexible decimation factors over four orders of magnitude The algorithm was developed to run in an FPGA (field programmable gate array) at input data sampling rates of up to 1,280 MHz. The current implementation takes a 1,280-MHz real input, and first breaks it up into seven 160-MHz complex channels, each spaced 80 MHz apart. The eighth channel at baseband was not required for this implementation, and led to more optimization. Afterwards, 16 second stage narrow band channels with independently tunable center frequencies and bandwidth settings are implemented A future implementation in a larger Xilinx FPGA will hold up to 32 independent second-stage channels.
Combline designs improve mm-wave filter performance
NASA Astrophysics Data System (ADS)
Hey-Shipton, Gregory L.
1990-10-01
Combline filters with 2- to 75-percent bandwidths and orders up to 19 are discussed. They are realized as coupled rectangular coaxial transmission lines, since this type of transmission line is characterized by machinability and the wide variation in coupling coefficients that can be realized with rectangular bars. A broadband combline filter designed as a 19th-order, 0.01-dB equal-ripple Chebyshev type is presented, along with a third-order 0.001-dB equal-ripple Chebyshev filter with a 200-MHz bandwidth centered at 8.0 GHz. Interfaces to standard 50-ohm coaxial lines, as well as structures for waveguide interfaces are described, and focus is placed on a two-step impedance transformer matching a 538-ohm waveguide characteristic impedance to a 95-ohm filter terminal impedance.
Wang, Zhihui; Kiryu, Tohru
2006-04-01
Since machine-based exercise still uses local facilities, it is affected by time and place. We designed a web-based system architecture based on the Java 2 Enterprise Edition that can accomplish continuously supported machine-based exercise. In this system, exercise programs and machines are loosely coupled and dynamically integrated on the site of exercise via the Internet. We then extended the conventional health promotion model, which contains three types of players (users, exercise trainers, and manufacturers), by adding a new player: exercise program creators. Moreover, we developed a self-describing strategy to accommodate a variety of exercise programs and provide ease of use to users on the web. We illustrate our novel design with examples taken from our feasibility study on a web-based cycle ergometer exercise system. A biosignal-based workload control approach was introduced to ensure that users performed appropriate exercise alone.
A Multi-Component Automated Laser-Origami System for Cyber-Manufacturing
NASA Astrophysics Data System (ADS)
Ko, Woo-Hyun; Srinivasa, Arun; Kumar, P. R.
2017-12-01
Cyber-manufacturing systems can be enhanced by an integrated network architecture that is easily configurable, reliable, and scalable. We consider a cyber-physical system for use in an origami-type laser-based custom manufacturing machine employing folding and cutting of sheet material to manufacture 3D objects. We have developed such a system for use in a laser-based autonomous custom manufacturing machine equipped with real-time sensing and control. The basic elements in the architecture are built around the laser processing machine. They include a sensing system to estimate the state of the workpiece, a control system determining control inputs for a laser system based on the estimated data and user’s job requests, a robotic arm manipulating the workpiece in the work space, and middleware, named Etherware, supporting the communication among the systems. We demonstrate automated 3D laser cutting and bending to fabricate a 3D product as an experimental result.
A Cloud-based Approach to Medical NLP
Chard, Kyle; Russell, Michael; Lussier, Yves A.; Mendonça, Eneida A; Silverstein, Jonathan C.
2011-01-01
Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN. PMID:22195072
A cloud-based approach to medical NLP.
Chard, Kyle; Russell, Michael; Lussier, Yves A; Mendonça, Eneida A; Silverstein, Jonathan C
2011-01-01
Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN.
Design and analysis of APD photoelectric detecting circuit
NASA Astrophysics Data System (ADS)
Fang, R.; Wang, C.
2015-11-01
In LADAR system, photoelectric detecting circuit is the key part in photoelectric conversion, which determines speed of respond, sensitivity and fidelity of the system. This paper presents the design of a matched APD Photoelectric detecting circuit. The circuit accomplishes low-noise readout and high-gain amplification of the weak photoelectric signal. The main performances, especially noise and transient response of the circuit are analyzed. In order to obtain large bandwidth, decompensated operational amplifiers are applied. Circuit simulations allow the architecture validation and the global performances to be predicted. The simulation results show that the gain of the detecting circuit is 630kΩ while the bandwidth is 100MHz, and 28dB dynamic range is achieved. Furthermore, the variation of the output pulse width is less than 0.9ns.
NASA Technical Reports Server (NTRS)
Lin, Shu (Principal Investigator); Uehara, Gregory T.; Nakamura, Eric; Chu, Cecilia W. P.
1996-01-01
The (64, 40, 8) subcode of the third-order Reed-Muller (RM) code for high-speed satellite communications is proposed. The RM subcode can be used either alone or as an inner code of a concatenated coding system with the NASA standard (255, 233, 33) Reed-Solomon (RS) code as the outer code to achieve high performance (or low bit-error rate) with reduced decoding complexity. It can also be used as a component code in a multilevel bandwidth efficient coded modulation system to achieve reliable bandwidth efficient data transmission. The progress made toward achieving the goal of implementing a decoder system based upon this code is summarized. The development of the integrated circuit prototype sub-trellis IC, particularly focusing on the design methodology, is addressed.
NASA Astrophysics Data System (ADS)
Murshid, Syed H.; Muralikrishnan, Hari P.; Kozaitis, Samuel P.
2012-06-01
Bandwidth increase has always been an important area of research in communications. A novel multiplexing technique known as Spatial Domain Multiplexing (SDM) has been developed at the Optronics Laboratory of Florida Institute of Technology to increase the bandwidth to T-bits/s range. In this technique, space inside the fiber is used effectively to transmit up to four channels of same wavelength at the same time. Experimental and theoretical analysis shows that these channels follow independent helical paths inside the fiber without interfering with each other. Multiple pigtail laser sources of exactly the same wavelength are used to launch light into a single carrier fiber in a fashion that resulting channels follow independent helical trajectories. These helically propagating light beams form optical vortices inside the fiber and carry their own Orbital Angular Momentum (OAM). The outputs of these beams appear as concentric donut shaped rings when projected on a screen. This endeavor presents the experimental outputs and simulated results for a four channel spatially multiplexed system effectively increasing the system bandwidth by a factor of four.
Open Architecture Data System for NASA Langley Combined Loads Test System
NASA Technical Reports Server (NTRS)
Lightfoot, Michael C.; Ambur, Damodar R.
1998-01-01
The Combined Loads Test System (COLTS) is a new structures test complex that is being developed at NASA Langley Research Center (LaRC) to test large curved panels and cylindrical shell structures. These structural components are representative of aircraft fuselage sections of subsonic and supersonic transport aircraft and cryogenic tank structures of reusable launch vehicles. Test structures are subjected to combined loading conditions that simulate realistic flight load conditions. The facility consists of two pressure-box test machines and one combined loads test machine. Each test machine possesses a unique set of requirements or research data acquisition and real-time data display. Given the complex nature of the mechanical and thermal loads to be applied to the various research test articles, each data system has been designed with connectivity attributes that support both data acquisition and data management functions. This paper addresses the research driven data acquisition requirements for each test machine and demonstrates how an open architecture data system design not only meets those needs but provides robust data sharing between data systems including the various control systems which apply spectra of mechanical and thermal loading profiles.
NASA Astrophysics Data System (ADS)
Benedetti, Marcello; Realpe-Gómez, John; Perdomo-Ortiz, Alejandro
2018-07-01
Machine learning has been presented as one of the key applications for near-term quantum technologies, given its high commercial value and wide range of applicability. In this work, we introduce the quantum-assisted Helmholtz machine:a hybrid quantum–classical framework with the potential of tackling high-dimensional real-world machine learning datasets on continuous variables. Instead of using quantum computers only to assist deep learning, as previous approaches have suggested, we use deep learning to extract a low-dimensional binary representation of data, suitable for processing on relatively small quantum computers. Then, the quantum hardware and deep learning architecture work together to train an unsupervised generative model. We demonstrate this concept using 1644 quantum bits of a D-Wave 2000Q quantum device to model a sub-sampled version of the MNIST handwritten digit dataset with 16 × 16 continuous valued pixels. Although we illustrate this concept on a quantum annealer, adaptations to other quantum platforms, such as ion-trap technologies or superconducting gate-model architectures, could be explored within this flexible framework.
Ritchie, Marylyn D; White, Bill C; Parker, Joel S; Hahn, Lance W; Moore, Jason H
2003-01-01
Background Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases. Results Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present. Conclusion This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases. PMID:12846935
NASA Astrophysics Data System (ADS)
Pleros, N.; Kalfas, G.; Mitsolidou, C.; Vagionas, C.; Tsiokos, D.; Miliou, A.
2017-01-01
Future broadband access networks in the 5G framework will need to be bilateral, exploiting both optical and wireless technologies. This paper deals with new approaches and synergies on radio-over-fiber (RoF) technologies and how those can be leveraged to seamlessly converge wireless technology for agility and mobility with passive optical networks (PON)-based backhauling. The proposed convergence paradigm is based upon a holistic network architecture mixing mm-wave wireless access with photonic integration, dynamic capacity allocation and network coding schemes to enable high bandwidth and low-latency fixed and 60GHz wireless personal area communications for gigabit rate per user, proposing and deploying on top a Medium-Transparent MAC (MT-MAC) protocol as a low-latency bandwidth allocation mechanism. We have evaluated alternative network topologies between the central office (CO) and the access point module (APM) for data rates up to 2.5 Gb/s and SC frequencies up to 60 GHz. Optical network coding is demonstrated for SCM-based signaling to enhance bandwidth utilization and facilitate optical-wireless convergence in 5G applications, reporting medium-transparent network coding directly at the physical layer between end-users communicating over a RoF infrastructure. Towards equipping the physical layer with the appropriate agility to support MT-MAC protocols, a monolithic InP-based Remote Antenna Unit optoelectronic PIC interface is shown that ensures control over the optical resource allocation assisting at the same time broadband wireless service. Finally, the MT-MAC protocol is analysed and simulation and analytical theoretical results are presented that are found to be in good agreement confirming latency values lower than 1msec for small- to mid-load conditions.
WDM mid-board optics for chip-to-chip wavelength routing interconnects in the H2020 ICT-STREAMS
NASA Astrophysics Data System (ADS)
Kanellos, G. T.; Pleros, N.
2017-02-01
Multi-socket server boards have emerged to increase the processing power density on the board level and further flatten the data center networks beyond leaf-spine architectures. Scaling however the number of processors per board puts current electronic technologies into challenge, as it requires high bandwidth interconnects and high throughput switches with increased number of ports that are currently unavailable. On-board optical interconnection has proved the potential to efficiently satisfy the bandwidth needs, but their use has been limited to parallel links without performing any smart routing functionality. With CWDM optical interconnects already a commodity, cyclical wavelength routing proposed to fit the datacom for rack-to-rack and board-to-board communication now becomes a promising on-board routing platform. ICT-STREAMS is a European research project that aims to combine WDM parallel on-board transceivers with a cyclical AWGR, in order to create a new board-level, chip-to-chip interconnection paradigm that will leverage WDM parallel transmission to a powerful wavelength routing platform capable to interconnect multiple processors with unprecedented bandwidth and throughput capacity. Direct, any-to-any, on-board interconnection of multiple processors will significantly contribute to further flatten the data centers and facilitate east-west communication. In the present communication, we present ICT-STREAMS on-board wavelength routing architecture for multiple chip-to-chip interconnections and evaluate the overall system performance in terms of throughput and latency for several schemes and traffic profiles. We also review recent advances of the ICT-STREAMS platform key-enabling technologies that span from Si in-plane lasers and polymer based electro-optical circuit boards to silicon photonics transceivers and photonic-crystal amplifiers.
Deep learning methods for protein torsion angle prediction.
Li, Haiou; Hou, Jie; Adhikari, Badri; Lyu, Qiang; Cheng, Jianlin
2017-09-18
Deep learning is one of the most powerful machine learning methods that has achieved the state-of-the-art performance in many domains. Since deep learning was introduced to the field of bioinformatics in 2012, it has achieved success in a number of areas such as protein residue-residue contact prediction, secondary structure prediction, and fold recognition. In this work, we developed deep learning methods to improve the prediction of torsion (dihedral) angles of proteins. We design four different deep learning architectures to predict protein torsion angles. The architectures including deep neural network (DNN) and deep restricted Boltzmann machine (DRBN), deep recurrent neural network (DRNN) and deep recurrent restricted Boltzmann machine (DReRBM) since the protein torsion angle prediction is a sequence related problem. In addition to existing protein features, two new features (predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments) are used as input to each of the four deep learning architectures to predict phi and psi angles of protein backbone. The mean absolute error (MAE) of phi and psi angles predicted by DRNN, DReRBM, DRBM and DNN is about 20-21° and 29-30° on an independent dataset. The MAE of phi angle is comparable to the existing methods, but the MAE of psi angle is 29°, 2° lower than the existing methods. On the latest CASP12 targets, our methods also achieved the performance better than or comparable to a state-of-the art method. Our experiment demonstrates that deep learning is a valuable method for predicting protein torsion angles. The deep recurrent network architecture performs slightly better than deep feed-forward architecture, and the predicted residue contact number and the error distribution of torsion angles extracted from sequence fragments are useful features for improving prediction accuracy.
Development Of A Three-Dimensional Circuit Integration Technology And Computer Architecture
NASA Astrophysics Data System (ADS)
Etchells, R. D.; Grinberg, J.; Nudd, G. R.
1981-12-01
This paper is the first of a series 1,2,3 describing a range of efforts at Hughes Research Laboratories, which are collectively referred to as "Three-Dimensional Microelectronics." The technology being developed is a combination of a unique circuit fabrication/packaging technology and a novel processing architecture. The packaging technology greatly reduces the parasitic impedances associated with signal-routing in complex VLSI structures, while simultaneously allowing circuit densities orders of magnitude higher than the current state-of-the-art. When combined with the 3-D processor architecture, the resulting machine exhibits a one- to two-order of magnitude simultaneous improvement over current state-of-the-art machines in the three areas of processing speed, power consumption, and physical volume. The 3-D architecture is essentially that commonly referred to as a "cellular array", with the ultimate implementation having as many as 512 x 512 processors working in parallel. The three-dimensional nature of the assembled machine arises from the fact that the chips containing the active circuitry of the processor are stacked on top of each other. In this structure, electrical signals are passed vertically through the chips via thermomigrated aluminum feedthroughs. Signals are passed between adjacent chips by micro-interconnects. This discussion presents a broad view of the total effort, as well as a more detailed treatment of the fabrication and packaging technologies themselves. The results of performance simulations of the completed 3-D processor executing a variety of algorithms are also presented. Of particular pertinence to the interests of the focal-plane array community is the simulation of the UNICORNS nonuniformity correction algorithms as executed by the 3-D architecture.
Evaluating science return in space exploration initiative architectures
NASA Technical Reports Server (NTRS)
Budden, Nancy Ann; Spudis, Paul D.
1993-01-01
Science is an important aspect of the Space Exploration Initiative, a program to explore the Moon and Mars with people and machines. Different SEI mission architectures are evaluated on the basis of three variables: access (to the planet's surface), capability (including number of crew, equipment, and supporting infrastructure), and time (being the total number of man-hours available for scientific activities). This technique allows us to estimate the scientific return to be expected from different architectures and from different implementations of the same architecture. Our methodology allows us to maximize the scientific return from the initiative by illuminating the different emphases and returns that result from the alternative architectural decisions.
Flexible Endian Adjustment for Cross Architecture Binary Translation
NASA Astrophysics Data System (ADS)
Zhu, Tong; Liu, Bo; Guan, Haibing; Liang, Alei
Different architectures and/or ISA (Instruction Set Architecture) representations hold different data arranging formats in the memory. Therefore, the adjustment of byte packing order (endianness) is indispensable in cross- architecture binary translation if the source and target machines are of heterogeneous endianness, which may otherwise cause system failure. The issue is inconspicuous but may lead to significant performance bottleneck. This paper investigates the key aspects of endianness and finds several solutions to endian adjustment for cross-architecture binary translation. In particular, it considers the two principal methods of this field - byte swapping and address swizzling, and gives a comparison of them in our DBT (Dynamic Binary Translator) - CrossBit.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ang; Song, Shuaiwen; Brugel, Eric
To continuously comply with Moore’s Law, modern parallel machines become increasingly complex. Effectively tuning application performance for these machines therefore becomes a daunting task. Moreover, identifying performance bottlenecks at application and architecture level, as well as evaluating various optimization strategies, are becoming extremely difficult when the entanglement of numerous correlated factors is being presented. To tackle these challenges, we present a visual analytical model named “X”. It is intuitive and sufficiently flexible to track all the typical features of a parallel machine.
Application of multirate digital filter banks to wideband all-digital phase-locked loops design
NASA Technical Reports Server (NTRS)
Sadr, Ramin; Shah, Biren; Hinedi, Sami
1993-01-01
A new class of architecture for all-digital phase-locked loops (DPLL's) is presented in this article. These architectures, referred to as parallel DPLL (PDPLL), employ multirate digital filter banks (DFB's) to track signals with a lower processing rate than the Nyquist rate, without reducing the input (Nyquist) bandwidth. The PDPLL basically trades complexity for hardware-processing speed by introducing parallel processing in the receiver. It is demonstrated here that the DPLL performance is identical to that of a PDPLL for both steady-state and transient behavior. A test signal with a time-varying Doppler characteristic is used to compare the performance of both the DPLL and the PDPLL.
Application of multirate digital filter banks to wideband all-digital phase-locked loops design
NASA Astrophysics Data System (ADS)
Sadr, Ramin; Shah, Biren; Hinedi, Sami
1993-06-01
A new class of architecture for all-digital phase-locked loops (DPLL's) is presented in this article. These architectures, referred to as parallel DPLL (PDPLL), employ multirate digital filter banks (DFB's) to track signals with a lower processing rate than the Nyquist rate, without reducing the input (Nyquist) bandwidth. The PDPLL basically trades complexity for hardware-processing speed by introducing parallel processing in the receiver. It is demonstrated here that the DPLL performance is identical to that of a PDPLL for both steady-state and transient behavior. A test signal with a time-varying Doppler characteristic is used to compare the performance of both the DPLL and the PDPLL.
Application of multirate digital filter banks to wideband all-digital phase-locked loops design
NASA Astrophysics Data System (ADS)
Sadr, R.; Shah, B.; Hinedi, S.
1992-11-01
A new class of architecture for all-digital phase-locked loops (DPLL's) is presented in this article. These architectures, referred to as parallel DPLL (PDPLL), employ multirate digital filter banks (DFB's) to track signals with a lower processing rate than the Nyquist rate, without reducing the input (Nyquist) bandwidth. The PDPLL basically trades complexity for hardware-processing speed by introducing parallel processing in the receiver. It is demonstrated here that the DPLL performance is identical to that of a PDPLL for both steady-state and transient behavior. A test signal with a time-varying Doppler characteristic is used to compare the performance of both the DPLL and the PDPLL.
Application of multirate digital filter banks to wideband all-digital phase-locked loops design
NASA Technical Reports Server (NTRS)
Sadr, R.; Shah, B.; Hinedi, S.
1992-01-01
A new class of architecture for all-digital phase-locked loops (DPLL's) is presented in this article. These architectures, referred to as parallel DPLL (PDPLL), employ multirate digital filter banks (DFB's) to track signals with a lower processing rate than the Nyquist rate, without reducing the input (Nyquist) bandwidth. The PDPLL basically trades complexity for hardware-processing speed by introducing parallel processing in the receiver. It is demonstrated here that the DPLL performance is identical to that of a PDPLL for both steady-state and transient behavior. A test signal with a time-varying Doppler characteristic is used to compare the performance of both the DPLL and the PDPLL.
GPU-computing in econophysics and statistical physics
NASA Astrophysics Data System (ADS)
Preis, T.
2011-03-01
A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.
An efficient optical architecture for sparsely connected neural networks
NASA Technical Reports Server (NTRS)
Hine, Butler P., III; Downie, John D.; Reid, Max B.
1990-01-01
An architecture for general-purpose optical neural network processor is presented in which the interconnections and weights are formed by directing coherent beams holographically, thereby making use of the space-bandwidth products of the recording medium for sparsely interconnected networks more efficiently that the commonly used vector-matrix multiplier, since all of the hologram area is in use. An investigation is made of the use of computer-generated holograms recorded on such updatable media as thermoplastic materials, in order to define the interconnections and weights of a neural network processor; attention is given to limits on interconnection densities, diffraction efficiencies, and weighing accuracies possible with such an updatable thin film holographic device.
Hybrid electro-optics and chipscale integration of electronics and photonics
NASA Astrophysics Data System (ADS)
Dalton, L. R.; Robinson, B. H.; Elder, D. L.; Tillack, A. F.; Johnson, L. E.
2017-08-01
Taken together, theory-guided nano-engineering of organic electro-optic materials and hybrid device architectures have permitted dramatic improvement of the performance of electro-optic devices. For example, the voltage-length product has been improved by nearly a factor of 104 , bandwidths have been extended to nearly 200 GHz, device footprints reduced to less than 200 μm2 , and femtojoule energy efficiency achieved. This presentation discusses the utilization of new coarse-grained theoretical methods and advanced quantum mechanical methods to quantitatively simulate the physical properties of new classes of organic electro-optic materials and to evaluate their performance in nanoscopic device architectures, accounting for the effect on chromophore ordering at interfaces in nanoscopic waveguides.
Fiber-Optic Network Architectures for Onboard Avionics Applications Investigated
NASA Technical Reports Server (NTRS)
Nguyen, Hung D.; Ngo, Duc H.
2003-01-01
This project is part of a study within the Advanced Air Transportation Technologies program undertaken at the NASA Glenn Research Center. The main focus of the program is the improvement of air transportation, with particular emphasis on air transportation safety. Current and future advances in digital data communications between an aircraft and the outside world will require high-bandwidth onboard communication networks. Radiofrequency (RF) systems, with their interconnection network based on coaxial cables and waveguides, increase the complexity of communication systems onboard modern civil and military aircraft with respect to weight, power consumption, and safety. In addition, safety and reliability concerns from electromagnetic interference between the RF components embedded in these communication systems exist. A simple, reliable, and lightweight network that is free from the effects of electromagnetic interference and capable of supporting the broadband communications needs of future onboard digital avionics systems cannot be easily implemented using existing coaxial cable-based systems. Fiber-optical communication systems can meet all these challenges of modern avionics applications in an efficient, cost-effective manner. The objective of this project is to present a number of optical network architectures for onboard RF signal distribution. Because of the emergence of a number of digital avionics devices requiring high-bandwidth connectivity, fiber-optic RF networks onboard modern aircraft will play a vital role in ensuring a low-noise, highly reliable RF communication system. Two approaches are being used for network architectures for aircraft onboard fiber-optic distribution systems: a hybrid RF-optical network and an all-optical wavelength division multiplexing (WDM) network.
JPEG XS-based frame buffer compression inside HEVC for power-aware video compression
NASA Astrophysics Data System (ADS)
Willème, Alexandre; Descampe, Antonin; Rouvroy, Gaël.; Pellegrin, Pascal; Macq, Benoit
2017-09-01
With the emergence of Ultra-High Definition video, reference frame buffers (FBs) inside HEVC-like encoders and decoders have to sustain huge bandwidth. The power consumed by these external memory accesses accounts for a significant share of the codec's total consumption. This paper describes a solution to significantly decrease the FB's bandwidth, making HEVC encoder more suitable for use in power-aware applications. The proposed prototype consists in integrating an embedded lightweight, low-latency and visually lossless codec at the FB interface inside HEVC in order to store each reference frame as several compressed bitstreams. As opposed to previous works, our solution compresses large picture areas (ranging from a CTU to a frame stripe) independently in order to better exploit the spatial redundancy found in the reference frame. This work investigates two data reuse schemes namely Level-C and Level-D. Our approach is made possible thanks to simplified motion estimation mechanisms further reducing the FB's bandwidth and inducing very low quality degradation. In this work, we integrated JPEG XS, the upcoming standard for lightweight low-latency video compression, inside HEVC. In practice, the proposed implementation is based on HM 16.8 and on XSM 1.1.2 (JPEG XS Test Model). Through this paper, the architecture of our HEVC with JPEG XS-based frame buffer compression is described. Then its performance is compared to HM encoder. Compared to previous works, our prototype provides significant external memory bandwidth reduction. Depending on the reuse scheme, one can expect bandwidth and FB size reduction ranging from 50% to 83.3% without significant quality degradation.
The Light Node Communication Framework: A New Way to Communicate Inside Smart Homes.
Plantevin, Valère; Bouzouane, Abdenour; Gaboury, Sebastien
2017-10-20
The Internet of things has profoundly changed the way we imagine information science and architecture, and smart homes are an important part of this domain. Created a decade ago, the few existing prototypes use technologies of the day, forcing designers to create centralized and costly architectures that raise some issues concerning reliability, scalability, and ease of access which cannot be tolerated in the context of assistance. In this paper, we briefly introduce a new kind of architecture where the focus is placed on distribution. More specifically, we respond to the first issue we encountered by proposing a lightweight and portable messaging protocol. After running several tests, we observed a maximized bandwidth, whereby no packets were lost and good encryption was obtained. These results tend to prove that our innovation may be employed in a real context of distribution with small entities.
NASA Astrophysics Data System (ADS)
Li, Xingfeng; Gan, Chaoqin; Liu, Zongkang; Yan, Yuqi; Qiao, HuBao
2018-01-01
In this paper, a novel architecture of hybrid PON for smart grid is proposed by introducing a wavelength-routing module (WRM). By using conventional optical passive components, a WRM with M ports is designed. The symmetry and passivity of the WRM makes it be easily integrated and very cheap in practice. Via the WRM, two types of network based on different ONU-interconnected manner can realize online access. Depending on optical switches and interconnecting fibers, full-fiber-fault protection and dynamic bandwidth allocation are realized in these networks. With the help of amplitude modulation, DPSK modulation and RSOA technology, wavelength triple-reuse is achieved. By means of injecting signals into left and right branches in access ring simultaneously, the transmission delay is decreased. Finally, the performance analysis and simulation of the network verifies the feasibility of the proposed architecture.
The Light Node Communication Framework: A New Way to Communicate Inside Smart Homes
Bouzouane, Abdenour; Gaboury, Sebastien
2017-01-01
The Internet of things has profoundly changed the way we imagine information science and architecture, and smart homes are an important part of this domain. Created a decade ago, the few existing prototypes use technologies of the day, forcing designers to create centralized and costly architectures that raise some issues concerning reliability, scalability, and ease of access which cannot be tolerated in the context of assistance. In this paper, we briefly introduce a new kind of architecture where the focus is placed on distribution. More specifically, we respond to the first issue we encountered by proposing a lightweight and portable messaging protocol. After running several tests, we observed a maximized bandwidth, whereby no packets were lost and good encryption was obtained. These results tend to prove that our innovation may be employed in a real context of distribution with small entities. PMID:29053581
Feasibility of Using Distributed Wireless Mesh Networks for Medical Emergency Response
Braunstein, Brian; Trimble, Troy; Mishra, Rajesh; Manoj, B. S.; Rao, Ramesh; Lenert, Leslie
2006-01-01
Achieving reliable, efficient data communications networks at a disaster site is a difficult task. Network paradigms, such as Wireless Mesh Network (WMN) architectures, form one exemplar for providing high-bandwidth, scalable data communication for medical emergency response activity. WMNs are created by self-organized wireless nodes that use multi-hop wireless relaying for data transfer. In this paper, we describe our experience using a mesh network architecture we developed for homeland security and medical emergency applications. We briefly discuss the architecture and present the traffic behavioral observations made by a client-server medical emergency application tested during a large-scale homeland security drill. We present our traffic measurements, describe lessons learned, and offer functional requirements (based on field testing) for practical 802.11 mesh medical emergency response networks. With certain caveats, the results suggest that 802.11 mesh networks are feasible and scalable systems for field communications in disaster settings. PMID:17238308
An Evaluation of Architectural Platforms for Parallel Navier-Stokes Computations
NASA Technical Reports Server (NTRS)
Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.
1996-01-01
We study the computational, communication, and scalability characteristics of a computational fluid dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architecture platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), and distributed memory multiprocessors with different topologies - the IBM SP and the Cray T3D. We investigate the impact of various networks connecting the cluster of workstations on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Parallelizing Navier-Stokes Computations on a Variety of Architectural Platforms
NASA Technical Reports Server (NTRS)
Jayasimha, D. N.; Hayder, M. E.; Pillay, S. K.
1997-01-01
We study the computational, communication, and scalability characteristics of a Computational Fluid Dynamics application, which solves the time accurate flow field of a jet using the compressible Navier-Stokes equations, on a variety of parallel architectural platforms. The platforms chosen for this study are a cluster of workstations (the LACE experimental testbed at NASA Lewis), a shared memory multiprocessor (the Cray YMP), distributed memory multiprocessors with different topologies-the IBM SP and the Cray T3D. We investigate the impact of various networks, connecting the cluster of workstations, on the performance of the application and the overheads induced by popular message passing libraries used for parallelization. The work also highlights the importance of matching the memory bandwidth to the processor speed for good single processor performance. By studying the performance of an application on a variety of architectures, we are able to point out the strengths and weaknesses of each of the example computing platforms.
Content addressable memory project
NASA Technical Reports Server (NTRS)
Hall, Josh; Levy, Saul; Smith, D.; Wei, S.; Miyake, K.; Murdocca, M.
1991-01-01
The progress on the Rutgers CAM (Content Addressable Memory) Project is described. The overall design of the system is completed at the architectural level and described. The machine is composed of two kinds of cells: (1) the CAM cells which include both memory and processor, and support local processing within each cell; and (2) the tree cells, which have smaller instruction set, and provide global processing over the CAM cells. A parameterized design of the basic CAM cell is completed. Progress was made on the final specification of the CPS. The machine architecture was driven by the design of algorithms whose requirements are reflected in the resulted instruction set(s). A few of these algorithms are described.
Proceedings of the NASA Conference on Space Telerobotics, volume 3
NASA Technical Reports Server (NTRS)
Rodriguez, Guillermo (Editor); Seraji, Homayoun (Editor)
1989-01-01
The theme of the Conference was man-machine collaboration in space. The Conference provided a forum for researchers and engineers to exchange ideas on the research and development required for application of telerobotics technology to the space systems planned for the 1990s and beyond. The Conference: (1) provided a view of current NASA telerobotic research and development; (2) stimulated technical exchange on man-machine systems, manipulator control, machine sensing, machine intelligence, concurrent computation, and system architectures; and (3) identified important unsolved problems of current interest which can be dealt with by future research.
The EP-3E vs. the BAMS UAS: An Operating and Support Cost Comparison
2012-09-01
Accountability Office HALE High Altitude Long Endurance ISR Intelligence, Surveillance and Reconnaissance JCC Joint Architecture...others are very complex high altitude long endurance (HALE) aircraft. However, most share the common need for satellite bandwidth. The DoD plan is...collection sites, and risks as they apply to the BAMS UAS. These factors were not adequately considered in the original O&S analysis . Once the analysis
In-Storage Embedded Accelerator for Sparse Pattern Processing
2016-09-13
computation . As a result, a very small processor could be used and still make full use of storage device bandwidth. When the host software sends...Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee et al. "A view of cloud computing ."Communications of the ACM 53, no. 4 (2010...Laboratory, * MIT Computer Science & Artificial Intelligence Laboratory Abstract— We present a novel system architecture for sparse pattern
Multifunction Multiband Airborne Radio Architecture Study.
1982-01-01
30 to 88, 108 to 156, and 255 to 400 MHz band allocations . (ii) On designated operating channels: two in the 225 to 400 MHz bandwidth, one in each of...altimeter, direction finding, and relay. 2.2 BASELINE SYSTEM APPROACH This subsection describes the TRW-proposed baseline design for the MFBARS system...problem and SINCGARS application. Major efforts were directed toward reducing the overall costs while retaining required performance. Significantly, cost
Rodríguez-Lera, Francisco J; Matellán-Olivera, Vicente; Conde-González, Miguel Á; Martín-Rico, Francisco
2018-05-01
Generation of autonomous behavior for robots is a general unsolved problem. Users perceive robots as repetitive tools that do not respond to dynamic situations. This research deals with the generation of natural behaviors in assistive service robots for dynamic domestic environments, particularly, a motivational-oriented cognitive architecture to generate more natural behaviors in autonomous robots. The proposed architecture, called HiMoP, is based on three elements: a Hierarchy of needs to define robot drives; a set of Motivational variables connected to robot needs; and a Pool of finite-state machines to run robot behaviors. The first element is inspired in Alderfer's hierarchy of needs, which specifies the variables defined in the motivational component. The pool of finite-state machine implements the available robot actions, and those actions are dynamically selected taking into account the motivational variables and the external stimuli. Thus, the robot is able to exhibit different behaviors even under similar conditions. A customized version of the "Speech Recognition and Audio Detection Test," proposed by the RoboCup Federation, has been used to illustrate how the architecture works and how it dynamically adapts and activates robots behaviors taking into account internal variables and external stimuli.
Plug-and-play design approach to smart harness for modular small satellites
NASA Astrophysics Data System (ADS)
Mughal, M. Rizwan; Ali, Anwar; Reyneri, Leonardo M.
2014-02-01
A typical satellite involves many different components that vary in bandwidth demand. Sensors that require a very low data rate may reside on a simple two- or three-wire interface such as I2C, SPI, etc. Complex sensors that require high data rate and bandwidth may reside on an optical interface. The AraMiS architecture is an enhanced capability architecture with different satellite configurations. Although keeping the low-cost and COTS approach of CubeSats, it extends the modularity concept as it also targets different satellite shapes and sizes. But modularity moves beyond the mechanical structure: the tiles also have thermo-mechanical, harness and signal-processing functionalities. Further modularizing the system, every tile can also host a variable number of small sensors, actuators or payloads, connected using a plug-and-play approach. Every subsystem is housed in a small daughter board and is supplied, by the main tile, with power and data distribution functions, power and data harness, mechanical support and is attached and interconnected with space-grade spring-loaded connectors. The tile software is also modular and allows a quick adaptation to specific subsystems. The basic software for the CPU is properly hardened to guarantee high level of radiation tolerance at very low cost.
Project Integration Architecture: Implementation of the CORBA-Served Application Infrastructure
NASA Technical Reports Server (NTRS)
Jones, William Henry
2005-01-01
The Project Integration Architecture (PIA) has been demonstrated in a single-machine C++ implementation prototype. The architecture is in the process of being migrated to a Common Object Request Broker Architecture (CORBA) implementation. The migration of the Foundation Layer interfaces is fundamentally complete. The implementation of the Application Layer infrastructure for that migration is reported. The Application Layer provides for distributed user identification and authentication, per-user/per-instance access controls, server administration, the formation of mutually-trusting application servers, a server locality protocol, and an ability to search for interface implementations through such trusted server networks.
MIT CSAIL and Lincoln Laboratory Task Force Report
2016-08-01
projects have been very diverse, spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications...spanning several areas of CSAIL concentration, including robotics, big data analytics , wireless communications, computing architectures and...to machine learning systems and algorithms, such as recommender systems, and “Big Data ” analytics . Advanced computing architectures broadly refer to
Exploring the Function Space of Deep-Learning Machines
NASA Astrophysics Data System (ADS)
Li, Bo; Saad, David
2018-06-01
The function space of deep-learning machines is investigated by studying growth in the entropy of functions of a given error with respect to a reference function, realized by a deep-learning machine. Using physics-inspired methods we study both sparsely and densely connected architectures to discover a layerwise convergence of candidate functions, marked by a corresponding reduction in entropy when approaching the reference function, gain insight into the importance of having a large number of layers, and observe phase transitions as the error increases.
Unorganized machines for seasonal streamflow series forecasting.
Siqueira, Hugo; Boccato, Levy; Attux, Romis; Lyra, Christiano
2014-05-01
Modern unorganized machines--extreme learning machines and echo state networks--provide an elegant balance between processing capability and mathematical simplicity, circumventing the difficulties associated with the conventional training approaches of feedforward/recurrent neural networks (FNNs/RNNs). This work performs a detailed investigation of the applicability of unorganized architectures to the problem of seasonal streamflow series forecasting, considering scenarios associated with four Brazilian hydroelectric plants and four distinct prediction horizons. Experimental results indicate the pertinence of these models to the focused task.
Proceedings of the 1986 IEEE international conference on systems, man and cybernetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1986-01-01
This book presents the papers given at a conference on man-machine systems. Topics considered at the conference included neural model-based cognitive theory and engineering, user interfaces, adaptive and learning systems, human interaction with robotics, decision making, the testing and evaluation of expert systems, software development, international conflict resolution, intelligent interfaces, automation in man-machine system design aiding, knowledge acquisition in expert systems, advanced architectures for artificial intelligence, pattern recognition, knowledge bases, and machine vision.
An Energy-Efficient Multi-Tier Architecture for Fall Detection on Smartphones
Guvensan, M. Amac; Kansiz, A. Oguz; Camgoz, N. Cihan; Turkmen, H. Irem; Yavuz, A. Gokhan; Karsligil, M. Elif
2017-01-01
Automatic detection of fall events is vital to providing fast medical assistance to the causality, particularly when the injury causes loss of consciousness. Optimization of the energy consumption of mobile applications, especially those which run 24/7 in the background, is essential for longer use of smartphones. In order to improve energy-efficiency without compromising on the fall detection performance, we propose a novel 3-tier architecture that combines simple thresholding methods with machine learning algorithms. The proposed method is implemented on a mobile application, called uSurvive, for Android smartphones. It runs as a background service and monitors the activities of a person in daily life and automatically sends a notification to the appropriate authorities and/or user defined contacts when it detects a fall. The performance of the proposed method was evaluated in terms of fall detection performance and energy consumption. Real life performance tests conducted on two different models of smartphone demonstrate that our 3-tier architecture with feature reduction could save up to 62% of energy compared to machine learning only solutions. In addition to this energy saving, the hybrid method has a 93% of accuracy, which is superior to thresholding methods and better than machine learning only solutions. PMID:28644378
NASA Technical Reports Server (NTRS)
Denning, Peter J.; Tichy, Walter F.
1990-01-01
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed.
ATCA for Machines-- Advanced Telecommunications Computing Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, R.S.; /SLAC
2008-04-22
The Advanced Telecommunications Computing Architecture is a new industry open standard for electronics instrument modules and shelves being evaluated for the International Linear Collider (ILC). It is the first industrial standard designed for High Availability (HA). ILC availability simulations have shown clearly that the capabilities of ATCA are needed in order to achieve acceptable integrated luminosity. The ATCA architecture looks attractive for beam instruments and detector applications as well. This paper provides an overview of ongoing R&D including application of HA principles to power electronics systems.
The TOTEM DAQ based on the Scalable Readout System (SRS)
NASA Astrophysics Data System (ADS)
Quinto, Michele; Cafagna, Francesco S.; Fiergolski, Adrian; Radicioni, Emilio
2018-02-01
The TOTEM (TOTal cross section, Elastic scattering and diffraction dissociation Measurement at the LHC) experiment at LHC, has been designed to measure the total proton-proton cross-section and study the elastic and diffractive scattering at the LHC energies. In order to cope with the increased machine luminosity and the higher statistic required by the extension of the TOTEM physics program, approved for the LHC's Run Two phase, the previous VME based data acquisition system has been replaced with a new one based on the Scalable Readout System. The system features an aggregated data throughput of 2GB / s towards the online storage system. This makes it possible to sustain a maximum trigger rate of ˜ 24kHz, to be compared with the 1KHz rate of the previous system. The trigger rate is further improved by implementing zero-suppression and second-level hardware algorithms in the Scalable Readout System. The new system fulfils the requirements for an increased efficiency, providing higher bandwidth, and increasing the purity of the data recorded. Moreover full compatibility has been guaranteed with the legacy front-end hardware, as well as with the DAQ interface of the CMS experiment and with the LHC's Timing, Trigger and Control distribution system. In this contribution we describe in detail the architecture of full system and its performance measured during the commissioning phase at the LHC Interaction Point.
Proposed hardware architectures of particle filter for object tracking
NASA Astrophysics Data System (ADS)
Abd El-Halym, Howida A.; Mahmoud, Imbaby Ismail; Habib, SED
2012-12-01
In this article, efficient hardware architectures for particle filter (PF) are presented. We propose three different architectures for Sequential Importance Resampling Filter (SIRF) implementation. The first architecture is a two-step sequential PF machine, where particle sampling, weight, and output calculations are carried out in parallel during the first step followed by sequential resampling in the second step. For the weight computation step, a piecewise linear function is used instead of the classical exponential function. This decreases the complexity of the architecture without degrading the results. The second architecture speeds up the resampling step via a parallel, rather than a serial, architecture. This second architecture targets a balance between hardware resources and the speed of operation. The third architecture implements the SIRF as a distributed PF composed of several processing elements and central unit. All the proposed architectures are captured using VHDL synthesized using Xilinx environment, and verified using the ModelSim simulator. Synthesis results confirmed the resource reduction and speed up advantages of our architectures.
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan
2005-01-01
Submission Deadline: 1 June 2005
Cloud services for the Fermilab scientific stakeholders
Timm, S.; Garzoglio, G.; Mhashilkar, P.; ...
2015-12-23
As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Cloud services for the Fermilab scientific stakeholders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Timm, S.; Garzoglio, G.; Mhashilkar, P.
As part of the Fermilab/KISTI cooperative research project, Fermilab has successfully run an experimental simulation workflow at scale on a federation of Amazon Web Services (AWS), FermiCloud, and local FermiGrid resources. We used the CernVM-FS (CVMFS) file system to deliver the application software. We established Squid caching servers in AWS as well, using the Shoal system to let each individual virtual machine find the closest squid server. We also developed an automatic virtual machine conversion system so that we could transition virtual machines made on FermiCloud to Amazon Web Services. We used this system to successfully run a cosmic raymore » simulation of the NOvA detector at Fermilab, making use of both AWS spot pricing and network bandwidth discounts to minimize the cost. On FermiCloud we also were able to run the workflow at the scale of 1000 virtual machines, using a private network routable inside of Fermilab. As a result, we present in detail the technological improvements that were used to make this work a reality.« less
Performance analysis of an all-digital BPSK direct sequence spread-spectrum IF receiver architecture
NASA Astrophysics Data System (ADS)
Chung, Bong-Young; Chien, Charles; Samueli, Henry; Jain, Rajeev
1993-09-01
A VLSI architecture for an all-digital binary phase shift keyed (BPSK) direct-sequence (DS) spread spectrum (SS) IF receiver is presented, and an in-depth performance analysis is given. The all-digital architecture incorporates a Costar loop for carrier recovery and a delay-locked loop for clock recovery. For the PN acquisition block, a robust energy detection scheme is proposed to reduce false PN locks over a broad range of signal-to-noise ratios. The proposed architecture is intended for use in the 902-928 MHz unlicensed spread spectrum radio band. A 100 kbs information rate and a 12.7 Mchips/second PN code rate are assumed. The IF center frequency is 12.7 MHz and the IF sampling rate is 50.8 Msamples/ second, which is the Nyquist rate for the 25.4 MHz bandwidth signal. Finite wordlength effects have been simulated to optimize the architecture, thereby minimizing the chip area, and results of the finite wordlength simulations demonstrate that the chip architecture achieves a bit error rate performance within 1 dB of theory in an additive white Gaussian noise channel. The probability of PN acquisition within 5 ms is approximately 56% at -17 dB IF input SNR and 82% at -11 dB IF input SNR.
Three-Axis Attitude Estimation With a High-Bandwidth Angular Rate Sensor
NASA Technical Reports Server (NTRS)
Bayard, David S.; Green, Joseph J.
2013-01-01
A continuing challenge for modern instrument pointing control systems is to meet the increasingly stringent pointing performance requirements imposed by emerging advanced scientific, defense, and civilian payloads. Instruments such as adaptive optics telescopes, space interferometers, and optical communications make unprecedented demands on precision pointing capabilities. A cost-effective method was developed for increasing the pointing performance for this class of NASA applications. The solution was to develop an attitude estimator that fuses star tracker and gyro measurements with a high-bandwidth angular rotation sensor (ARS). An ARS is a rate sensor whose bandwidth extends well beyond that of the gyro, typically up to 1,000 Hz or higher. The most promising ARS sensor technology is based on a magnetohydrodynamic concept, and has recently become available commercially. The key idea is that the sensor fusion of the star tracker, gyro, and ARS provides a high-bandwidth attitude estimate suitable for supporting pointing control with a fast-steering mirror or other type of tip/tilt correction for increased performance. The ARS is relatively inexpensive and can be bolted directly next to the gyro and star tracker on the spacecraft bus. The high-bandwidth attitude estimator fuses an ARS sensor with a standard three-axis suite comprised of a gyro and star tracker. The estimation architecture is based on a dual-complementary filter (DCF) structure. The DCF takes a frequency- weighted combination of the sensors such that each sensor is most heavily weighted in a frequency region where it has the lowest noise. An important property of the DCF is that it avoids the need to model disturbance torques in the filter mechanization. This is important because the disturbance torques are generally not known in applications. This property represents an advantage over the prior art because it overcomes a weakness of the Kalman filter that arises when fusing more than one rate measurement. An additional advantage over prior art is that, computationally, the DCF requires significantly fewer real-time calculations than a Kalman filter formulation. There are essentially two reasons for this: the DCF state is not augmented with angular rate, and measurement updates occur at the slower gyro rate instead of the faster ARS sampling rate. Finally, the DCF has a simple and compelling architecture. The DCF is exactly equivalent to flying two identical attitude observers, one at low rate and one at high rate. These attitude observers are exactly of the form currently flown on typical three-axis spacecraft.
Control Demonstration of Multiple Doubly-Fed Induction Motors for Hybrid Electric Propulsion
NASA Technical Reports Server (NTRS)
Sadey, David J.; Bodson, Marc; Csank, Jeffrey T.; Hunker, Keith R.; Theman, Casey J.; Taylor, Linda M.
2017-01-01
The Convergent Aeronautics Solutions (CAS) High Voltage-Hybrid Electric Propulsion (HVHEP) task was formulated to support the move into future hybrid-electric aircraft. The goal of this project is to develop a new AC power architecture to support the needs of higher efficiency and lower emissions. This proposed architecture will adopt the use of the doubly-fed induction machine (DFIM) for propulsor drive motor application.The Convergent Aeronautics Solutions (CAS) High Voltage-Hybrid Electric Propulsion (HVHEP) task was formulated to support the move into future hybrid-electric aircraft. The goal of this project is to develop a new AC power architecture to support the needs of higher efficiency and lower emissions. This proposed architecture will adopt the use of the doubly-fed induction machine (DFIM) for propulsor drive motor application. DFIMs are attractive for several reasons, including but not limited to the ability to self-start, ability to operate sub- and super-synchronously, and requiring only fractionally rated power converters on a per-unit basis depending on the required range of operation. The focus of this paper is based specifically on the presentation and analysis of a novel strategy which allows for independent operation of each of the aforementioned doubly-fed induction motors. This strategy includes synchronization, soft-start, and closed loop speed control of each motor as a means of controlling output thrust; be it concurrently or differentially. The demonstration of this strategy has recently been proven out on a low power test bed using fractional horsepower machines. Simulation and hardware test results are presented in the paper.
NASA Astrophysics Data System (ADS)
Radziszewski, Kacper
2017-10-01
The following paper presents the results of the research in the field of the machine learning, investigating the scope of application of the artificial neural networks algorithms as a tool in architectural design. The computational experiment was held using the backward propagation of errors method of training the artificial neural network, which was trained based on the geometry of the details of the Roman Corinthian order capital. During the experiment, as an input training data set, five local geometry parameters combined has given the best results: Theta, Pi, Rho in spherical coordinate system based on the capital volume centroid, followed by Z value of the Cartesian coordinate system and a distance from vertical planes created based on the capital symmetry. Additionally during the experiment, artificial neural network hidden layers optimal count and structure was found, giving results of the error below 0.2% for the mentioned before input parameters. Once successfully trained artificial network, was able to mimic the details composition on any other geometry type given. Despite of calculating the transformed geometry locally and separately for each of the thousands of surface points, system could create visually attractive and diverse, complex patterns. Designed tool, based on the supervised learning method of machine learning, gives possibility of generating new architectural forms- free of the designer’s imagination bounds. Implementing the infinitely broad computational methods of machine learning, or Artificial Intelligence in general, not only could accelerate and simplify the design process, but give an opportunity to explore never seen before, unpredictable forms or everyday architectural practice solutions.
Engineering artificial machines from designable DNA materials for biomedical applications.
Qi, Hao; Huang, Guoyou; Han, Yulong; Zhang, Xiaohui; Li, Yuhui; Pingguan-Murphy, Belinda; Lu, Tian Jian; Xu, Feng; Wang, Lin
2015-06-01
Deoxyribonucleic acid (DNA) emerges as building bricks for the fabrication of nanostructure with complete artificial architecture and geometry. The amazing ability of DNA in building two- and three-dimensional structures raises the possibility of developing smart nanomachines with versatile controllability for various applications. Here, we overviewed the recent progresses in engineering DNA machines for specific bioengineering and biomedical applications.
Engineering Artificial Machines from Designable DNA Materials for Biomedical Applications
Huang, Guoyou; Han, Yulong; Zhang, Xiaohui; Li, Yuhui; Pingguan-Murphy, Belinda; Lu, Tian Jian; Xu, Feng
2015-01-01
Deoxyribonucleic acid (DNA) emerges as building bricks for the fabrication of nanostructure with complete artificial architecture and geometry. The amazing ability of DNA in building two- and three-dimensional structures raises the possibility of developing smart nanomachines with versatile controllability for various applications. Here, we overviewed the recent progresses in engineering DNA machines for specific bioengineering and biomedical applications. PMID:25547514
Highly parallel sparse Cholesky factorization
NASA Technical Reports Server (NTRS)
Gilbert, John R.; Schreiber, Robert
1990-01-01
Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms.
Feature recognition and detection for ancient architecture based on machine vision
NASA Astrophysics Data System (ADS)
Zou, Zheng; Wang, Niannian; Zhao, Peng; Zhao, Xuefeng
2018-03-01
Ancient architecture has a very high historical and artistic value. The ancient buildings have a wide variety of textures and decorative paintings, which contain a lot of historical meaning. Therefore, the research and statistics work of these different compositional and decorative features play an important role in the subsequent research. However, until recently, the statistics of those components are mainly by artificial method, which consumes a lot of labor and time, inefficiently. At present, as the strong support of big data and GPU accelerated training, machine vision with deep learning as the core has been rapidly developed and widely used in many fields. This paper proposes an idea to recognize and detect the textures, decorations and other features of ancient building based on machine vision. First, classify a large number of surface textures images of ancient building components manually as a set of samples. Then, using the convolution neural network to train the samples in order to get a classification detector. Finally verify its precision.
Wavelet-enhanced convolutional neural network: a new idea in a deep learning paradigm.
Savareh, Behrouz Alizadeh; Emami, Hassan; Hajiabadi, Mohamadreza; Azimi, Seyed Majid; Ghafoori, Mahyar
2018-05-29
Manual brain tumor segmentation is a challenging task that requires the use of machine learning techniques. One of the machine learning techniques that has been given much attention is the convolutional neural network (CNN). The performance of the CNN can be enhanced by combining other data analysis tools such as wavelet transform. In this study, one of the famous implementations of CNN, a fully convolutional network (FCN), was used in brain tumor segmentation and its architecture was enhanced by wavelet transform. In this combination, a wavelet transform was used as a complementary and enhancing tool for CNN in brain tumor segmentation. Comparing the performance of basic FCN architecture against the wavelet-enhanced form revealed a remarkable superiority of enhanced architecture in brain tumor segmentation tasks. Using mathematical functions and enhancing tools such as wavelet transform and other mathematical functions can improve the performance of CNN in any image processing task such as segmentation and classification.
Lazar, Aurel A; Pnevmatikakis, Eftychios A
2011-03-01
We investigate architectures for time encoding and time decoding of visual stimuli such as natural and synthetic video streams (movies, animation). The architecture for time encoding is akin to models of the early visual system. It consists of a bank of filters in cascade with single-input multi-output neural circuits. Neuron firing is based on either a threshold-and-fire or an integrate-and-fire spiking mechanism with feedback. We show that analog information is represented by the neural circuits as projections on a set of band-limited functions determined by the spike sequence. Under Nyquist-type and frame conditions, the encoded signal can be recovered from these projections with arbitrary precision. For the video time encoding machine architecture, we demonstrate that band-limited video streams of finite energy can be faithfully recovered from the spike trains and provide a stable algorithm for perfect recovery. The key condition for recovery calls for the number of neurons in the population to be above a threshold value.
Lazar, Aurel A.; Pnevmatikakis, Eftychios A.
2013-01-01
We investigate architectures for time encoding and time decoding of visual stimuli such as natural and synthetic video streams (movies, animation). The architecture for time encoding is akin to models of the early visual system. It consists of a bank of filters in cascade with single-input multi-output neural circuits. Neuron firing is based on either a threshold-and-fire or an integrate-and-fire spiking mechanism with feedback. We show that analog information is represented by the neural circuits as projections on a set of band-limited functions determined by the spike sequence. Under Nyquist-type and frame conditions, the encoded signal can be recovered from these projections with arbitrary precision. For the video time encoding machine architecture, we demonstrate that band-limited video streams of finite energy can be faithfully recovered from the spike trains and provide a stable algorithm for perfect recovery. The key condition for recovery calls for the number of neurons in the population to be above a threshold value. PMID:21296708
Tensor Basis Neural Network v. 1.0 (beta)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ling, Julia; Templeton, Jeremy
This software package can be used to build, train, and test a neural network machine learning model. The neural network architecture is specifically designed to embed tensor invariance properties by enforcing that the model predictions sit on an invariant tensor basis. This neural network architecture can be used in developing constitutive models for applications such as turbulence modeling, materials science, and electromagnetism.
Deep learning based classification of breast tumors with shear-wave elastography.
Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong
2016-12-01
This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.
Atkinson, Jonathan A; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E; Griffiths, Marcus; Wells, Darren M
2017-10-01
Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. © The Authors 2017. Published by Oxford University Press.
Atkinson, Jonathan A.; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E.; Griffiths, Marcus
2017-01-01
Abstract Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. PMID:29020748
Assessment of Mechanical Performance of Bone Architecture Using Rapid Prototyping Models
NASA Astrophysics Data System (ADS)
Saparin, Peter; Woesz, Alexander; Thomsen, Jasper S.; Fratzl, Peter
2008-06-01
The aim of this on-going research project is to assess the influence of bone microarchitecture on the mechanical performance of trabecular bone. A testing chain consist-ing of three steps was established: 1) micro computed tomography (μCT) imaging of human trabecular bone; 2) building of models of the bone from a light-sensitive polymer using Rapid Prototyping (RP); 3) mechanical testing of the models in a material testing machine. A direct resampling procedure was developed to convert μCT data into the format of the RP machine. Standardized parameters for production and testing of the plastic models were established by use of regular cellular structures. Next, normal, osteoporotic, and extreme osteoporotic vertebral trabecular bone architectures were re-produced by RP and compression tested. We found that normal architecture of vertebral trabecular bone exhibit behaviour characteristic of a cellular structure. In normal bone the fracture occurs at much higher strain values that in osteoporotic bone. After the fracture a normal trabecular architecture is able to carry much higher loads than an osteoporotic architecture. However, no statistically significant differences were found in maximal stress during uniaxial compression of the central part of normal, osteoporotic, and extreme osteoporotic vertebral trabecular bone. This supports the hypothesis that osteoporotic trabecular bone can compensate for a loss of trabeculae by thickening the remaining trabeculae in the loading direction (compensatory hypertrophy). The developed approach could be used for mechanical evaluation of structural data acquired non-invasively and assessment of changes in performance of bone architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Samuel; Patterson, David; Oliker, Leonid
This article consists of a collection of slides from the authors' conference presentation. The Roofline model is a visually intuitive figure for kernel analysis and optimization. We believe undergraduates will find it useful in assessing performance and scalability limitations. It is easily extended to other architectural paradigms. It is easily extendable to other metrics: performance (sort, graphics, crypto..) bandwidth (L2, PCIe, ..). Furthermore, a performance counters could be used to generate a runtime-specific roofline that would greatly aide the optimization.
National Test Bed Security and Communications Architecture Working Group Report
1992-04-01
computer systems via a physical medium. Most of those physical media are tappable or interceptable. This means that all the data that flows across the...provides the capability for NTBN nodes to support users operating in differing COIs to share the computing resources and communication media and for...representation. Again generally speaking, the NTBN must act as the high-speed, wide-bandwidth communications media that would provide the "near real-time
Toolkits and Libraries for Deep Learning.
Erickson, Bradley J; Korfiatis, Panagiotis; Akkus, Zeynettin; Kline, Timothy; Philbrick, Kenneth
2017-08-01
Deep learning is an important new area of machine learning which encompasses a wide range of neural network architectures designed to complete various tasks. In the medical imaging domain, example tasks include organ segmentation, lesion detection, and tumor classification. The most popular network architecture for deep learning for images is the convolutional neural network (CNN). Whereas traditional machine learning requires determination and calculation of features from which the algorithm learns, deep learning approaches learn the important features as well as the proper weighting of those features to make predictions for new data. In this paper, we will describe some of the libraries and tools that are available to aid in the construction and efficient execution of deep learning as applied to medical images.
Implementation of the force decomposition machine for molecular dynamics simulations.
Borštnik, Urban; Miller, Benjamin T; Brooks, Bernard R; Janežič, Dušanka
2012-09-01
We present the design and implementation of the force decomposition machine (FDM), a cluster of personal computers (PCs) that is tailored to running molecular dynamics (MD) simulations using the distributed diagonal force decomposition (DDFD) parallelization method. The cluster interconnect architecture is optimized for the communication pattern of the DDFD method. Our implementation of the FDM relies on standard commodity components even for networking. Although the cluster is meant for DDFD MD simulations, it remains general enough for other parallel computations. An analysis of several MD simulation runs on both the FDM and a standard PC cluster demonstrates that the FDM's interconnect architecture provides a greater performance compared to a more general cluster interconnect. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Harper, R. E.; Alger, L. S.; Babikyan, C. A.; Butler, B. P.; Friend, S. A.; Ganska, R. J.; Lala, J. H.; Masotto, T. K.; Meyer, A. J.; Morton, D. P.
1992-01-01
Digital computing systems needed for Army programs such as the Computer-Aided Low Altitude Helicopter Flight Program and the Armored Systems Modernization (ASM) vehicles may be characterized by high computational throughput and input/output bandwidth, hard real-time response, high reliability and availability, and maintainability, testability, and producibility requirements. In addition, such a system should be affordable to produce, procure, maintain, and upgrade. To address these needs, the Army Fault Tolerant Architecture (AFTA) is being designed and constructed under a three-year program comprised of a conceptual study, detailed design and fabrication, and demonstration and validation phases. Described here are the results of the conceptual study phase of the AFTA development. Given here is an introduction to the AFTA program, its objectives, and key elements of its technical approach. A format is designed for representing mission requirements in a manner suitable for first order AFTA sizing and analysis, followed by a discussion of the current state of mission requirements acquisition for the targeted Army missions. An overview is given of AFTA's architectural theory of operation.
NASA Astrophysics Data System (ADS)
Garg, Amit Kumar; Madavi, Amresh Ashok; Janyani, Vijay
2017-02-01
A flexible hybrid wavelength division multiplexing-time division multiplexing passive optical network architecture that allows dual rate signals to be sent at 1 and 10 Gbps to each optical networking unit depending upon the traffic load is proposed. The proposed design allows dynamic wavelength allocation with pay-as-you-grow deployment capability. This architecture is capable of providing up to 40 Gbps of equal data rates to all optical distribution networks (ODNs) and up to 70 Gbps of a asymmetrical data rate to the specific ODN. The proposed design handles broadcasting capability with simultaneous point-to-point transmission, which further reduces energy consumption. In this architecture, each module sends a wavelength to each ODN, thus making the architecture fully flexible; this flexibility allows network providers to use only required OLT components and switch off others. The design is also reliable to any module or TRx failure and provides services without any service disruption. Dynamic wavelength allocation and pay-as-you-grow deployment support network extensibility and bandwidth scalability to handle future generation access networks.
Di Lucente, S; Luo, J; Centelles, R Pueyo; Rohit, A; Zou, S; Williams, K A; Dorren, H J S; Calabretta, N
2013-01-14
Data centers have to sustain the rapid growth of data traffic due to the increasing demand of bandwidth-hungry internet services. The current intra-data center fat tree topology causes communication bottlenecks in the server interaction process, power-hungry O-E-O conversions that limit the minimum latency and the power efficiency of these systems. In this paper we numerically and experimentally investigate an optical packet switch architecture with modular structure and highly distributed control that allow configuration times in the order of nanoseconds. Numerical results indicate that the candidate architecture scaled over 4000 ports, provides an overall throughput over 50 Tb/s and a packet loss rate below 10(-6) while assuring sub-microsecond latency. We present experimental results that demonstrate the feasibility of a 16x16 optical packet switch based on parallel 1x4 integrated optical cross-connect modules. Error-free operations can be achieved with 4 dB penalty while the overall energy consumption is of 66 pJ/b. Based on those results, we discuss feasibility to scale the architecture to a much larger port count.
2004-06-01
element can be applied to achieve this goal. Résumé Ce document décrit l’étude d’une antenne imprimée à polarisation circulaire réalisée sur un...matériau LTCC (low temperature co-fired ceramic). Cette antenne est utilisée comme élément rayonnant d’un réseau à déphasage ayant une architecture de...l’analyse d’une antenne élémentaire pouvant être utilisée dans réseau à déphasage ayant une architecture de type “tuile” fonctionnant en bande EHF. La
Special Issue on a Fault Tolerant Network on Chip Architecture
NASA Astrophysics Data System (ADS)
Janidarmian, Majid; Tinati, Melika; Khademzadeh, Ahmad; Ghavibazou, Maryam; Fekr, Atena Roshan
2010-06-01
In this paper a fast and efficient spare switch selection algorithm is presented in a reliable NoC architecture based on specific application mapped onto mesh topology called FERNA. Based on ring concept used in FERNA, this algorithm achieves best results equivalent to exhaustive algorithm with much less run time improving two parameters. Inputs of FERNA algorithm for response time of the system and extra communication cost minimization are derived from simulation of high transaction level using SystemC TLM and mathematical formulation, respectively. The results demonstrate that improvement of above mentioned parameters lead to advance whole system reliability that is analytically calculated. Mapping algorithm has been also investigated as an effective issue on extra bandwidth requirement and system reliability.
NASA Astrophysics Data System (ADS)
Valerio Testa, Paolo; Klein, Bernhard; Hahnel, Ronny; Plettemeier, Dirk; Carta, Corrado; Ellinger, Frank
2017-09-01
This paper presents an overview of the research work currently being performed within the frame of project DAAB and its successor DAAB-TX towards the integration of ultra-wideband transceivers operating at mm-wave frequencies and capable of data rates up to 100 Gbits-1. Two basic system architectures are being considered: integrating a broadband antenna with a distributed amplifier and integrate antennas centered at adjacent frequencies with broadband active combiners or dividers. The paper discusses in detail the design of such systems and their components, from the distributed amplifiers and combiners, to the broadband silicon antennas and their single-chip integration. All components are designed for fabrication in a commercially available SiGe:C BiCMOS technology. The presented results represent the state of the art in their respective areas: 170 GHz is the highest reported bandwidth for distributed amplifiers integrated in Silicon; 89 GHz is the widest reported bandwidth for integrated-system antennas; the simulated performance of the two antenna integrated receiver spans 105 GHz centered at 148GHz, which would improve the state of the art by a factor in excess of 4 even against III-V implementations, if confirmed by measurements.
Cislan-2 extension final document by University of Twente (Netherlands)
NASA Astrophysics Data System (ADS)
Niemegeers, Ignas; Baumann, Frank; Beuwer, Wim; Jordense, Marcel; Pras, Aiko; Schutte, Leon; Tracey, Ian
1992-01-01
Results of worked performed under the so called Cislan extension contract are presented. The adaptation of the Cislan 2 prototype design to an environment of interconnected Local Area Networks (LAN's) instead of a single 802.5 token ring LAN is considered. In order to extend the network architecture, the Interconnection Function (IF) protocol layer was subdivided into two protocol layers: a new IF layer, and below the Medium Enhancement (ME) protocol layer. Some small enhancements to the distributed bandwidth allocation protocol were developed, which in fact are also applicable to the 'normal' Cislan 2 system. The new services and protocols are described together with some scenarios and requirements for the new internetting Cislan 2 system. How to overcome the degradation of the quality of speech due to packet loss on the LAN subsystem was studied. Experiments were planned in order to measure this speech quality degradation. Simulations were performed of two Cislan subsystems, the bandwidth allocation protocol and the clock synchronization mechanism. Results on both simulations, performed on SUN workstations using QNAP as a simulation tool, are given. Results of the simulations of the clock synchronization mechanism, and results of the simulation of the distributed bandwidth allocation protocol are given.
A macrochip interconnection network enabled by silicon nanophotonic devices.
Zheng, Xuezhe; Cunningham, John E; Koka, Pranay; Schwetman, Herb; Lexau, Jon; Ho, Ron; Shubin, Ivan; Krishnamoorthy, Ashok V; Yao, Jin; Mekis, Attila; Pinguet, Thierry
2010-03-01
We present an advanced wavelength-division multiplexing point-to-point network enabled by silicon nanophotonic devices. This network offers strictly non-blocking all-to-all connectivity while maximizing bisection bandwidth, making it ideal for multi-core and multi-processor interconnections. We introduce one of the key components, the nanophotonic grating coupler, and discuss, for the first time, how this device can be useful for practical implementations of the wavelength-division multiplexing network using optical proximity communications. Finite difference time-domain simulation of the nanophotonic grating coupler device indicates that it can be made compact (20 microm x 50 microm), low loss (3.8 dB), and broadband (100 nm). These couplers require subwavelength material modulation at the nanoscale to achieve the desired functionality. We show that optical proximity communication provides unmatched optical I/O bandwidth density to electrical chips, which enables the application of wavelength-division multiplexing point-to-point network in macrochip with unprecedented bandwidth-density. The envisioned physical implementation is discussed. The benefits of such an interconnect network include a 5-6x improvement in latency when compared to a purely electronic implementation. Performance analysis shows that the wavelength-division multiplexing point-to-point network offers better overall performance over other optical network architectures.
NASA Astrophysics Data System (ADS)
Chang, Daniel Y.; Rowe, Neil C.
2013-05-01
While conducting a cutting-edge research in a specific domain, we realize that (1) requirements clarity and correctness are crucial to our success [1], (2) hardware is hard to change, most work is in software requirements development, coding and testing [2], (3) requirements are constantly changing, so that configurability, reusability, scalability, adaptability, modularity and testability are important non-functional attributes [3], (4) cross-domain knowledge is necessary for complex systems [4], and (5) if our research is successful, the results could be applied to other domains with similar problems. In this paper, we propose to use model-driven requirements engineering (MDRE) to model and guide our requirements/development, since models are easy to understand, execute, and modify. The domain for our research is Electronic Warfare (EW) real-time ultra-wide instantaneous bandwidth (IBW1) signal simulation. The proposed four MDRE models are (1) Switch-and-Filter architecture, (2) multiple parallel data bit streams alignment, (3) post-ADC and pre-DAC bits re-mapping, and (4) Discrete Fourier Transform (DFT) filter bank. This research is unique since the instantaneous bandwidth we are dealing with is in gigahertz range instead of conventional megahertz.
Digital Intermediate Frequency Receiver Module For Use In Airborne Sar Applications
Tise, Bertice L.; Dubbert, Dale F.
2005-03-08
A digital IF receiver (DRX) module directly compatible with advanced radar systems such as synthetic aperture radar (SAR) systems. The DRX can combine a 1 G-Sample/sec 8-bit ADC with high-speed digital signal processor, such as high gate-count FPGA technology or ASICs to realize a wideband IF receiver. DSP operations implemented in the DRX can include quadrature demodulation and multi-rate, variable-bandwidth IF filtering. Pulse-to-pulse (Doppler domain) filtering can also be implemented in the form of a presummer (accumulator) and an azimuth prefilter. An out of band noise source can be employed to provide a dither signal to the ADC, and later be removed by digital signal processing. Both the range and Doppler domain filtering operations can be implemented using a unique pane architecture which allows on-the-fly selection of the filter decimation factor, and hence, the filter bandwidth. The DRX module can include a standard VME-64 interface for control, status, and programming. An interface can provide phase history data to the real-time image formation processors. A third front-panel data port (FPDP) interface can send wide bandwidth, raw phase histories to a real-time phase history recorder for ground processing.
Automated planning for intelligent machines in energy-related applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weisbin, C.R.; de Saussure, G.; Barhen, J.
1984-01-01
This paper discusses the current activities of the Center for Engineering Systems Advanced Research (CESAR) program related to plan generation and execution by an intelligent machine. The system architecture for the CESAR mobile robot (named HERMIES-1) is described. The minimal cut-set approach is developed to reduce the tree search time of conventional backward chaining planning techniques. Finally, a real-time concept of an Intelligent Machine Operating System is presented in which planning and reasoning is embedded in a system for resource allocation and process management.
Table-driven software architecture for a stitching system
NASA Technical Reports Server (NTRS)
Thrash, Patrick J. (Inventor); Miller, Jeffrey L. (Inventor); Pallas, Ken (Inventor); Trank, Robert C. (Inventor); Fox, Rhoda (Inventor); Korte, Mike (Inventor); Codos, Richard (Inventor); Korolev, Alexandre (Inventor); Collan, William (Inventor)
2001-01-01
Native code for a CNC stitching machine is generated by generating a geometry model of a preform; generating tool paths from the geometry model, the tool paths including stitching instructions for making stitches; and generating additional instructions indicating thickness values. The thickness values are obtained from a lookup table. When the stitching machine runs the native code, it accesses a lookup table to determine a thread tension value corresponding to the thickness value. The stitching machine accesses another lookup table to determine a thread path geometry value corresponding to the thickness value.
System software for the finite element machine
NASA Technical Reports Server (NTRS)
Crockett, T. W.; Knott, J. D.
1985-01-01
The Finite Element Machine is an experimental parallel computer developed at Langley Research Center to investigate the application of concurrent processing to structural engineering analysis. This report describes system-level software which has been developed to facilitate use of the machine by applications researchers. The overall software design is outlined, and several important parallel processing issues are discussed in detail, including processor management, communication, synchronization, and input/output. Based on experience using the system, the hardware architecture and software design are critiqued, and areas for further work are suggested.
A system framework of inter-enterprise machining quality control based on fractal theory
NASA Astrophysics Data System (ADS)
Zhao, Liping; Qin, Yongtao; Yao, Yiyong; Yan, Peng
2014-03-01
In order to meet the quality control requirement of dynamic and complicated product machining processes among enterprises, a system framework of inter-enterprise machining quality control based on fractal was proposed. In this system framework, the fractal-specific characteristic of inter-enterprise machining quality control function was analysed, and the model of inter-enterprise machining quality control was constructed by the nature of fractal structures. Furthermore, the goal-driven strategy of inter-enterprise quality control and the dynamic organisation strategy of inter-enterprise quality improvement were constructed by the characteristic analysis on this model. In addition, the architecture of inter-enterprise machining quality control based on fractal was established by means of Web service. Finally, a case study for application was presented. The result showed that the proposed method was available, and could provide guidance for quality control and support for product reliability in inter-enterprise machining processes.
Minimalism context-aware displays.
Cai, Yang
2004-12-01
Despite the rapid development of cyber technologies, today we still have very limited attention and communication bandwidth to process the increasing information flow. The goal of the study is to develop a context-aware filter to match the information load with particular needs and capacities. The functions include bandwidth-resolution trade-off and user context modeling. From the empirical lab studies, it is found that the resolution of images can be reduced in order of magnitude if the viewer knows that he/she is looking for particular features. The adaptive display queue is optimized with real-time operational conditions and user's inquiry history. Instead of measuring operator's behavior directly, ubiquitous computing models are developed to anticipate user's behavior from the operational environment data. A case study of the video stream monitoring for transit security is discussed in the paper. In addition, the author addresses the future direction of coherent human-machine vision systems.
Modeling node bandwidth limits and their effects on vector combining algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Littlefield, R.J.
Each node in a message-passing multicomputer typically has several communication links. However, the maximum aggregate communication speed of a node is often less than the sum of its individual link speeds. Such computers are called node bandwidth limited (NBL). The NBL constraint is important when choosing algorithms because it can change the relative performance of different algorithms that accomplish the same task. This paper introduces a model of communication performance for NBL computers and uses the model to analyze the overall performance of three algorithms for vector combining (global sum) on the Intel Touchstone DELTA computer. Each of the threemore » algorithms is found to be at least 33% faster than the other two for some combinations of machine size and vector length. The NBL constraint is shown to significantly affect the conditions under which each algorithm is fastest.« less
Virtualization - A Key Cost Saver in NASA Multi-Mission Ground System Architecture
NASA Technical Reports Server (NTRS)
Swenson, Paul; Kreisler, Stephen; Sager, Jennifer A.; Smith, Dan
2014-01-01
With science team budgets being slashed, and a lack of adequate facilities for science payload teams to operate their instruments, there is a strong need for innovative new ground systems that are able to provide necessary levels of capability processing power, system availability and redundancy while maintaining a small footprint in terms of physical space, power utilization and cooling.The ground system architecture being presented is based off of heritage from several other projects currently in development or operations at Goddard, but was designed and built specifically to meet the needs of the Science and Planetary Operations Control Center (SPOCC) as a low-cost payload command, control, planning and analysis operations center. However, this SPOCC architecture was designed to be generic enough to be re-used partially or in whole by other labs and missions (since its inception that has already happened in several cases!)The SPOCC architecture leverages a highly available VMware-based virtualization cluster with shared SAS Direct-Attached Storage (DAS) to provide an extremely high-performing, low-power-utilization and small-footprint compute environment that provides Virtual Machine resources shared among the various tenant missions in the SPOCC. The storage is also expandable, allowing future missions to chain up to 7 additional 2U chassis of storage at an extremely competitive cost if they require additional archive or virtual machine storage space.The software architecture provides a fully-redundant GMSEC-based message bus architecture based on the ActiveMQ middleware to track all health and safety status within the SPOCC ground system. All virtual machines utilize the GMSEC system agents to report system host health over the GMSEC bus, and spacecraft payload health is monitored using the Hammers Integrated Test and Operations System (ITOS) Galaxy Telemetry and Command (TC) system, which performs near-real-time limit checking and data processing on the downlinked data stream and injects messages into the GMSEC bus that are monitored to automatically page the on-call operator or Systems Administrator (SA) when an off-nominal condition is detected. This architecture, like the LTSP thin clients, are shared across all tenant missions.Other required IT security controls are implemented at the ground system level, including physical access controls, logical system-level authentication authorization management, auditing and reporting, network management and a NIST 800-53 FISMA-Moderate IT Security plan Risk Assessment Contingency Plan, helping multiple missions share the cost of compliance with agency-mandated directives.The SPOCC architecture provides science payload control centers and backup mission operations centers with a cost-effective, standardized approach to virtualizing and monitoring resources that were traditionally multiple racks full of physical machines. The increased agility in deploying new virtual systems and thin client workstations can provide significant savings in personnel costs for maintaining the ground system. The cost savings in procurement, power, rack footprint and cooling as well as the shared multi-mission design greatly reduces upfront cost for missions moving into the facility. Overall, the authors hope that this architecture will become a model for how future NASA operations centers are constructed!
NASA Astrophysics Data System (ADS)
Zheng, Jun; Ansari, Nirwan
2005-03-01
Call for Papers: Optical Access Networks With the wide deployment of fiber-optic technology over the past two decades, we have witnessed a tremendous growth of bandwidth capacity in the backbone networks of today's telecommunications infrastructure. However, access networks, which cover the "last-mile" areas and serve numerous residential and small business users, have not been scaled up commensurately. The local subscriber lines for telephone and cable television are still using twisted pairs and coaxial cables. Most residential connections to the Internet are still through dial-up modems operating at a low speed on twisted pairs. As the demand for access bandwidth increases with emerging high-bandwidth applications, such as distance learning, high-definition television (HDTV), and video on demand (VoD), the last-mile access networks have become a bandwidth bottleneck in today's telecommunications infrastructure. To ease this bottleneck, it is imperative to provide sufficient bandwidth capacity in the access networks to open the bottleneck and thus present more opportunities for the provisioning of multiservices. Optical access solutions promise huge bandwidth to service providers and low-cost high-bandwidth services to end users and are therefore widely considered the technology of choice for next-generation access networks. To realize the vision of optical access networks, however, many key issues still need to be addressed, such as network architectures, signaling protocols, and implementation standards. The major challenges lie in the fact that an optical solution must be not only robust, scalable, and flexible, but also implemented at a low cost comparable to that of existing access solutions in order to increase the economic viability of many potential high-bandwidth applications. In recent years, optical access networks have been receiving tremendous attention from both academia and industry. A large number of research activities have been carried out or are now underway this hot area. The purpose of this feature issue is to expose the networking community to the latest research breakthroughs and progresses in the area of optical access networks. This feature issue aims to present a collection of papers that focus on the state-of-the-art research in various networking aspects of optical access networks. Original papers are solicited from all researchers involved in area of optical access networks. Topics of interest include but not limited to:
Finding idle machines in a workstation-based distributed system
NASA Technical Reports Server (NTRS)
Theimer, Marvin M.; Lantz, Keith A.
1989-01-01
The authors describe the design and performance of scheduling facilities for finding idle hosts in a workstation-based distributed system. They focus on the tradeoffs between centralized and decentralized architectures with respect to scalability, fault tolerance, and simplicity of design, as well as several implementation issues of interest when multicast communication is used. They conclude that the principal tradeoff between the two approaches is that a centralized architecture can be scaled to a significantly greater degree and can more easily monitor global system statistics, whereas a decentralized architecture is simpler to implement.
Bit-parallel arithmetic in a massively-parallel associative processor
NASA Technical Reports Server (NTRS)
Scherson, Isaac D.; Kramer, David A.; Alleyne, Brian D.
1992-01-01
A simple but powerful new architecture based on a classical associative processor model is presented. Algorithms for performing the four basic arithmetic operations both for integer and floating point operands are described. For m-bit operands, the proposed architecture makes it possible to execute complex operations in O(m) cycles as opposed to O(m exp 2) for bit-serial machines. A word-parallel, bit-parallel, massively-parallel computing system can be constructed using this architecture with VLSI technology. The operation of this system is demonstrated for the fast Fourier transform and matrix multiplication.
2008-09-01
Abbreviations ATM automated teller machine BEA business enterprise architecture DOD...Limitations Automated Teller Machines (ATMs)-At-Sea 1988 Localized, shipboard ATMs that received and accounted for a portion of sailors’ and...use smart card technology for electronic retail ransactions and (2) economically justified on the basis of reliable analyses of stimated costs and
Specification and Analysis of Parallel Machine Architecture
1990-03-17
Parallel Machine Architeture C.V. Ramamoorthy Computer Science Division Dept. of Electrical Engineering and Computer Science University of California...capacity. (4) Adaptive: The overhead in resolution of deadlocks, etc. should be in proportion to their frequency. (5) Avoid rollbacks: Rollbacks can be...snapshots of system state graphically at a rate proportional to simulation time. Some of the examples are as follow: (1) When the simulation clock of
Reference Architecture for MNE 5 Technical System
2007-05-30
of being available in most experiments. Core Services A core set of applications whi directories, web portal and collaboration applications etc. A...classifications Messages (xml, JMS, content level…) Meta data filtering, who can initiate services Web browsing Collaboration & messaging Border...Exchange Ref Architecture for MNE5 Tech System.doc 9 of 21 audit logging Person and machine Data lev objects, web services, messages rification el
A Boltzmann machine for the organization of intelligent machines
NASA Technical Reports Server (NTRS)
Moed, Michael C.; Saridis, George N.
1990-01-01
A three-tier structure consisting of organization, coordination, and execution levels forms the architecture of an intelligent machine using the principle of increasing precision with decreasing intelligence from a hierarchically intelligent control. This system has been formulated as a probabilistic model, where uncertainty and imprecision can be expressed in terms of entropies. The optimal strategy for decision planning and task execution can be found by minimizing the total entropy in the system. The focus is on the design of the organization level as a Boltzmann machine. Since this level is responsible for planning the actions of the machine, the Boltzmann machine is reformulated to use entropy as the cost function to be minimized. Simulated annealing, expanding subinterval random search, and the genetic algorithm are presented as search techniques to efficiently find the desired action sequence and illustrated with numerical examples.
Optical Peaking Enhancement in High-Speed Ring Modulators
Müller, J.; Merget, F.; Azadeh, S. Sharif; Hauck, J.; García, S. Romero; Shen, B.; Witzens, J.
2014-01-01
Ring resonator modulators (RRM) combine extreme compactness, low power consumption and wavelength division multiplexing functionality, making them a frontrunner for addressing the scalability requirements of short distance optical links. To extend data rates beyond the classically assumed bandwidth capability, we derive and experimentally verify closed form equations of the electro-optic response and asymmetric side band generation resulting from inherent transient time dynamics and leverage these to significantly improve device performance. An equivalent circuit description with a commonly used peaking amplifier model allows straightforward assessment of the effect on existing communication system architectures. A small signal analytical expression of peaking in the electro-optic response of RRMs is derived and used to extend the electro-optic bandwidth of the device above 40 GHz as well as to open eye diagrams penalized by intersymbol interference at 32, 40 and 44 Gbps. Predicted peaking and asymmetric side band generation are in excellent agreement with experiments. PMID:25209255
An acquisition system for CMOS imagers with a genuine 10 Gbit/s bandwidth
NASA Astrophysics Data System (ADS)
Guérin, C.; Mahroug, J.; Tromeur, W.; Houles, J.; Calabria, P.; Barbier, R.
2012-12-01
This paper presents a high data throughput acquisition system for pixel detector readout such as CMOS imagers. This CMOS acquisition board offers a genuine 10 Gbit/s bandwidth to the workstation and can provide an on-line and continuous high frame rate imaging capability. On-line processing can be implemented either on the Data Acquisition Board or on the multi-cores workstation depending on the complexity of the algorithms. The different parts composing the acquisition board have been designed to be used first with a single-photon detector called LUSIPHER (800×800 pixels), developed in our laboratory for scientific applications ranging from nano-photonics to adaptive optics. The architecture of the acquisition board is presented and the performances achieved by the produced boards are described. The future developments (hardware and software) concerning the on-line implementation of algorithms dedicated to single-photon imaging are tackled.
Wide-Range Motion Estimation Architecture with Dual Search Windows for High Resolution Video Coding
NASA Astrophysics Data System (ADS)
Dung, Lan-Rong; Lin, Meng-Chun
This paper presents a memory-efficient motion estimation (ME) technique for high-resolution video compression. The main objective is to reduce the external memory access, especially for limited local memory resource. The reduction of memory access can successfully save the notorious power consumption. The key to reduce the memory accesses is based on center-biased algorithm in that the center-biased algorithm performs the motion vector (MV) searching with the minimum search data. While considering the data reusability, the proposed dual-search-windowing (DSW) approaches use the secondary windowing as an option per searching necessity. By doing so, the loading of search windows can be alleviated and hence reduce the required external memory bandwidth. The proposed techniques can save up to 81% of external memory bandwidth and require only 135 MBytes/sec, while the quality degradation is less than 0.2dB for 720p HDTV clips coded at 8Mbits/sec.
Wang, Kang; Gu, Huaxi; Yang, Yintang; Wang, Kun
2015-08-10
With the number of cores increasing, there is an emerging need for a high-bandwidth low-latency interconnection network, serving core-to-memory communication. In this paper, aiming at the goal of simultaneous access to multi-rank memory, we propose an optical interconnection network for core-to-memory communication. In the proposed network, the wavelength usage is delicately arranged so that cores can communicate with different ranks at the same time and broadcast for flow control can be achieved. A distributed memory controller architecture that works in a pipeline mode is also designed for efficient optical communication and transaction address processes. The scaling method and wavelength assignment for the proposed network are investigated. Compared with traditional electronic bus-based core-to-memory communication, the simulation results based on the PARSEC benchmark show that the bandwidth enhancement and latency reduction are apparent.
SINET3: advanced optical and IP hybrid network
NASA Astrophysics Data System (ADS)
Urushidani, Shigeo
2007-11-01
This paper introduces the new Japanese academic backbone network called SINET3, which has been in full-scale operation since June 2007. SINET3 provides a wide variety of network services, such as multi-layer transfer, enriched VPN, enhanced QoS, and layer-1 bandwidth on demand (BoD) services to create an innovative and prolific science infrastructure for more than 700 universities and research institutions. The network applies an advanced hybrid network architecture composed of 75 layer-1 switches and 12 high-performance IP routers to accommodate such diversified services in a single network platform, and provides sufficient bandwidth using Japan's first STM256 (40 Gbps) lines. The network adopts lots of the latest networking technologies, such as next-generation SDH (VCAT/GFP/LCAS), GMPLS, advanced MPLS, and logical-router technologies, for high network convergence, flexible resource assignment, and high service availability. This paper covers the network services, network design, and networking technologies of SINET3.
Behzadi, Kobra; Baghelani, Masoud
2014-05-01
This paper presents a third order continuous time current mode ΣΔ modulator for WLAN 802.11b standard applications. The proposed circuit utilized feedback architecture with scaled and optimized DAC coefficients. At circuit level, we propose a modified cascade current mirror integrator with reduced input impedance which results in more bandwidth and linearity and hence improves the dynamic range. Also, a very fast and precise novel dynamic latch based current comparator is introduced with low power consumption. This ultra fast comparator facilitates increasing the sampling rate toward GHz frequencies. The modulator exhibits dynamic range of more than 60 dB for 20 MHz signal bandwidth and OSR of 10 while consuming only 914 μW from 1.8 V power supply. The FoM of the modulator is calculated from two different methods, and excellent performance is achieved for proposed modulator.
Behzadi, Kobra; Baghelani, Masoud
2013-01-01
This paper presents a third order continuous time current mode ΣΔ modulator for WLAN 802.11b standard applications. The proposed circuit utilized feedback architecture with scaled and optimized DAC coefficients. At circuit level, we propose a modified cascade current mirror integrator with reduced input impedance which results in more bandwidth and linearity and hence improves the dynamic range. Also, a very fast and precise novel dynamic latch based current comparator is introduced with low power consumption. This ultra fast comparator facilitates increasing the sampling rate toward GHz frequencies. The modulator exhibits dynamic range of more than 60 dB for 20 MHz signal bandwidth and OSR of 10 while consuming only 914 μW from 1.8 V power supply. The FoM of the modulator is calculated from two different methods, and excellent performance is achieved for proposed modulator. PMID:25685504
A Sensor-Based Method for Diagnostics of Machine Tool Linear Axes.
Vogl, Gregory W; Weiss, Brian A; Donmez, M Alkan
2015-01-01
A linear axis is a vital subsystem of machine tools, which are vital systems within many manufacturing operations. When installed and operating within a manufacturing facility, a machine tool needs to stay in good condition for parts production. All machine tools degrade during operations, yet knowledge of that degradation is illusive; specifically, accurately detecting degradation of linear axes is a manual and time-consuming process. Thus, manufacturers need automated and efficient methods to diagnose the condition of their machine tool linear axes without disruptions to production. The Prognostics and Health Management for Smart Manufacturing Systems (PHM4SMS) project at the National Institute of Standards and Technology (NIST) developed a sensor-based method to quickly estimate the performance degradation of linear axes. The multi-sensor-based method uses data collected from a 'sensor box' to identify changes in linear and angular errors due to axis degradation; the sensor box contains inclinometers, accelerometers, and rate gyroscopes to capture this data. The sensors are expected to be cost effective with respect to savings in production losses and scrapped parts for a machine tool. Numerical simulations, based on sensor bandwidth and noise specifications, show that changes in straightness and angular errors could be known with acceptable test uncertainty ratios. If a sensor box resides on a machine tool and data is collected periodically, then the degradation of the linear axes can be determined and used for diagnostics and prognostics to help optimize maintenance, production schedules, and ultimately part quality.
A Sensor-Based Method for Diagnostics of Machine Tool Linear Axes
Vogl, Gregory W.; Weiss, Brian A.; Donmez, M. Alkan
2017-01-01
A linear axis is a vital subsystem of machine tools, which are vital systems within many manufacturing operations. When installed and operating within a manufacturing facility, a machine tool needs to stay in good condition for parts production. All machine tools degrade during operations, yet knowledge of that degradation is illusive; specifically, accurately detecting degradation of linear axes is a manual and time-consuming process. Thus, manufacturers need automated and efficient methods to diagnose the condition of their machine tool linear axes without disruptions to production. The Prognostics and Health Management for Smart Manufacturing Systems (PHM4SMS) project at the National Institute of Standards and Technology (NIST) developed a sensor-based method to quickly estimate the performance degradation of linear axes. The multi-sensor-based method uses data collected from a ‘sensor box’ to identify changes in linear and angular errors due to axis degradation; the sensor box contains inclinometers, accelerometers, and rate gyroscopes to capture this data. The sensors are expected to be cost effective with respect to savings in production losses and scrapped parts for a machine tool. Numerical simulations, based on sensor bandwidth and noise specifications, show that changes in straightness and angular errors could be known with acceptable test uncertainty ratios. If a sensor box resides on a machine tool and data is collected periodically, then the degradation of the linear axes can be determined and used for diagnostics and prognostics to help optimize maintenance, production schedules, and ultimately part quality. PMID:28691039
MARTI: man-machine animation real-time interface
NASA Astrophysics Data System (ADS)
Jones, Christian M.; Dlay, Satnam S.
1997-05-01
The research introduces MARTI (man-machine animation real-time interface) for the realization of natural human-machine interfacing. The system uses simple vocal sound-tracks of human speakers to provide lip synchronization of computer graphical facial models. We present novel research in a number of engineering disciplines, which include speech recognition, facial modeling, and computer animation. This interdisciplinary research utilizes the latest, hybrid connectionist/hidden Markov model, speech recognition system to provide very accurate phone recognition and timing for speaker independent continuous speech, and expands on knowledge from the animation industry in the development of accurate facial models and automated animation. The research has many real-world applications which include the provision of a highly accurate and 'natural' man-machine interface to assist user interactions with computer systems and communication with one other using human idiosyncrasies; a complete special effects and animation toolbox providing automatic lip synchronization without the normal constraints of head-sets, joysticks, and skilled animators; compression of video data to well below standard telecommunication channel bandwidth for video communications and multi-media systems; assisting speech training and aids for the handicapped; and facilitating player interaction for 'video gaming' and 'virtual worlds.' MARTI has introduced a new level of realism to man-machine interfacing and special effect animation which has been previously unseen.
Integrated Short Range, Low Bandwidth, Wearable Communications Networking Technologies
2012-04-30
Only (FOUO) Table of Contents Introduction 7 Research Discussions 7 1 Specifications 8 2 SAN Radio 9 2.1 R.F. Design Improvements 9 2.1.1 LNA...Characterization and Verification Testing 26 2.2 Digital Design Improvements 26 2.2.1 Improve Processor Access to Memory Resources 26 2.2.2...integrated and tested . A hybrid architecture of the automatic gain control (AGC) was designed to Page 7 of 116 For Official Use Only (FOUO
Measured performance of the GTA rf systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denney, P.M.; Jachim, S.P.
1993-06-01
This paper describes the performance of the RF systems on the Ground Test Accelerator (GTA). The RF system architecture is briefly described. Among the RF performance results presented are RF field flatness and stability, amplitude and phase control resolution, and control system bandwidth and stability. The rejection by the RF systems of beam-induced disturbances, such as transients and noise, are analyzed. The observed responses are also compared to computer-based simulations of the RF systems for validation.
Measured performance of the GTA rf systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Denney, P.M.; Jachim, S.P.
1993-01-01
This paper describes the performance of the RF systems on the Ground Test Accelerator (GTA). The RF system architecture is briefly described. Among the RF performance results presented are RF field flatness and stability, amplitude and phase control resolution, and control system bandwidth and stability. The rejection by the RF systems of beam-induced disturbances, such as transients and noise, are analyzed. The observed responses are also compared to computer-based simulations of the RF systems for validation.
Dense wavelength division multiplexing devices for metropolitan-area datacom and telecom networks
NASA Astrophysics Data System (ADS)
DeCusatis, Casimer M.; Priest, David G.
2000-12-01
Large data processing environments in use today can require multi-gigabyte or terabyte capacity in the data communication infrastructure; these requirements are being driven by storage area networks with access to petabyte data bases, new architecture for parallel processing which require high bandwidth optical links, and rapidly growing network applications such as electronic commerce over the Internet or virtual private networks. These datacom applications require high availability, fault tolerance, security, and the capacity to recover from any single point of failure without relying on traditional SONET-based networking. These requirements, coupled with fiber exhaust in metropolitan areas, are driving the introduction of dense optical wavelength division multiplexing (DWDM) in data communication systems, particularly for large enterprise servers or mainframes. In this paper, we examine the technical requirements for emerging nextgeneration DWDM systems. Protocols for storage area networks and computer architectures such as Parallel Sysplex are presented, including their fiber bandwidth requirements. We then describe two commercially available DWDM solutions, a first generation 10 channel system and a recently announced next generation 32 channel system. Technical requirements, network management and security, fault tolerant network designs, new network topologies enabled by DWDM, and the role of time division multiplexing in the network are all discussed. Finally, we present a description of testing conducted on these networks and future directions for this technology.
Digital Front End for Wide-Band VLBI Science Receiver
NASA Technical Reports Server (NTRS)
Jongeling, Andre; Sigman, Elliott; Navarro, Robert; Goodhart, Charles; Rogstad, Steve; Chandra, Kumar; Finley, Sue; Trinh, Joseph; Soriano, Melissa; White, Les;
2006-01-01
An upgrade to the very-long-baseline-interferometry (VLBI) science receiver (VSR) a radio receiver used in NASA's Deep Space Network (DSN) is currently being implemented. The current VSR samples standard DSN intermediate- frequency (IF) signals at 256 MHz and after digital down-conversion records data from up to four 16-MHz baseband channels. Currently, IF signals are limited to the 265-to-375-MHz range, and recording rates are limited to less than 80 Mbps. The new digital front end, denoted the Wideband VSR, provides improvements to enable the receiver to process wider bandwidth signals and accommodate more data channels for recording. The Wideband VSR utilizes state-of-the-art commercial analog-to-digital converter and field-programmable gate array (FPGA) integrated circuits, and fiber-optic connections in a custom architecture. It accepts IF signals from 100 to 600 MHz, sampling the signal at 1.28 GHz. The sample data are sent to a digital processing module, using a fiber-optic link for isolation. The digital processing module includes boards designed around an Advanced Telecom Computing Architecture (ATCA) industry-standard backplane. Digital signal processing implemented in FPGAs down-convert the data signals in up to 16 baseband channels with programmable bandwidths from 1 kHz to 16 MHz. Baseband samples are transmitted to a computer via multiple Ethernet connections allowing recording to disk at rates of up to 1 Gbps.
Brain Network Architecture and Global Intelligence in Children with Focal Epilepsy.
Paldino, M J; Golriz, F; Chapieski, M L; Zhang, W; Chu, Z D
2017-02-01
The biologic basis for intelligence rests to a large degree on the capacity for efficient integration of information across the cerebral network. We aimed to measure the relationship between network architecture and intelligence in the pediatric, epileptic brain. Patients were retrospectively identified with the following: 1) focal epilepsy; 2) brain MR imaging at 3T, including resting-state functional MR imaging; and 3) full-scale intelligence quotient measured by a pediatric neuropsychologist. The cerebral cortex was parcellated into approximately 700 gray matter network "nodes." The strength of a connection between 2 nodes was defined by the correlation between their blood oxygen level-dependent time-series. We calculated the following topologic properties: clustering coefficient, transitivity, modularity, path length, and global efficiency. A machine learning algorithm was used to measure the independent contribution of each metric to the intelligence quotient after adjusting for all other metrics. Thirty patients met the criteria (4-18 years of age); 20 patients required anesthesia during MR imaging. After we accounted for age and sex, clustering coefficient and path length were independently associated with full-scale intelligence quotient. Neither motion parameters nor general anesthesia was an important variable with regard to accurate intelligence quotient prediction by the machine learning algorithm. A longer history of epilepsy was associated with shorter path lengths ( P = .008), consistent with reorganization of the network on the basis of seizures. Considering only patients receiving anesthesia during machine learning did not alter the patterns of network architecture contributing to global intelligence. These findings support the physiologic relevance of imaging-based metrics of network architecture in the pathologic, developing brain. © 2017 by American Journal of Neuroradiology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Song, Shuaiwen; Fu, Haohuan
2014-08-16
Support Vector Machine (SVM) has been widely used in data-mining and Big Data applications as modern commercial databases start to attach an increasing importance to the analytic capabilities. In recent years, SVM was adapted to the field of High Performance Computing for power/performance prediction, auto-tuning, and runtime scheduling. However, even at the risk of losing prediction accuracy due to insufficient runtime information, researchers can only afford to apply offline model training to avoid significant runtime training overhead. To address the challenges above, we designed and implemented MICSVM, a highly efficient parallel SVM for x86 based multi-core and many core architectures,more » such as the Intel Ivy Bridge CPUs and Intel Xeon Phi coprocessor (MIC).« less
A proposal of an architecture for the coordination level of intelligent machines
NASA Technical Reports Server (NTRS)
Beard, Randall; Farah, Jeff; Lima, Pedro
1993-01-01
The issue of obtaining a practical, structured, and detailed description of an architecture for the Coordination Level of Center for Intelligent Robotic Systems for Sapce Exploration (CIRSSE) Testbed Intelligent Controller is addressed. Previous theoretical and implementation works were the departure point for the discussion. The document is organized as follows: after this introductory section, section 2 summarizes the overall view of the Intelligent Machine (IM) as a control system, proposing a performance measure on which to base its design. Section 3 addresses with some detail implementation issues. An hierarchic petri-net with feedback-based learning capabilities is proposed. Finally, section 4 is an attempt to address the feedback problem. Feedback is used for two functions: error recovery and reinforcement learning of the correct translations for the petri-net transitions.
Proceedings of the NASA Conference on Space Telerobotics, volume 2
NASA Technical Reports Server (NTRS)
Rodriguez, Guillermo (Editor); Seraji, Homayoun (Editor)
1989-01-01
These proceedings contain papers presented at the NASA Conference on Space Telerobotics held in Pasadena, January 31 to February 2, 1989. The theme of the Conference was man-machine collaboration in space. The Conference provided a forum for researchers and engineers to exchange ideas on the research and development required for application of telerobotics technology to the space systems planned for the 1990s and beyond. The Conference: (1) provided a view of current NASA telerobotic research and development; (2) stimulated technical exchange on man-machine systems, manipulator control, machine sensing, machine intelligence, concurrent computation, and system architectures; and (3) identified important unsolved problems of current interest which can be dealt with by future research.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Mathematical defense method of networked servers with controlled remote backups
NASA Astrophysics Data System (ADS)
Kim, Song-Kyoo
2006-05-01
The networked server defense model is focused on reliability and availability in security respects. The (remote) backup servers are hooked up by VPN (Virtual Private Network) with high-speed optical network and replace broken main severs immediately. The networked server can be represent as "machines" and then the system deals with main unreliable, spare, and auxiliary spare machine. During vacation periods, when the system performs a mandatory routine maintenance, auxiliary machines are being used for back-ups; the information on the system is naturally delayed. Analog of the N-policy to restrict the usage of auxiliary machines to some reasonable quantity. The results are demonstrated in the network architecture by using the stochastic optimization techniques.
NASA Astrophysics Data System (ADS)
McGrath, Carl J.
1994-11-01
Continued evolution of consumer broadband services such as digital video and digital multimedia has placed renewed emphasis on the need for network solutions to the broadband connectivity challenge. Although still important to architectural planners, connection oriented broadband services based on ISDN concepts must now compete with a wider array of broadcast and highly asymmetrical services for bandwidth on the network. For network operators, the business imperative is to identify and execute a network rebuild plan that will meet the capacity and flexibility needs of these services and compete with the inevitable alternate paths into the home. This paper focuses on some of the key issues facing broadband network planners as they search for the best architecture to meet the business and operations goals in their segment of the market. It will be apparent that no single optimum solution exists for all deployment scenarios, emphasizing the need for flexible and modular sources (such as servers) and network interfaces (such as set tops) which preserve the value of content, the ultimate driver in this round of network revolution.
Lewis, Richard L; Shvartsman, Michael; Singh, Satinder
2013-07-01
We explore the idea that eye-movement strategies in reading are precisely adapted to the joint constraints of task structure, task payoff, and processing architecture. We present a model of saccadic control that separates a parametric control policy space from a parametric machine architecture, the latter based on a small set of assumptions derived from research on eye movements in reading (Engbert, Nuthmann, Richter, & Kliegl, 2005; Reichle, Warren, & McConnell, 2009). The eye-control model is embedded in a decision architecture (a machine and policy space) that is capable of performing a simple linguistic task integrating information across saccades. Model predictions are derived by jointly optimizing the control of eye movements and task decisions under payoffs that quantitatively express different desired speed-accuracy trade-offs. The model yields distinct eye-movement predictions for the same task under different payoffs, including single-fixation durations, frequency effects, accuracy effects, and list position effects, and their modulation by task payoff. The predictions are compared to-and found to accord with-eye-movement data obtained from human participants performing the same task under the same payoffs, but they are found not to accord as well when the assumptions concerning payoff optimization and processing architecture are varied. These results extend work on rational analysis of oculomotor control and adaptation of reading strategy (Bicknell & Levy, ; McConkie, Rayner, & Wilson, 1973; Norris, 2009; Wotschack, 2009) by providing evidence for adaptation at low levels of saccadic control that is shaped by quantitatively varying task demands and the dynamics of processing architecture. Copyright © 2013 Cognitive Science Society, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mou, J.I.; King, C.
The focus of this study is to develop a sensor fused process modeling and control methodology to model, assess, and then enhance the performance of a hexapod machine for precision product realization. Deterministic modeling technique was used to derive models for machine performance assessment and enhancement. Sensor fusion methodology was adopted to identify the parameters of the derived models. Empirical models and computational algorithms were also derived and implemented to model, assess, and then enhance the machine performance. The developed sensor fusion algorithms can be implemented on a PC-based open architecture controller to receive information from various sensors, assess themore » status of the process, determine the proper action, and deliver the command to actuators for task execution. This will enhance a hexapod machine`s capability to produce workpieces within the imposed dimensional tolerances.« less
Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms.
Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel
2014-01-01
With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies.
Rebooting Computers as Learning Machines
DeBenedictis, Erik P.
2016-06-13
Artificial neural networks could become the technological driver that replaces Moore's law, boosting computers' utlity through a process akin to automatic programming--although physics and computer architecture would are also a factor.
Rebooting Computers as Learning Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik P.
Artificial neural networks could become the technological driver that replaces Moore's law, boosting computers' utlity through a process akin to automatic programming--although physics and computer architecture would are also a factor.
An Analysis of Hardware-Assisted Virtual Machine Based Rootkits
2014-06-01
certain aspects of TPM implementation just to name a few. HyperWall is an architecture proposed by Szefer and Lee to protect guest VMs from...DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) The use of virtual machine (VM) technology has expanded rapidly since AMD and Intel implemented ...Intel VT-x implementations of Blue Pill to identify commonalities in the respective versions’ attack methodologies from both a functional and technical
NASA Astrophysics Data System (ADS)
Land, Walker H., Jr.; Lewis, Michael; Sadik, Omowunmi; Wong, Lut; Wanekaya, Adam; Gonzalez, Richard J.; Balan, Arun
2004-04-01
This paper extends the classification approaches described in reference [1] in the following way: (1.) developing and evaluating a new method for evolving organophosphate nerve agent Support Vector Machine (SVM) classifiers using Evolutionary Programming, (2.) conducting research experiments using a larger database of organophosphate nerve agents, and (3.) upgrading the architecture to an object-based grid system for evaluating the classification of EP derived SVMs. Due to the increased threats of chemical and biological weapons of mass destruction (WMD) by international terrorist organizations, a significant effort is underway to develop tools that can be used to detect and effectively combat biochemical warfare. This paper reports the integration of multi-array sensors with Support Vector Machines (SVMs) for the detection of organophosphates nerve agents using a grid computing system called Legion. Grid computing is the use of large collections of heterogeneous, distributed resources (including machines, databases, devices, and users) to support large-scale computations and wide-area data access. Finally, preliminary results using EP derived support vector machines designed to operate on distributed systems have provided accurate classification results. In addition, distributed training time architectures are 50 times faster when compared to standard iterative training time methods.
Radioastronomic signal processing cores for the SKA radio telescope
NASA Astrophysics Data System (ADS)
Comorett, G.; Chiarucc, S.; Belli, C.
Modern radio telescopes require the processing of wideband signals, with sample rates from tens of MHz to tens of GHz, and are composed from hundreds up to a million of individual antennas. Digital signal processing of these signals include digital receivers (the digital equivalent of the heterodyne receiver), beamformers, channelizers, spectrometers. FPGAs present the advantage of providing a relatively low power consumption, relative to GPUs or dedicated computers, a wide signal data path, and high interconnectivity. Efficient algorithms have been developed for these applications. Here we will review some of the signal processing cores developed for the SKA telescope. The LFAA beamformer/channelizer architecture is based on an oversampling channelizer, where the channelizer output sampling rate and channel spacing can be set independently. This is useful where an overlap between adjacent channels is required to provide an uniform spectral coverage. The architecture allows for an efficient and distributed channelization scheme, with a final resolution corresponding to a million of spectral channels, minimum leakage and high out-of-band rejection. An optimized filter design procedure is used to provide an equiripple response with a very large number of spectral channels. A wideband digital receiver has been designed in order to select the processed bandwidth of the SKA Mid receiver. The receiver extracts a 2.5 MHz bandwidth form a 14 GHz input bandwidth. The design allows for non-integer ratios between the input and output sampling rates, with a resource usage comparable to that of a conventional decimating digital receiver. Finally, some considerations on quantization of radioastronomic signals are presented. Due to the stochastic nature of the signal, quantization using few data bits is possible. Good accuracies and dynamic range are possible even with 2-3 bits, but the nonlinearity in the correlation process must be corrected in post-processing. With at least 6 bits it is possible to have a very linear response of the instrument, with nonlinear terms below 80 dB, providing the signal amplitude is kept within bounds.
A direct-to-drive neural data acquisition system.
Kinney, Justin P; Bernstein, Jacob G; Meyer, Andrew J; Barber, Jessica B; Bolivar, Marti; Newbold, Bryan; Scholvin, Jorg; Moore-Kochlacs, Caroline; Wentz, Christian T; Kopell, Nancy J; Boyden, Edward S
2015-01-01
Driven by the increasing channel count of neural probes, there is much effort being directed to creating increasingly scalable electrophysiology data acquisition (DAQ) systems. However, all such systems still rely on personal computers for data storage, and thus are limited by the bandwidth and cost of the computers, especially as the scale of recording increases. Here we present a novel architecture in which a digital processor receives data from an analog-to-digital converter, and writes that data directly to hard drives, without the need for a personal computer to serve as an intermediary in the DAQ process. This minimalist architecture may support exceptionally high data throughput, without incurring costs to support unnecessary hardware and overhead associated with personal computers, thus facilitating scaling of electrophysiological recording in the future.
A direct-to-drive neural data acquisition system
Kinney, Justin P.; Bernstein, Jacob G.; Meyer, Andrew J.; Barber, Jessica B.; Bolivar, Marti; Newbold, Bryan; Scholvin, Jorg; Moore-Kochlacs, Caroline; Wentz, Christian T.; Kopell, Nancy J.; Boyden, Edward S.
2015-01-01
Driven by the increasing channel count of neural probes, there is much effort being directed to creating increasingly scalable electrophysiology data acquisition (DAQ) systems. However, all such systems still rely on personal computers for data storage, and thus are limited by the bandwidth and cost of the computers, especially as the scale of recording increases. Here we present a novel architecture in which a digital processor receives data from an analog-to-digital converter, and writes that data directly to hard drives, without the need for a personal computer to serve as an intermediary in the DAQ process. This minimalist architecture may support exceptionally high data throughput, without incurring costs to support unnecessary hardware and overhead associated with personal computers, thus facilitating scaling of electrophysiological recording in the future. PMID:26388740
GPU and APU computations of Finite Time Lyapunov Exponent fields
NASA Astrophysics Data System (ADS)
Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros
2012-03-01
We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.
High-speed fiber-optic links for distribution of satellite traffic
NASA Technical Reports Server (NTRS)
Daryoush, Afshin S.; Saedi, Reza; Ackerman, Edward; Kunath, Richard; Shalkhauser, Kurt
1990-01-01
Low-loss fiberoptic links are designed for distribution of data and the frequency reference in large-aperture phased-array antennas based on the transmit/receive-level data mixing architecture. In particular, design aspects of a fiberoptic link satisfying the distribution requirements of satellite data traffic are presented. The design is addressed in terms of reactively matched optical transmitter and receiver modules. Analog and digital characterization of a 50-m fiberoptic link realized using these modules indicates the applicability of this architecture as the only viable alternative for distribution of data signals inside a satellite at present. It is demonstrated that the design of a reactive matching modules enhances the link performance. A dynamic range of 88 dB/MHz was measured for analog data over a 500-1000-MHz bandwidth.
Scalability analysis methodology for passive optical interconnects in data center networks using PAM
NASA Astrophysics Data System (ADS)
Lin, R.; Szczerba, Krzysztof; Agrell, Erik; Wosinska, Lena; Tang, M.; Liu, D.; Chen, J.
2017-11-01
A framework is developed for modeling the fundamental impairments in optical datacenter interconnects, i.e., the power loss and the receiver noises. This framework makes it possible, to analyze the trade-offs between data rates, modulation order, and number of ports that can be supported in optical interconnect architectures, while guaranteeing that the required signal-to-noise ratios are satisfied. To the best of our knowledge, this important assessment methodology is not yet available. As a case study, the trade-offs are investigated for three coupler-based top-of-rack interconnect architectures, which suffer from serious insertion loss. The results show that using single-port transceivers with 10 GHz bandwidth, avalanche photodiode detectors, and quadratical pulse amplitude modulation, more than 500 ports can be supported.
Sensing and perception: Connectionist approaches to subcognitive computing
NASA Technical Reports Server (NTRS)
Skrrypek, J.
1987-01-01
New approaches to machine sensing and perception are presented. The motivation for crossdisciplinary studies of perception in terms of AI and neurosciences is suggested. The question of computing architecture granularity as related to global/local computation underlying perceptual function is considered and examples of two environments are given. Finally, the examples of using one of the environments, UCLA PUNNS, to study neural architectures for visual function are presented.
ISA-97 Compliant Architecture Testbed (ICAT) Projectry Organizations
1992-03-30
by the System Integracion Directorate of the USAISEC, August 29, 1992. The report discusses the refinement of the ISA-97 Compliant Architecture Model...browser and iconic representations of system objects and resources. When the user is interacting with an application which has multiple compo- nents, it is...computer communications, it is not uncommon for large information systems to be shared by users on multiple machines. The trend towards the desktop
The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor
1991-06-01
Symposium on Compiler Construction, June 1986. [14] Daniel Gajski , David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar - A Large Scale Multiprocessor. In...Directory Methods. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990. [31] G . M. Papadopoulos and D.E. Culler...Monsoon: An Explicit Token-Store Ar- chitecture. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990. [32] G . F
GREAT: a web portal for Genome Regulatory Architecture Tools
Bouyioukos, Costas; Bucchini, François; Elati, Mohamed; Képès, François
2016-01-01
GREAT (Genome REgulatory Architecture Tools) is a novel web portal for tools designed to generate user-friendly and biologically useful analysis of genome architecture and regulation. The online tools of GREAT are freely accessible and compatible with essentially any operating system which runs a modern browser. GREAT is based on the analysis of genome layout -defined as the respective positioning of co-functional genes- and its relation with chromosome architecture and gene expression. GREAT tools allow users to systematically detect regular patterns along co-functional genomic features in an automatic way consisting of three individual steps and respective interactive visualizations. In addition to the complete analysis of regularities, GREAT tools enable the use of periodicity and position information for improving the prediction of transcription factor binding sites using a multi-view machine learning approach. The outcome of this integrative approach features a multivariate analysis of the interplay between the location of a gene and its regulatory sequence. GREAT results are plotted in web interactive graphs and are available for download either as individual plots, self-contained interactive pages or as machine readable tables for downstream analysis. The GREAT portal can be reached at the following URL https://absynth.issb.genopole.fr/GREAT and each individual GREAT tool is available for downloading. PMID:27151196
Cognitive Architectures and Autonomy: A Comparative Review
NASA Astrophysics Data System (ADS)
Thórisson, Kristinn; Helgasson, Helgi
2012-05-01
One of the original goals of artificial intelligence (AI) research was to create machines with very general cognitive capabilities and a relatively high level of autonomy. It has taken the field longer than many had expected to achieve even a fraction of this goal; the community has focused on building specific, targeted cognitive processes in isolation, and as of yet no system exists that integrates a broad range of capabilities or presents a general solution to autonomous acquisition of a large set of skills. Among the reasons for this are the highly limited machine learning and adaptation techniques available, and the inherent complexity of integrating numerous cognitive and learning capabilities in a coherent architecture. In this paper we review selected systems and architectures built expressly to address integrated skills. We highlight principles and features of these systems that seem promising for creating generally intelligent systems with some level of autonomy, and discuss them in the context of the development of future cognitive architectures. Autonomy is a key property for any system to be considered generally intelligent, in our view; we use this concept as an organizing principle for comparing the reviewed systems. Features that remain largely unaddressed in present research, but seem nevertheless necessary for such efforts to succeed, are also discussed.
2001-09-01
testing is performed between two machines connected by either a 100 Mbps Ethernet connection or a 56K modem connection. This testing is performed...and defined as follows: • The available bandwidth is set at two different levels (Ethernet 100 Mbps and 56K modem ). 32 • The packet size is set... modem connection. These two connections represent the target 100 Mbps high end and 56k bps low end of anticipated client connections in web-based
Light-operated machines based on threaded molecular structures.
Credi, Alberto; Silvi, Serena; Venturi, Margherita
2014-01-01
Rotaxanes and related species represent the most common implementation of the concept of artificial molecular machines, because the supramolecular nature of the interactions between the components and their interlocked architecture allow a precise control on the position and movement of the molecular units. The use of light to power artificial molecular machines is particularly valuable because it can play the dual role of "writing" and "reading" the system. Moreover, light-driven machines can operate without accumulation of waste products, and photons are the ideal inputs to enable autonomous operation mechanisms. In appropriately designed molecular machines, light can be used to control not only the stability of the system, which affects the relative position of the molecular components but also the kinetics of the mechanical processes, thereby enabling control on the direction of the movements. This step forward is necessary in order to make a leap from molecular machines to molecular motors.
Approach to design neural cryptography: a generalized architecture and a heuristic rule.
Mu, Nankun; Liao, Xiaofeng; Huang, Tingwen
2013-06-01
Neural cryptography, a type of public key exchange protocol, is widely considered as an effective method for sharing a common secret key between two neural networks on public channels. How to design neural cryptography remains a great challenge. In this paper, in order to provide an approach to solve this challenge, a generalized network architecture and a significant heuristic rule are designed. The proposed generic framework is named as tree state classification machine (TSCM), which extends and unifies the existing structures, i.e., tree parity machine (TPM) and tree committee machine (TCM). Furthermore, we carefully study and find that the heuristic rule can improve the security of TSCM-based neural cryptography. Therefore, TSCM and the heuristic rule can guide us to designing a great deal of effective neural cryptography candidates, in which it is possible to achieve the more secure instances. Significantly, in the light of TSCM and the heuristic rule, we further expound that our designed neural cryptography outperforms TPM (the most secure model at present) on security. Finally, a series of numerical simulation experiments are provided to verify validity and applicability of our results.
Using deep learning for content-based medical image retrieval
NASA Astrophysics Data System (ADS)
Sun, Qinpei; Yang, Yuanyuan; Sun, Jianyong; Yang, Zhiming; Zhang, Jianguo
2017-03-01
Content-Based medical image retrieval (CBMIR) is been highly active research area from past few years. The retrieval performance of a CBMIR system crucially depends on the feature representation, which have been extensively studied by researchers for decades. Although a variety of techniques have been proposed, it remains one of the most challenging problems in current CBMIR research, which is mainly due to the well-known "semantic gap" issue that exists between low-level image pixels captured by machines and high-level semantic concepts perceived by human[1]. Recent years have witnessed some important advances of new techniques in machine learning. One important breakthrough technique is known as "deep learning". Unlike conventional machine learning methods that are often using "shallow" architectures, deep learning mimics the human brain that is organized in a deep architecture and processes information through multiple stages of transformation and representation. This means that we do not need to spend enormous energy to extract features manually. In this presentation, we propose a novel framework which uses deep learning to retrieval the medical image to improve the accuracy and speed of a CBIR in integrated RIS/PACS.
DeepX: Deep Learning Accelerator for Restricted Boltzmann Machine Artificial Neural Networks.
Kim, Lok-Won
2018-05-01
Although there have been many decades of research and commercial presence on high performance general purpose processors, there are still many applications that require fully customized hardware architectures for further computational acceleration. Recently, deep learning has been successfully used to learn in a wide variety of applications, but their heavy computation demand has considerably limited their practical applications. This paper proposes a fully pipelined acceleration architecture to alleviate high computational demand of an artificial neural network (ANN) which is restricted Boltzmann machine (RBM) ANNs. The implemented RBM ANN accelerator (integrating network size, using 128 input cases per batch, and running at a 303-MHz clock frequency) integrated in a state-of-the art field-programmable gate array (FPGA) (Xilinx Virtex 7 XC7V-2000T) provides a computational performance of 301-billion connection-updates-per-second and about 193 times higher performance than a software solution running on general purpose processors. Most importantly, the architecture enables over 4 times (12 times in batch learning) higher performance compared with a previous work when both are implemented in an FPGA device (XC2VP70).
Designing a connectionist network supercomputer.
Asanović, K; Beck, J; Feldman, J; Morgan, N; Wawrzynek, J
1993-12-01
This paper describes an effort at UC Berkeley and the International Computer Science Institute to develop a supercomputer for artificial neural network applications. Our perspective has been strongly influenced by earlier experiences with the construction and use of a simpler machine. In particular, we have observed Amdahl's Law in action in our designs and those of others. These observations inspire attention to many factors beyond fast multiply-accumulate arithmetic. We describe a number of these factors along with rough expressions for their influence and then give the applications targets, machine goals and the system architecture for the machine we are currently designing.
Layered Architectures for Quantum Computers and Quantum Repeaters
NASA Astrophysics Data System (ADS)
Jones, Nathan C.
This chapter examines how to organize quantum computers and repeaters using a systematic framework known as layered architecture, where machine control is organized in layers associated with specialized tasks. The framework is flexible and could be used for analysis and comparison of quantum information systems. To demonstrate the design principles in practice, we develop architectures for quantum computers and quantum repeaters based on optically controlled quantum dots, showing how a myriad of technologies must operate synchronously to achieve fault-tolerance. Optical control makes information processing in this system very fast, scalable to large problem sizes, and extendable to quantum communication.
Rio: a dynamic self-healing services architecture using Jini networking technology
NASA Astrophysics Data System (ADS)
Clarke, James B.
2002-06-01
Current mainstream distributed Java architectures offer great capabilities embracing conventional enterprise architecture patterns and designs. These traditional systems provide robust transaction oriented environments that are in large part focused on data and host processors. Typically, these implementations require that an entire application be deployed on every machine that will be used as a compute resource. In order for this to happen, the application is usually taken down, installed and started with all systems in-sync and knowing about each other. Static environments such as these present an extremely difficult environment to setup, deploy and administer.
The roofline model: A pedagogical tool for program analysis and optimization
Williams, Samuel; Patterson, David; Oliker, Leonid; ...
2008-08-01
This article consists of a collection of slides from the authors' conference presentation. The Roofline model is a visually intuitive figure for kernel analysis and optimization. We believe undergraduates will find it useful in assessing performance and scalability limitations. It is easily extended to other architectural paradigms. It is easily extendable to other metrics: performance (sort, graphics, crypto..) bandwidth (L2, PCIe, ..). Furthermore, a performance counters could be used to generate a runtime-specific roofline that would greatly aide the optimization.
Shuttle Global Positioning System (GPS) system design study
NASA Technical Reports Server (NTRS)
Nilsen, P. W.
1979-01-01
The various integration problems in the Shuttle GPS system were investigated. The analysis of the Shuttle GPS link was studied. A preamplifier was designed since the Shuttle GPS antennas must be located remotely from the receiver. Several GPS receiver architecture trade-offs were discussed. The Shuttle RF harmonics and intermode that fall within the GPS receiver bandwidth were analyzed. The GPS PN code acquisition was examined. Since the receiver clock strongly affects both GPS carrier and code acquisition performance, a clock model was developed.
2015-09-01
Gateway 2 4. Voice Packet Flow: SIP , Session Description Protocol (SDP), and RTP 3 5. Voice Data Analysis 5 6. Call Analysis 6 7. Call Metrics 6...analysis processing is designed for a general VoIP system architecture based on Session Initiation Protocol ( SIP ) for negotiating call sessions and...employs Skinny Client Control Protocol for network communication between the phone and the local CallManager (e.g., for each dialed digit), SIP
Coarse-Grain Bandwidth Estimation Scheme for Large-Scale Network
NASA Technical Reports Server (NTRS)
Cheung, Kar-Ming; Jennings, Esther H.; Sergui, John S.
2013-01-01
A large-scale network that supports a large number of users can have an aggregate data rate of hundreds of Mbps at any time. High-fidelity simulation of a large-scale network might be too complicated and memory-intensive for typical commercial-off-the-shelf (COTS) tools. Unlike a large commercial wide-area-network (WAN) that shares diverse network resources among diverse users and has a complex topology that requires routing mechanism and flow control, the ground communication links of a space network operate under the assumption of a guaranteed dedicated bandwidth allocation between specific sparse endpoints in a star-like topology. This work solved the network design problem of estimating the bandwidths of a ground network architecture option that offer different service classes to meet the latency requirements of different user data types. In this work, a top-down analysis and simulation approach was created to size the bandwidths of a store-and-forward network for a given network topology, a mission traffic scenario, and a set of data types with different latency requirements. These techniques were used to estimate the WAN bandwidths of the ground links for different architecture options of the proposed Integrated Space Communication and Navigation (SCaN) Network. A new analytical approach, called the "leveling scheme," was developed to model the store-and-forward mechanism of the network data flow. The term "leveling" refers to the spreading of data across a longer time horizon without violating the corresponding latency requirement of the data type. Two versions of the leveling scheme were developed: 1. A straightforward version that simply spreads the data of each data type across the time horizon and doesn't take into account the interactions among data types within a pass, or between data types across overlapping passes at a network node, and is inherently sub-optimal. 2. Two-state Markov leveling scheme that takes into account the second order behavior of the store-and-forward mechanism, and the interactions among data types within a pass. The novelty of this approach lies in the modeling of the store-and-forward mechanism of each network node. The term store-and-forward refers to the data traffic regulation technique in which data is sent to an intermediate network node where they are temporarily stored and sent at a later time to the destination node or to another intermediate node. Store-and-forward can be applied to both space-based networks that have intermittent connectivity, and ground-based networks with deterministic connectivity. For groundbased networks, the store-and-forward mechanism is used to regulate the network data flow and link resource utilization such that the user data types can be delivered to their destination nodes without violating their respective latency requirements.
Yang, Hui; Zhang, Jie; Ji, Yuefeng; Tan, Yuanlong; Lin, Yi; Han, Jianrui; Lee, Young
2015-09-07
Data center interconnection with elastic optical network is a promising scenario to meet the high burstiness and high-bandwidth requirements of data center services. In our previous work, we implemented cross stratum optimization of optical network and application stratums resources that allows to accommodate data center services. In view of this, this study extends the data center resources to user side to enhance the end-to-end quality of service. We propose a novel data center service localization (DCSL) architecture based on virtual resource migration in software defined elastic data center optical network. A migration evaluation scheme (MES) is introduced for DCSL based on the proposed architecture. The DCSL can enhance the responsiveness to the dynamic end-to-end data center demands, and effectively reduce the blocking probability to globally optimize optical network and application resources. The overall feasibility and efficiency of the proposed architecture are experimentally verified on the control plane of our OpenFlow-based enhanced SDN testbed. The performance of MES scheme under heavy traffic load scenario is also quantitatively evaluated based on DCSL architecture in terms of path blocking probability, provisioning latency and resource utilization, compared with other provisioning scheme.
Analog Module Architecture for Space-Qualified Field-Programmable Mixed-Signal Arrays
NASA Technical Reports Server (NTRS)
Edwards, R. Timothy; Strohbehn, Kim; Jaskulek, Steven E.; Katz, Richard
1999-01-01
Spacecraft require all manner of both digital and analog circuits. Onboard digital systems are constructed almost exclusively from field-programmable gate array (FPGA) circuits providing numerous advantages over discrete design including high integration density, high reliability, fast turn-around design cycle time, lower mass, volume, and power consumption, and lower parts acquisition and flight qualification costs. Analog and mixed-signal circuits perform tasks ranging from housekeeping to signal conditioning and processing. These circuits are painstakingly designed and built using discrete components due to a lack of options for field-programmability. FPAA (Field-Programmable Analog Array) and FPMA (Field-Programmable Mixed-signal Array) parts exist but not in radiation-tolerant technology and not necessarily in an architecture optimal for the design of analog circuits for spaceflight applications. This paper outlines an architecture proposed for an FPAA fabricated in an existing commercial digital CMOS process used to make radiation-tolerant antifuse-based FPGA devices. The primary concerns are the impact of the technology and the overall array architecture on the flexibility of programming, the bandwidth available for high-speed analog circuits, and the accuracy of the components for high-performance applications.
NASA Astrophysics Data System (ADS)
Hanson, Jeffrey A.; McLaughlin, Keith L.; Sereno, Thomas J.
2011-06-01
We have developed a flexible, target-driven, multi-modal, physics-based fusion architecture that efficiently searches sensor detections for targets and rejects clutter while controlling the combinatoric problems that commonly arise in datadriven fusion systems. The informational constraints imposed by long lifetime requirements make systems vulnerable to false alarms. We demonstrate that our data fusion system significantly reduces false alarms while maintaining high sensitivity to threats. In addition, mission goals can vary substantially in terms of targets-of-interest, required characterization, acceptable latency, and false alarm rates. Our fusion architecture provides the flexibility to match these trade-offs with mission requirements unlike many conventional systems that require significant modifications for each new mission. We illustrate our data fusion performance with case studies that span many of the potential mission scenarios including border surveillance, base security, and infrastructure protection. In these studies, we deployed multi-modal sensor nodes - including geophones, magnetometers, accelerometers and PIR sensors - with low-power processing algorithms and low-bandwidth wireless mesh networking to create networks capable of multi-year operation. The results show our data fusion architecture maintains high sensitivities while suppressing most false alarms for a variety of environments and targets.
Multitasking runtime systems for the Cedar Multiprocessor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guzzi, M.D.
1986-07-01
The programming of a MIMD machine is more complex than for SISD and SIMD machines. The multiple computational resources of the machine must be made available to the programming language compiler and to the programmer so that multitasking programs may be written. This thesis will explore the additional complexity of programming a MIMD machine, the Cedar Multiprocessor specifically, and the multitasking runtime system necessary to provide multitasking resources to the user. First, the problem will be well defined: the Cedar machine, its operating system, the programming language, and multitasking concepts will be described. Second, a solution to the problem, calledmore » macrotasking, will be proposed. This solution provides multitasking facilities to the programmer at a very coarse level with many visible machine dependencies. Third, an alternate solution, called microtasking, will be proposed. This solution provides multitasking facilities of a much finer grain. This solution does not depend so rigidly on the specific architecture of the machine. Finally, the two solutions will be compared for effectiveness. 12 refs., 16 figs.« less
Wide-bandwidth high-resolution search for extraterrestrial intelligence
NASA Technical Reports Server (NTRS)
Horowitz, Paul
1993-01-01
A third antenna was added to the system. It is a terrestrial low-gain feed, to act as a veto for local interference. The 3-chip design for a 4 megapoint complex FFT was reduced to finished working hardware. The 4-Megachannel circuit board contains 36 MByte of DRAM, 5 CPLDs, the three large FFT ASICs, and 74 ICs in all. The Austek FDP-based Spectrometer/Power Accumulator (SPA) has now been implemented as a 4-layer printed circuit. A PC interface board has been designed and together with its associated user interface and control software allows an IBM compatible computer to control the SPA board, and facilitates the transfer of spectra to the PC for display, processing, and storage. The Feature Recognizer Array cards receive the stream of modulus words from the 4M FFT cards, and forward a greatly thinned set of reports to the PC's in whose backplane they reside. In particular, a powerful ROM-based state-machine architecture has been adopted, and DRAM has been added to permit integration modes when tracking or reobserving source candidates. The general purpose (GP) array consists of twenty '486 PC class computers, each of which receives and processes the data from a feature extractor/correlator board set. The array performs a first analysis on the provided 'features' and then passes this information on to the workstation. The core workstation software is now written. That is, the communication channels between the user interface, the backend monitor program and the PC's have working software.
UniBoard: generic hardware for radio astronomy signal processing
NASA Astrophysics Data System (ADS)
Hargreaves, J. E.
2012-09-01
UniBoard is a generic high-performance computing platform for radio astronomy, developed as a Joint Research Activity in the RadioNet FP7 Programme. The hardware comprises eight Altera Stratix IV Field Programmable Gate Arrays (FPGAs) interconnected by a high speed transceiver mesh. Each FPGA is connected to two DDR3 memory modules and three external 10Gbps ports. In addition, a total of 128 low voltage differential input lines permit connection to external ADC cards. The DSP capability of the board exceeds 644E9 complex multiply-accumulate operations per second. The first production run of eight boards was distributed to partners in The Netherlands, France, Italy, UK, China and Korea in May 2011, with a further production runs completed in December 2011 and early 2012. The function of the board is determined by the firmware loaded into its FPGAs. Current applications include beamformers, correlators, digital receivers, RFI mitigation for pulsar astronomy, and pulsar gating and search machines The new UniBoard based correlator for the European VLBI network (EVN) uses an FX architecture with half the resources of the board devoted to station based processing: delay and phase correction and channelization, and half to the correlation function. A single UniBoard can process a 64MHz band from 32 stations, 2 polarizations, sampled at 8 bit. Adding more UniBoards can expand the total bandwidth of the correlator. The design is able to process both prerecorded and real time (eVLBI) data.
Plagianakos, V P; Magoulas, G D; Vrahatis, M N
2006-03-01
Distributed computing is a process through which a set of computers connected by a network is used collectively to solve a single problem. In this paper, we propose a distributed computing methodology for training neural networks for the detection of lesions in colonoscopy. Our approach is based on partitioning the training set across multiple processors using a parallel virtual machine. In this way, interconnected computers of varied architectures can be used for the distributed evaluation of the error function and gradient values, and, thus, training neural networks utilizing various learning methods. The proposed methodology has large granularity and low synchronization, and has been implemented and tested. Our results indicate that the parallel virtual machine implementation of the training algorithms developed leads to considerable speedup, especially when large network architectures and training sets are used.
Migrating EO/IR sensors to cloud-based infrastructure as service architectures
NASA Astrophysics Data System (ADS)
Berglie, Stephen T.; Webster, Steven; May, Christopher M.
2014-06-01
The Night Vision Image Generator (NVIG), a product of US Army RDECOM CERDEC NVESD, is a visualization tool used widely throughout Army simulation environments to provide fully attributed synthesized, full motion video using physics-based sensor and environmental effects. The NVIG relies heavily on contemporary hardware-based acceleration and GPU processing techniques, which push the envelope of both enterprise and commodity-level hypervisor support for providing virtual machines with direct access to hardware resources. The NVIG has successfully been integrated into fully virtual environments where system architectures leverage cloudbased technologies to various extents in order to streamline infrastructure and service management. This paper details the challenges presented to engineers seeking to migrate GPU-bound processes, such as the NVIG, to virtual machines and, ultimately, Cloud-Based IAS architectures. In addition, it presents the path that led to success for the NVIG. A brief overview of Cloud-Based infrastructure management tool sets is provided, and several virtual desktop solutions are outlined. A discrimination is made between general purpose virtual desktop technologies compared to technologies that expose GPU-specific capabilities, including direct rendering and hard ware-based video encoding. Candidate hypervisor/virtual machine configurations that nominally satisfy the virtualized hardware-level GPU requirements of the NVIG are presented , and each is subsequently reviewed in light of its implications on higher-level Cloud management techniques. Implementation details are included from the hardware level, through the operating system, to the 3D graphics APls required by the NVIG and similar GPU-bound tools.
Fast adaptive composite grid methods on distributed parallel architectures
NASA Technical Reports Server (NTRS)
Lemke, Max; Quinlan, Daniel
1992-01-01
The fast adaptive composite (FAC) grid method is compared with the adaptive composite method (AFAC) under variety of conditions including vectorization and parallelization. Results are given for distributed memory multiprocessor architectures (SUPRENUM, Intel iPSC/2 and iPSC/860). It is shown that the good performance of AFAC and its superiority over FAC in a parallel environment is a property of the algorithm and not dependent on peculiarities of any machine.
NASA Astrophysics Data System (ADS)
Bamiedakis, N.; Chen, J.; Penty, R. V.; White, I. H.
2016-03-01
Multimode polymer waveguides are being increasingly considered for use in short-reach board-level optical interconnects as they exhibit favourable optical properties and allow direct integration onto standard PCBs with conventional methods of the electronics industry. Siloxane-based multimode waveguides have been demonstrated with excellent optical transmission performance, while a wide range of passive waveguide components that offer routing flexibility and enable the implementation of complex on-board interconnection architectures has been reported. In recent work, we have demonstrated that these polymer waveguides can exhibit very high bandwidth-length products in excess of 30 GHz×m despite their highly-multimoded nature, while it has been shown that even larger values of > 60 GHz×m can be achieved by adjusting their refractive index profile. Furthermore, the combination of refractive index engineering and launch conditioning schemes can ensure high bandwidth (> 100 GHz×m) and high coupling efficiency (<1 dB) with standard multimode fibre inputs with relatively large alignment tolerances (~17×15 μm2). In the work presented here, we investigate the effects of refractive index engineering on the performance of passive waveguide components (crossings, bends) and provide suitable design rules for their on-board use. It is shown that, depending on the interconnection layout and link requirements, appropriate choice of refractive index profile can provide enhanced component performance, ensuring low loss interconnection and adequate link bandwidth. The results highlight the strong potential of this versatile optical technology for the formation of high-performance board-level optical interconnects with high routing flexibility.
Workflow as a Service in the Cloud: Architecture and Scheduling Algorithms
Wang, Jianwu; Korambath, Prakashan; Altintas, Ilkay; Davis, Jim; Crawl, Daniel
2017-01-01
With more and more workflow systems adopting cloud as their execution environment, it becomes increasingly challenging on how to efficiently manage various workflows, virtual machines (VMs) and workflow execution on VM instances. To make the system scalable and easy-to-extend, we design a Workflow as a Service (WFaaS) architecture with independent services. A core part of the architecture is how to efficiently respond continuous workflow requests from users and schedule their executions in the cloud. Based on different targets, we propose four heuristic workflow scheduling algorithms for the WFaaS architecture, and analyze the differences and best usages of the algorithms in terms of performance, cost and the price/performance ratio via experimental studies. PMID:29399237
Micro-Machined High-Frequency (80 MHz) PZT Thick Film Linear Arrays
Zhou, Qifa; Wu, Dawei; Liu, Changgeng; Zhu, Benpeng; Djuth, Frank; Shung, K. Kirk
2010-01-01
This paper presents the development of a micro-machined high-frequency linear array using PZT piezoelectric thick films. The linear array has 32 elements with an element width of 24 μm and an element length of 4 mm. Array elements were fabricated by deep reactive ion etching of PZT thick films, which were prepared from spin-coating of PZT solgel composite. Detailed fabrication processes, especially PZT thick film etching conditions and a novel transferring-and-etching method, are presented and discussed. Array designs were evaluated by simulation. Experimental measurements show that the array had a center frequency of 80 MHz and a fractional bandwidth (−6 dB) of 60%. An insertion loss of −41 dB and adjacent element crosstalk of −21 dB were found at the center frequency. PMID:20889407
Orchestrating Bulk Data Movement in Grid Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vazhkudai, SS
2005-01-25
Data Grids provide a convenient environment for researchers to manage and access massively distributed bulk data by addressing several system and transfer challenges inherent to these environments. This work addresses issues involved in the efficient selection and access of replicated data in Grid environments in the context of the Globus Toolkit{trademark}, building middleware that (1) selects datasets in highly replicated environments, enabling efficient scheduling of data transfer requests; (2) predicts transfer times of bulk wide-area data transfers using extensive statistical analysis; and (3) co-allocates bulk data transfer requests, enabling parallel downloads from mirrored sites. These efforts have demonstrated a decentralizedmore » data scheduling architecture, a set of forecasting tools that predict bandwidth availability within 15% error and co-allocation architecture, and heuristics that expedites data downloads by up to 2 times.« less
NASA Astrophysics Data System (ADS)
Arun, S.; Choudhury, Vishal; Balaswamy, V.; Supradeepa, V. R.
2018-02-01
We have demonstrated a 34 W continuous wave supercontinuum using the standard telecom fiber (SMF 28e). The supercontinuum spans over a bandwidth of 1000 nm (>1 octave) from 880nm to 1900 nm with a substantial power spectral density of >1mW/nm from 880-1350 nm and 50-100mW/nm in 1350-1900 nm. The distributed feedback Raman laser architecture was used for pumping the supercontinuum which ensured high efficiency Raman conversions and helped in achieving a very high efficiency of 44% for supercontinuum generation. Using this architecture, Yb laser operating at any wavelength can be used for generating the supercontinuum and this was demonstrated by using two different Yb lasers operating at 1117nm and 1085 nm to pump the supercontinuum.
WDM-PON Architecture for FTTx Networks
NASA Astrophysics Data System (ADS)
Iannone, E.; Franco, P.; Santoni, S.
Broadband services for residential users in European countries have until now largely relied on xDSL technologies, while FTTx technologies have been mainly exploited in Asia and North America. The increasing bandwidth demand and the growing penetration of new services are pushing the deployment of optical access networks, and major European operators are now announcing FTTx projects. While FTTH is recognized as the target solution to bring broadband services to residential users, the identification of an FTTx evolutionary path able to seamlessly migrate to FTTH is key to enabling a massive deployment, easing the huge investments needed. WDM-PON architecture is an interesting solution that is able to accommodate the strategic need of building a new fiber-based access infrastructure with the possibility of adapting investments to actual demands and evolving to FTTH without requiring further interventions on fiber infrastructures.
NASA Astrophysics Data System (ADS)
Jenkins, David R.; Basden, Alastair; Myers, Richard M.
2018-05-01
We propose a solution to the increased computational demands of Extremely Large Telescope (ELT) scale adaptive optics (AO) real-time control with the Intel Xeon Phi Knights Landing (KNL) Many Integrated Core (MIC) Architecture. The computational demands of an AO real-time controller (RTC) scale with the fourth power of telescope diameter and so the next generation ELTs require orders of magnitude more processing power for the RTC pipeline than existing systems. The Xeon Phi contains a large number (≥64) of low power x86 CPU cores and high bandwidth memory integrated into a single socketed server CPU package. The increased parallelism and memory bandwidth are crucial to providing the performance for reconstructing wavefronts with the required precision for ELT scale AO. Here, we demonstrate that the Xeon Phi KNL is capable of performing ELT scale single conjugate AO real-time control computation at over 1.0kHz with less than 20μs RMS jitter. We have also shown that with a wavefront sensor camera attached the KNL can process the real-time control loop at up to 966Hz, the maximum frame-rate of the camera, with jitter remaining below 20μs RMS. Future studies will involve exploring the use of a cluster of Xeon Phis for the real-time control of the MCAO and MOAO regimes of AO. We find that the Xeon Phi is highly suitable for ELT AO real time control.
NASA Astrophysics Data System (ADS)
Elder, Delwin L.; Johnson, Lewis E.; Tillack, Andreas F.; Robinson, Bruce H.; Haffner, Christian; Heni, Wolfgang; Hoessbacher, Claudia; Fedoryshyn, Yuriy; Salamin, Yannick; Baeuerle, Benedikt; Josten, Arne; Ayata, Masafumi; Koch, Ueli; Leuthold, Juerg; Dalton, Larry R.
2018-02-01
Multi-scale (correlated quantum and statistical mechanics) modeling methods have been advanced and employed to guide the improvement of organic electro-optic (OEO) materials, including by analyzing electric field poling induced electro-optic activity in nanoscopic plasmonic-organic hybrid (POH) waveguide devices. The analysis of in-device electro-optic activity emphasizes the importance of considering both the details of intermolecular interactions within organic electro-optic materials and interactions at interfaces between OEO materials and device architectures. Dramatic improvement in electro-optic device performance-including voltage-length performance, bandwidth, energy efficiency, and lower optical losses have been realized. These improvements are critical to applications in telecommunications, computing, sensor technology, and metrology. Multi-scale modeling methods illustrate the complexity of improving the electro-optic activity of organic materials, including the necessity of considering the trade-off between improving poling-induced acentric order through chromophore modification and the reduction of chromophore number density associated with such modification. Computational simulations also emphasize the importance of developing chromophore modifications that serve multiple purposes including matrix hardening for enhanced thermal and photochemical stability, control of matrix dimensionality, influence on material viscoelasticity, improvement of chromophore molecular hyperpolarizability, control of material dielectric permittivity and index of refraction properties, and control of material conductance. Consideration of new device architectures is critical to the implementation of chipscale integration of electronics and photonics and achieving the high bandwidths for applications such as next generation (e.g., 5G) telecommunications.
Autonomous Distributed Congestion Control Scheme in WCDMA Network
NASA Astrophysics Data System (ADS)
Ahmad, Hafiz Farooq; Suguri, Hiroki; Choudhary, Muhammad Qaisar; Hassan, Ammar; Liaqat, Ali; Khan, Muhammad Umer
Wireless technology has become widely popular and an important means of communication. A key issue in delivering wireless services is the problem of congestion which has an adverse impact on the Quality of Service (QoS), especially timeliness. Although a lot of work has been done in the context of RRM (Radio Resource Management), the deliverance of quality service to the end user still remains a challenge. Therefore there is need for a system that provides real-time services to the users through high assurance. We propose an intelligent agent-based approach to guarantee a predefined Service Level Agreement (SLA) with heterogeneous user requirements for appropriate bandwidth allocation in QoS sensitive cellular networks. The proposed system architecture exploits Case Based Reasoning (CBR) technique to handle RRM process of congestion management. The system accomplishes predefined SLA through the use of Retrieval and Adaptation Algorithm based on CBR case library. The proposed intelligent agent architecture gives autonomy to Radio Network Controller (RNC) or Base Station (BS) in accepting, rejecting or buffering a connection request to manage system bandwidth. Instead of simply blocking the connection request as congestion hits the system, different buffering durations are allocated to diverse classes of users based on their SLA. This increases the opportunity of connection establishment and reduces the call blocking rate extensively in changing environment. We carry out simulation of the proposed system that verifies efficient performance for congestion handling. The results also show built-in dynamism of our system to cater for variety of SLA requirements.
Open architecture of smart sensor suites
NASA Astrophysics Data System (ADS)
Müller, Wilmuth; Kuwertz, Achim; Grönwall, Christina; Petersson, Henrik; Dekker, Rob; Reinert, Frank; Ditzel, Maarten
2017-10-01
Experiences from recent conflicts show the strong need for smart sensor suites comprising different multi-spectral imaging sensors as core elements as well as additional non-imaging sensors. Smart sensor suites should be part of a smart sensor network - a network of sensors, databases, evaluation stations and user terminals. Its goal is to optimize the use of various information sources for military operations such as situation assessment, intelligence, surveillance, reconnaissance, target recognition and tracking. Such a smart sensor network will enable commanders to achieve higher levels of situational awareness. Within the study at hand, an open system architecture was developed in order to increase the efficiency of sensor suites. The open system architecture for smart sensor suites, based on a system-of-systems approach, enables combining different sensors in multiple physical configurations, such as distributed sensors, co-located sensors combined in a single package, tower-mounted sensors, sensors integrated in a mobile platform, and trigger sensors. The architecture was derived from a set of system requirements and relevant scenarios. Its mode of operation is adaptable to a series of scenarios with respect to relevant objects of interest, activities to be observed, available transmission bandwidth, etc. The presented open architecture is designed in accordance with the NATO Architecture Framework (NAF). The architecture allows smart sensor suites to be part of a surveillance network, linked e.g. to a sensor planning system and a C4ISR center, and to be used in combination with future RPAS (Remotely Piloted Aircraft Systems) for supporting a more flexible dynamic configuration of RPAS payloads.
2006-08-01
information. For example, to say “If Asimov is right, then his three laws hold,” we could write r (As1 As2 As3) where As stands for Asimov’s law. The...Robots Via Mechanized Deontic Logic,” tech. report Machine Ethics: papers from the AAAI Fall Symp.; FS–05–06, 2005b. 3. I. Asimov , I, Robot, Spectra, 2004
Sequence invariant state machines
NASA Technical Reports Server (NTRS)
Whitaker, S.; Manjunath, S.
1990-01-01
A synthesis method and new VLSI architecture are introduced to realize sequential circuits that have the ability to implement any state machine having N states and m inputs, regardless of the actual sequence specified in the flow table. A design method is proposed that utilizes BTS logic to implement regular and dense circuits. A given state sequence can be programmed with power supply connections or dynamically reallocated if stored in a register. Arbitrary flow table sequences can be modified or programmed to dynamically alter the function of the machine. This allows VLSI controllers to be designed with the programmability of a general purpose processor but with the compact size and performance of dedicated logic.
Sequence-invariant state machines
NASA Technical Reports Server (NTRS)
Whitaker, Sterling R.; Manjunath, Shamanna K.; Maki, Gary K.
1991-01-01
A synthesis method and an MOS VLSI architecture are presented to realize sequential circuits that have the ability to implement any state machine having N states and m inputs, regardless of the actual sequence specified in the flow table. The design method utilizes binary tree structured (BTS) logic to implement regular and dense circuits. The desired state sequence can be hardwired with power supply connections or can be dynamically reallocated if stored in a register. This allows programmable VLSI controllers to be designed with a compact size and performance approaching that of dedicated logic. Results of ICV implementations are reported and an example sequence-invariant state machine is contrasted with implementations based on traditional methods.
Parallel processors and nonlinear structural dynamics algorithms and software
NASA Technical Reports Server (NTRS)
Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.; Plaskacz, Edward J.
1989-01-01
The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14.
Oweiss, Karim G
2006-07-01
This paper suggests a new approach for data compression during extracutaneous transmission of neural signals recorded by high-density microelectrode array in the cortex. The approach is based on exploiting the temporal and spatial characteristics of the neural recordings in order to strip the redundancy and infer the useful information early in the data stream. The proposed signal processing algorithms augment current filtering and amplification capability and may be a viable replacement to on chip spike detection and sorting currently employed to remedy the bandwidth limitations. Temporal processing is devised by exploiting the sparseness capabilities of the discrete wavelet transform, while spatial processing exploits the reduction in the number of physical channels through quasi-periodic eigendecomposition of the data covariance matrix. Our results demonstrate that substantial improvements are obtained in terms of lower transmission bandwidth, reduced latency and optimized processor utilization. We also demonstrate the improvements qualitatively in terms of superior denoising capabilities and higher fidelity of the obtained signals.
Very high frequency (beyond 100 MHz) PZT kerfless linear arrays.
Wu, Da-Wei; Zhou, Qifa; Geng, Xuecang; Liu, Chang-Geng; Djuth, Frank; Shung, K Kirk
2009-10-01
This paper presents the design, fabrication, and measurements of very high frequency kerfless linear arrays prepared from PZT film and PZT bulk material. A 12-microm PZT thick film fabricated from PZT-5H powder/solution composite and a piece of 15-microm PZT-5H sheet were used to fabricate 32-element kerfless high-frequency linear arrays with photolithography. The PZT thick film was prepared by spin-coating of PZT sol-gel composite solution. The thin PZT-5H sheet sample was prepared by lapping a PZT-5H ceramic with a precision lapping machine. The measured results of the 2 arrays were compared. The PZT film array had a center frequency of 120 MHz, a bandwidth of 60% with a parylene matching layer, and an insertion loss of 41 dB. The PZT ceramic sheet array was found to have a center frequency of 128 MHz with a poorer bandwidth (40% with a parylene matching layer) but a better sensitivity (28 dB insertion loss).
Very High Frequency (Beyond 100 MHz) PZT Kerfless Linear Arrays
Wu, Da-Wei; Zhou, Qifa; Geng, Xuecang; Liu, Chang-Geng; Djuth, Frank; Shung, K. Kirk
2010-01-01
This paper presents the design, fabrication, and measurements of very high frequency kerfless linear arrays prepared from PZT film and PZT bulk material. A 12-µm PZT thick film fabricated from PZT-5H powder/solution composite and a piece of 15-µm PZT-5H sheet were used to fabricate 32-element kerfless high-frequency linear arrays with photolithography. The PZT thick film was prepared by spin-coating of PZT sol-gel composite solution. The thin PZT-5H sheet sample was prepared by lapping a PZT-5H ceramic with a precision lapping machine. The measured results of the 2 arrays were compared. The PZT film array had a center frequency of 120 MHz, a bandwidth of 60% with a parylene matching layer, and an insertion loss of 41 dB. The PZT ceramic sheet array was found to have a center frequency of 128 MHz with a poorer bandwidth (40% with a parylene matching layer) but a better sensitivity (28 dB insertion loss). PMID:19942516
Low-complexity transcoding algorithm from H.264/AVC to SVC using data mining
NASA Astrophysics Data System (ADS)
Garrido-Cantos, Rosario; De Cock, Jan; Martínez, Jose Luis; Van Leuven, Sebastian; Cuenca, Pedro; Garrido, Antonio
2013-12-01
Nowadays, networks and terminals with diverse characteristics of bandwidth and capabilities coexist. To ensure a good quality of experience, this diverse environment demands adaptability of the video stream. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a low-complexity algorithm to convert an H.264/AVC bitstream without scalability to scalable bitstreams with temporal scalability in baseline and main profiles by accelerating the mode decision task of the scalable video coding encoding stage using machine learning tools. The results show that when our technique is applied, the complexity is reduced by 87% while maintaining coding efficiency.
First 3 years of operation of RIACS (Research Institute for Advanced Computer Science) (1983-1985)
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
The focus of the Research Institute for Advanced Computer Science (RIACS) is to explore matches between advanced computing architectures and the processes of scientific research. An architecture evaluation of the MIT static dataflow machine, specification of a graphical language for expressing distributed computations, and specification of an expert system for aiding in grid generation for two-dimensional flow problems was initiated. Research projects for 1984 and 1985 are summarized.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Draeger, E. W.
The Advanced Architecture and Portability Specialists team (AAPS) worked with a select set of LLNL application teams to develop and/or implement a portability strategy for next-generation architectures. The team also investigated new and updated programming models and helped develop programming abstractions targeting maintainability and performance portability. Significant progress was made on both fronts in FY17, resulting in multiple applications being significantly more prepared for the nextgeneration machines than before.
GREAT: a web portal for Genome Regulatory Architecture Tools.
Bouyioukos, Costas; Bucchini, François; Elati, Mohamed; Képès, François
2016-07-08
GREAT (Genome REgulatory Architecture Tools) is a novel web portal for tools designed to generate user-friendly and biologically useful analysis of genome architecture and regulation. The online tools of GREAT are freely accessible and compatible with essentially any operating system which runs a modern browser. GREAT is based on the analysis of genome layout -defined as the respective positioning of co-functional genes- and its relation with chromosome architecture and gene expression. GREAT tools allow users to systematically detect regular patterns along co-functional genomic features in an automatic way consisting of three individual steps and respective interactive visualizations. In addition to the complete analysis of regularities, GREAT tools enable the use of periodicity and position information for improving the prediction of transcription factor binding sites using a multi-view machine learning approach. The outcome of this integrative approach features a multivariate analysis of the interplay between the location of a gene and its regulatory sequence. GREAT results are plotted in web interactive graphs and are available for download either as individual plots, self-contained interactive pages or as machine readable tables for downstream analysis. The GREAT portal can be reached at the following URL https://absynth.issb.genopole.fr/GREAT and each individual GREAT tool is available for downloading. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
On the Performance of an Algebraic MultigridSolver on Multicore Clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, A H; Schulz, M; Yang, U M
2010-04-29
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.
Zhao, Jiangsan; Bodner, Gernot; Rewald, Boris
2016-01-01
Phenotyping local crop cultivars is becoming more and more important, as they are an important genetic source for breeding – especially in regard to inherent root system architectures. Machine learning algorithms are promising tools to assist in the analysis of complex data sets; novel approaches are need to apply them on root phenotyping data of mature plants. A greenhouse experiment was conducted in large, sand-filled columns to differentiate 16 European Pisum sativum cultivars based on 36 manually derived root traits. Through combining random forest and support vector machine models, machine learning algorithms were successfully used for unbiased identification of most distinguishing root traits and subsequent pairwise cultivar differentiation. Up to 86% of pea cultivar pairs could be distinguished based on top five important root traits (Timp5) – Timp5 differed widely between cultivar pairs. Selecting top important root traits (Timp) provided a significant improved classification compared to using all available traits or randomly selected trait sets. The most frequent Timp of mature pea cultivars was total surface area of lateral roots originating from tap root segments at 0–5 cm depth. The high classification rate implies that culturing did not lead to a major loss of variability in root system architecture in the studied pea cultivars. Our results illustrate the potential of machine learning approaches for unbiased (root) trait selection and cultivar classification based on rather small, complex phenotypic data sets derived from pot experiments. Powerful statistical approaches are essential to make use of the increasing amount of (root) phenotyping information, integrating the complex trait sets describing crop cultivars. PMID:27999587
Wireless data transmission for high energy physics applications
NASA Astrophysics Data System (ADS)
Dittmeier, Sebastian; Brenner, Richard; Dancila, Dragos; Dehos, Cedric; De Lurgio, Patrick; Djurcic, Zelimir; Drake, Gary; Gonzalez Gimenez, Jose Luis; Gustafsson, Leif; Kim, Do-Won; Locci, Elizabeth; Pfeiffer, Ullrich; Röhrich, Dieter; Rydberg, Anders; Schöning, André; Siligaris, Alexandre; Soltveit, Hans Kristian; Ullaland, Kjetil; Vincent, Pierre; Rodriguez Vazquez, Pedro; Wiedner, Dirk; Yang, Shiming
2017-08-01
Silicon tracking detectors operated at high luminosity collider experiments pose a challenge for current and future readout systems regarding bandwidth, radiation, space and power constraints. With the latest developments in wireless communications, wireless readout systems might be an attractive alternative to commonly used wired optical and copper based readout architectures. The WADAPT group (Wireless Allowing Data and Power Transmission) has been formed to study the feasibility of wireless data transmission for future tracking detectors. These proceedings cover current developments focused on communication in the 60 GHz band. This frequency band offers a high bandwidth, a small form factor and an already mature technology. Motivation for wireless data transmission for high energy physics application and the developments towards a demonstrator prototype are summarized. Feasibility studies concerning the construction and operation of a wireless transceiver system have been performed. Data transmission tests with a transceiver prototype operating at even higher frequencies in the 240 GHz band are described. Data transmission at rates up to 10 Gb/s have been obtained successfully using binary phase shift keying.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Weikuan; Vetter, Jeffrey S
Parallel NFS (pNFS) is touted as an emergent standard protocol for parallel I/O access in various storage environments. Several pNFS prototypes have been implemented for initial validation and protocol examination. Previous efforts have focused on realizing the pNFS protocol to expose the best bandwidth potential from underlying file and storage systems. In this presentation, we provide an initial characterization of two pNFS prototype implementations, lpNFS (a Lustre-based parallel NFS implementation) and spNFS (another reference implementation from Network Appliance, Inc.). We show that both lpNFS and spNFS can faithfully achieve the primary goal of pNFS, i.e., aggregating I/O bandwidth from manymore » storage servers. However, they both face the challenge of scalable metadata management. Particularly, the throughput of sp-NFS metadata operations degrades significanlty with an increasing number of data servers. Even for the better-performing lpNFS, we discuss its architecture and propose a direct I/O request flow protocol to improve its performance.« less
Farrell, Alan C.; Senanayake, Pradeep; Hung, Chung-Hong; El-Howayek, Georges; Rajagopal, Abhejit; Currie, Marc; Hayat, Majeed M.; Huffaker, Diana L.
2015-01-01
Avalanche photodiodes (APDs) are essential components in quantum key distribution systems and active imaging systems requiring both ultrafast response time to measure photon time of flight and high gain to detect low photon flux. The internal gain of an APD can improve system signal-to-noise ratio (SNR). Excess noise is typically kept low through the selection of material with intrinsically low excess noise, using separate-absorption-multiplication (SAM) heterostructures, or taking advantage of the dead-space effect using thin multiplication regions. In this work we demonstrate the first measurement of excess noise and gain-bandwidth product in III–V nanopillars exhibiting substantially lower excess noise factors compared to bulk and gain-bandwidth products greater than 200 GHz. The nanopillar optical antenna avalanche detector (NOAAD) architecture is utilized for spatially separating the absorption region from the avalanche region via the NOA resulting in single carrier injection without the use of a traditional SAM heterostructure. PMID:26627932
High-speed digital fiber optic links for satellite traffic
NASA Technical Reports Server (NTRS)
Daryoush, A. S.; Ackerman, E.; Saedi, R.; Kunath, R. R.; Shalkhauser, K.
1989-01-01
Large aperture phased array antennas operating at millimeter wave frequencies are designed for space-based communications and imaging platforms. Array elements are comprised of active T/R modules which are linked to the central processing unit through high-speed fiber-optic networks. The system architecture satisfying system requirements at millimeter wave frequency is T/R level data mixing where data and frequency reference signals are distributed independently before mixing at the T/R modules. This paper demonstrates design procedures of a low loss high-speed fiber-optic link used for transmission of data signals over 600-900 MHz bandwidth inside satellite. The fiber-optic link is characterized for transmission of analog and digital data. A dynamic range of 79 dB/MHz was measured for analog data over the bandwidth. On the other hand, for bursted SMSK satellite traffic at 220 Mbps rates, BER of 2 x 10 to the -7th was measured for E(b)/N(o) of 14.3 dB.
RXIO: Design and implementation of high performance RDMA-capable GridFTP
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tian, Yuan; Yu, Weikuan; Vetter, Jeffrey S.
2011-12-21
For its low-latency, high bandwidth, and low CPU utilization, Remote Direct Memory Access (RDMA) has established itself as an effective data movement technology in many networking environments. However, the transport protocols of grid run-time systems, such as GridFTP in Globus, are not yet capable of utilizing RDMA. In this study, we examine the architecture of GridFTP for the feasibility of enabling RDMA. An RDMA-capable XIO (RXIO) framework is designed and implemented to extend its XIO system and match the characteristics of RDMA. Our experimental results demonstrate that RDMA can significantly improve the performance of GridFTP, reducing the latency by 32%more » and increasing the bandwidth by more than three times. In achieving such performance improvements, RDMA dramatically cuts down CPU utilization of GridFTP clients and servers. In conclusion, these results demonstrate that RXIO can effectively exploit the benefits of RDMA for GridFTP. It offers a good prototype to further leverage GridFTP on wide-area RDMA networks.« less
Zhang, Tao; Zhang, Jian; Luo, Heng; Deng, Lianwen; Zhou, Pengyu; Wen, Guangwu; Xia, Long; Zhong, Bo; Zhang, Haibin
2018-06-08
Carbon-based materials have excited extensive interest for their remarkable electrical properties and low density for application in electromagnetic (EM) wave absorbents. However, the processing of heteroatoms doping in carbon nanostructures is an insuperable challenge for attaining effective reflection loss and EM matching. Herein, a facile method for large-scale synthesis of boron and nitrogen doped carbon nanotubes decorated by ferrites particles is proposed. The BCN nanotubes (50-100 nm in diameter) imbedded with nanosized Fe x (B/C/N) y (10-20 nm) are successfully constructed by two steps of polymerization and carbonthermic reduction. The product exhibits an outstanding reflection loss (RL) performance, in that the minimum RL is -47.97 dB at 11.44 GHz with a broad bandwidth 11.2 GHz (from 3.76 to 14.9 GHz) below -10 dB indicating a competitive absorbent in stealth materials. Crystalline and theoretical studies of the absorption mechanism indicate a unique dielectric dispersion effect in the absorbing bandwidth.
MarFS, a Near-POSIX Interface to Cloud Objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
NASA Astrophysics Data System (ADS)
Zhang, Tao; Zhang, Jian; Luo, Heng; Deng, Lianwen; Zhou, Pengyu; Wen, Guangwu; Xia, Long; Zhong, Bo; Zhang, Haibin
2018-06-01
Carbon-based materials have excited extensive interest for their remarkable electrical properties and low density for application in electromagnetic (EM) wave absorbents. However, the processing of heteroatoms doping in carbon nanostructures is an insuperable challenge for attaining effective reflection loss and EM matching. Herein, a facile method for large-scale synthesis of boron and nitrogen doped carbon nanotubes decorated by ferrites particles is proposed. The BCN nanotubes (50–100 nm in diameter) imbedded with nanosized Fe x (B/C/N) y (10–20 nm) are successfully constructed by two steps of polymerization and carbonthermic reduction. The product exhibits an outstanding reflection loss (RL) performance, in that the minimum RL is ‑47.97 dB at 11.44 GHz with a broad bandwidth 11.2 GHz (from 3.76 to 14.9 GHz) below ‑10 dB indicating a competitive absorbent in stealth materials. Crystalline and theoretical studies of the absorption mechanism indicate a unique dielectric dispersion effect in the absorbing bandwidth.
MarFS, a Near-POSIX Interface to Cloud Objects
Inman, Jeffrey Thornton; Vining, William Flynn; Ransom, Garrett Wilson; ...
2017-01-01
The engineering forces driving development of “cloud” storage have produced resilient, cost-effective storage systems that can scale to 100s of petabytes, with good parallel access and bandwidth. These features would make a good match for the vast storage needs of High-Performance Computing datacenters, but cloud storage gains some of its capability from its use of HTTP-style Representational State Transfer (REST) semantics, whereas most large datacenters have legacy applications that rely on POSIX file-system semantics. MarFS is an open-source project at Los Alamos National Laboratory that allows us to present cloud-style object-storage as a scalable near-POSIX file system. We have alsomore » developed a new storage architecture to improve bandwidth and scalability beyond what’s available in commodity object stores, while retaining their resilience and economy. Additionally, we present a scheme for scaling the POSIX interface to allow billions of files in a single directory and trillions of files in total.« less
Interfacing insect brain for space applications.
Di Pino, Giovanni; Seidl, Tobias; Benvenuto, Antonella; Sergi, Fabrizio; Campolo, Domenico; Accoto, Dino; Maria Rossini, Paolo; Guglielmelli, Eugenio
2009-01-01
Insects exhibit remarkable navigation capabilities that current control architectures are still far from successfully mimic and reproduce. In this chapter, we present the results of a study on conceptualizing insect/machine hybrid controllers for improving autonomy of exploratory vehicles. First, the different principally possible levels of interfacing between insect and machine are examined followed by a review of current approaches towards hybridity and enabling technologies. Based on the insights of this activity, we propose a double hybrid control architecture which hinges around the concept of "insect-in-a-cockpit." It integrates both biological/artificial (insect/robot) modules and deliberative/reactive behavior. The basic assumption is that "low-level" tasks are managed by the robot, while the "insect intelligence" is exploited whenever high-level problem solving and decision making is required. Both neural and natural interfacing have been considered to achieve robustness and redundancy of exchanged information.
Dynamic extreme learning machine and its approximation capability.
Zhang, Rui; Lan, Yuan; Huang, Guang-Bin; Xu, Zong-Ben; Soh, Yeng Chai
2013-12-01
Extreme learning machines (ELMs) have been proposed for generalized single-hidden-layer feedforward networks which need not be neuron alike and perform well in both regression and classification applications. The problem of determining the suitable network architectures is recognized to be crucial in the successful application of ELMs. This paper first proposes a dynamic ELM (D-ELM) where the hidden nodes can be recruited or deleted dynamically according to their significance to network performance, so that not only the parameters can be adjusted but also the architecture can be self-adapted simultaneously. Then, this paper proves in theory that such D-ELM using Lebesgue p-integrable hidden activation functions can approximate any Lebesgue p-integrable function on a compact input set. Simulation results obtained over various test problems demonstrate and verify that the proposed D-ELM does a good job reducing the network size while preserving good generalization performance.
A system for routing arbitrary directed graphs on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1987-01-01
There are many problems which can be described in terms of directed graphs that contain a large number of vertices where simple computations occur using data from connecting vertices. A method is given for parallelizing such problems on an SIMD machine model that is bit-serial and uses only nearest neighbor connections for communication. Each vertex of the graph will be assigned to a processor in the machine. Algorithms are given that will be used to implement movement of data along the arcs of the graph. This architecture and algorithms define a system that is relatively simple to build and can do graph processing. All arcs can be transversed in parallel in time O(T), where T is empirically proportional to the diameter of the interconnection network times the average degree of the graph. Modifying or adding a new arc takes the same time as parallel traversal.
Diskless supercomputers: Scalable, reliable I/O for the Tera-Op technology base
NASA Technical Reports Server (NTRS)
Katz, Randy H.; Ousterhout, John K.; Patterson, David A.
1993-01-01
Computing is seeing an unprecedented improvement in performance; over the last five years there has been an order-of-magnitude improvement in the speeds of workstation CPU's. At least another order of magnitude seems likely in the next five years, to machines with 500 MIPS or more. The goal of the ARPA Teraop program is to realize even larger, more powerful machines, executing as many as a trillion operations per second. Unfortunately, we have seen no comparable breakthroughs in I/O performance; the speeds of I/O devices and the hardware and software architectures for managing them have not changed substantially in many years. We have completed a program of research to demonstrate hardware and software I/O architectures capable of supporting the kinds of internetworked 'visualization' workstations and supercomputers that will appear in the mid 1990s. The project had three overall goals: high performance, high reliability, and scalable, multipurpose system.
Active semi-supervised learning method with hybrid deep belief networks.
Zhou, Shusen; Chen, Qingcai; Wang, Xiaolong
2014-01-01
In this paper, we develop a novel semi-supervised learning algorithm called active hybrid deep belief networks (AHD), to address the semi-supervised sentiment classification problem with deep learning. First, we construct the previous several hidden layers using restricted Boltzmann machines (RBM), which can reduce the dimension and abstract the information of the reviews quickly. Second, we construct the following hidden layers using convolutional restricted Boltzmann machines (CRBM), which can abstract the information of reviews effectively. Third, the constructed deep architecture is fine-tuned by gradient-descent based supervised learning with an exponential loss function. Finally, active learning method is combined based on the proposed deep architecture. We did several experiments on five sentiment classification datasets, and show that AHD is competitive with previous semi-supervised learning algorithm. Experiments are also conducted to verify the effectiveness of our proposed method with different number of labeled reviews and unlabeled reviews respectively.
Hardware/software codesign for embedded RISC core
NASA Astrophysics Data System (ADS)
Liu, Peng
2001-12-01
This paper describes hardware/software codesign method of the extendible embedded RISC core VIRGO, which based on MIPS-I instruction set architecture. VIRGO is described by Verilog hardware description language that has five-stage pipeline with shared 32-bit cache/memory interface, and it is controlled by distributed control scheme. Every pipeline stage has one small controller, which controls the pipeline stage status and cooperation among the pipeline phase. Since description use high level language and structure is distributed, VIRGO core has highly extension that can meet the requirements of application. We take look at the high-definition television MPEG2 MPHL decoder chip, constructed the hardware/software codesign virtual prototyping machine that can research on VIRGO core instruction set architecture, and system on chip memory size requirements, and system on chip software, etc. We also can evaluate the system on chip design and RISC instruction set based on the virtual prototyping machine platform.
Multilayer Extreme Learning Machine With Subnetwork Nodes for Representation Learning.
Yang, Yimin; Wu, Q M Jonathan
2016-11-01
The extreme learning machine (ELM), which was originally proposed for "generalized" single-hidden layer feedforward neural networks, provides efficient unified learning solutions for the applications of clustering, regression, and classification. It presents competitive accuracy with superb efficiency in many applications. However, ELM with subnetwork nodes architecture has not attracted much research attentions. Recently, many methods have been proposed for supervised/unsupervised dimension reduction or representation learning, but these methods normally only work for one type of problem. This paper studies the general architecture of multilayer ELM (ML-ELM) with subnetwork nodes, showing that: 1) the proposed method provides a representation learning platform with unsupervised/supervised and compressed/sparse representation learning and 2) experimental results on ten image datasets and 16 classification datasets show that, compared to other conventional feature learning methods, the proposed ML-ELM with subnetwork nodes performs competitively or much better than other feature learning methods.
A Collaborative Knowledge Plane for Autonomic Networks
NASA Astrophysics Data System (ADS)
Mbaye, Maïssa; Krief, Francine
Autonomic networking aims to give network components self-managing capabilities. Several autonomic architectures have been proposed. Each of these architectures includes sort of a knowledge plane which is very important to mimic an autonomic behavior. Knowledge plane has a central role for self-functions by providing suitable knowledge to equipment and needs to learn new strategies for more accuracy.However, defining knowledge plane's architecture is still a challenge for researchers. Specially, defining the way cognitive supports interact each other in knowledge plane and implementing them. Decision making process depends on these interactions between reasoning and learning parts of knowledge plane. In this paper we propose a knowledge plane's architecture based on machine learning (inductive logic programming) paradigm and situated view to deal with distributed environment. This architecture is focused on two self-functions that include all other self-functions: self-adaptation and self-organization. Study cases are given and implemented.
3D printing of robotic soft actuators with programmable bioinspired architectures.
Schaffner, Manuel; Faber, Jakob A; Pianegonda, Lucas; Rühs, Patrick A; Coulter, Fergal; Studart, André R
2018-02-28
Soft actuation allows robots to interact safely with humans, other machines, and their surroundings. Full exploitation of the potential of soft actuators has, however, been hindered by the lack of simple manufacturing routes to generate multimaterial parts with intricate shapes and architectures. Here, we report a 3D printing platform for the seamless digital fabrication of pneumatic silicone actuators exhibiting programmable bioinspired architectures and motions. The actuators comprise an elastomeric body whose surface is decorated with reinforcing stripes at a well-defined lead angle. Similar to the fibrous architectures found in muscular hydrostats, the lead angle can be altered to achieve elongation, contraction, or twisting motions. Using a quantitative model based on lamination theory, we establish design principles for the digital fabrication of silicone-based soft actuators whose functional response is programmed within the material's properties and architecture. Exploring such programmability enables 3D printing of a broad range of soft morphing structures.
On the suitability of the connection machine for direct particle simulation
NASA Technical Reports Server (NTRS)
Dagum, Leonard
1990-01-01
The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
Improvement of multiprocessing performance by using optical centralized shared bus
NASA Astrophysics Data System (ADS)
Han, Xuliang; Chen, Ray T.
2004-06-01
With the ever-increasing need to solve larger and more complex problems, multiprocessing is attracting more and more research efforts. One of the challenges facing the multiprocessor designers is to fulfill in an effective manner the communications among the processes running in parallel on multiple multiprocessors. The conventional electrical backplane bus provides narrow bandwidth as restricted by the physical limitations of electrical interconnects. In the electrical domain, in order to operate at high frequency, the backplane topology has been changed from the simple shared bus to the complicated switched medium. However, the switched medium is an indirect network. It cannot support multicast/broadcast as effectively as the shared bus. Besides the additional latency of going through the intermediate switching nodes, signal routing introduces substantial delay and considerable system complexity. Alternatively, optics has been well known for its interconnect capability. Therefore, it has become imperative to investigate how to improve multiprocessing performance by utilizing optical interconnects. From the implementation standpoint, the existing optical technologies still cannot fulfill the intelligent functions that a switch fabric should provide as effectively as their electronic counterparts. Thus, an innovative optical technology that can provide sufficient bandwidth capacity, while at the same time, retaining the essential merits of the shared bus topology, is highly desirable for the multiprocessing performance improvement. In this paper, the optical centralized shared bus is proposed for use in the multiprocessing systems. This novel optical interconnect architecture not only utilizes the beneficial characteristics of optics, but also retains the desirable properties of the shared bus topology. Meanwhile, from the architecture standpoint, it fits well in the centralized shared-memory multiprocessing scheme. Therefore, a smooth migration with substantial multiprocessing performance improvement is expected. To prove the technical feasibility from the architecture standpoint, a conceptual emulation of the centralized shared-memory multiprocessing scheme is demonstrated on a generic PCI subsystem with an optical centralized shared bus.
Can High Bandwidth and Latency Justify Large Cache Blocks in Scalable Multiprocessors?
1994-01-01
400 MB/second. 4 Dubnicki’s work used trace-driven simulation, with traces collected on an 8-processor machine. We would expect such small-scale...312 1 6 32 64 of odk Sb* Bad64.M Figure 17: Miss rate of Ind Blocked LU. Figure 18: MCPR of Ind Blocked LU. overall miss rate of TGauss is a factor of...easily. 17 (’his approach assunics that the model paramelers we collect from simulations with infinite band- width (such as the miss rate and the
Distributed and Modular CAN-Based Architecture for Hardware Control and Sensor Data Integration
Losada, Diego P.; Fernández, Joaquín L.; Paz, Enrique; Sanz, Rafael
2017-01-01
In this article, we present a CAN-based (Controller Area Network) distributed system to integrate sensors, actuators and hardware controllers in a mobile robot platform. With this work, we provide a robust, simple, flexible and open system to make hardware elements or subsystems communicate, that can be applied to different robots or mobile platforms. Hardware modules can be connected to or disconnected from the CAN bus while the system is working. It has been tested in our mobile robot Rato, based on a RWI (Real World Interface) mobile platform, to replace the old sensor and motor controllers. It has also been used in the design of two new robots: BellBot and WatchBot. Currently, our hardware integration architecture supports different sensors, actuators and control subsystems, such as motor controllers and inertial measurement units. The integration architecture was tested and compared with other solutions through a performance analysis of relevant parameters such as transmission efficiency and bandwidth usage. The results conclude that the proposed solution implements a lightweight communication protocol for mobile robot applications that avoids transmission delays and overhead. PMID:28467381
Considerations for an Earth Relay Satellite with RF and Optical Trunklines
NASA Technical Reports Server (NTRS)
Israel, David J.
2016-01-01
Support for user platforms through the use of optical links to geosynchronous relay spacecraft are expected to be part of the future space communications architecture. The European Data Relay Satellite System (EDRS) has its first node, EDRS-A, in orbit. The EDRS architecture includes space-to-space optical links with a Ka-Band feeder link or trunkline. NASA's Laser Communications Relay Demonstration (LCRD) mission, originally baselined to support a space-to-space optical link relayed with an optical trunkline, has added an Radio Frequency (RF) trunkline. The use of an RF trunkline avoids the outages suffered by an optical trunkline due to clouds, but an RF trunkline will be bandwidth limited. A space relay architecture with both RF and optical trunklines could relay critical realtime data, while also providing a high data volume capacity. This paper considers the relay user scenarios that could be supported, and the implications to the space relay system and operations. System trades such as the amount of onboard processing and storage required, the use of link layer switching vs. network layer routing, and the use of Delay/Disruption Tolerant Networking (DTN) are discussed.
Medusa: A Scalable MR Console Using USB
Stang, Pascal P.; Conolly, Steven M.; Santos, Juan M.; Pauly, John M.; Scott, Greig C.
2012-01-01
MRI pulse sequence consoles typically employ closed proprietary hardware, software, and interfaces, making difficult any adaptation for innovative experimental technology. Yet MRI systems research is trending to higher channel count receivers, transmitters, gradient/shims, and unique interfaces for interventional applications. Customized console designs are now feasible for researchers with modern electronic components, but high data rates, synchronization, scalability, and cost present important challenges. Implementing large multi-channel MR systems with efficiency and flexibility requires a scalable modular architecture. With Medusa, we propose an open system architecture using the Universal Serial Bus (USB) for scalability, combined with distributed processing and buffering to address the high data rates and strict synchronization required by multi-channel MRI. Medusa uses a modular design concept based on digital synthesizer, receiver, and gradient blocks, in conjunction with fast programmable logic for sampling and synchronization. Medusa is a form of synthetic instrument, being reconfigurable for a variety of medical/scientific instrumentation needs. The Medusa distributed architecture, scalability, and data bandwidth limits are presented, and its flexibility is demonstrated in a variety of novel MRI applications. PMID:21954200
Distributed and Modular CAN-Based Architecture for Hardware Control and Sensor Data Integration.
Losada, Diego P; Fernández, Joaquín L; Paz, Enrique; Sanz, Rafael
2017-05-03
In this article, we present a CAN-based (Controller Area Network) distributed system to integrate sensors, actuators and hardware controllers in a mobile robot platform. With this work, we provide a robust, simple, flexible and open system to make hardware elements or subsystems communicate, that can be applied to different robots or mobile platforms. Hardware modules can be connected to or disconnected from the CAN bus while the system is working. It has been tested in our mobile robot Rato, based on a RWI (Real World Interface) mobile platform, to replace the old sensor and motor controllers. It has also been used in the design of two new robots: BellBot and WatchBot. Currently, our hardware integration architecture supports different sensors, actuators and control subsystems, such as motor controllers and inertial measurement units. The integration architecture was tested and compared with other solutions through a performance analysis of relevant parameters such as transmission efficiency and bandwidth usage. The results conclude that the proposed solution implements a lightweight communication protocol for mobile robot applications that avoids transmission delays and overhead.
A wide-range programmable frequency synthesizer based on a finite state machine filter
NASA Astrophysics Data System (ADS)
Alser, Mohammed H.; Assaad, Maher M.; Hussin, Fawnizu A.
2013-11-01
In this article, an FPGA-based design and implementation of a fully digital wide-range programmable frequency synthesizer based on a finite state machine filter is presented. The advantages of the proposed architecture are that, it simultaneously generates a high frequency signal from a low frequency reference signal (i.e. synthesising), and synchronising the two signals (signals have the same phase, or a constant difference) without jitter accumulation issue. The architecture is portable and can be easily implemented for various platforms, such as FPGAs and integrated circuits. The frequency synthesizer circuit can be used as a part of SERDES devices in intra/inter chip communication in system-on-chip (SoC). The proposed circuit is designed using Verilog language and synthesized for the Altera DE2-70 development board, with the Cyclone II (EP2C35F672C6) device on board. Simulation and experimental results are included; they prove the synthesizing and tracking features of the proposed architecture. The generated clock signal frequency of a range from 19.8 MHz to 440 MHz is synchronized to the input reference clock with a frequency step of 0.12 MHz.
Neural networks and applications tutorial
NASA Astrophysics Data System (ADS)
Guyon, I.
1991-09-01
The importance of neural networks has grown dramatically during this decade. While only a few years ago they were primarily of academic interest, now dozens of companies and many universities are investigating the potential use of these systems and products are beginning to appear. The idea of building a machine whose architecture is inspired by that of the brain has roots which go far back in history. Nowadays, technological advances of computers and the availability of custom integrated circuits, permit simulations of hundreds or even thousands of neurons. In conjunction, the growing interest in learning machines, non-linear dynamics and parallel computation spurred renewed attention in artificial neural networks. Many tentative applications have been proposed, including decision systems (associative memories, classifiers, data compressors and optimizers), or parametric models for signal processing purposes (system identification, automatic control, noise canceling, etc.). While they do not always outperform standard methods, neural network approaches are already used in some real world applications for pattern recognition and signal processing tasks. The tutorial is divided into six lectures, that where presented at the Third Graduate Summer Course on Computational Physics (September 3-7, 1990) on Parallel Architectures and Applications, organized by the European Physical Society: (1) Introduction: machine learning and biological computation. (2) Adaptive artificial neurons (perceptron, ADALINE, sigmoid units, etc.): learning rules and implementations. (3) Neural network systems: architectures, learning algorithms. (4) Applications: pattern recognition, signal processing, etc. (5) Elements of learning theory: how to build networks which generalize. (6) A case study: a neural network for on-line recognition of handwritten alphanumeric characters.
Executable Architecture of Net Enabled Operations: State Machine of Federated Nodes
2009-11-01
verbal descriptions from operators) of the current Command and Control (C2) practices into model form. In theory these should be Standard Operating...faudra une grande quantité de données pour faire en sorte que le modèle reflète les processus véritables, les auteurs recommandent que la machine à...descriptions from operators) of the current C2 practices into model form. In theory these should be SOPs that execute as a thread from start to finish. The
NASA Technical Reports Server (NTRS)
Schreiber, Robert; Simon, Horst D.
1992-01-01
We are surveying current projects in the area of parallel supercomputers. The machines considered here will become commercially available in the 1990 - 1992 time frame. All are suitable for exploring the critical issues in applying parallel processors to large scale scientific computations, in particular CFD calculations. This chapter presents an overview of the surveyed machines, and a detailed analysis of the various architectural and technology approaches taken. Particular emphasis is placed on the feasibility of a Teraflops capability following the paths proposed by various developers.
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
Artificial Intelligence research has come under fire for failing to fulfill its promises. A growing number of AI researchers are reexamining the bases of AI research and are challenging the assumption that intelligent behavior can be fully explained as manipulation of symbols by algorithms. Three recent books -- Mind over Machine (H. Dreyfus and S. Dreyfus), Understanding Computers and Cognition (T. Winograd and F. Flores), and Brains, Behavior, and Robots (J. Albus) -- explore alternatives and open the door to new architectures that may be able to learn skills.
LHCb experience with running jobs in virtual machines
NASA Astrophysics Data System (ADS)
McNab, A.; Stagni, F.; Luzzi, C.
2015-12-01
The LHCb experiment has been running production jobs in virtual machines since 2013 as part of its DIRAC-based infrastructure. We describe the architecture of these virtual machines and the steps taken to replicate the WLCG worker node environment expected by user and production jobs. This relies on the uCernVM system for providing root images for virtual machines. We use the CernVM-FS distributed filesystem to supply the root partition files, the LHCb software stack, and the bootstrapping scripts necessary to configure the virtual machines for us. Using this approach, we have been able to minimise the amount of contextualisation which must be provided by the virtual machine managers. We explain the process by which the virtual machine is able to receive payload jobs submitted to DIRAC by users and production managers, and how this differs from payloads executed within conventional DIRAC pilot jobs on batch queue based sites. We describe our operational experiences in running production on VM based sites managed using Vcycle/OpenStack, Vac, and HTCondor Vacuum. Finally we show how our use of these resources is monitored using Ganglia and DIRAC.
NASA Astrophysics Data System (ADS)
McMahon, Jeff
Sub-millimeter observations are crucial for answering questions about star and galaxy formation; understanding galactic dust foregrounds; and for removing these foregrounds to detect the faint signature of inflationary gravitational waves in the polarization of the Cosmic Microwave Background (CMB). Achieving these goals requires improved, broad-band antireflection coated lenses and half-wave plates (HWPs). These optical elements will significantly boost the sensitivity and capability of future sub-millimeter and CMB missions. We propose to develop wide-bandwidth metamaterial antireflection coatings for silicon lenses and sapphire HWPs with 3:1 ratio bandwidth that are scalable across the sub-millimeter band from 300 GHz to 3 THz. This is an extension of our successful work on saw cut metamaterial AR coatings for silicon optics at millimeter wave lengths. These, and the proposed coatings consist of arrays of sub-wavelength scale features cut into optical surfaces that behave like simple dielectrics. We have demonstrated saw cut 3:1 bandwidth coatings on silicon lenses, but these coatings are limited to the millimeter wave band by the limitations of dicing saw machining. The crucial advance needed to extend these broad band coatings throughout the sub-millimeter band is the development of laser cut graded index metamaterial coatings. The proposed work includes developing the capability to fabricate these coatings, optimizing the design of these metamaterials, fabricating and testing prototype lenses and HWPs, and working with the PIPER collaboration to achieve a sub-orbital demonstration of this technology. The proposed work will develop potentially revolutionary new high performance coatings for the sub-millimeter bands, and cary this technology to TRL 7 paving the way for its use in space. We anticipate that there will be a wide range of applications for these coatings on future NASA balloons and satellites.
Design and optimization of G-band extended interaction klystron with high output power
NASA Astrophysics Data System (ADS)
Li, Renjie; Ruan, Cunjun; Zhang, Huafeng
2018-03-01
A ladder-type Extended Interaction Klystron (EIK) with unequal-length slots in the G-band is proposed and designed. The key parameters of resonance cavities working in the π mode are obtained based on the theoretical analysis and 3D simulation. The influence of the device fabrication tolerance on the high-frequency performance is analyzed in detail, and it is found that at least 5 μm of machining precision is required. Thus, the dynamic tuning is required to compensate for the frequency shift and increase the bandwidth. The input and output coupling hole dimensions are carefully designed to achieve high output power along with a broad bandwidth. The effect of surface roughness of the metallic material on the output power has been investigated, and it is proposed that lower surface roughness leads to higher output power. The focusing magnetic field is also optimized to 0.75 T in order to maintain the beam transportation and achieve high output power. With 16.5 kV operating voltage and 0.30 A beam current, the output power of 360 W, the efficiency of 7.27%, the gain of 38.6 dB, and the 3 dB bandwidth of 500 MHz are predicted. The output properties of the EIK show great stability with the effective suppression of oscillation and mode competition. Moreover, small-signal theory analysis and 1D code AJDISK calculations are carried out to verify the results of 3D PIC simulations. A close agreement among the three methods proves the relative validity and the reliability of the designed EIK. Thus, it is indicated that the EIK with unequal-length slots has potential for power improvement and bandwidth extension.
Tradeoffs in the design of a system for high level language interpretation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osorio, F.C.C.; Patt, Y.N.
The problem of designing a system for high-level language interpretation (HLLI) is considered. First, a model of the design process is presented where several styles of design, e.g. turing machine interpretation, CISC architecture interpretation and RISC architecture interpretation are treated uniformly. Second, the most significant characteristics of HLLI are analysed in the context of different design styles, and some guidelines are presented on how to identify the most suitable design style for a given high-level language problem. 12 references.
HgCdTe avalanche photodiodes: A review
NASA Astrophysics Data System (ADS)
Singh, Anand; Srivastav, Vanya; Pal, Ravinder
2011-10-01
This paper presents a comprehensive review of fundamental issues, device architectures, technology development and applications of HgCdTe based avalanche photodiodes (APD). High gain, above 5×10 3, a low excess noise factor close to unity, THz gain-bandwidth product, and fast response in the range of pico-seconds has been achieved by electron-initiated avalanche multiplication for SWIR, MWIR, and LWIR detector applications involving low optical signals. Detector arrays with good element-to-element uniformity have been fabricated paving the way for fabrication of HgCdTe-APD FPAs.
An adaptive vector quantization scheme
NASA Technical Reports Server (NTRS)
Cheung, K.-M.
1990-01-01
Vector quantization is known to be an effective compression scheme to achieve a low bit rate so as to minimize communication channel bandwidth and also to reduce digital memory storage while maintaining the necessary fidelity of the data. However, the large number of computations required in vector quantizers has been a handicap in using vector quantization for low-rate source coding. An adaptive vector quantization algorithm is introduced that is inherently suitable for simple hardware implementation because it has a simple architecture. It allows fast encoding and decoding because it requires only addition and subtraction operations.
Execution of parallel algorithms on a heterogeneous multicomputer
NASA Astrophysics Data System (ADS)
Isenstein, Barry S.; Greene, Jonathon
1995-04-01
Many aerospace/defense sensing and dual-use applications require high-performance computing, extensive high-bandwidth interconnect and realtime deterministic operation. This paper will describe the architecture of a scalable multicomputer that includes DSP and RISC processors. A single chassis implementation is capable of delivering in excess of 10 GFLOPS of DSP processing power with 2 Gbytes/s of realtime sensor I/O. A software approach to implementing parallel algorithms called the Parallel Application System (PAS) is also presented. An example of applying PAS to a DSP application is shown.
2014-08-01
searchrequired for SPH are described in Sect. 3. Section 4 contains aperformance analysis of the algorithm using Kepler -type GPUcards. 2. Numerical...generation of Kepler architecture, code nameGK104, which is also implemented in Tesla K10. The Keplerarchitecture relies on a Graphics Processing Cluster (GPC...lat-ter is 512 KB large and has a bandwidth of 512 B/clockcycle. Constant memory (read only per grid): 48 KB per Kepler SM.Used to hold constants
Requirements and Usage of NVM in Advanced Onboard Data Processing Systems
NASA Technical Reports Server (NTRS)
Some, R.
2001-01-01
This viewgraph presentation gives an overview of the requirements and uses of non-volatile memory (NVM) in advanced onboard data processing systems. Supercomputing in space presents the only viable approach to the bandwidth problem (can't get data down to Earth), controlling constellations of cooperating satellites, reducing mission operating costs, and real-time intelligent decision making and science data gathering. Details are given on the REE vision and impact on NASA and Department of Defense missions, objectives of REE, baseline architecture, and issues. NVM uses and requirements are listed.
Design and Implementation of the PALM-3000 Real-Time Control System
NASA Technical Reports Server (NTRS)
Truong, Tuan N.; Bouchez, Antonin H.; Burruss, Rick S.; Dekany, Richard G.; Guiwits, Stephen R.; Roberts, Jennifer E.; Shelton, Jean C.; Troy, Mitchell
2012-01-01
This paper reflects, from a computational perspective, on the experience gathered in designing and implementing realtime control of the PALM-3000 adaptive optics system currently in operation at the Palomar Observatory. We review the algorithms that serve as functional requirements driving the architecture developed, and describe key design issues and solutions that contributed to the system's low compute-latency. Additionally, we describe an implementation of dense matrix-vector-multiplication for wavefront reconstruction that exceeds 95% of the maximum sustained achievable bandwidth on NVIDIA Geforce 8800GTX GPU.
Peer-to-peer architectures for exascale computing : LDRD final report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vorobeychik, Yevgeniy; Mayo, Jackson R.; Minnich, Ronald G.
2010-09-01
The goal of this research was to investigate the potential for employing dynamic, decentralized software architectures to achieve reliability in future high-performance computing platforms. These architectures, inspired by peer-to-peer networks such as botnets that already scale to millions of unreliable nodes, hold promise for enabling scientific applications to run usefully on next-generation exascale platforms ({approx} 10{sup 18} operations per second). Traditional parallel programming techniques suffer rapid deterioration of performance scaling with growing platform size, as the work of coping with increasingly frequent failures dominates over useful computation. Our studies suggest that new architectures, in which failures are treated as ubiquitousmore » and their effects are considered as simply another controllable source of error in a scientific computation, can remove such obstacles to exascale computing for certain applications. We have developed a simulation framework, as well as a preliminary implementation in a large-scale emulation environment, for exploration of these 'fault-oblivious computing' approaches. High-performance computing (HPC) faces a fundamental problem of increasing total component failure rates due to increasing system sizes, which threaten to degrade system reliability to an unusable level by the time the exascale range is reached ({approx} 10{sup 18} operations per second, requiring of order millions of processors). As computer scientists seek a way to scale system software for next-generation exascale machines, it is worth considering peer-to-peer (P2P) architectures that are already capable of supporting 10{sup 6}-10{sup 7} unreliable nodes. Exascale platforms will require a different way of looking at systems and software because the machine will likely not be available in its entirety for a meaningful execution time. Realistic estimates of failure rates range from a few times per day to more than once per hour for these platforms. P2P architectures give us a starting point for crafting applications and system software for exascale. In the context of the Internet, P2P applications (e.g., file sharing, botnets) have already solved this problem for 10{sup 6}-10{sup 7} nodes. Usually based on a fractal distributed hash table structure, these systems have proven robust in practice to constant and unpredictable outages, failures, and even subversion. For example, a recent estimate of botnet turnover (i.e., the number of machines leaving and joining) is about 11% per week. Nonetheless, P2P networks remain effective despite these failures: The Conficker botnet has grown to {approx} 5 x 10{sup 6} peers. Unlike today's system software and applications, those for next-generation exascale machines cannot assume a static structure and, to be scalable over millions of nodes, must be decentralized. P2P architectures achieve both, and provide a promising model for 'fault-oblivious computing'. This project aimed to study the dynamics of P2P networks in the context of a design for exascale systems and applications. Having no single point of failure, the most successful P2P architectures are adaptive and self-organizing. While there has been some previous work applying P2P to message passing, little attention has been previously paid to the tightly coupled exascale domain. Typically, the per-node footprint of P2P systems is small, making them ideal for HPC use. The implementation on each peer node cooperates en masse to 'heal' disruptions rather than relying on a controlling 'master' node. Understanding this cooperative behavior from a complex systems viewpoint is essential to predicting useful environments for the inextricably unreliable exascale platforms of the future. We sought to obtain theoretical insight into the stability and large-scale behavior of candidate architectures, and to work toward leveraging Sandia's Emulytics platform to test promising candidates in a realistic (ultimately {ge} 10{sup 7} nodes) setting. Our primary example applications are drawn from linear algebra: a Jacobi relaxation solver for the heat equation, and the closely related technique of value iteration in optimization. We aimed to apply P2P concepts in designing implementations capable of surviving an unreliable machine of 10{sup 6} nodes.« less
NASA Astrophysics Data System (ADS)
Siddiqui, Aleem; Reinke, Charles; Shin, Heedeuk; Jarecki, Robert L.; Starbuck, Andrew L.; Rakich, Peter
2017-05-01
The performance of electronic systems for radio-frequency (RF) spectrum analysis is critical for agile radar and communications systems, ISR (intelligence, surveillance, and reconnaissance) operations in challenging electromagnetic (EM) environments, and EM-environment situational awareness. While considerable progress has been made in size, weight, and power (SWaP) and performance metrics in conventional RF technology platforms, fundamental limits make continued improvements increasingly difficult. Alternatively, we propose employing cascaded transduction processes in a chip-scale nano-optomechanical system (NOMS) to achieve a spectral sensor with exceptional signal-linearity, high dynamic range, narrow spectral resolution and ultra-fast sweep times. By leveraging the optimal capabilities of photons and phonons, the system we pursue in this work has performance metrics scalable well beyond the fundamental limitations inherent to all electronic systems. In our device architecture, information processing is performed on wide-bandwidth RF-modulated optical signals by photon-mediated phononic transduction of the modulation to the acoustical-domain for narrow-band filtering, and then back to the optical-domain by phonon-mediated phase modulation (the reverse process). Here, we rely on photonics to efficiently distribute signals for parallel processing, and on phononics for effective and flexible RF-frequency manipulation. This technology is used to create RF-filters that are insensitive to the optical wavelength, with wide center frequency bandwidth selectivity (1-100GHz), ultra-narrow filter bandwidth (1-100MHz), and high dynamic range (70dB), which we will present. Additionally, using this filter as a building block, we will discuss current results and progress toward demonstrating a multichannel-filter with a bandwidth of < 10MHz per channel, while minimizing cumulative optical/acoustic/optical transduced insertion-loss to ideally < 10dB. These proposed metric represent significant improvements over RF-platforms.
NASA Technical Reports Server (NTRS)
Schoenwald, Adam J.; Bradley, Damon C.; Mohammed, Priscilla N.; Piepmeier, Jeffrey R.; Wong, Mark
2016-01-01
Radio-frequency interference (RFI) is a known problem for passive remote sensing as evidenced in the L-band radiometers SMOS, Aquarius and more recently, SMAP. Various algorithms have been developed and implemented on SMAP to improve science measurements. This was achieved by the use of a digital microwave radiometer. RFI mitigation becomes more challenging for microwave radiometers operating at higher frequencies in shared allocations. At higher frequencies larger bandwidths are also desirable for lower measurement noise further adding to processing challenges. This work focuses on finding improved RFI mitigation techniques that will be effective at additional frequencies and at higher bandwidths. To aid the development and testing of applicable detection and mitigation techniques, a wide-band RFI algorithm testing environment has been developed using the Reconfigurable Open Architecture Computing Hardware System (ROACH) built by the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER) Group. The testing environment also consists of various test equipment used to reproduce typical signals that a radiometer may see including those with and without RFI. The testing environment permits quick evaluations of RFI mitigation algorithms as well as show that they are implementable in hardware. The algorithm implemented is a complex signal kurtosis detector which was modeled and simulated. The complex signal kurtosis detector showed improved performance over the real kurtosis detector under certain conditions. The real kurtosis is implemented on SMAP at 24 MHz bandwidth. The complex signal kurtosis algorithm was then implemented in hardware at 200 MHz bandwidth using the ROACH. In this work, performance of the complex signal kurtosis and the real signal kurtosis are compared. Performance evaluations and comparisons in both simulation as well as experimental hardware implementations were done with the use of receiver operating characteristic (ROC) curves.
NASA Astrophysics Data System (ADS)
Trauger, John T.; Moody, D. C.
2010-05-01
Among the leading architectures for the imaging and spectroscopy of nearby exoplanetary systems is the space coronagraph, which provides in principle very high (10 billion to one) suppression of diffracted and scattered starlight at very small separations (a few tenths of arcseconds) from the star. The concept of a band-limited Lyot coronagraph, introduced by Kuchner and Traub (2002), provides the theoretical basis for mathematically perfect starlight suppression. In practice, the optical characteristics of available materials and practical aspects of the fabrication processes impose limitations on contrast and spectral bandwidths that are achievable in the real world. Nevertheless, the band-limited Lyot coronagraph approach has produced the best laboratory validated performance among known types of internal coronagraph for contrast and spectral bandwidth, and alone it has demonstrated high-contrast imaging performance at levels required for exoplanet exploration. We report the design and fabrication of hybrid focal-plane masks for Lyot coronagraphy, composed of thickness-profiled metallic and dielectric thin films, vacuum deposited on a glass substrate. These masks are in principle band-limited in both the real and imaginary parts of the complex amplitude characteristics. Together with a deformable mirror for control of wavefront phase, these masks have the potential for contrast performance better than 10-9 at inner working angles of 3 lambda/D or better over spectral bandwidths of 20% or more, and with throughput efficiencies up to 60%. We report recent laboratory demonstrations of high contrast with nickel-dielectric masks, including the demonstration of 2x10-9 contrast with a 3 lambda/D inner working angle over 20% spectral bandwidths.
Yang, Hui; Zhang, Jie; Zhao, Yongli; Ji, Yuefeng; Wu, Jialin; Lin, Yi; Han, Jianrui; Lee, Young
2015-05-18
Inter-data center interconnect with IP over elastic optical network (EON) is a promising scenario to meet the high burstiness and high-bandwidth requirements of data center services. In our previous work, we implemented multi-stratum resources integration among IP networks, optical networks and application stratums resources that allows to accommodate data center services. In view of this, this study extends to consider the service resilience in case of edge optical node failure. We propose a novel multi-stratum resources integrated resilience (MSRIR) architecture for the services in software defined inter-data center interconnect based on IP over EON. A global resources integrated resilience (GRIR) algorithm is introduced based on the proposed architecture. The MSRIR can enable cross stratum optimization and provide resilience using the multiple stratums resources, and enhance the data center service resilience responsiveness to the dynamic end-to-end service demands. The overall feasibility and efficiency of the proposed architecture is experimentally verified on the control plane of our OpenFlow-based enhanced SDN (eSDN) testbed. The performance of GRIR algorithm under heavy traffic load scenario is also quantitatively evaluated based on MSRIR architecture in terms of path blocking probability, resilience latency and resource utilization, compared with other resilience algorithms.
Pseudo Asynchronous Level Crossing adc for ecg Signal Acquisition.
Marisa, T; Niederhauser, T; Haeberlin, A; Wildhaber, R A; Vogel, R; Goette, J; Jacomet, M
2017-02-07
A new pseudo asynchronous level crossing analogue-to-digital converter (adc) architecture targeted for low-power, implantable, long-term biomedical sensing applications is presented. In contrast to most of the existing asynchronous level crossing adc designs, the proposed design has no digital-to-analogue converter (dac) and no continuous time comparators. Instead, the proposed architecture uses an analogue memory cell and dynamic comparators. The architecture retains the signal activity dependent sampling operation by generating events only when the input signal is changing. The architecture offers the advantages of smaller chip area, energy saving and fewer analogue system components. Beside lower energy consumption the use of dynamic comparators results in a more robust performance in noise conditions. Moreover, dynamic comparators make interfacing the asynchronous level crossing system to synchronous processing blocks simpler. The proposed adc was implemented in [Formula: see text] complementary metal-oxide-semiconductor (cmos) technology, the hardware occupies a chip area of 0.0372 mm 2 and operates from a supply voltage of [Formula: see text] to [Formula: see text]. The adc's power consumption is as low as 0.6 μW with signal bandwidth from [Formula: see text] to [Formula: see text] and achieves an equivalent number of bits (enob) of up to 8 bits.
Yang, Hui; Zhang, Jie; Ji, Yuefeng; Tian, Rui; Han, Jianrui; Lee, Young
2015-11-30
Data center interconnect with elastic optical network is a promising scenario to meet the high burstiness and high-bandwidth requirements of data center services. In our previous work, we implemented multi-stratum resilience between IP and elastic optical networks that allows to accommodate data center services. In view of this, this study extends to consider the resource integration by breaking the limit of network device, which can enhance the resource utilization. We propose a novel multi-stratum resources integration (MSRI) architecture based on network function virtualization in software defined elastic data center optical interconnect. A resource integrated mapping (RIM) scheme for MSRI is introduced in the proposed architecture. The MSRI can accommodate the data center services with resources integration when the single function or resource is relatively scarce to provision the services, and enhance globally integrated optimization of optical network and application resources. The overall feasibility and efficiency of the proposed architecture are experimentally verified on the control plane of OpenFlow-based enhanced software defined networking (eSDN) testbed. The performance of RIM scheme under heavy traffic load scenario is also quantitatively evaluated based on MSRI architecture in terms of path blocking probability, provisioning latency and resource utilization, compared with other provisioning schemes.
MEDIC: medical embedded device for individualized care.
Wu, Winston H; Bui, Alex A T; Batalin, Maxim A; Au, Lawrence K; Binney, Jonathan D; Kaiser, William J
2008-02-01
Presented work highlights the development and initial validation of a medical embedded device for individualized care (MEDIC), which is based on a novel software architecture, enabling sensor management and disease prediction capabilities, and commercially available microelectronic components, sensors and conventional personal digital assistant (PDA) (or a cell phone). In this paper, we present a general architecture for a wearable sensor system that can be customized to an individual patient's needs. This architecture is based on embedded artificial intelligence that permits autonomous operation, sensor management and inference, and may be applied to a general purpose wearable medical diagnostics. A prototype of the system has been developed based on a standard PDA and wireless sensor nodes equipped with commercially available Bluetooth radio components, permitting real-time streaming of high-bandwidth data from various physiological and contextual sensors. We also present the results of abnormal gait diagnosis using the complete system from our evaluation, and illustrate how the wearable system and its operation can be remotely configured and managed by either enterprise systems or medical personnel at centralized locations. By using commercially available hardware components and software architecture presented in this paper, the MEDIC system can be rapidly configured, providing medical researchers with broadband sensor data from remote patients and platform access to best adapt operation for diagnostic operation objectives.
Architectures and Design for Next-Generation Hybrid Circuit/Packet Networks
NASA Astrophysics Data System (ADS)
Vadrevu, Sree Krishna Chaitanya
Internet traffic is increasing rapidly at an annual growth rate of 35% with aggregate traffic exceeding several Exabyte's per month. The traffic is also becoming heterogeneous in bandwidth and quality-of-service (QoS) requirements with growing popularity of cloud computing, video-on-demand (VoD), e-science, etc. Hybrid circuit/packet networks which can jointly support circuit and packet services along with the adoption of high-bit-rate transmission systems form an attractive solution to address the traffic growth. 10 Gbps and 40 Gbps transmission systems are widely deployed in telecom backbone networks such as Comcast, AT&T, etc., and network operators are considering migration to 100 Gbps and beyond. This dissertation proposes robust architectures, capacity migration strategies, and novel service frameworks for next-generation hybrid circuit/packet architectures. In this dissertation, we study two types of hybrid circuit/packet networks: a) IP-over-WDM networks, in which the packet (IP) network is overlaid on top of the circuit (optical WDM) network and b) Hybrid networks in which the circuit and packet networks are deployed side by side such as US DoE's ESnet. We investigate techniques to dynamically migrate capacity between the circuit and packet sections by exploiting traffic variations over a day, and our methods show that significant bandwidth savings can be obtained with improved reliability of services. Specifically, we investigate how idle backup circuit capacity can be used to support packet services in IP-over-WDM networks, and similarly, excess capacity in packet network to support circuit services in ESnet. Control schemes that enable our mechanisms are also discussed. In IP-over-WDM networks, with upcoming 100 Gbps and beyond, dedicated protection will induce significant under-utilization of backup resources. We investigate design strategies to loan idle circuit backup capacity to support IP/packet services. However, failure of backup circuits will preempt IP services routed over them, and thus it is important to ensure IP topology survivability to successfully re-route preempted IP services. Integer-linear-program (ILP) and heuristic solutions have been developed and network cost reduction up to 60% has been observed. In ESnet, we study loaning packet links to support circuit services. Mixed-line-rate (MLR) networks supporting 10/40/100 Gbps on the same fiber are becoming increasingly popular. Services that accept degradation in bandwidth, latency, jitter, etc. under failure scenarios for lower cost are known as degraded services. We study degradation in bandwidth for lower cost under failure scenarios, a concept called partial protection, in the context of MLR networks. We notice partial protection enables significant cost savings compared to full protection. To cope with traffic growth, network operators need to deploy equipment at periodic time intervals, and this is known as the multi-period planning and upgrade problem. We study three important multi-period planning approaches, namely incremental planning, all-period planning, and two-period planning with mixed line rates. Our approaches predict the network equipment that needs to be deployed optimally at which nodes and at which time periods in the network to meet QoS requirements.
Zhang, Fangzheng; Guo, Qingshui; Pan, Shilong
2017-10-23
Real-time and high-resolution target detection is highly desirable in modern radar applications. Electronic techniques have encountered grave difficulties in the development of such radars, which strictly rely on a large instantaneous bandwidth. In this article, a photonics-based real-time high-range-resolution radar is proposed with optical generation and processing of broadband linear frequency modulation (LFM) signals. A broadband LFM signal is generated in the transmitter by photonic frequency quadrupling, and the received echo is de-chirped to a low frequency signal by photonic frequency mixing. The system can operate at a high frequency and a large bandwidth while enabling real-time processing by low-speed analog-to-digital conversion and digital signal processing. A conceptual radar is established. Real-time processing of an 8-GHz LFM signal is achieved with a sampling rate of 500 MSa/s. Accurate distance measurement is implemented with a maximum error of 4 mm within a range of ~3.5 meters. Detection of two targets is demonstrated with a range-resolution as high as 1.875 cm. We believe the proposed radar architecture is a reliable solution to overcome the limitations of current radar on operation bandwidth and processing speed, and it is hopefully to be used in future radars for real-time and high-resolution target detection and imaging.
Frequency-Offset Cartesian Feedback Based on Polyphase Difference Amplifiers
Zanchi, Marta G.; Pauly, John M.; Scott, Greig C.
2010-01-01
A modified Cartesian feedback method called “frequency-offset Cartesian feedback” and based on polyphase difference amplifiers is described that significantly reduces the problems associated with quadrature errors and DC-offsets in classic Cartesian feedback power amplifier control systems. In this method, the reference input and feedback signals are down-converted and compared at a low intermediate frequency (IF) instead of at DC. The polyphase difference amplifiers create a complex control bandwidth centered at this low IF, which is typically offset from DC by 200–1500 kHz. Consequently, the loop gain peak does not overlap DC where voltage offsets, drift, and local oscillator leakage create errors. Moreover, quadrature mismatch errors are significantly attenuated in the control bandwidth. Since the polyphase amplifiers selectively amplify the complex signals characterized by a +90° phase relationship representing positive frequency signals, the control system operates somewhat like single sideband (SSB) modulation. However, the approach still allows the same modulation bandwidth control as classic Cartesian feedback. In this paper, the behavior of the polyphase difference amplifier is described through both the results of simulations, based on a theoretical analysis of their architecture, and experiments. We then describe our first printed circuit board prototype of a frequency-offset Cartesian feedback transmitter and its performance in open and closed loop configuration. This approach should be especially useful in magnetic resonance imaging transmit array systems. PMID:20814450
Ultra-Compact Transputer-Based Controller for High-Level, Multi-Axis Coordination
NASA Technical Reports Server (NTRS)
Zenowich, Brian; Crowell, Adam; Townsend, William T.
2013-01-01
The design of machines that rely on arrays of servomotors such as robotic arms, orbital platforms, and combinations of both, imposes a heavy computational burden to coordinate their actions to perform coherent tasks. For example, the robotic equivalent of a person tracing a straight line in space requires enormously complex kinematics calculations, and complexity increases with the number of servo nodes. A new high-level architecture for coordinated servo-machine control enables a practical, distributed transputer alternative to conventional central processor electronics. The solution is inherently scalable, dramatically reduces bulkiness and number of conductor runs throughout the machine, requires only a fraction of the power, and is designed for cooling in a vacuum.
Job Scheduling in a Heterogeneous Grid Environment
NASA Technical Reports Server (NTRS)
Shan, Hong-Zhang; Smith, Warren; Oliker, Leonid; Biswas, Rupak
2004-01-01
Computational grids have the potential for solving large-scale scientific problems using heterogeneous and geographically distributed resources. However, a number of major technical hurdles must be overcome before this potential can be realized. One problem that is critical to effective utilization of computational grids is the efficient scheduling of jobs. This work addresses this problem by describing and evaluating a grid scheduling architecture and three job migration algorithms. The architecture is scalable and does not assume control of local site resources. The job migration policies use the availability and performance of computer systems, the network bandwidth available between systems, and the volume of input and output data associated with each job. An extensive performance comparison is presented using real workloads from leading computational centers. The results, based on several key metrics, demonstrate that the performance of our distributed migration algorithms is significantly greater than that of a local scheduling framework and comparable to a non-scalable global scheduling approach.
MEMS-Based Communications Systems for Space-Based Applications
NASA Technical Reports Server (NTRS)
DeLosSantos, Hector J.; Brunner, Robert A.; Lam, Juan F.; Hackett, Le Roy H.; Lohr, Ross F., Jr.; Larson, Lawrence E.; Loo, Robert Y.; Matloubian, Mehran; Tangonan, Gregory L.
1995-01-01
As user demand for higher capacity and flexibility in communications satellites increases, new ways to cope with the inherent limitations posed by the prohibitive mass and power consumption, needed to satisfy those requirements, are under investigation. Recent studies suggest that while new satellite architectures are necessary to enable multi-user, multi-data rate, multi-location satellite links, these new architectures will inevitably increase power consumption, and in turn, spacecraft mass, to such an extent that their successful implementation will demand novel lightweight/low power hardware approaches. In this paper, following a brief introduction to the fundamentals of communications satellites, we address the impact of micro-electro-mechanical systems (MEMS) technology, in particular micro-electro-mechanical (MEM) switches to mitigate the above mentioned problems and show that low-loss/wide bandwidth MEM switches will go a long way towards enabling higher capacity and flexibility space-based communications systems.
Efficient image data distribution and management with application to web caching architectures
NASA Astrophysics Data System (ADS)
Han, Keesook J.; Suter, Bruce W.
2003-03-01
We present compact image data structures and associated packet delivery techniques for effective Web caching architectures. Presently, images on a web page are inefficiently stored, using a single image per file. Our approach is to use clustering to merge similar images into a single file in order to exploit the redundancy between images. Our studies indicate that a 30-50% image data size reduction can be achieved by eliminating the redundancies of color indexes. Attached to this file is new metadata to permit an easy extraction of images. This approach will permit a more efficient use of the cache, since a shorter list of cache references will be required. Packet and transmission delays can be reduced by 50% eliminating redundant TCP/IP headers and connection time. Thus, this innovative paradigm for the elimination of redundancy may provide valuable benefits for optimizing packet delivery in IP networks by reducing latency and minimizing the bandwidth requirements.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deslippe, Jack; da Jornada, Felipe H.; Vigil-Fowler, Derek
2016-10-06
We profile and optimize calculations performed with the BerkeleyGW code on the Xeon-Phi architecture. BerkeleyGW depends both on hand-tuned critical kernels as well as on BLAS and FFT libraries. We describe the optimization process and performance improvements achieved. We discuss a layered parallelization strategy to take advantage of vector, thread and node-level parallelism. We discuss locality changes (including the consequence of the lack of L3 cache) and effective use of the on-package high-bandwidth memory. We show preliminary results on Knights-Landing including a roofline study of code performance before and after a number of optimizations. We find that the GW methodmore » is particularly well-suited for many-core architectures due to the ability to exploit a large amount of parallelism over plane-wave components, band-pairs, and frequencies.« less
Supporting shared data structures on distributed memory architectures
NASA Technical Reports Server (NTRS)
Koelbel, Charles; Mehrotra, Piyush; Vanrosendale, John
1990-01-01
Programming nonshared memory systems is more difficult than programming shared memory systems, since there is no support for shared data structures. Current programming languages for distributed memory architectures force the user to decompose all data structures into separate pieces, with each piece owned by one of the processors in the machine, and with all communication explicitly specified by low-level message-passing primitives. A new programming environment is presented for distributed memory architectures, providing a global name space and allowing direct access to remote parts of data values. The analysis and program transformations required to implement this environment are described, and the efficiency of the resulting code on the NCUBE/7 and IPSC/2 hypercubes are described.
Data flow language and interpreter for a reconfigurable distributed data processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hurt, A.D.; Heath, J.R.
1982-01-01
An analytic language and an interpreter whereby an applications data flow graph may serve as an input to a reconfigurable distributed data processor is proposed. The architecture considered consists of a number of loosely coupled computing elements (CES) which may be linked to data and file memories through fully nonblocking interconnect networks. The real-time performance of such an architecture depends upon its ability to alter its topology in response to changes in application, asynchronous data rates and faults. Such a data flow language enhances the versatility of a reconfigurable architecture by allowing the user to specify the machine's topology atmore » a very high level. 11 references.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, Tyler Barratt; Urrea, Jorge Mario
2012-06-01
The aim of the Authenticating Cache architecture is to ensure that machine instructions in a Read Only Memory (ROM) are legitimate from the time the ROM image is signed (immediately after compilation) to the time they are placed in the cache for the processor to consume. The proposed architecture allows the detection of ROM image modifications during distribution or when it is loaded into memory. It also ensures that modified instructions will not execute in the processor-as the cache will not be loaded with a page that fails an integrity check. The authenticity of the instruction stream can also bemore » verified in this architecture. The combination of integrity and authenticity assurance greatly improves the security profile of a system.« less
NASA Technical Reports Server (NTRS)
Smith, T. B., III; Lala, J. H.
1984-01-01
The FTMP architecture is a high reliability computer concept modeled after a homogeneous multiprocessor architecture. Elements of the FTMP are operated in tight synchronism with one another and hardware fault-detection and fault-masking is provided which is transparent to the software. Operating system design and user software design is thus greatly simplified. Performance of the FTMP is also comparable to that of a simplex equivalent due to the efficiency of fault handling hardware. The FTMP project constructed an engineering module of the FTMP, programmed the machine and extensively tested the architecture through fault injection and other stress testing. This testing confirmed the soundness of the FTMP concepts.
Secure Autonomous Automated Scheduling (SAAS). Rev. 1.1
NASA Technical Reports Server (NTRS)
Walke, Jon G.; Dikeman, Larry; Sage, Stephen P.; Miller, Eric M.
2010-01-01
This report describes network-centric operations, where a virtual mission operations center autonomously receives sensor triggers, and schedules space and ground assets using Internet-based technologies and service-oriented architectures. For proof-of-concept purposes, sensor triggers are received from the United States Geological Survey (USGS) to determine targets for space-based sensors. The Surrey Satellite Technology Limited (SSTL) Disaster Monitoring Constellation satellite, the UK-DMC, is used as the space-based sensor. The UK-DMC's availability is determined via machine-to-machine communications using SSTL's mission planning system. Access to/from the UK-DMC for tasking and sensor data is via SSTL's and Universal Space Network's (USN) ground assets. The availability and scheduling of USN's assets can also be performed autonomously via machine-to-machine communications. All communication, both on the ground and between ground and space, uses open Internet standards